We stand with Ukraine

We stand with Ukraine

Peng Xu

Orcid: 0000-0003-3399-9722

Affiliations:

The Hong Kong University of Science and Technology

According to our database¹, Peng Xu authored at least 41 papers between 2018 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

Online presence:

on orcid.org

On csauthors.net:

Bibliography

2024

ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities.

[BibT_eX]

[DOI]

,

,

,

,

Mohammad Shoeybi

,

Bryan Catanzaro

CoRR, 2024

ChatQA: Building GPT-4 Level Conversational QA Models.

[BibT_eX]

[DOI]

,

,

,

,

,

Mohammad Shoeybi

,

Bryan Catanzaro

CoRR, 2024

InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining.

[BibT_eX]

[DOI]

,

,

Lawrence McAfee

,

,

,

Mohammad Shoeybi

,

Bryan Catanzaro

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Retrieval meets Long Context Large Language Models.

[BibT_eX]

[DOI]

,

,

,

Lawrence McAfee

,

,

,

Sandeep Subramanian

,

Evelina Bakhturina

,

Mohammad Shoeybi

,

Bryan Catanzaro

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

RAVEN: In-Context Learning with Retrieval Augmented Encoder-Decoder Language Models.

[BibT_eX]

[DOI]

,

,

,

Mohammad Shoeybi

,

Kevin Chen-Chuan Chang

,

Bryan Catanzaro

CoRR, 2023

Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study.

[BibT_eX]

[DOI]

,

,

,

Lawrence McAfee

,

,

Mohammad Shoeybi

,

,

Oleksii Kuchaiev

,

,

,

Anima Anandkumar

,

Bryan Catanzaro

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Context Generation Improves Open Domain Question Answering.

[BibT_eX]

[DOI]

,

Mostofa Patwary

,

Shrimai Prabhumoye

,

,

,

Mohammad Shoeybi

,

,

Anima Anandkumar

,

Bryan Catanzaro

Proceedings of the Findings of the Association for Computational Linguistics: EACL 2023, 2023

2022

Factuality Enhanced Language Models for Open-Ended Text Generation.

[BibT_eX]

[DOI]

,

,

,

Mostofa Patwary

,

Mohammad Shoeybi

,

Bryan Catanzaro

CoRR, 2022

CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command Recognition.

[BibT_eX]

[DOI]

,

Samuel Cahyawijaya

,

,

Elham J. Barezi

,

,

Cheuk Tung Shadow Yiu

,

,

,

Genta Indra Winata

,

,

,

,

CoRR, 2022

Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models.

[BibT_eX]

[DOI]

,

,

,

,

Mostofa Patwary

,

Mohammad Shoeybi

,

,

Anima Anandkumar

,

Bryan Catanzaro

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Factuality Enhanced Language Models for Open-Ended Text Generation.

[BibT_eX]

[DOI]

,

,

,

Mostofa Patwary

,

,

Mohammad Shoeybi

,

Bryan Catanzaro

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset.

[BibT_eX]

[DOI]

,

,

,

Samuel Cahyawijaya

,

Cheuk Tung Shadow Yiu

,

,

,

Elham J. Barezi

,

,

,

,

Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation.

[BibT_eX]

[DOI]

,

Samuel Cahyawijaya

,

Genta Indra Winata

,

,

,

,

,

,

,

Elham J. Barezi

,

,

,

,

Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

CI-AVSR: A Cantonese Audio-Visual Speech Datasetfor In-car Command Recognition.

[BibT_eX]

[DOI]

,

Samuel Cahyawijaya

,

,

Elham J. Barezi

,

,

,

,

,

Genta Indra Winata

,

,

,

,

Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

QA4QG: Using Question Answering to Constrain Multi-Hop Question Generation.

[BibT_eX]

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2022

Evaluating Parameter Efficient Learning for Generation.

[BibT_eX]

[DOI]

,

Mostofa Patwary

,

Shrimai Prabhumoye

,

,

,

,

,

Mohammad Shoeybi

,

Bryan Catanzaro

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

2021

ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation.

[BibT_eX]

[DOI]

,

Samuel Cahyawijaya

,

Genta Indra Winata

,

,

,

,

,

,

,

Elham J. Barezi

,

CoRR, 2021

X2Parser: Cross-Lingual and Cross-Domain Framework for Task-Oriented Compositional Semantic Parsing.

[BibT_eX]

[DOI]

,

Genta Indra Winata

,

,

Proceedings of the 6th Workshop on Representation Learning for NLP, 2021

BiToD: A Bilingual Multi-Domain Dataset For Task-Oriented Dialogue Modeling.

[BibT_eX]

[DOI]

,

,

Genta Indra Winata

,

,

,

,

,

Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

CAiRE in DialDoc21: Data Augmentation for Information Seeking Dialogue System.

[BibT_eX]

[DOI]

,

,

Genta Indra Winata

,

,

,

,

,

Proceedings of the 1st Workshop on Document-grounded Dialogue and Conversational Question Answering, 2021

2020

EmoGraph: Capturing Emotion Correlations using Graph Networks.

[BibT_eX]

[DOI]

,

,

Genta Indra Winata

,

,

CoRR, 2020

Variational Transformers for Diverse Response Generation.

[BibT_eX]

[DOI]

,

Genta Indra Winata

,

,

,

CoRR, 2020

Getting To Know You: User Attribute Extraction from Dialogues.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Learning Fast Adaptation on Cross-Accented Speech Recognition.

[BibT_eX]

[DOI]

Genta Indra Winata

,

Samuel Cahyawijaya

,

,

,

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Generating Empathetic Responses by Looking Ahead the User's Sentiment.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

MEGATRON-CNTRL: Controllable Story Generation with External Knowledge Using Large-Scale Language Models.

[BibT_eX]

[DOI]

,

Mostofa Patwary

,

Mohammad Shoeybi

,

,

,

Anima Anandkumar

,

Bryan Catanzaro

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Cross-lingual Spoken Language Understanding with Regularized Representation Alignment.

[BibT_eX]

[DOI]

,

Genta Indra Winata

,

,

,

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Meta-Transfer Learning for Code-Switched Speech Recognition.

[BibT_eX]

[DOI]

Genta Indra Winata

,

Samuel Cahyawijaya

,

,

,

,

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Coach: A Coarse-to-Fine Approach for Cross-domain Slot Filling.

[BibT_eX]

[DOI]

,

Genta Indra Winata

,

,

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Attention-Informed Mixed-Language Training for Zero-Shot Cross-Lingual Task-Oriented Dialogue Systems.

[BibT_eX]

[DOI]

,

Genta Indra Winata

,

,

,

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

CAiRE: An End-to-End Empathetic Chatbot.

[BibT_eX]

[DOI]

,

,

Genta Indra Winata

,

Farhad Bin Siddique

,

,

,

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

CAiRE: An End-to-End Empathetic Chatbot.

[BibT_eX]

[DOI]

,

,

Genta Indra Winata

,

,

CoRR, 2019

HappyBot: Generating Empathetic Dialogue Responses by Improving User Experience Look-ahead.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2019

CAiRE_HKUST at SemEval-2019 Task 3: Hierarchical Attention for Dialogue Emotion Classification.

[BibT_eX]

[DOI]

Genta Indra Winata

,

,

,

,

,

,

Proceedings of the 13th International Workshop on Semantic Evaluation, 2019

A Novel Repetition Normalized Adversarial Reward for Headline Generation.

[BibT_eX]

[DOI]

,

Proceedings of the IEEE International Conference on Acoustics, 2019

Clickbait? Sensational Headline Generation with Auto-tuned Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Zero-shot Cross-lingual Dialogue Systems with Transferable Latent Variables.

[BibT_eX]

[DOI]

,

,

,

Genta Indra Winata

,

,

,

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

MoEL: Mixture of Empathetic Listeners.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Generalizing Question Answering System with Pre-trained Language Model Fine-tuning.

[BibT_eX]

[DOI]

,

,

Genta Indra Winata

,

,

,

,

Proceedings of the 2nd Workshop on Machine Reading for Question Answering, 2019

2018

Emo2Vec: Learning Generalized Emotion Representation by Multi-task Training.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, 2018

PlusEmo2Vec at SemEval-2018 Task 1: Exploiting emotion knowledge from emoji and #hashtags.

[BibT_eX]

[DOI]

,

,

Proceedings of The 12th International Workshop on Semantic Evaluation, 2018

Loading...