Michael Zeng

Orcid: 0000-0001-5302-5883

Affiliations:
  • Microsoft, Redmond, WA, USA


According to our database1, Michael Zeng authored at least 92 papers between 2018 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Investigating Neural Audio Codecs for Speech Language Model-Based Speech Generation.
CoRR, 2024

TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation.
CoRR, 2024

CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations.
CoRR, 2024

Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like.
CoRR, 2024

i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Improving Readability for Automatic Speech Recognition Transcription.
ACM Trans. Asian Low Resour. Lang. Inf. Process., May, 2023

MACSum: Controllable Summarization with Mixed Attributes.
Trans. Assoc. Comput. Linguistics, 2023

Diffusion Conditional Expectation Model for Efficient and Robust Target Speech Extraction.
CoRR, 2023

ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation.
CoRR, 2023

i-Code Studio: A Configurable and Composable Framework for Integrative AI.
CoRR, 2023

LMGQS: A Large-scale Dataset for Query-focused Summarization.
CoRR, 2023

i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data.
CoRR, 2023

MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action.
CoRR, 2023

Any-to-Any Generation via Composable Diffusion.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Adapting Multi-Lingual ASR Models for Handling Multiple Talkers.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Generate rather than Retrieve: Large Language Models are Strong Context Generators.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Code-Switching Text Generation and Injection in Mandarin-English ASR.
Proceedings of the IEEE International Conference on Acoustics, 2023

DATA2VEC-SG: Improving Self-Supervised Learning Representations for Speech Generation Tasks.
Proceedings of the IEEE International Conference on Acoustics, 2023

Target Sound Extraction with Variable Cross-Modality Clues.
Proceedings of the IEEE International Conference on Acoustics, 2023

InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

LMGQS: A Large-scale Dataset for Query-focused Summarization.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Automatic Prompt Optimization with "Gradient Descent" and Beam Search.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

ReCo: Region-Controlled Text-to-Image Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Unifying Vision, Text, and Layout for Universal Document Processing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

UniSumm and SummZoo: Unified Model and Diverse Benchmark for Few-Shot Summarization.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

i-Code: An Integrative and Composable Multimodal Learning Framework.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing.
IEEE J. Sel. Top. Signal Process., 2022

UniSumm: Unified Few-shot Summarization with Multi-Task Pre-Training and Prefix-Tuning.
CoRR, 2022

Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization.
CoRR, 2022

Impossible Triangle: What's Next for Pre-trained Language Models?
CoRR, 2022

Unsupervised Summarization with Customized Granularities.
CoRR, 2022

A Comprehensive Study on Self-Supervised Distillation for Speaker Representation Learning.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Optimizing Alignment of Speech and Language Latent Spaces for End-To-End Speech Recognition and Understanding.
Proceedings of the IEEE International Conference on Acoustics, 2022

Large-Scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2022

Unsupervised Multi-Granularity Summarization.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Narrate Dialogues for Better Summarization.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

ParaTag: A Dataset of Paraphrase Tagging for Fine-Grained Labels, NLG Evaluation, and Data Augmentation.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Automatic Rule Induction for Efficient Semi-Supervised Learning.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Task Compass: Scaling Multi-task Pre-training with Task Prefix.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

AdaPrompt: Adaptive Model Training for Prompt-based NLP.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

CLIP-Event: Connecting Text and Images with Event Structures.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

An Empirical Study of Training End-to-End Vision-and-Language Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Training Data is More Valuable than You Think: A Simple and Effective Method by Retrieving from Training Data.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

End-to-End Segmentation-based News Summarization.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

Leveraging Knowledge in Multilingual Commonsense Reasoning.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

Dict-BERT: Enhancing Language Model Pre-training with Dictionary.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarization.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

JAKET: Joint Pre-training of Knowledge Graph and Language Understanding.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
MLP Architectures for Vision-and-Language Modeling: An Empirical Study.
CoRR, 2021

Florence: A New Foundation Model for Computer Vision.
CoRR, 2021

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing.
CoRR, 2021

Does Knowledge Help General NLU? An Empirical Study.
CoRR, 2021

A Joint and Domain-Adaptive Approach to Spoken Language Understanding.
CoRR, 2021

Leveraging Lead Bias for Zero-shot Abstractive News Summarization.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

MediaSum: A Large-scale Media Interview Dataset for Dialogue Summarization.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Enhancing Factual Consistency of Abstractive Summarization.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

SPLAT: Speech-Language Joint Pre-Training for Spoken Language Understanding.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Data Augmentation for Spoken Language Understanding via Pretrained Language Models.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Improving Zero-shot Neural Machine Translation on Language-specific Encoders- Decoders.
Proceedings of the International Joint Conference on Neural Networks, 2021

UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data.
Proceedings of the 38th International Conference on Machine Learning, 2021

Speech-Language Pre-Training for End-to-End Spoken Language Understanding.
Proceedings of the IEEE International Conference on Acoustics, 2021

Generating Human Readable Transcript for Automatic Speech Recognition with Pre-Trained Language Model.
Proceedings of the IEEE International Conference on Acoustics, 2021

Want To Reduce Labeling Cost? GPT-3 Can Help.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Fusing Context Into Knowledge Graph for Commonsense Question Answering.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

Retrieval Enhanced Model for Commonsense Generation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

2020
Fusing Context Into Knowledge Graph for Commonsense Reasoning.
CoRR, 2020

LSTM-LM with Long-Term History for First-Pass Decoding in Conversational Speech Recognition.
CoRR, 2020

Semi-Supervised Speech-Language Joint Pre-Training for Spoken Language Understanding.
CoRR, 2020

Mind The Facts: Knowledge-Boosted Coherent Abstractive Text Summarization.
CoRR, 2020

Meta Dialogue Policy Learning.
CoRR, 2020

Data Augmentation for Spoken Language Understanding via Pretrained Models.
CoRR, 2020

End-to-End Abstractive Summarization for Meetings.
CoRR, 2020

Boosting Factual Correctness of Abstractive Summarization with Knowledge Graph.
CoRR, 2020

Discriminative Transfer Learning for Optimizing ASR and Semantic Labeling in Task-Oriented Spoken Dialog.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Sequence-Level Self-Learning with Multiple Hypotheses.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Mixed-Lingual Pre-training for Cross-lingual Summarization.
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, 2020

A Hierarchical Network for Abstractive Meeting Summarization with Cross-Domain Pretraining.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

TED: A Pretrained Unsupervised Summarization Model with Theme Modeling and Denoising.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Few-shot Natural Language Generation for Task-Oriented Dialog.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

2019
Make Lead Bias in Your Favor: A Simple and Effective Method for News Summarization.
CoRR, 2019

Meeting Transcription Using Virtual Microphone Arrays.
CoRR, 2019

SIM: A Slot-Independent Neural Model for Dialogue State Tracking.
Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue, 2019

Meeting Transcription Using Asynchronous Distant Microphones.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Multi-task Learning for Natural Language Generation in Task-Oriented Dialogue.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

2018
SDNet: Contextualized Attention-based Deep Network for Conversational Question Answering.
CoRR, 2018


  Loading...