Xiaofei Wang

Affiliations:

Microsoft, One Microsoft Way, Redmond, WA, USA

According to our database¹, Xiaofei Wang authored at least 37 papers between 2020 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Isochrony-Controlled Speech-to-Text Translation: A study on translating from Sino-Tibetan to Indo-European Languages.

[BibT_eX]

[DOI]

CoRR, 2024

Investigating Neural Audio Codecs for Speech Language Model-Based Speech Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech.

[BibT_eX]

[DOI]

CoRR, 2024

E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS.

[BibT_eX]

[DOI]

CoRR, 2024

An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTS.

[BibT_eX]

[DOI]

CoRR, 2024

TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation.

[BibT_eX]

[DOI]

CoRR, 2024

CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations.

[BibT_eX]

[DOI]

CoRR, 2024

Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like.

[BibT_eX]

[DOI]

CoRR, 2024

Diarist: Streaming Speech Translation with Speaker Diarization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

SpeechX: Neural Codec Language Model as a Versatile Speech Transformer.

[BibT_eX]

[DOI]

CoRR, 2023

Speaker Diarization for ASR Output with T-vectors: A Sequence Classification Approach.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Simulating Realistic Speech Overlaps Improves Multi-Talker ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

DATA2VEC-SG: Improving Self-Supervised Learning Representations for Speech Generation Tasks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Vararray Meets T-Sot: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Self-Supervised Learning with Bi-Label Masked Speech Prediction for Streaming Multi-Talker Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Speech Separation with Large-Scale Self-Supervised Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Breaking trade-offs in speech separation with sparsely-gated mixture of experts.

[BibT_eX]

[DOI]

CoRR, 2022

Leveraging Real Conversational Data for Multi-Channel Continuous Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Streaming Multi-Talker ASR with Token-Level Serialized Output Training.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

All-Neural Beamformer for Continuous Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

VarArray: Array-Geometry-Agnostic Continuous Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Picknet: Real-Time Channel Selection for Ad Hoc Microphone Arrays.

[BibT_eX]

[DOI]

Takuya Yoshioka

Xiaofei Wang

Dongmei Wang

Proceedings of the IEEE International Conference on Acoustics, 2022

Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers Using End-to-End Speaker-Attributed ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Personalized speech enhancement: new models and Comprehensive evaluation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Exploring End-to-End Multi-Channel ASR with Bias Information for Meeting Transcription.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Investigation of End-to-End Speaker-Attributed ASR for Continuous Multi-Talker Recordings.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

End-to-End Speaker-Attributed ASR with Transformer.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Human Listening and Live Captioning: Multi-Task Training for Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Minimum Bayes Risk Training for End-to-End Speaker-Attributed ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Hypothesis Stitcher for End-to-End Speaker-Attributed ASR on Long-Form Multi-Talker Recordings.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Continuous Speech Separation with Ad Hoc Microphone Arrays.

[BibT_eX]

[DOI]

Proceedings of the 29th European Signal Processing Conference, 2021

A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Serialized Output Training for End-to-End Overlapped Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Joint Speaker Counting, Speech Recognition, and Speaker Identification for Overlapped Speech of any Number of Speakers.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Xiaofei Wang

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...