Xiaofei Wang

Affiliations:
  • Microsoft, One Microsoft Way, Redmond, WA, USA


According to our database1, Xiaofei Wang authored at least 36 papers between 2020 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Investigating Neural Audio Codecs for Speech Language Model-Based Speech Generation.
CoRR, 2024

Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech.
CoRR, 2024

E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS.
CoRR, 2024

An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTS.
CoRR, 2024

TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation.
CoRR, 2024

CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations.
CoRR, 2024

Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like.
CoRR, 2024

Diarist: Streaming Speech Translation with Speaker Diarization.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer.
CoRR, 2023

Speaker Diarization for ASR Output with T-vectors: A Sequence Classification Approach.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Simulating Realistic Speech Overlaps Improves Multi-Talker ASR.
Proceedings of the IEEE International Conference on Acoustics, 2023

DATA2VEC-SG: Improving Self-Supervised Learning Representations for Speech Generation Tasks.
Proceedings of the IEEE International Conference on Acoustics, 2023

Vararray Meets T-Sot: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023

Self-Supervised Learning with Bi-Label Masked Speech Prediction for Streaming Multi-Talker Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023

Speech Separation with Large-Scale Self-Supervised Learning.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Breaking trade-offs in speech separation with sparsely-gated mixture of experts.
CoRR, 2022

Leveraging Real Conversational Data for Multi-Channel Continuous Speech Separation.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Streaming Multi-Talker ASR with Token-Level Serialized Output Training.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

All-Neural Beamformer for Continuous Speech Separation.
Proceedings of the IEEE International Conference on Acoustics, 2022

VarArray: Array-Geometry-Agnostic Continuous Speech Separation.
Proceedings of the IEEE International Conference on Acoustics, 2022

Picknet: Real-Time Channel Selection for Ad Hoc Microphone Arrays.
Proceedings of the IEEE International Conference on Acoustics, 2022

Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction.
Proceedings of the IEEE International Conference on Acoustics, 2022

Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers Using End-to-End Speaker-Attributed ASR.
Proceedings of the IEEE International Conference on Acoustics, 2022

Personalized speech enhancement: new models and Comprehensive evaluation.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Exploring End-to-End Multi-Channel ASR with Bias Information for Meeting Transcription.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Investigation of End-to-End Speaker-Attributed ASR for Continuous Multi-Talker Recordings.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

End-to-End Speaker-Attributed ASR with Transformer.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Human Listening and Live Captioning: Multi-Task Training for Speech Enhancement.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Minimum Bayes Risk Training for End-to-End Speaker-Attributed ASR.
Proceedings of the IEEE International Conference on Acoustics, 2021

Hypothesis Stitcher for End-to-End Speaker-Attributed ASR on Long-Form Multi-Talker Recordings.
Proceedings of the IEEE International Conference on Acoustics, 2021

Continuous Speech Separation with Ad Hoc Microphone Arrays.
Proceedings of the 29th European Signal Processing Conference, 2021

A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
Serialized Output Training for End-to-End Overlapped Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Joint Speaker Counting, Speech Recognition, and Speaker Identification for Overlapped Speech of any Number of Speakers.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020


  Loading...