Yu Shi

Orcid: 0000-0003-1872-3429

Affiliations:
  • Microsoft, Redmond, WA, USA


According to our database1, Yu Shi authored at least 41 papers between 2001 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

2023
Improving Readability for Automatic Speech Recognition Transcription.
ACM Trans. Asian Low Resour. Lang. Inf. Process., May, 2023

i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data.
CoRR, 2023

Code-Switching Text Generation and Injection in Mandarin-English ASR.
Proceedings of the IEEE International Conference on Acoustics, 2023

Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

i-Code: An Integrative and Composable Multimodal Learning Framework.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Breaking trade-offs in speech separation with sparsely-gated mixture of experts.
CoRR, 2022

Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization.
CoRR, 2022

Automatic Speech Recognition Post-Processing for Readability: Task, Dataset and a Two-Stage Pre-Trained Approach.
IEEE Access, 2022

Optimizing Alignment of Speech and Language Latent Spaces for End-To-End Speech Recognition and Understanding.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Building a great multi-lingual teacher with sparsely-gated mixture of experts for speech recognition.
CoRR, 2021

Florence: A New Foundation Model for Computer Vision.
CoRR, 2021

A Joint and Domain-Adaptive Approach to Spoken Language Understanding.
CoRR, 2021

Listen, Look and Deliberate: Visual Context-Aware Speech Recognition Using Pre-Trained Text-Video Representations.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Improving Zero-shot Neural Machine Translation on Language-specific Encoders- Decoders.
Proceedings of the International Joint Conference on Neural Networks, 2021

Speech-Language Pre-Training for End-to-End Spoken Language Understanding.
Proceedings of the IEEE International Conference on Acoustics, 2021

Generating Human Readable Transcript for Automatic Speech Recognition with Pre-Trained Language Model.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Discriminative Transfer Learning for Optimizing ASR and Semantic Labeling in Task-Oriented Spoken Dialog.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Mixed-Lingual Pre-training for Cross-lingual Summarization.
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, 2020

MaP: A Matrix-based Prediction Approach to Improve Span Extraction in Machine Reading Comprehension.
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, 2020

2010
A study of irrelevant variability normalization based training and unsupervised online adaptation for LVCSR.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

A Study of Discriminative Training for HMM-Based Online Handwritten Chinese/Japanese Character Recognition.
Proceedings of the International Conference on Frontiers in Handwriting Recognition, 2010

2009
A Study of Feature Design for Online Handwritten Chinese Character Recognition Based on Continuous-Density Hidden Markov Models.
Proceedings of the 10th International Conference on Document Analysis and Recognition, 2009

2008
GPU-accelerated Gaussian clustering for fMPE discriminative training.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

A symbol graph based handwritten math expression recognition.
Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008

Approximateword-lattice indexing with text indexers: Time-Anchored Lattice Expansion.
Proceedings of the IEEE International Conference on Acoustics, 2008

Symbol graph based discriminative training and rescoring for improved math symbol recognition.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
A Unified Framework for Symbol Segmentation and Recognition of Handwritten Mathematical Expressions.
Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR 2007), 2007

A Segmentation Posterior Based Endpointing Algorithm.
Proceedings of the IEEE International Conference on Acoustics, 2007

Towards spoken-document retrieval for the enterprise: Approximate word-lattice indexing with text indexers.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006
A Robust Voice Activity Detection Based on Noise Eigenspace Projection.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Integrating Hypotheses of Multiple Recognizers for Improving Mandarin LVCSR Performance.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

Auto-segmentation based VAD for robust ASR.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Auto-Segmentation Based Partitioning and Clustering Approach to Robust Endpointing.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2004
Tone articulation modeling for Mandarin spontaneous speech recognition.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Studies in massively speaker-specific speech recognition.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Segmental tonal modeling for phone set design in Mandarin LVCSR.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Spectrogram-based formant tracking via particle filters.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
A system for spoken query information retrieval on mobile devices.
IEEE Trans. Speech Audio Process., 2002

Power spectral density based channel equalization of large speech database for concatenative TTS system.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

2001
Speech lab in a box: a Mandarin speech toolbox to jumpstart speech related research.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001


  Loading...