Ye Bai

Orcid: 0000-0001-5533-6909

According to our database1, Ye Bai authored at least 52 papers between 2011 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Seed-Music: A Unified Framework for High Quality and Controlled Music Generation.
CoRR, 2024

NEST-RQ: Next Token Prediction for Speech Self-Supervised Pre-Training.
CoRR, 2024

Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition.
CoRR, 2024

TraceableSpeech: Towards Proactively Traceable Text-to-Speech with Watermarking.
CoRR, 2024

Jointly Recognizing Speech and Singing Voices Based on Multi-Task Audio Source Separation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

PolyVoice: Language Models for Speech to Speech Translation.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Transfer knowledge for punctuation prediction via adversarial training.
Speech Commun., April, 2023

PolyVoice: Language Models for Speech to Speech Translation.
CoRR, 2023

HoloSinger: Semantics and Music Driven Motion Generation with Octahedral Holographic Projection.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Image-driven Audio-visual Universal Source Separation.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

2022
ADD 2022: the First Audio Deep Synthesis Detection Challenge.
CoRR, 2022

Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

K-Converter: An Unsupervised Singing Voice Conversion System.
Proceedings of the IEEE International Conference on Acoustics, 2022

ADD 2022: the first Audio Deep Synthesis Detection Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Integrating Knowledge Into End-to-End Speech Recognition From External Text-Only Data.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Fast End-to-End Speech Recognition Via Non-Autoregressive Models and Cross-Modal Knowledge Transferring From BERT.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Half-Truth: A Partially Fake Audio Detection Dataset.
CoRR, 2021

TSNAT: Two-Step Non-Autoregressvie Transformer Models for Speech Recognition.
CoRR, 2021

Fast End-to-End Speech Recognition via a Non-Autoregressive Model and Cross-Modal Knowledge Transferring from BERT.
CoRR, 2021

Rnn-transducer With Language Bias For End-to-end Mandarin-English Code-switching Speech Recognition.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Hierarchically Attending Time-Frequency and Channel Features for Improving Speaker Verification.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Half-Truth: A Partially Fake Audio Detection Dataset.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

FSR: Accelerating the Inference Process of Transducer-Based Models by Applying Fast-Skip Regularization.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Continual Learning for Fake Audio Detection.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

End-to-End Spelling Correction Conditioned on Acoustic Feature for Code-Switching Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Decoupling Pronunciation and Language for End-to-End Code-Switching Automatic Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

One In A Hundred: Selecting the Best Predicted Sequence from Numerous Candidates for Speech Recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020
A Public Chinese Dataset for Language Model Adaptation.
J. Signal Process. Syst., 2020

Deep imitator: Handwriting calligraphy imitation via deep attention networks.
Pattern Recognit., 2020

Adversarial Transfer Learning for Punctuation Restoration.
CoRR, 2020

Focal Loss for Punctuation Prediction.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Spike-Triggered Non-Autoregressive Transformer for End-to-End Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Listen Attentively, and Spell Once: Whole Sentence Generation via a Non-Autoregressive Architecture for Low-Latency Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Synchronous Transformers for end-to-end Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Language-Adversarial Transfer Learning for Low-Resource Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Autonomous "Figure-8" Flights of a Quadcopter: Experimental Datasets.
Data, 2019

Integrating Whole Context to Sequence-to-sequence Speech Recognition.
CoRR, 2019

Research on the effect of psychological stress intervention in music students based on Diffie-Hellman key exchange algorithm.
Clust. Comput., 2019

Self-Attention Transducers for End-to-End Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

A Time Delay Neural Network with Shared Weight Self-Attention for Small-Footprint Keyword Spotting.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Learn Spelling from Teachers: Transferring Knowledge from Language Models to Sequence-to-Sequence Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Language-invariant Bottleneck Features from Adversarial End-to-end Acoustic Models for Low Resource Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

Hypersphere Embedding and Additive Margin for Query-by-example Keyword Spotting.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Noise Prior Knowledge Learning for Speech Enhancement via Gated Convolutional Generative Adversarial Network.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Voice Activity Detection Based on Time-Delay Neural Networks.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018
Utterance-level Permutation Invariant Training with Discriminative Learning for Single Channel Speech Separation.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

CLMAD: A Chinese Language Model Adaptation Dataset.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Adversarial Multilingual Training for Low-Resource Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2016
End-to-end keywords spotting based on connectionist temporal classification for Mandarin.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

2013
Study of a speech coding algorithm based on a contact conduction transmitter in a complicated water area.
Proceedings of the Conference on Underwater Networks and Systems, 2013

2011
Method of the Road Lines Recognition in the Maps of Digital Material Based on Improvemented BP Neural Network.
Proceedings of the Advances in Computer Science, 2011

The Heavy Mineral Analysis Based on Immune Self-organizing Neural Network.
Proceedings of the Advances in Computer Science, 2011


  Loading...