Shuai Zhang

Orcid: 0000-0002-1094-887X

Affiliations:
  • Chinese Academy of Sciences, Institute of Automation, Beijing, China
  • University of Chinese Academy of Sciences, School of Artificial Intelligence, Beijing, China


According to our database1, Shuai Zhang authored at least 28 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Pandora's Box or Aladdin's Lamp: A Comprehensive Analysis Revealing the Role of RAG Noise in Large Language Models.
CoRR, 2024

ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation.
CoRR, 2024

Fake News Detection and Manipulation Reasoning via Large Vision-Language Models.
CoRR, 2024

MINT: a Multi-modal Image and Narrative Text Dubbing Dataset for Foley Audio Content Planning and Generation.
CoRR, 2024

Can large language models understand uncommon meanings of common words?
CoRR, 2024

KS-LLM: Knowledge Selection of Large Language Models with Evidence Document for Question Answering.
CoRR, 2024

Bilateral Masking with prompt for Knowledge Graph Completion.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

2023
TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection.
CoRR, 2023

TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Detection of Cross-Dataset Fake Audio Based on Prosodic and Pronunciation Features.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

2022
Hybrid Autoregressive and Non-Autoregressive Transformer Models for Speech Recognition.
IEEE Signal Process. Lett., 2022

ADD 2022: the First Audio Deep Synthesis Detection Challenge.
CoRR, 2022

Reducing language context confusion for end-to-end code-switching automatic speech recognition.
CoRR, 2022

reducing multilingual context confusion for end-to-end code-switching automatic speech recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

ADD 2022: the first Audio Deep Synthesis Detection Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Integrating Knowledge Into End-to-End Speech Recognition From External Text-Only Data.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Fast End-to-End Speech Recognition Via Non-Autoregressive Models and Cross-Modal Knowledge Transferring From BERT.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

TSNAT: Two-Step Non-Autoregressvie Transformer Models for Speech Recognition.
CoRR, 2021

Fast End-to-End Speech Recognition via a Non-Autoregressive Model and Cross-Modal Knowledge Transferring from BERT.
CoRR, 2021

Rnn-transducer With Language Bias For End-to-end Mandarin-English Code-switching Speech Recognition.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

FSR: Accelerating the Inference Process of Transducer-Based Models by Applying Fast-Skip Regularization.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

End-to-End Spelling Correction Conditioned on Acoustic Feature for Code-Switching Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Decoupling Pronunciation and Language for End-to-End Code-Switching Automatic Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

One In A Hundred: Selecting the Best Predicted Sequence from Numerous Candidates for Speech Recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020
Spike-Triggered Non-Autoregressive Transformer for End-to-End Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Listen Attentively, and Spell Once: Whole Sentence Generation via a Non-Autoregressive Architecture for Low-Latency Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Synchronous Transformers for end-to-end Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Integrating Whole Context to Sequence-to-sequence Speech Recognition.
CoRR, 2019


  Loading...