Zhihao Du

Orcid: 0000-0003-3509-9322

According to our database1, Zhihao Du authored at least 37 papers between 2018 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Assessing the Post-Activation Performance Enhancement of Upper Limbs in Basketball Athletes: A Sensor-Based Study of Rapid Stretch Compound and Blood Flow Restriction Training.
Sensors, July, 2024

Effects of High-Load Bench Press Training with Different Blood Flow Restriction Pressurization Strategies on the Degree of Muscle Activation in the Upper Limbs of Bodybuilders.
Sensors, January, 2024

Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap.
CoRR, 2024

IntrinsicVoice: Empowering LLMs with Intrinsic Real-time Voice Interaction Abilities.
CoRR, 2024

CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens.
CoRR, 2024

FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs.
CoRR, 2024

An Embarrassingly Simple Approach for LLM with Strong ASR Capacity.
CoRR, 2024

FunCodec: A Fundamental, Reproducible and Integrable Open-Source Toolkit for Neural Speech Codec.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT.
CoRR, 2023

FunASR: A Fundamental End-to-End Speech Recognition Toolkit.
CoRR, 2023

CASA-ASR: Context-Aware Speaker-Attributed ASR.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Personality-aware Training based Speaker Adaptation for End-to-end Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

FunASR: A Fundamental End-to-End Speech Recognition Toolkit.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

AttenTPU: Tensor Processor for Attention Mechanism with Fine-Grained Padding.
Proceedings of the IEEE International Conference on Integrated Circuits, 2023

TOLD: a Novel Two-Stage Overlap-Aware Framework for Speaker Diarization.
Proceedings of the IEEE International Conference on Acoustics, 2023

The Second Multi-Channel Multi-Party Meeting Transcription Challenge (M2MeT 2.0): A Benchmark for Speaker-Attributed ASR.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Sa-Paraformer: Non-Autoregressive End-To-End Speaker-Attributed ASR.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

A Comparative Study on Multichannel Speaker-Attributed Automatic Speech Recognition in Multi-party Meetings.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022
MFCCA: Multi-Frame Cross-Channel attention for multi-speaker ASR in Multi-party meeting scenario.
CoRR, 2022

Speaker Embedding-aware Neural Diarization: an Efficient Framework for Overlapping Speech Diarization in Meeting Scenarios.
CoRR, 2022

MFCCA:Multi-Frame Cross-Channel Attention for Multi-Speaker ASR in Multi-Party Meeting Scenario.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Separate-to-Recognize: Joint Multi-target Speech Separation and Speech Recognition for Speaker-attributed ASR.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Summary on the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2022

M2Met: The Icassp 2022 Multi-Channel Multi-Party Meeting Transcription Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2022

Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

2021
Speaker Embedding-aware Neural Diarization for Flexible Number of Speakers with Textual Information.
CoRR, 2021

Capturing Temporal Dependencies Through Future Prediction for CNN-Based Audio Classifiers.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
A Joint Framework of Denoising Autoencoder and Generative Vocoder for Monaural Speech Enhancement.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Self-Supervised Adversarial Multi-Task Learning for Vocoder-Based Monaural Speech Enhancement.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Double Adversarial Network Based Monaural Speech Enhancement for Robust Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

An Efficient Joint Training Framework for Robust Small-Footprint Keyword Spotting.
Proceedings of the Neural Information Processing - 27th International Conference, 2020

Pan: Phoneme-Aware Network for Monaural Speech Enhancement.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
A Monaural Speech Enhancement Method for Robust Small-Footprint Keyword Spotting.
CoRR, 2019

Acoustic Scene Classification by Implicitly Identifying Distinct Sound Events.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Investigation of Monaural Front-End Processing for Robust Speech Recognition Without Retraining or Joint-Training.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018
Investigation of Monaural Front-End Processing for Robust ASR without Retraining or Joint-Training.
CoRR, 2018


  Loading...