2025

Pangu Ultra MoE: How to Train Your Big MoE on Ascend NPUs.

[DOI]

Yehui Tang

Yichun Yin

CoRR, May, 2025

Pangu Ultra: Pushing the Limits of Dense Large Language Models on Ascend NPUs.

[DOI]

CoRR, April, 2025

2024

MART: Learning Hierarchical Music Audio Representations with Part-Whole Transformer.

[DOI]

Proceedings of the Companion Proceedings of the ACM on Web Conference 2024, 2024

SA-WavLM: Speaker-Aware Self-Supervised Pre-training for Mixture Speech.

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Prompt-Driven Target Speech Diarization.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Music-PAW: Learning Music Representations via Hierarchical Part-whole Interaction and Contrast.

[DOI]

CoRR, 2023

USED: Universal Speaker Extraction and Diarization.

[DOI]

Junyi Ao

Mehmet Sinan Yildirim

CoRR, 2023

DisCover: Disentangled Music Representation Learning for Cover Song Identification.

[DOI]

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

2022

Reducing language context confusion for end-to-end code-switching automatic speech recognition.

[DOI]

CoRR, 2022

M4Singer: A Multi-Style, Multi-Singer and Musical Score Provided Mandarin Singing Corpus.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

CorrectSpeech: A Fully Automated System for Speech Correction and Accent Reduction.

[DOI]

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

CoCA-MDD: A Coupled Cross-Attention based Framework for Streaming Mispronunciation Detection and Diagnosis.

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Streamable Speech Representation Disentanglement and Multi-Level Prosody Modeling for Live One-Shot Voice Conversion.

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

reducing multilingual context confusion for end-to-end code-switching automatic speech recognition.

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

EditSinger: Zero-Shot Text-Based Singing Voice Editing System with Diverse Prosody Modeling.

[DOI]

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

HiFiDenoise: High-Fidelity Denoising Text to Speech with Adversarial Networks.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Context-Aware Mask Prediction Network for End-to-End Text-Based Speech Editing.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

CCA-MDD: A Coupled Cross-Attention based Framework for Streaming Mispronunciation detection and diagnosis.

[DOI]

CoRR, 2021

Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization.

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-Shot Voice Conversion.

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Fcl-Taco2: Towards Fast, Controllable and Lightweight Text-to-Speech Synthesis.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion.

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Unified Mandarin TTS Front-end Based on Distilled BERT Model.

[DOI]

Yang Zhang

Liqun Deng

Yasheng Wang

CoRR, 2020

2016

HiGene: A high-performance platform for genomic data analysis.

[DOI]

Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2016

2013

Generating a two-phase lesson for guiding beginners to learn basic dance movements.

[DOI]

Comput. Educ., 2013

2012

Automatic Dance Lesson Generation.

[DOI]

IEEE Trans. Learn. Technol., 2012

Generalized Model-Based Human Motion Recognition with Body Partition Index Maps.

[DOI]

Comput. Graph. Forum, 2012

2011

Real-time mocap dance recognition for an interactive dancing game.

[DOI]

Comput. Animat. Virtual Worlds, 2011

2010

Automated Recognition of Sequential Patterns in Captured Motion Streams.

[DOI]

Proceedings of the Web-Age Information Management, 11th International Conference, 2010

Evaluating Human Motion Complexity Based on Un-Correlation and Non-smoothness.

[DOI]

Proceedings of the Advances in Multimedia Information Processing - PCM 2010, 2010

Automatically Constructing a Compact Concept Map of Dance Motion with Motion Captured Data.

[DOI]

Proceedings of the Advances in Web-Based Learning - ICWL 2010, 2010

Recognizing Dance Motions with Segmental SVD.

[DOI]

Proceedings of the 20th International Conference on Pattern Recognition, 2010