Qian Chen
Orcid: 0000-0001-6939-7438Affiliations:
- Alibaba Group, DAMO Academy, Speech Lab, China
- University of Science and Technology of China, National Engineering Laboratory of Speech and Language Information Processing, Hefei, China
According to our database1,
Qian Chen
authored at least 79 papers
between 2015 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
IEEE Signal Process. Lett., 2024
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling.
CoRR, 2024
Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization.
CoRR, 2024
Recording for Eyes, Not Echoing to Ears: Contextualized Spoken-to-Written Conversion of ASR Transcripts.
CoRR, 2024
CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens.
CoRR, 2024
ClinicalLab: Aligning Agents for Multi-Departmental Clinical Diagnostics in the Real World.
CoRR, 2024
CoRR, 2024
ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec.
CoRR, 2024
CoRR, 2024
CoRR, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Advancing Precise Outline-Conditioned Text Generation with Task Duality and Explicit Outline Control.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024
CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating LLMs on Code Understanding and Generation.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
2023
Improving Speaker Diarization using Semantic Information: Joint Pairwise Constraints Propagation.
CoRR, 2023
Self-Distillation Network with Ensemble Prototypes: Learning Robust Speaker Representations without Supervision.
CoRR, 2023
3D-Speaker: A Large-Scale Multi-Device, Multi-Distance, and Multi-Dialect Corpus for Speech Representation Disentanglement.
CoRR, 2023
Exploiting Correlations Between Contexts and Definitions with Multiple Definition Modeling.
CoRR, 2023
CoRR, 2023
Enhancing Multi-modal Multi-hop Question Answering via Structured Knowledge and Unified Retrieval-Generation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Proceedings of the 20th International Conference on Spoken Language Translation, 2023
CAM++: A Fast and Efficient Network for Speaker Verification Using Context-Aware Masking.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Adapter-tuning with Effective Token-dependent Representation Shift for Automatic Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Personality-aware Training based Speaker Adaptation for End-to-end Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Overview of the ICASSP 2023 General Meeting Understanding and Generation Challenge (MUG).
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Pushing the Limits of Self-Supervised Speaker Verification using Regularized Distillation Framework.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
The Second Multi-Channel Multi-Party Meeting Transcription Challenge (M2MeT 2.0): A Benchmark for Speaker-Attributed ASR.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
A Comparative Study on Multichannel Speaker-Attributed Automatic Speech Recognition in Multi-party Meetings.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
Exploring Speaker-Related Information in Spoken Language Understanding for Better Speaker Diarization.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
2022
Enhancing Multi-modal and Multi-hop Question Answering via Structured Knowledge and Unified Retrieval-Generation.
CoRR, 2022
PRISM: Pre-trained Indeterminate Speaker Representation Model for Speaker Diarization and Speaker Verification.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the Tenth International Conference on Learning Representations, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
MDERank: A Masked Document Embedding Rank Approach for Unsupervised Keyphrase Extraction.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022
2021
TRS: Transferability Reduced Ensemble via Encouraging Gradient Diversity and Model Smoothness.
CoRR, 2021
TRS: Transferability Reduced Ensemble via Promoting Gradient Diversity and Model Smoothness.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Pre-Training for Spoken Language Understanding with Joint Textual and Phonetic Representation Learning.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Sequence Model with Self-Adaptive Sliding Window for Efficient Spoken Document Segmentation.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
2020
Comput. Speech Lang., 2020
Controllable Time-Delay Transformer for Real-Time Punctuation Prediction and Disfluency Detection.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020
2019
Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models.
CoRR, 2019
Several Experiments on Investigating Pretraining and Knowledge-Enhanced Models for Natural Language Inference.
CoRR, 2019
CoRR, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
2018
A Sequential Neural Encoder With Latent Structured Description for Modeling Sentences.
IEEE ACM Trans. Audio Speech Lang. Process., 2018
VisDrone-DET2018: The Vision Meets Drone Object Detection in Image Challenge Results.
Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018
Proceedings of the 27th International Conference on Computational Linguistics, 2018
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018
2017
Exploring Question Understanding and Adaptation in Neural-Network-Based Question Answering.
CoRR, 2017
Recurrent Neural Network-Based Sentence Encoder with Gated Attention for Natural Language Inference.
Proceedings of the 2nd Workshop on Evaluating Vector Space Representations for NLP, 2017
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017
2016
CoRR, 2016
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016
2015
Automatic phrase boundary labeling of speech synthesis database using context-dependent HMMs and n-gram prior distributions.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015