Guangzhi Sun
Orcid: 0000-0002-5886-056X
According to our database1,
Guangzhi Sun
authored at least 51 papers
between 2019 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2025
CoRR, January, 2025
Comput. Speech Lang., 2025
2024
Graph Neural Networks for Contextual ASR With the Tree-Constrained Pointer Generator.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
IEEE ACM Trans. Audio Speech Lang. Process., 2024
CoRR, 2024
Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization.
CoRR, 2024
CoRR, 2024
Enhancing Low-Resource Language and Instruction Following Capabilities of Audio Language Models.
CoRR, 2024
Extract and Diffuse: Latent Integration for Improved Diffusion-based Speech and Vocal Enhancement.
CoRR, 2024
Whisper-PMFA: Partial Multi-Scale Feature Aggregation for Speaker Verification using Whisper Models.
CoRR, 2024
Wav2Prompt: End-to-End Speech Prompt Generation and Tuning For LLM in Zero and Few-shot Learning.
CoRR, 2024
CoRR, 2024
M<sup>3</sup>AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset.
CoRR, 2024
CoRR, 2024
Proceedings of the IEEE Spoken Language Technology Workshop, 2024
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Building Better AI Agents: A Provocation on the Utilisation of Persona in LLM-based Conversational Agents.
Proceedings of the ACM Conversational User Interfaces 2024, 2024
Proceedings of the Findings of the Association for Computational Linguistics, 2024
M³AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
2023
Minimising Biasing Word Errors for Contextual ASR With the Tree-Constrained Pointer Generator.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch.
CoRR, 2023
Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models.
CoRR, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
TorchAudio 2.1: Advancing Speech Recognition, Self-Supervised Learning, and Audio Processing Components for Pytorch.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
2022
Tree-constrained Pointer Generator with Graph Neural Network Encodings for Contextual Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022
2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Transformer Language Models with LSTM-Based Cross-Utterance Information Representation.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
2020
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior.
CoRR, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Generating Diverse and Natural Text-to-Speech Samples Using a Quantized Fine-Grained VAE and Autoregressive Prosody Prior.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
2019
Proceedings of the IEEE International Conference on Acoustics, 2019