Siqi Zheng
Orcid: 0009-0002-6787-4223
According to our database1,
Siqi Zheng
authored at least 57 papers
between 2018 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
UbiPhysio: Support Daily Functioning, Fitness, and Rehabilitation with Action Understanding and Feedback in Natural Language.
Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., March, 2024
MuVi: Video-to-Music Generation with Semantic Alignment and Rhythmic Synchronization.
CoRR, 2024
CoRR, 2024
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling.
CoRR, 2024
Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization.
CoRR, 2024
CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens.
CoRR, 2024
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs.
CoRR, 2024
CoRR, 2024
ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec.
CoRR, 2024
AudioLCM: Efficient and High-Quality Text-to-Audio Generation with Minimal Inference Steps.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
VoiceTuner: Self-Supervised Pre-training and Efficient Fine-tuning For Voice Generation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
FunCodec: A Fundamental, Reproducible and Integrable Open-Source Toolkit for Neural Speech Codec.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the CHI Conference on Human Factors in Computing Systems, 2024
2023
Improving Speaker Diarization using Semantic Information: Joint Pairwise Constraints Propagation.
CoRR, 2023
Self-Distillation Network with Ensemble Prototypes: Learning Robust Speaker Representations without Supervision.
CoRR, 2023
3D-Speaker: A Large-Scale Multi-Device, Multi-Distance, and Multi-Dialect Corpus for Speech Representation Disentanglement.
CoRR, 2023
CAM++: A Fast and Efficient Network for Speaker Verification Using Context-Aware Masking.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Pushing the Limits of Self-Supervised Speaker Verification using Regularized Distillation Framework.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
A Two-Layer Human-in-the-Loop Optimization Framework for Customizing Lower-Limb Exoskeleton Assistance.
Proceedings of the American Control Conference, 2023
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
Exploring Speaker-Related Information in Spoken Language Understanding for Better Speaker Diarization.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
2022
Multi-Source Time Series Remote Sensing Feature Selection and Urban Forest Extraction Based on Improved Artificial Bee Colony.
Remote. Sens., 2022
Speaker Embedding-aware Neural Diarization: an Efficient Framework for Overlapping Speech Diarization in Meeting Scenarios.
CoRR, 2022
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022
PRISM: Pre-trained Indeterminate Speaker Representation Model for Speaker Diarization and Speaker Verification.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the International Joint Conference on Neural Networks, 2022
Proceedings of the Tenth International Conference on Learning Representations, 2022
Reformulating Speaker Diarization As Community Detection With Emphasis On Topological Structure.
Proceedings of the IEEE International Conference on Acoustics, 2022
Summary on the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Graph Convolutional Network Based Semi-Supervised Learning on Multi-Speaker Meeting Data.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
2021
Speaker Embedding-aware Neural Diarization for Flexible Number of Speakers with Textual Information.
CoRR, 2021
Measuring daily-life fear perception change: a computational study in the context of COVID-19.
CoRR, 2021
CoRR, 2021
Investigation of Spatial-Acoustic Features for Overlapping Speech Detection in Multiparty Meetings.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
2020
Phonetically-Aware Coupled Network For Short Duration Text-Independent Speaker Verification.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
2019
Autoencoder-Based Semi-Supervised Curriculum Learning for Out-of-Domain Speaker Verification.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Towards a Fault-Tolerant Speaker Verification System: A Regularization Approach to Reduce the Condition Number.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Factors Influencing University Students' Intention to Redeem Digital Takeaway Coupons - Analysis Based on A Survey in China.
Proceedings of the ICIT 2019, 2019
2018
Proceedings of the 24th International Conference on Pattern Recognition, 2018