kNN-SVC: Robust Zero-Shot Singing Voice Conversion with Additive Synthesis and Concatenation Smoothness Optimization.
CoRR, April, 2025
Representation Learning for Music and Audio Intelligence
PhD thesis, 2024
Multi-Track MusicLDM: Towards Versatile Music Generation with Latent Diffusion Model.
CoRR, 2024
HKDSME: Heterogeneous Knowledge Distillation for Semi-supervised Singing Melody Extraction Using Harmonic Supervision.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Improving Generalization of Speech Separation in Real-World Scenarios: Strategies in Simulation, Optimization, and Evaluation.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
Retrieval Guided Music Captioning via Multimodal Prefixes.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024
Music Enhancement with Deep Filters: A Technical Report for the ICASSP 2024 Cadenza Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2024
Audiosr: Versatile Audio Super-Resolution at Scale.
Proceedings of the IEEE International Conference on Acoustics, 2024
MusicLDM: Enhancing Novelty in text-to-music Generation Using Beat-Synchronous mixup Strategies.
Proceedings of the IEEE International Conference on Acoustics, 2024
MDX-GAN: Enhancing Perceptual Quality in Multi-Class Source Separation Via Adversarial Training.
Proceedings of the IEEE International Conference on Acoustics, 2024
Graph contrastive learning with implicit augmentations.
Neural Networks, 2023
The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation.
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
Universal Source Separation with Weakly Labelled Data.
CoRR, 2023
Towards Improving Harmonic Sensitivity and Prediction Stability for Singing Melody Extraction.
Proceedings of the 24th International Society for Music Information Retrieval Conference, 2023
Large-Scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation.
Proceedings of the IEEE International Conference on Acoustics, 2023
Multitrack Music Transformer.
Proceedings of the IEEE International Conference on Acoustics, 2023
Multitrack Music Transformer: Learning Long-Term Dependencies in Music with Diverse Instruments.
CoRR, 2022
Latent feature augmentation for chorus detection.
Proceedings of the 23rd International Society for Music Information Retrieval Conference, 2022
Improving Choral Music Separation through Expressive Synthesized Data from Sampled Instruments.
Proceedings of the 23rd International Society for Music Information Retrieval Conference, 2022
Bytecover2: Towards Dimensionality Reduction of Latent Embedding for Efficient Cover Song Identification.
Proceedings of the IEEE International Conference on Acoustics, 2022
Tonet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music.
Proceedings of the IEEE International Conference on Acoustics, 2022
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection.
Proceedings of the IEEE International Conference on Acoustics, 2022
Zero-Shot Audio Source Separation through Query-Based Learning from Weakly-Labeled Data.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022
Learning Audio Embeddings with User Listening Data for Content-Based Music Recommendation.
Proceedings of the IEEE International Conference on Acoustics, 2021
POP909: A Pop-song Dataset for Music Arrangement Generation.
CoRR, 2020
Continuous Melody Generation via Disentangled Short-Term Representations and Structural Conditions.
Proceedings of the IEEE 14th International Conference on Semantic Computing, 2020
MusPy: A Toolkit for Symbolic Music Generation.
Proceedings of the 21th International Society for Music Information Retrieval Conference, 2020
Music SketchNet: Controllable Music Generation via Factorized Representations of Pitch and Rhythm.
Proceedings of the 21th International Society for Music Information Retrieval Conference, 2020
POP909: A Pop-Song Dataset for Music Arrangement Generation.
Proceedings of the 21th International Society for Music Information Retrieval Conference, 2020
Large-vocabulary Chord Transcription Via Chord Structure Decomposition.
Proceedings of the 20th International Society for Music Information Retrieval Conference, 2019
The Effect of Explicit Structure Encoding of Deep Neural Networks for Symbolic Music Generation.
CoRR, 2018