2024
Object-Centric Representation Learning for Video Scene Understanding.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024
Accented Text-to-Speech Synthesis With Limited Data.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
RefXVC: Cross-Lingual Voice Conversion With Enhanced Reference Leveraging.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
Multi-Scale Accent Modeling with Disentangling for Multi-Speaker Multi-Accent TTS Synthesis.
CoRR, 2024
Team Samsung-RAL: Technical Report for 2024 RoboDrive Challenge-Robust Map Segmentation Track.
CoRR, 2024
The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Is Your HD Map Constructor Reliable under Sensor Corruptions?
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
MBFusion: A New Multi-modal BEV Feature Fusion Method for HD Map Construction.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024
HIMap: HybrId Representation Learning for End-to-end Vectorized HD Map Construction.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
2023
Mel-S3R: Combining Mel-spectrogram and self-supervised speech representation with VQ-VAE for any-to-any voice conversion.
Speech Commun., June, 2023
Optimization of Cross-Lingual Voice Conversion With Linguistics Losses to Reduce Foreign Accents.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
TTS-Guided Training for Accent Conversion Without Parallel Data.
IEEE Signal Process. Lett., 2023
Zero-shot multi-speaker accent TTS with limited accent data.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
2022
Slot-VPS: Object-centric Representation Learning for Video Panoptic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
2021
Language Agnostic Speaker Embedding for Cross-Lingual Personalized Speech Generation.
IEEE ACM Trans. Audio Speech Lang. Process., 2021
Transfer Learning From Speech Synthesis to Voice Conversion With Non-Parallel Training Data.
IEEE ACM Trans. Audio Speech Lang. Process., 2021
Cross-Lingual Voice Conversion with a Cycle Consistency Loss on Linguistic Representation.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
2020
Multi-Task WaveRNN With an Integrated Architecture for Cross-Lingual Voice Conversion.
IEEE Signal Process. Lett., 2020
Personalized Singing Voice Generation Using WaveRNN.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020
The NUS & NWPU system for Voice Conversion Challenge 2020.
,
,
,
,
,
,
,
,
,
,
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020
NUS-HLT System for Blizzard Challenge 2020.
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020
2019
Cross-lingual Voice Conversion with Bilingual Phonetic Posteriorgram and Average Modeling.
Proceedings of the IEEE International Conference on Acoustics, 2019
A Modularized Neural Network with Language-Specific Output Layers for Cross-Lingual Voice Conversion.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Many-to-many Cross-lingual Voice Conversion with a Jointly Trained Speaker Embedding Network.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
Speaker-independent Spectral Mapping for Speech-to-Singing Conversion.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
2005
A probabilistic model for robust face alignment in videos.
Proceedings of the 2005 International Conference on Image Processing, 2005
A Bayesian Mixture Model for Multi-View Face Alignment.
Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), 2005