2024
WildAvatar: Web-scale In-the-wild Video Dataset for 3D Avatar Creation.
CoRR, 2024
One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model.
CoRR, 2024
Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask.
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo.
CoRR, 2024
GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting.
CoRR, 2024
GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo.
Proceedings of the Computer Vision - ECCV 2024, 2024
GauHuman: Articulated Gaussian Splatting from Monocular Human Videos.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
2023
GauHuman: Articulated Gaussian Splatting from Monocular Human Videos.
CoRR, 2023
HumanLiff: Layer-wise 3D Human Generation with Diffusion Model.
CoRR, 2023
ConsistentNeRF: Enhancing Neural Radiance Fields with 3D Consistency for Sparse View Synthesis.
CoRR, 2023
Hyper-parameter Adaptation of Conformer ASR Systems for Elderly and Dysarthric Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Lossless 4-bit Quantization of Architecture Compressed Conformer ASR Systems on the 300-hr Switchboard Corpus.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
SHERF: Generalizable Human NeRF from a Single Image.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Exploiting Prompt Learning with Pre-Trained Language Models for Alzheimer's Disease Detection.
Proceedings of the IEEE International Conference on Acoustics, 2023
2022
DHA: End-to-End Joint Optimization of Data Augmentation Policy, Hyper-parameter and Architecture.
Trans. Mach. Learn. Res., 2022
Bayesian Neural Network Language Modeling for Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2022
Neural Architecture Search for LF-MMI Trained Time Delay Neural Networks.
IEEE ACM Trans. Audio Speech Lang. Process., 2022
Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System on the 300-hr Switchboard Corpus.
CoRR, 2022
Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System on the 300-hr Swithboard Corpus.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Exploring linguistic feature and model combination for speech recognition based automatic AD detection.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Conformer Based Elderly Speech Recognition System for Alzheimer's Disease Detection.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Generalizing Few-Shot NAS with Gradient Matching.
Proceedings of the Tenth International Conference on Learning Representations, 2022
Neural Architecture Search for Speech Emotion Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022
Exploiting Cross Domain Acoustic-to-Articulatory Inverted Features for Disordered Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022
2021
Audio-Visual Multi-Channel Integration and Recognition of Overlapped Speech.
IEEE ACM Trans. Audio Speech Lang. Process., 2021
Mixed Precision Low-Bit Quantization of Neural Network Language Models for Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2021
Recent Progress in the CUHK Dysarthric Speech Recognition System.
IEEE ACM Trans. Audio Speech Lang. Process., 2021
Bayesian Learning of LF-MMI Trained Time Delay Neural Networks for Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2021
Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Bayesian Parametric and Architectural Domain Adaptation of LF-MMI Trained TDNNs for Elderly and Dysarthric Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Development of the Cuhk Elderly Speech Recognition System for Neurocognitive Disorder Detection Using the Dementiabank Corpus.
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE International Conference on Acoustics, 2021
Bayesian Transformer Language Models for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021
Mixed Precision Quantization of Transformer Language Models for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021
Neural Architecture Search for LF-MMI Trained Time Delay Neural Networks.
Proceedings of the IEEE International Conference on Acoustics, 2021
Understanding the wiring evolution in differentiable neural architecture search.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021
2020
Neural Architecture Search for Speech Recognition.
CoRR, 2020
Bayesian x-vector: Bayesian Neural Network based x-vector System for Speaker Verification.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020
Exploiting Cross-Domain Visual Feature Generation for Disordered Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Investigation of Data Augmentation Techniques for Disordered Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
DSNAS: Direct Neural Architecture Search Without Parameter Retraining.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
2019
Comparative Study of Parametric and Representation Uncertainty Modeling for Recurrent Neural Network Language Models.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Exploiting Visual Features Using Bayesian Gated Neural Networks for Disordered Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
On the Use of Pitch Features for Disordered Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
LF-MMI Training of Bayesian and Gaussian Process Time Delay Neural Networks for Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
The CUHK Dysarthric Speech Recognition Systems for English and Cantonese.
,
,
,
,
,
,
,
,
,
,
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Recurrent Neural Network Language Model Training Using Natural Gradient.
Proceedings of the IEEE International Conference on Acoustics, 2019
BLHUC: Bayesian Learning of Hidden Unit Contributions for Deep Neural Network Speaker Adaptation.
Proceedings of the IEEE International Conference on Acoustics, 2019
Speech Emotion Recognition Using Capsule Networks.
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE International Conference on Acoustics, 2019
Gaussian Process Lstm Recurrent Neural Network Language Models for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019
Bayesian and Gaussian Process Neural Networks for Large Vocabulary Continuous Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019
2018
Development of the CUHK Dysarthric Speech Recognition System for the UA Speech Corpus.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Gaussian Process Neural Networks for Speech Recognition.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018