2024

WildAvatar: Web-scale In-the-wild Video Dataset for 3D Avatar Creation.

[DOI]

Zihao Huang

Shoukang Hu

CoRR, 2024

One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model.

[DOI]

CoRR, 2024

Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask.

[DOI]

CoRR, 2024

Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo.

[DOI]

CoRR, 2024

GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting.

[DOI]

CoRR, 2024

GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo.

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

GauHuman: Articulated Gaussian Splatting from Monocular Human Videos.

[DOI]

Shoukang Hu

Tao Hu

Ziwei Liu

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

GauHuman: Articulated Gaussian Splatting from Monocular Human Videos.

[DOI]

Shoukang Hu

Ziwei Liu

CoRR, 2023

HumanLiff: Layer-wise 3D Human Generation with Diffusion Model.

[DOI]

CoRR, 2023

ConsistentNeRF: Enhancing Neural Radiance Fields with 3D Consistency for Sparse View Synthesis.

[DOI]

CoRR, 2023

Hyper-parameter Adaptation of Conformer ASR Systems for Elderly and Dysarthric Speech Recognition.

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Lossless 4-bit Quantization of Architecture Compressed Conformer ASR Systems on the 300-hr Switchboard Corpus.

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

SHERF: Generalizable Human NeRF from a Single Image.

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Exploiting Prompt Learning with Pre-Trained Language Models for Alzheimer's Disease Detection.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

DHA: End-to-End Joint Optimization of Data Augmentation Policy, Hyper-parameter and Architecture.

[DOI]

Trans. Mach. Learn. Res., 2022

Bayesian Neural Network Language Modeling for Speech Recognition.

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2022

Neural Architecture Search for LF-MMI Trained Time Delay Neural Networks.

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2022

Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System on the 300-hr Switchboard Corpus.

[DOI]

CoRR, 2022

Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System on the 300-hr Swithboard Corpus.

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Exploring linguistic feature and model combination for speech recognition based automatic AD detection.

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Conformer Based Elderly Speech Recognition System for Alzheimer's Disease Detection.

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems.

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Generalizing Few-Shot NAS with Gradient Matching.

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Neural Architecture Search for Speech Emotion Recognition.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Exploiting Cross Domain Acoustic-to-Articulatory Inverted Features for Disordered Speech Recognition.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Audio-Visual Multi-Channel Integration and Recognition of Overlapped Speech.

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Mixed Precision Low-Bit Quantization of Neural Network Language Models for Speech Recognition.

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Recent Progress in the CUHK Dysarthric Speech Recognition System.

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Bayesian Learning of LF-MMI Trained Time Delay Neural Networks for Speech Recognition.

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition.

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Bayesian Parametric and Architectural Domain Adaptation of LF-MMI Trained TDNNs for Elderly and Dysarthric Speech Recognition.

[DOI]

Jiajun Deng

Fabian Ritter Gutierrez

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Development of the Cuhk Elderly Speech Recognition System for Neurocognitive Disorder Detection Using the Dementiabank Corpus.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Bayesian Transformer Language Models for Speech Recognition.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Mixed Precision Quantization of Transformer Language Models for Speech Recognition.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Neural Architecture Search for LF-MMI Trained Time Delay Neural Networks.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Understanding the wiring evolution in differentiable neural architecture search.

[DOI]

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020

Neural Architecture Search for Speech Recognition.

[DOI]

CoRR, 2020

Bayesian x-vector: Bayesian Neural Network based x-vector System for Speaker Verification.

[DOI]

Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Exploiting Cross-Domain Visual Feature Generation for Disordered Speech Recognition.

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Investigation of Data Augmentation Techniques for Disordered Speech Recognition.

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers.

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

DSNAS: Direct Neural Architecture Search Without Parameter Retraining.

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Comparative Study of Parametric and Representation Uncertainty Modeling for Recurrent Neural Network Language Models.

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Exploiting Visual Features Using Bayesian Gated Neural Networks for Disordered Speech Recognition.

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

On the Use of Pitch Features for Disordered Speech Recognition.

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

LF-MMI Training of Bayesian and Gaussian Process Time Delay Neural Networks for Speech Recognition.

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

The CUHK Dysarthric Speech Recognition Systems for English and Cantonese.

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Recurrent Neural Network Language Model Training Using Natural Gradient.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

BLHUC: Bayesian Learning of Hidden Unit Contributions for Deep Neural Network Speaker Adaptation.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Speech Emotion Recognition Using Capsule Networks.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Gaussian Process Lstm Recurrent Neural Network Language Models for Speech Recognition.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Bayesian and Gaussian Process Neural Networks for Large Vocabulary Continuous Speech Recognition.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Development of the CUHK Dysarthric Speech Recognition System for the UA Speech Corpus.

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Gaussian Process Neural Networks for Speech Recognition.

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018