Hao Huang
Orcid: 0000-0001-6604-0951Affiliations:
- Xinjiang Univerity, School of Information Science and Engineering, Xinjiang Provincial Key Laboratory of Multilingual Information Technology, Urumqi, China
- Shanghai Jiao Tong University, Department of Electronic Engineering, Shanghai, China (PhD 2008)
According to our database1,
Hao Huang
authored at least 62 papers
between 2009 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on scopus.com
-
on orcid.org
On csauthors.net:
Bibliography
2024
GLFER-Net: a polyphonic sound source localization and detection network based on global-local feature extraction and recalibration.
EURASIP J. Audio Speech Music. Process., December, 2024
CEA-Net: a co-interactive external attention network for joint intent detection and slot filling.
Neural Comput. Appl., August, 2024
IIFC-Net: A Monaural Speech Enhancement Network With High-Order Information Interaction and Feature Calibration.
IEEE Signal Process. Lett., 2024
J. Intell. Fuzzy Syst., 2024
Proceedings of the IEEE International Conference on Cybernetics and Intelligent Systems, 2024
Proceedings of the International Joint Conference on Neural Networks, 2024
Improving Pointer Network based Dialogue State Tracking via Dual Hierarchical Selective Augmentation.
Proceedings of the International Joint Conference on Neural Networks, 2024
Phase Continuity-Aware Self-Attentive Recurrent Network with Adaptive Feature Selection for Robust VAD.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Introducing Multilingual Phonetic Information to Speaker Embedding for Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2024
SMMA-Net: An Audio Clue-Based Target Speaker Extraction Network with Spectrogram Matching and Mutual Attention.
Proceedings of the IEEE International Conference on Acoustics, 2024
Fact-Aware Summarization with Contrastive Learning for Few-Shot Dialogue State Tracking.
Proceedings of the IEEE International Conference on Acoustics, 2024
2023
Neural RAPT: deep learning-based pitch tracking with prior algorithmic knowledge instillation.
Int. J. Speech Technol., December, 2023
W2VC: WavLM representation based one-shot voice conversion with gradient reversal distillation and CTC supervision.
EURASIP J. Audio Speech Music. Process., December, 2023
Proceedings of the ACM Multimedia Asia 2023, 2023
Reprogramming Self-supervised Learning-based Speech Representations for Speaker Anonymization.
Proceedings of the ACM Multimedia Asia 2023, 2023
Self-supervised Learning Representation based Accent Recognition with Persistent Accent Memory.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
MTANet: Multi-band Time-frequency Attention Network for Singing Melody Extraction from Polyphonic Music.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the International Joint Conference on Neural Networks, 2023
CRA-DIFFUSE: Improved Cross-Domain Speech Enhancement Based on Diffusion Model with T-F Domain Pre-Denoising.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023
Speech-Text Based Multi-Modal Training with Bidirectional Attention for Improved Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
Speakeraugment: Data Augmentation for Generalizable Source Separation via Speaker Parameter Manipulation.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
2022
Hierarchic Temporal Convolutional Network With Cross-Domain Encoder for Music Source Separation.
IEEE Signal Process. Lett., 2022
A bimodal network based on Audio-Text-Interactional-Attention with ArcFace loss for speech emotion recognition.
Speech Commun., 2022
Multi-stage music separation network with dual-branch attention and hybrid convolution.
J. Intell. Inf. Syst., 2022
Intermediate-layer output Regularization for Attention-based Speech Recognition with Shared Decoder.
CoRR, 2022
Internal Language Model Estimation based Language Model Fusion for Cross-Domain Code-Switching Speech Recognition.
CoRR, 2022
CoRR, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
A Graph Isomorphism Network with Weighted Multiple Aggregators for Speech Emotion Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the Neural Information Processing - 29th International Conference, 2022
GhostVec: Directly Extracting Speaker Embedding from End-to-End Speech Recognition Model Using Adversarial Examples.
Proceedings of the Neural Information Processing - 29th International Conference, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Minimum Word Error Training For Non-Autoregressive Transformer-Based Code-Switching ASR.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the Biometric Recognition - 16th Chinese Conference, 2022
2021
A gating context-aware text classification model with BERT and graph convolutional networks.
J. Intell. Fuzzy Syst., 2021
Connectionist temporal classification loss for vector quantized variational autoencoder in zero-shot voice conversion.
Digit. Signal Process., 2021
Approaches to Improving Recognition of Underrepresented Named Entities in Hybrid ASR Systems.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
End-to-End Speech Separation Using Orthogonal Representation in Complex and Real Time-Frequency Domain.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Encoder-Decoder Based Pitch Tracking and Joint Model Training for Mandarin Tone Classification.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
2020
Using Deep Time Delay Neural Network for Slot Filling in Spoken Language Understanding.
Symmetry, 2020
Monaural Singing Voice and Accompaniment Separation Based on Gated Nested U-Net Architecture.
Symmetry, 2020
Enriching Under-Represented Named-Entities To Improve Speech Recognition Performance.
CoRR, 2020
A multilingual approach to joint Speech and Accent Recognition with DNN-HMM framework.
CoRR, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Monolingual Data Selection Analysis for English-Mandarin Hybrid Code-Switching Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
2017
2016
Semi-Supervised and Cross-Lingual Knowledge Transfer Learnings for DNN Hybrid Acoustic Models Under Low-Resource Conditions.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Monaural Singing Voice Separation by Non-negative Matrix Partial Co-Factorization with Temporal Continuity and Sparsity Criteria.
Proceedings of the Intelligent Computing Methodologies - 12th International Conference, 2016
I-vector based deep neural network acoustic model adaptation using multilingual language resource.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016
2015
Maximum F1-Score Discriminative Training Criterion for Automatic Mispronunciation Detection.
IEEE ACM Trans. Audio Speech Lang. Process., 2015
2009
Inf. Sci., 2009