Jiaen Liang

Orcid: 0009-0001-8309-1301

According to our database1, Jiaen Liang authored at least 36 papers between 2006 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


On csauthors.net:


Dual-model self-regularization and fusion for domain adaptation of robust speaker verification.
Speech Commun., November, 2023

M2-CTTS: End-to-End Multi-scale Multi-modal Conversational Text-to-Speech Synthesis.
CoRR, 2023

MMT-GD: Multi-Modal Transformer with Graph Distillation for Cross-Cultural Humor Detection.
Proceedings of the 4th on Multimodal Sentiment Analysis Challenge and Workshop: Mimicked Emotions, 2023

Sliding Window Seq2seq Modeling for Engagement Estimation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Answer-Based Entity Extraction and Alignment for Visual Text Question Answering.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

M<sup>2</sup>-CTTS: End-to-End Multi-Scale Multi-Modal Conversational Text-to-Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2023

Acoustic domain mismatch compensation in bird audio detection.
Int. J. Speech Technol., 2022

Exploring single channel speech separation for short-time text-dependent speaker verification.
Int. J. Speech Technol., 2022

Joint framework with deep feature distillation and adaptive focal loss for weakly supervised audio tagging and acoustic event detection.
Digit. Signal Process., 2022

ECAPA-TDNN for Multi-speaker Text-to-speech Synthesis.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Rhythm-controllable Attention with High Robustness for Long Sentence Speech Synthesis.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Selective Pseudo-labeling and Class-wise Discriminative Fusion for Sound Event Detection.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Joint Weakly Supervised AT and AED Using Deep Feature Distillation and Adaptive Focal Loss.
CoRR, 2021

Attention-Based Scaling Adaptation for Target Speech Extraction.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

CNN-based Discriminative Training for Domain Compensation in Acoustic Event Detection with Frame-wise Classifier.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

Mask-based blind source separation and MVDR beamforming in ASR.
Int. J. Speech Technol., 2020

Attention-based scaling adaptation for target speech extraction.
CoRR, 2020

Self-and-Mixed Attention Decoder with Deep Acoustic Structure for Transformer-Based LVCSR.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Speech Driven Talking Head Generation via Attentional Landmarks Based Representation.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

The SHNU System for Blizzard Challenge 2020.
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020

Speaker Direction-of-Arrival Estimation Based on Orthogonal Dipoles.
Circuits Syst. Signal Process., 2019

Active Learning for LF-MMI Trained Neural Networks in ASR.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Speaker Direction-of-Arrival Estimation Based on Frequency-Independent Beampattern.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Frequency-invariant differential microphone array design in the STFT domain.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Exploring nuisance attribute projection and score normalization for GLDS-SVM based automatic mispronunciation detection method.
Proceedings of the IEEE International Conference on Acoustics, 2011

Exploring goodness of prosody by diverse matching templates.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Automatic reference independent evaluation of prosody quality using multiple knowledge fusions.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

High performance automatic mispronunciation detection method based on neural network and TRAP features.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

An efficient mispronounciation detction method using GLDS-SVM and formant enhanced features.
Proceedings of the IEEE International Conference on Acoustics, 2009

Context Dependent Feature Based Bottom-up Rescoring SVM Classifier in Children's English Stress Mis-pronunciation Detection.
Proceedings of the 9th IEEE International Conference on Advanced Learning Technologies, 2009

Improving searching speed and accuracy of query by humming system based on three methods: feature fusion, candidates set reduction and multiple similarity measurement rescoring.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Music Genre Classification Based on Multiple Classifier Fusion.
Proceedings of the Fourth International Conference on Natural Computation, 2008

Improved phonotactic language identification using random forest language models.
Proceedings of the IEEE International Conference on Acoustics, 2008

A Novel Phone-State Matrix Based Vocabulary-Indenendent Keyword Spotting Method for Spontaneous Speech.
Proceedings of the IEEE International Conference on Acoustics, 2007

Full Utilization of Closed-captions in Broadcast News Recognition.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

An Improved Mandarin Keyword Spotting System Using MCE Training and Context-Enhanced Verification.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006
