Mingxing Xu

Orcid: 0000-0003-2883-1802

According to our database1, Mingxing Xu authored at least 93 papers between 1997 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


On csauthors.net:


Rapid monitoring of the spatial distribution of soil organic matter using unmanned aerial vehicle imaging spectroscopy.
Ann. GIS, July, 2024

Nonlinear control strategies for 3-DOF control moment gyroscope using deep reinforcement learning.
Neural Comput. Appl., April, 2024

Whisper-PMFA: Partial Multi-Scale Feature Aggregation for Speaker Verification using Whisper Models.
CoRR, 2024

A Joint Noise Disentanglement and Adversarial Training Framework for Robust Speaker Verification.
CoRR, 2024

Speaker Adaptation for Quantised End-to-End ASR Models.
CoRR, 2024

SAML: Speaker Adaptive Mixture of LoRA Experts for End-to-End ASR.
CoRR, 2024

Enhancing Quantised End-to-End ASR Models Via Personalisation.
Proceedings of the IEEE International Conference on Acoustics, 2024

Automatic Representative Frame Selection and Intrathoracic Lymph Node Diagnosis With Endobronchial Ultrasound Elastography Videos.
IEEE J. Biomed. Health Informatics, 2023

DialoguePCN: Perception and Cognition Network for Emotion Recognition in Conversations.
IEEE Access, 2023

Robust Point Cloud Classification With Permutohedral Lattice-based Representation.
Proceedings of the IEEE International Conference on Visual Communications and Image Processing, 2023

Graph Neural Networks With Lifting-Based Adaptive Graph Wavelets.
IEEE Trans. Signal Inf. Process. over Networks, 2022

Surrogate modeling for spacecraft thermophysical models using deep learning.
Neural Comput. Appl., 2022

Hierarchical Spherical CNNs with Lifting-based Adaptive Wavelets for Pooling and Unpooling.
CoRR, 2022

LiftPool: Lifting-based Graph Pooling for Hierarchical Graph Representation Learning.
CoRR, 2022

How Health-Related Misinformation Spreads Across the Internet: Evidence for the "Typhoon Eye" Effect.
Cyberpsychology Behav. Soc. Netw., 2022

Spectral Graph Convolutional Networks With Lifting-based Adaptive Graph Wavelets.
CoRR, 2021

Cross-Database Replay Detection in Terminal-Dependent Speaker Verification.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Learning from Multiple Noisy Augmented Data Sets for Better Cross-Lingual Spoken Language Understanding.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Depth Estimation From Light Field Using Graph-Based Structure-Aware Analysis.
IEEE Trans. Circuits Syst. Video Technol., 2020

Spatial-Temporal Transformer Networks for Traffic Flow Forecasting.
CoRR, 2020

Guoym at SemEval-2020 Task 8: Ensemble-based Classification of Visuo-Lingual Metaphor in Memes.
Proceedings of the Fourteenth Workshop on Semantic Evaluation, 2020

FERNet: Fine-grained Extraction and Reasoning Network for Emotion Recognition in Dialogues.
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, 2020

Structure-Aware Graph Construction For Point Cloud Segmentation With Graph Convolutional Networks.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

用于票房收益预测的国产电影信息数据库 (Database of Chinese Domestic Films for Fox-office Revenue Forecasting).
计算机科学, 2019

THU-HCSI at SemEval-2019 Task 3: Hierarchical Ensemble Classification of Contextual Emotion in Conversation.
Proceedings of the 13th International Workshop on Semantic Evaluation, 2019

MSGCNN: Multi-scale Graph Convolutional Neural Network for Point Cloud Segmentation.
Proceedings of the Fifth IEEE International Conference on Multimedia Big Data, 2019

Replay detection using CQT-based modified group delay feature and ResNeWt network in ASVspoof 2019.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Multi-Scale Convolutional Recurrent Neural Network with Ensemble Method for Weakly Labeled Sound Event Detection.
Proceedings of the 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos, 2019

THUHCSI in MediaEval 2018 Emotional Impact of Movies Task.
Proceedings of the Working Notes Proceedings of the MediaEval 2018 Workshop, 2018

Imbalance Learning-based Framework for Fear Recognition in the MediaEval Emotional Impact of Movies Task.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Emotion Recognition from Variable-Length Speech Segments Using Deep Learning on Spectrograms.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Multi-scale convolutional recurrent neural network with ensemble method for weakly labeled sound event detection.
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, 2018

MMANN: Multimodal Multilevel Attention Neural Network for Horror Clip Detection.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Multi-modal Multi-scale Speech Expression Evaluation in Computer-Assisted Language Learning.
Proceedings of the Artificial Intelligence and Mobile Services - AIMS 2018, 2018

Multi-scale Context Based Attention for Dynamic Music Emotion Prediction.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

THUHCSI in MediaEval 2017 Emotional Impact of Movies Task.
Proceedings of the Working Notes Proceedings of the MediaEval 2017 Workshop co-located with the Conference and Labs of the Evaluation Forum (CLEF 2017), 2017

Speech Emotion Recognition with Emotion-Pair Based Framework Considering Emotion Distribution Information in Dimensional Emotion Space.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Speaker segmentation using deep speaker vectors for fast speaker change scenarios.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Learning cross-lingual knowledge with multilingual BLSTM for emphasis detection with limited training data.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Study on Feature Subspace of Archetypal Emotions for Speech Emotion Recognition.
CoRR, 2016

THU-HCSI at MediaEval 2016: Emotional Impact of Movies Task.
Proceedings of the Working Notes Proceedings of the MediaEval 2016 Workshop, 2016

Analysis on Gated Recurrent Unit Based Question Detection Approach.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Combining CNN and BLSTM to Extract Textual and Acoustic Features for Recognizing Stances in Mandarin Ideological Debate Competition.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Heterogeneity-entropy based unsupervised feature learning for personality prediction with cross-media data.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2016

Recognizing stances in Mandarin social ideological debates with text and acoustic features.
Proceedings of the 2016 IEEE International Conference on Multimedia & Expo Workshops, 2016

DBLSTM-based multi-scale fusion for dynamic emotion prediction in music.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2016

SVR based double-scale regression for dynamic emotion prediction in music.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Question detection from acoustic features using recurrent neural network with gated recurrent unit.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

A deep bidirectional long short-term memory based multi-scale approach for music dynamic emotion prediction.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Relative entropy normalized Gaussian supervector for speech emotion recognition using kernel extreme learning machine.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

Chinese Traditional Opera database for Music Genre Recognition.
Proceedings of the 2015 International Conference Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2015

Multi-Scale Approaches to the MediaEval 2015 "Emotion in Music" Task.
Proceedings of the Working Notes Proceedings of the MediaEval 2015 Workshop, 2015

Predictors of Pause Duration in Read-Aloud Discourse.
IEICE Trans. Inf. Syst., 2014

MediaEval 2014: THU-HCSIL Approach to Emotion in Music Task using Multi-level Regression.
Proceedings of the Working Notes Proceedings of the MediaEval 2014 Workshop, 2014

Improved keyword spotting system by optimizing posterior confidence measure vector using feed-forward neural network.
Proceedings of the 2014 International Joint Conference on Neural Networks, 2014

Intrinsic variation robust speaker verification based on sparse representation.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Automatic Emotion Variation Detection in continuous speech.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Comparing feature dimension reduction algorithms for GMM-SVM based speech emotion recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

Comparison of adaptation methods for GMM-SVM based speech emotion recognition.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Study on the effects of intrinsic variation using i-vectors in text-independent speaker verification.
Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

Compensation of Intrinsic Variability with Factor Analysis Modeling for Robust Speaker Verification.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Energy classification-assisted fingerprint system for content-based audio copy detection.
Proceedings of the 9th International Conference on Communications, 2012

Proceedings of the TRECVID 2009 workshop participants notebook papers, 2009

ANN based decision fusion for speech emotion recognition.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Shift Window Based Framework for Emotional Change Detection of Speech.
Proceedings of the Sixth International Conference on Fuzzy Systems and Knowledge Discovery, 2009

Lyric-based Song Sentiment Classification with Sentiment Vector Space Model.
Proceedings of the ACL 2008, 2008

A Cohort-Based Speaker Model Synthesis for Mismatched Channels in Speaker Verification.
IEEE Trans. Speech Audio Process., 2007

Cohort-Based Speaker Model Synthesis for Channel Robust Speaker Recognition.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

A two-step keyword spotting method based on context-dependent a posteriori probability.
Proceedings of the 2004 International Symposium on Chinese Spoken Language Processing, 2004

Using word confidence measure for OOV words detection in a spontaneous spoken dialog system.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

An automatic speech recognition strategy directed by the semantic knowledge in dialogue system.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Study on the strategy for hierarchical speech recognition.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Comparison and combination of confidence measures in isolated word recognition.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Preparing for evaluation of a flight spoken dialogue system.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Study on detection of prosodic phrase boundaries in spontaneous speech.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Phoneagent: a conversational interface for telephone exchange system.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Study on framework for Chinese pronunciation variation modeling.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Improved katz smoothing for language modeling in speech recogniton.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Robust parsing in spoken dialogue systems.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Topic Forest: a plan-based dialog management structure.
Proceedings of the IEEE International Conference on Acoustics, 2001

Intra-syllable Dependent Phonetic Modeling For Chinese Speech Recognition.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

A Noise Cancellation Method Based on Wavelet Transform.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

Word-class Stochastic Model in A Spoken Language Dialogue System.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

Acoustic Level Error Analysis in Continuous Speech Recognition.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

Semi-continuous segmental probability modeling for continuous speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

An equivalent-class based MMI learning method for MGCPM.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Language understanding component for Chinese dialogue system.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

HarkMan - A vocabulary-independent keyword spotter for spontaneous Chinese speech.
J. Comput. Sci. Technol., 1999

Easytalk: a large-vocabulary speaker-independent Chinese dictation machine.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

A fast and effective state decoding algorithm.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

An effective scoring method for speaking skill evaluation system.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

A Vocabulary-Independent Keyword Spotter for Spontaneous Chinese Speech.
Proceedings of the 1998 International Symposium on Chinese Spoken Language Processing, 1998

Rejection in Speech Recognition Based on CDCPMs.
Proceedings of the 10th Research on Computational Linguistics International Conference, 1997
