We stand with Ukraine

We stand with Ukraine

Qing Wang

Orcid: 0000-0003-3843-3920

Affiliations:

University of Science and Technology of China, National Engineering Laboratory for Speech and Language Information Processing, Hefei, China

According to our database¹, Qing Wang authored at least 41 papers between 2014 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of five.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

Online presence:

on orcid.org

On csauthors.net:

Bibliography

2025

An Experimental Study on Joint Modeling for Sound Event Localization and Detection with Source Distance Estimation.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, January, 2025

2024

Collaborative Viseme Subword and End-to-End Modeling for Word-Level Lip Reading.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

IEEE Trans. Multim., 2024

A Variance-Preserving Interpolation Approach for Diffusion Models With Applications to Single Channel Speech Enhancement and Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Optimizing Audio-Visual Speech Enhancement Using Multi-Level Distortion Measures for Audio-Visual Speech Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2024

See then Tell: Enhancing Key Information Extraction with Vision Grounding.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2024

SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction Benchmark in Form Understanding.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2024

Representation Learning Using Machine Attribute Information for Anomalous Sound Detection in Real Scenarios.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the International Joint Conference on Neural Networks, 2024

The NERCSLIP-USTC System for Semi-Supervised Acoustic Scene Classification of ICME 2024 Grand Challenge.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Exploring Audio-Visual Information Fusion for Sound Event Localization and Detection In Low-Resource Realistic Scenarios.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Maths: Multimodal Transformer-Based Human-Readable Solver.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

2023

Using iterative adaptation and dynamic mask for child speech extraction under real-world multilingual conditions.

[BibT_eX]

[DOI]

,

,

,

Alejandrina Cristià

,

,

,

Speech Commun., July, 2023

A Four-Stage Data Augmentation Approach to ResNet-Conformer Based Acoustic Modeling for Sound Event Localization and Detection.

[BibT_eX]

[DOI]

,

,

,

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the 31st ACM International Conference on Multimedia, 2023

The NERCSLIP-USTC System for the L3DAS23 Challenge Task2: 3D Sound Event Localization and Detection (SELD).

[BibT_eX]

[DOI]

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

Loss Function Design for DNN-Based Sound Event Localization and Detection on Low-Resource Realistic Data.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

An Experimental Study on Sound Event Localization and Detection Under Realistic Testing Conditions.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

Incorporating Lip Features into Audio-Visual Multi-Speaker DOA Estimation by Gated Fusion.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

Improving Sound Event Localization and Detection with Class-Dependent Sound Separation for Real-World Scenarios.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022

A Study on Joint Modeling and Data Augmentation of Multi-Modalities for Audio-Visual Scene Classification.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Chao-Han Huck Yang

,

Sabato Marco Siniscalchi

,

,

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Deep Segment Model for Acoustic Scene Classification.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

Information Fusion in Attention Networks Using Adaptive and Multi-Level Factorized Bilinear Pooling for Audio-Visual Emotion Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2021

A Lottery Ticket Hypothesis Framework for Low-Complexity Device-Robust Neural Acoustic Scene Classification.

[BibT_eX]

[DOI]

Chao-Han Huck Yang

,

,

Sabato Marco Siniscalchi

,

,

,

,

,

,

,

,

CoRR, 2021

A Model Ensemble Approach for Sound Event Localization and Detection.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Lightweight Causal Transformer with Local Self-Attention for Real-Time Speech Enhancement.

[BibT_eX]

[DOI]

Koen Oostermeijer

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

MRD: A Memory Relation Decoder for Online Handwritten Mathematical Expression Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 16th International Conference on Document Analysis and Recognition, 2021

Speech Enhancement Autoencoder with Hierarchical Latent Structure.

[BibT_eX]

[DOI]

Koen Oostermeijer

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

A Transformer-based Radical Analysis Network for Chinese Character Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 25th International Conference on Pattern Recognition, 2020

Stroke Based Posterior Attention for Online Handwritten Mathematical Expression Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 25th International Conference on Pattern Recognition, 2020

Geometry Constrained Progressive Learning for Lstm-Based Speech Enhancement.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Frequency Gating: Improved Convolutional Neural Networks for Speech Enhancement in the Time-Frequency Domain.

[BibT_eX]

[DOI]

Koen Oostermeijer

,

,

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019

A LSTM-Based Joint Progressive Learning Framework for Simultaneous Speech Dereverberation and Denoising.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018

A Multiobjective Learning and Ensembling Approach to High-Performance Speech Enhancement With Compact Neural Network Architectures.

[BibT_eX]

[DOI]

,

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2018

A Progressive Deep Learning Approach to Child Speech Separation.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

A Maximum Likelihood Approach to Masking-based Speech Enhancement Using Deep Neural Network.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

2017

An information fusion framework with multi-channel feature concatenation and multi-perspective system combination for the deep-learning-based robust recognition of microphone array speech.

[BibT_eX]

[DOI]

,

,

,

,

,

Comput. Speech Lang., 2017

Joint noise and mask aware training for DNN-based speech enhancement with SUB-band features.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Hands-free Speech Communications and Microphone Arrays, 2017

2016

Boosting DNN-based speech enhancement via explicit transformations.

[BibT_eX]

[DOI]

,

,

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

2015

A universal VAD based on jointly trained deep neural networks.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

An information fusion approach to recognizing microphone array speech in the CHiME-3 challenge based on a deep learning framework.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014

Robust speech recognition with speech enhanced deep neural networks.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Loading...