Yu Wang

Orcid: 0000-0001-9500-081X

Affiliations:
  • Shanghai Jiao Tong University, Cooperative Medianet Innovation Center, China
  • University of Cambridge, Department of Engineering, UK
  • Imperial College London, Speech and Audio Processing Group, UK (PhD 2015)


According to our database1, Yu Wang authored at least 63 papers between 2013 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Leveraging Diverse Modeling Contexts With Collaborating Learning for Neural Machine Translation.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

DialogMCF: Multimodal Context Flow for Audio Visual Scene-Aware Dialog.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

ReflecTool: Towards Reflection-Aware Tool-Augmented Clinical Agents.
CoRR, 2024

HSDreport: Heart Sound Diagnosis with Echocardiography Reports.
CoRR, 2024

MedCare: Advancing Medical LLMs through Decoupling Clinical Alignment and Knowledge Aggregation.
CoRR, 2024

TAIA: Large Language Models are Out-of-Distribution Data Learners.
CoRR, 2024

MING-MOE: Enhancing Medical Multi-Task Learning in Large Language Models with Sparse Mixture of Low-Rank Adapter Experts.
CoRR, 2024

M<sup>3</sup>AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset.
CoRR, 2024

Automatic Interactive Evaluation for Large Language Models with State Aware Patient Simulator.
CoRR, 2024

Post-decoder Biasing for End-to-End Speech Recognition of Multi-turn Medical Interview.
CoRR, 2024

M2K-VDG: Model-Adaptive Multimodal Knowledge Anchor Enhanced Video-grounded Dialogue Generation.
CoRR, 2024

MM-SAP: A Comprehensive Benchmark for Assessing Self-Awareness of Multimodal Large Language Models in Perception.
CoRR, 2024

Annotation-free Audio-Visual Segmentation.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

MSG-BART: Multi-Granularity Scene Graph-Enhanced Encoder-Decoder Language Model for Video-Grounded Dialogue Generation.
Proceedings of the IEEE International Conference on Acoustics, 2024

RA2FD: Distilling Faithfulness into Efficient Dialogue Systems.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

MedCare: Advancing Medical LLMs through Decoupling Clinical Alignment and Knowledge Aggregation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

CE-VDG: Counterfactual Entropy-based Bias Reduction for Video-grounded Dialogue Generation.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

SDA: Semantic Discrepancy Alignment for Text-conditioned Image Retrieval.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

DictLLM: Harnessing Key-Value Data Structures with Large Language Models for Enhanced Medical Diagnostics.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Self-Supervised Masking for Unsupervised Anomaly Detection and Localization.
IEEE Trans. Multim., 2023

Redundancy-Adaptive Multimodal Learning for Imperfect Data.
CoRR, 2023

Improving the Reliability of Large Language Models by Leveraging Uncertainty-Aware In-Context Learning.
CoRR, 2023

An Automatic Evaluation Framework for Multi-turn Medical Consultations Capabilities of Large Language Models.
CoRR, 2023

LibriSQA: Advancing Free-form and Open-ended Spoken Question Answering with a Novel Dataset and Framework.
CoRR, 2023

Audio-aware Query-enhanced Transformer for Audio-Visual Segmentation.
CoRR, 2023

SelfEvolve: A Code Evolution Framework via Large Language Models.
CoRR, 2023

DiffusionSeg: Adapting Diffusion Towards Unsupervised Object Discovery.
CoRR, 2023

Knowledge-aware Bayesian Co-attention for Multimodal Emotion Recognition.
CoRR, 2023

Uncertainty-Guided End-to-End Audio-Visual Speaker Diarization for Far-Field Recordings.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Contrastive Learning Based ASR Robust Knowledge Selection For Spoken Dialogue System.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Self-Improvement of Non-autoregressive Model via Sequence-Level Distillation.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Enhanced Multimodal Representation Learning with Cross-modal KD.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Unsupervised Ensemble Distillation for Multi-Organ Segmentation.
Proceedings of the 19th IEEE International Symposium on Biomedical Imaging, 2022

Multi-level Fusion of Wav2vec 2.0 and BERT for Multimodal Emotion Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

LAR-SR: A Local Autoregressive Model for Image Super-Resolution.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Efficient Use of End-to-End Data in Spoken Language Processing.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Spoken Language 'Grammatical Error Correction'.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Non-Native Children's Automatic Speech Recognition: The INTERSPEECH 2020 Shared Task ALTA Systems.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019
General Sequence Teacher-Student Learning.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Exploiting Future Word Contexts in Neural Network Language Models for Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Non-native Speaker Verification for Spoken Language Assessment.
CoRR, 2019

Disfluency Detection for Spoken Learner English.
Proceedings of the 8th ISCA International Workshop on Speech and Language Technology in Education, 2019

Impact of ASR Performance on Spoken Grammatical Error Detection.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Non-Intrusive POLQA Estimation of Speech Quality using Recurrent Neural Networks.
Proceedings of the 27th European Signal Processing Conference, 2019

Learning Between Different Teacher and Student Models in ASR.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Model-Based Speech Enhancement in the Modulation Domain.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Towards automatic assessment of spontaneous spoken English.
Speech Commun., 2018

Confidence Estimation and Deletion Prediction Using Bidirectional Recurrent Neural Networks.
CoRR, 2018

Sequence Teacher-Student Training of Acoustic Models for Automatic Free Speaking Language Assessment.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Speaker Adaptation and Adaptive Training for Jointly Optimised Tandem Systems.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Impact of ASR Performance on Free Speaking Language Assessment.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Phonetic and Graphemic Systems for Multi-Genre Broadcast Transcription.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Future Word Contexts in Neural Network Language Models.
CoRR, 2017

An attention based model for off-topic spontaneous spoken response detection: An Initial Study.
Proceedings of the 7th ISCA International Workshop on Speech and Language Technology in Education, 2017

Use of Graphemic Lexicons for Spoken Language Assessment.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016
A data-driven non-intrusive measure of speech quality and intelligibility.
Speech Commun., 2016

Speech enhancement using an MMSE spectral amplitude estimator based on a modulation domain Kalman filter with a Gamma prior.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Off-topic Response Detection for Spontaneous Spoken English Assessment.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

2014
Speech enhancement usinga modulation domain Kalman filter post-processor with a Gaussian Mixture noise model.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
Speech enhancement using a robust Kalman filter post-processor in the modulation domain.
Proceedings of the IEEE International Conference on Acoustics, 2013

A subspace method for speech enhancement in the modulation domain.
Proceedings of the 21st European Signal Processing Conference, 2013


  Loading...