Ming Cheng

Orcid: 0000-0002-4733-3596

Affiliations:
  • Wuhan University, School of Computer Science, China
  • Duke Kunshan University, Suzhou Municipal Key Laboratory of Multimodal Intelligent Systems, China


According to our database1, Ming Cheng authored at least 15 papers between 2020 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Efficient Personal Voice Activity Detection with Wake Word Reference Speech.
Proceedings of the IEEE International Conference on Acoustics, 2024

Robust Wake Word Spotting With Frame-Level Cross-Modal Attention Based Audio-Visual Conformer.
Proceedings of the IEEE International Conference on Acoustics, 2024

Joint Inference of Speaker Diarization and ASR with Multi-Stage Information Sharing.
Proceedings of the IEEE International Conference on Acoustics, 2024

Voxblink: A Large Scale Speaker Verification Dataset on Camera.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Computer-Aided Autism Spectrum Disorder Diagnosis With Behavior Signal Processing.
IEEE Trans. Affect. Comput., 2023

VoxBlink: X-Large Speaker Verification Dataset on Camera.
CoRR, 2023

Assessing the Social Skills of Children with Autism Spectrum Disorder via Language-Image Pre-training Models.
Proceedings of the Pattern Recognition and Computer Vision - 6th Chinese Conference, 2023

The DKU Post-Challenge Audio-Visual Wake Word Spotting System for the 2021 MISP Challenge: Deep Analysis.
Proceedings of the IEEE International Conference on Acoustics, 2023

Target-Speaker Voice Activity Detection Via Sequence-to-Sequence Prediction.
Proceedings of the IEEE International Conference on Acoustics, 2023

The WHU-Alibaba Audio-Visual Speaker Diarization System for the MISP 2022 Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
The DKU Audio-Visual Wake Word Spotting System for the 2021 MISP Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
A Multimodal Dynamic Neural Network for Call for Help Recognition in Elevators.
Proceedings of the ICMI '21 Companion: Companion Publication of the 2021 International Conference on Multimodal Interaction, Montreal, QC, Canada, October 18, 2021

Cross-modal Assisted Training for Abnormal Event Recognition in Elevators.
Proceedings of the ICMI '21: International Conference on Multimodal Interaction, 2021

2020
Responsive Social Smile: A Machine Learning based Multimodal Behavior Assessment Framework towards Early Stage Autism Screening.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

RWF-2000: An Open Large Scale Video Database for Violence Detection.
Proceedings of the 25th International Conference on Pattern Recognition, 2020


  Loading...