Xinhao Mei

Orcid: 0000-0001-6079-5130

According to our database1, Xinhao Mei authored at least 23 papers between 2021 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Towards Generating Diverse Audio Captions via Adversarial Training.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

AudioLDM 2: Learning Holistic Audio Generation With Self-Supervised Pretraining.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

First-Shot Unsupervised Anomalous Sound Detection with Unknown Anomalies Estimated by Metadata-Assisted Audio Generation.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
FoleyGen: Visually-Guided Audio Generation.
CoRR, 2023

Enhance audio generation controllability through representation similarity regularization.
CoRR, 2023

Dual Transformer Decoder based Features Fusion Network for Automated Audio Captioning.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Ontology-aware Learning and Evaluation for Audio Tagging.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Visually-Aware Audio Captioning With Adaptive Audio-Visual Attention.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

AudioLDM: Text-to-Audio Generation with Latent Diffusion Models.
Proceedings of the International Conference on Machine Learning, 2023

Simple Pooling Front-Ends for Efficient Audio Classification.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Automated audio captioning: an overview of recent progress and new challenges.
EURASIP J. Audio Speech Music. Process., 2022

Automated Audio Captioning via Fusion of Low- and High- Dimensional Features.
CoRR, 2022

Surrey System for DCASE 2022 Task 5: Few-shot Bioacoustic Event Detection with Segment-level Metric Learning.
CoRR, 2022

On Metric Learning for Audio-Text Cross-Modal Retrieval.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Separate What You Describe: Language-Queried Audio Source Separation.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Diverse Audio Captioning Via Adversarial Training.
Proceedings of the IEEE International Conference on Acoustics, 2022

Deep Neural Decision Forest for Acoustic Scene Classification.
Proceedings of the 30th European Signal Processing Conference, 2022

Leveraging Pre-trained BERT for Audio Captioning.
Proceedings of the 30th European Signal Processing Conference, 2022

Segment-Level Metric Learning for Few-Shot Bioacoustic Event Detection.
Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022

2021
Audio Captioning Transformer.
Proceedings of the 6th Workshop on Detection and Classification of Acoustic Scenes and Events 2021 (DCASE 2021), 2021

An Encoder-Decoder Based Audio Captioning System with Transfer and Reinforcement Learning.
Proceedings of the 6th Workshop on Detection and Classification of Acoustic Scenes and Events 2021 (DCASE 2021), 2021

CL4AC: A Contrastive Loss for Audio Captioning.
Proceedings of the 6th Workshop on Detection and Classification of Acoustic Scenes and Events 2021 (DCASE 2021), 2021


  Loading...