Xubo Liu

Orcid: 0000-0002-9643-2099

According to our database1, Xubo Liu authored at least 63 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Towards Generating Diverse Audio Captions via Adversarial Training.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

AudioLDM 2: Learning Holistic Audio Generation With Self-Supervised Pretraining.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Learning Source Disentanglement in Neural Audio Codec.
CoRR, 2024

FlowSep: Language-Queried Sound Separation with Rectified Flow Matching.
CoRR, 2024

A Reference-free Metric for Language-Queried Audio Source Separation using Contrastive Language-Audio Pretraining.
CoRR, 2024

Improving Audio Generation with Visual Enhanced Caption.
CoRR, 2024

Fish Tracking, Counting, and Behaviour Analysis in Digital Aquaculture: A Comprehensive Review.
CoRR, 2024

ComposerX: Multi-Agent Symbolic Music Composition with LLMs.
CoRR, 2024

Tracking-forced Referring Video Object Segmentation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

T-CLAP: Temporal-Enhanced Contrastive Language-Audio Pretraining.
Proceedings of the 34th IEEE International Workshop on Machine Learning for Signal Processing, 2024

First-Shot Unsupervised Anomalous Sound Detection with Unknown Anomalies Estimated by Metadata-Assisted Audio Generation.
Proceedings of the IEEE International Conference on Acoustics, 2024

Retrieval-Augmented Text-to-Audio Generation.
Proceedings of the IEEE International Conference on Acoustics, 2024

Audio Prompt Tuning for Universal Sound Separation.
Proceedings of the IEEE International Conference on Acoustics, 2024

CM-PIE: Cross-Modal Perception for Interactive-Enhanced Audio-Visual Video Parsing.
Proceedings of the IEEE International Conference on Acoustics, 2024

Universal Sound Separation with Self-Supervised Audio Masked Autoencoder.
Proceedings of the 32nd European Signal Processing Conference, 2024

Look before You Leap: Dual Logical Verification for Knowledge-based Visual Question Generation.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Selective Prompting Tuning for Personalized Conversations with LLMs.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Learning Temporal Resolution in Spectrogram for Audio Classification.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Synth-AC: Enhancing Audio Captioning with Synthetic Supervision.
CoRR, 2023

Multimodal Fish Feeding Intensity Assessment in Aquaculture.
CoRR, 2023

Separate Anything You Describe.
CoRR, 2023

WavJourney: Compositional Audio Creation with Large Language Models.
CoRR, 2023

Text-Driven Foley Sound Generation With Latent Diffusion Model.
CoRR, 2023

Latent Diffusion Model Based Foley Sound Generation System For DCASE Challenge 2023 Task 7.
CoRR, 2023

SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision.
CoRR, 2023

Leveraging Pre-trained AudioLDM for Text to Sound Generation: A Benchmark Study.
CoRR, 2023

Dual Transformer Decoder based Features Fusion Network for Automated Audio Captioning.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Ontology-aware Learning and Evaluation for Audio Tagging.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Visually-Aware Audio Captioning With Adaptive Audio-Visual Attention.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Adapting Language-Audio Models as Few-Shot Audio Learners.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

AudioLDM: Text-to-Audio Generation with Latent Diffusion Models.
Proceedings of the International Conference on Machine Learning, 2023

Simple Pooling Front-Ends for Efficient Audio Classification.
Proceedings of the IEEE International Conference on Acoustics, 2023

Leveraging Pre-Trained AudioLDM for Sound Generation: A Benchmark Study.
Proceedings of the 31st European Signal Processing Conference, 2023

Knowledge Distillation for Efficient Audio-Visual Video Captioning.
Proceedings of the 31st European Signal Processing Conference, 2023

Learning Retrieval Augmentation for Personalized Dialogue Generation.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

SynthVSR: Scaling Up Visual Speech RecognitionWith Synthetic Supervision.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Personalized Dialogue Generation with Persona-Adaptive Attention.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Automated audio captioning: an overview of recent progress and new challenges.
EURASIP J. Audio Speech Music. Process., 2022

Automated Audio Captioning via Fusion of Low- and High- Dimensional Features.
CoRR, 2022

Learning the Spectrogram Temporal Resolution for Audio Classification.
CoRR, 2022

Low-complexity CNNs for Acoustic Scene Classification.
CoRR, 2022

Surrey System for DCASE 2022 Task 5: Few-shot Bioacoustic Event Detection with Segment-level Metric Learning.
CoRR, 2022

Continual Learning For On-Device Environmental Sound Classification.
CoRR, 2022

Fish Feeding Intensity Assessment in Aquaculture: A New Audio Dataset AFFIA3K and a Deep Learning Algorithm.
Proceedings of the 32nd IEEE International Workshop on Machine Learning for Signal Processing, 2022

Audio Visual Multi-Speaker Tracking with Improved GCF and PMBM Filter.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

On Metric Learning for Audio-Text Cross-Modal Retrieval.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

VoiceFixer: A Unified Framework for High-Fidelity Speech Restoration.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Separate What You Describe: Language-Queried Audio Source Separation.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Neural Vocoder is All You Need for Speech Super-resolution.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Audio-Visual Tracking of Multiple Speakers Via a PMBM Filter.
Proceedings of the IEEE International Conference on Acoustics, 2022

Diverse Audio Captioning Via Adversarial Training.
Proceedings of the IEEE International Conference on Acoustics, 2022

Path Planning based on Astar Algorithm in Automatic Driving.
Proceedings of the 6th International Conference on Algorithms, Computing and Systems, 2022

Visually Assisted Self-supervised Audio Speaker Localization and Tracking.
Proceedings of the 30th European Signal Processing Conference, 2022

Deep Neural Decision Forest for Acoustic Scene Classification.
Proceedings of the 30th European Signal Processing Conference, 2022

Leveraging Pre-trained BERT for Audio Captioning.
Proceedings of the 30th European Signal Processing Conference, 2022

Continual Learning for On-Ddevice Environmental Sound Classification.
Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022

Segment-Level Metric Learning for Few-Shot Bioacoustic Event Detection.
Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022

2021
Conditional Sound Generation Using Neural Discrete Time-Frequency Representation Learning.
Proceedings of the 2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP), 2021

Token-Level Supervised Contrastive Learning for Punctuation Restoration.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Audio Captioning Transformer.
Proceedings of the 6th Workshop on Detection and Classification of Acoustic Scenes and Events 2021 (DCASE 2021), 2021

An Encoder-Decoder Based Audio Captioning System with Transfer and Reinforcement Learning.
Proceedings of the 6th Workshop on Detection and Classification of Acoustic Scenes and Events 2021 (DCASE 2021), 2021

CL4AC: A Contrastive Loss for Audio Captioning.
Proceedings of the 6th Workshop on Detection and Classification of Acoustic Scenes and Events 2021 (DCASE 2021), 2021

2019
Altitude Control for Variable Load Quadrotor via Learning Rate Based Robust Sliding Mode Controller.
IEEE Access, 2019


  Loading...