Ruibo Fu

Orcid: 0000-0001-9598-1881

According to our database1, Ruibo Fu authored at least 70 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Dual-Branch Knowledge Distillation for Noise-Robust Synthetic Speech Detection.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

CFAD: A Chinese dataset for fake audio detection.
Speech Commun., 2024

SceneFake: An initial dataset and benchmarks for scene fake audio detection.
Pattern Recognit., 2024

Mixture of Experts Fusion for Fake Audio Detection Using Frozen wav2vec 2.0.
CoRR, 2024

DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech.
CoRR, 2024

Towards Diverse and Efficient Audio Captioning via Diffusion Models.
CoRR, 2024

Text Prompt is Not Enough: Sound Event Enhanced Prompt Adapter for Target Style Audio Generation.
CoRR, 2024

Exploring the Role of Audio in Multimodal Misinformation Detection.
CoRR, 2024

Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio?
CoRR, 2024

EELE: Exploring Efficient and Extensible LoRA Integration in Emotional Text-to-Speech.
CoRR, 2024

A Noval Feature via Color Quantisation for Fake Audio Detection.
CoRR, 2024

Temporal Variability and Multi-Viewed Self-Supervised Representations to Tackle the ASVspoof5 Deepfake Challenge.
CoRR, 2024

VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing.
CoRR, 2024

MDPE: A Multimodal Deception Dataset with Personality and Emotional Characteristics.
CoRR, 2024

ICAGC 2024: Inspirational and Convincing Audio Generation Challenge 2024.
CoRR, 2024

ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation.
CoRR, 2024

Fake News Detection and Manipulation Reasoning via Large Vision-Language Models.
CoRR, 2024

A multi-speaker multi-lingual voice cloning system based on vits2 for limmits 2024 challenge.
CoRR, 2024

MINT: a Multi-modal Image and Narrative Text Dubbing Dataset for Foley Audio Content Planning and Generation.
CoRR, 2024

Codecfake: An Initial Dataset for Detecting LLM-based Deepfake Audio.
CoRR, 2024

PPPR: Portable Plug-in Prompt Refiner for Text to Audio Generation.
CoRR, 2024

Genuine-Focused Learning using Mask AutoEncoder for Generalized Fake Audio Detection.
CoRR, 2024

Generalized Source Tracing: Detecting Novel Audio Deepfake Algorithm with Real Emphasis and Fake Dispersion Strategy.
CoRR, 2024

Generalized Fake Audio Detection via Deep Stable Learning.
CoRR, 2024

The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio.
CoRR, 2024

Emotion selectable end-to-end text-based speech editing.
Artif. Intell., 2024

Learning Speech Representation from Contrastive Token-Acoustic Pretraining.
Proceedings of the IEEE International Conference on Acoustics, 2024

Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era.
Proc. IEEE, October, 2023

Adversarial Multi-Task Learning for Mandarin Prosodic Boundary Prediction With Multi-Modal Embeddings.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Learning to Behave Like Clean Speech: Dual-Branch Knowledge Distillation for Noise-Robust Fake Audio Detection.
CoRR, 2023

Low-rank Adaptation Method for Wav2vec2-based Fake Audio Detection.
CoRR, 2023

Adaptive Fake Audio Detection with Low-Rank Model Squeezing.
CoRR, 2023

TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection.
CoRR, 2023

UnifySpeech: A Unified Framework for Zero-shot Text-to-Speech and Voice Conversion.
CoRR, 2023

TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Adaptive Fake Audio Detection with Low-Rank Model Squeezing.
Proceedings of the Workshop on Deepfake Audio Detection and Analysis co-located with 32th International Joint Conference on Artificial Intelligence (IJCAI 2023), 2023

ADD 2023: the Second Audio Deepfake Detection Challenge.
Proceedings of the Workshop on Deepfake Audio Detection and Analysis co-located with 32th International Joint Conference on Artificial Intelligence (IJCAI 2023), 2023

Low-rank Adaptation Method for Wav2vec2-based Fake Audio Detection.
Proceedings of the Workshop on Deepfake Audio Detection and Analysis co-located with 32th International Joint Conference on Artificial Intelligence (IJCAI 2023), 2023

The VIBVG Speech Synthesis System for Blizzard Challenge 2023.
Proceedings of the 18th Blizzard Challenge Workshop, Grenoble, France, August 29, 2023, 2023

2022
CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

NeuralDPS: Neural Deterministic Plus Stochastic Model With Multiband Excitation for Noise-Controllable Waveform Generation.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

SceneFake: An Initial Dataset and Benchmarks for Scene Fake Audio Detection.
CoRR, 2022

System Fingerprints Detection for DeepFake Audio: An Initial Dataset and Investigation.
CoRR, 2022

ADD 2022: the First Audio Deep Synthesis Detection Challenge.
CoRR, 2022

An Initial Investigation for Detecting Vocoder Fingerprints of Fake Audio.
Proceedings of the DDAM@MM 2022: Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia, 2022

Fully Automated End-to-End Fake Audio Detection.
Proceedings of the DDAM@MM 2022: Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia, 2022

Singing-Tacotron: Global Duration Control Attention and Dynamic Filter for End-to-end Singing Voice Synthesis.
Proceedings of the DDAM@MM 2022: Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia, 2022

DDAM '22: 1st International Workshop on Deepfake Detection for Audio Multimedia.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

ADD 2022: the first Audio Deep Synthesis Detection Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2022

Context-Aware Mask Prediction Network for End-to-End Text-Based Speech Editing.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Half-Truth: A Partially Fake Audio Detection Dataset.
CoRR, 2021

Text Enhancement for Paragraph Processing in End-to-End Code-switching TTS.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Half-Truth: A Partially Fake Audio Detection Dataset.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Patnet : A Phoneme-Level Autoregressive Transformer Network for Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2021

Prosody and Voice Factorization for Few-Shot Speaker Adaptation in the Challenge M2voc 2021.
Proceedings of the IEEE International Conference on Acoustics, 2021

Bi-Level Style and Prosody Decoupling Modeling for Personalized End-to-End Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Spoken Content and Voice Factorization for Few-Shot Speaker Adaptation.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Bi-Level Speaker Supervision for One-Shot Speech Synthesis.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Non-Autoregressive End-to-End TTS with Coarse-to-Fine Decoding.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Dynamic Speaker Representations Adjustment and Decoder Factorization for Speaker Adaptation in End-to-End Speech Synthesis.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Dynamic Soft Windowing and Language Dependent Style Token for Code-Switching End-to-End Speech Synthesis.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Focusing on Attention: Prosody Transfer and Adaptative Optimization Strategy for Multi-Speaker End-to-End Speech Synthesis.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

The NLPR Speech Synthesis entry for Blizzard Challenge 2020.
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020

2019
Phoneme Dependent Speaker Embedding and Model Factorization for Multi-speaker Speech Synthesis and Adaptation.
Proceedings of the IEEE International Conference on Acoustics, 2019

The NLPR Speech Synthesis entry for Blizzard Challenge 2019.
Proceedings of the Blizzard Challenge 2019, Vienna, Austria, September 23, 2019, 2019

2018
On the Application and Compression of Deep Time Delay Neural Network for Embedded Statistical Parametric Speech Synthesis.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Deep Metric Learning for the Target Cost in Unit-Selection Speech Synthesizer.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Transfer Learning Based Progressive Neural Networks for Acoustic Modeling in Statistical Parametric Speech Synthesis.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

2017
The NLPR Speech Synthesis entry for Blizzard Challenge 2017.
Proceedings of the Blizzard Challenge 2017, Stockholm, Sweden, August 25, 2017, 2017


  Loading...