Tao Wang

Orcid: 0000-0001-9002-2630

Affiliations:
  • Chinese Academy of Science, National Laboratory of Pattern Recognition, Institute of Automation, Beijing, China,
  • University of Chinese Academy of Sciences, School of Artificial Intelligence, Beijing, China


According to our database1, Tao Wang authored at least 48 papers between 2009 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Assessing growth potential of careers with occupational mobility network and ensemble framework.
Eng. Appl. Artif. Intell., January, 2024

CFAD: A Chinese dataset for fake audio detection.
Speech Commun., 2024

WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification.
CoRR, 2024

DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech.
CoRR, 2024

Text Prompt is Not Enough: Sound Event Enhanced Prompt Adapter for Target Style Audio Generation.
CoRR, 2024

VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing.
CoRR, 2024

ICAGC 2024: Inspirational and Convincing Audio Generation Challenge 2024.
CoRR, 2024

ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation.
CoRR, 2024

MINT: a Multi-modal Image and Narrative Text Dubbing Dataset for Foley Audio Content Planning and Generation.
CoRR, 2024

TraceableSpeech: Towards Proactively Traceable Text-to-Speech with Watermarking.
CoRR, 2024

PPPR: Portable Plug-in Prompt Refiner for Text to Audio Generation.
CoRR, 2024

Emotion selectable end-to-end text-based speech editing.
Artif. Intell., 2024

Fewer-Token Neural Speech Codec with Time-Invariant Codes.
Proceedings of the IEEE International Conference on Acoustics, 2024

Learning Speech Representation from Contrastive Token-Acoustic Pretraining.
Proceedings of the IEEE International Conference on Acoustics, 2024

Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Adversarial Representation Mechanism Learning for Network Embedding.
IEEE Trans. Knowl. Data Eng., 2023

Amer: A New Attribute-Missing Network Embedding Approach.
IEEE Trans. Cybern., 2023

Adversarial Multi-Task Learning for Mandarin Prosodic Boundary Prediction With Multi-Modal Embeddings.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Fewer-token Neural Speech Codec with Time-invariant Codes.
CoRR, 2023

Controllable Residual Speaker Representation for Voice Conversion.
CoRR, 2023

Boosting Fast and High-Quality Speech Synthesis with Linear Diffusion.
CoRR, 2023

UnifySpeech: A Unified Framework for Zero-shot Text-to-Speech and Voice Conversion.
CoRR, 2023

Slow-Fast Time Parameter Aggregation Network for Class-Incremental Lip Reading.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

ADD 2023: the Second Audio Deepfake Detection Challenge.
Proceedings of the Workshop on Deepfake Audio Detection and Analysis co-located with 32th International Joint Conference on Artificial Intelligence (IJCAI 2023), 2023

The VIBVG Speech Synthesis System for Blizzard Challenge 2023.
Proceedings of the 18th Blizzard Challenge Workshop, Grenoble, France, August 29, 2023, 2023

2022
CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

NeuralDPS: Neural Deterministic Plus Stochastic Model With Multiband Excitation for Noise-Controllable Waveform Generation.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

EmoFake: An Initial Dataset for Emotion Fake Audio Detection.
CoRR, 2022

ADD 2022: the First Audio Deep Synthesis Detection Challenge.
CoRR, 2022

An Initial Investigation for Detecting Vocoder Fingerprints of Fake Audio.
Proceedings of the DDAM@MM 2022: Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia, 2022

Singing-Tacotron: Global Duration Control Attention and Dynamic Filter for End-to-end Singing Voice Synthesis.
Proceedings of the DDAM@MM 2022: Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia, 2022

ADD 2022: the first Audio Deep Synthesis Detection Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2022

Context-Aware Mask Prediction Network for End-to-End Text-Based Speech Editing.
Proceedings of the IEEE International Conference on Acoustics, 2022

Powerful Graph Convolutional Networks with Adaptive Propagation Mechanism for Homophily and Heterophily.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Powerful Graph Convolutioal Networks with Adaptive Propagation Mechanism for Homophily and Heterophily.
CoRR, 2021

Half-Truth: A Partially Fake Audio Detection Dataset.
CoRR, 2021

Text Enhancement for Paragraph Processing in End-to-End Code-switching TTS.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Half-Truth: A Partially Fake Audio Detection Dataset.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Prosody and Voice Factorization for Few-Shot Speaker Adaptation in the Challenge M2voc 2021.
Proceedings of the IEEE International Conference on Acoustics, 2021

Bi-Level Style and Prosody Decoupling Modeling for Personalized End-to-End Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Spoken Content and Voice Factorization for Few-Shot Speaker Adaptation.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Bi-Level Speaker Supervision for One-Shot Speech Synthesis.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Non-Autoregressive End-to-End TTS with Coarse-to-Fine Decoding.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Dynamic Speaker Representations Adjustment and Decoder Factorization for Speaker Adaptation in End-to-End Speech Synthesis.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Dynamic Soft Windowing and Language Dependent Style Token for Code-Switching End-to-End Speech Synthesis.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Focusing on Attention: Prosody Transfer and Adaptative Optimization Strategy for Multi-Speaker End-to-End Speech Synthesis.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

The NLPR Speech Synthesis entry for Blizzard Challenge 2020.
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020

2009
Building the Semantic Relations-Based Web Services Registry through Services Mining.
Proceedings of the 8th IEEE/ACIS International Conference on Computer and Information Science, 2009


  Loading...