Yang Ai

Orcid: 0009-0006-0157-4980

According to our database1, Yang Ai authored at least 49 papers between 2009 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Low-Latency Neural Speech Phase Prediction Based on Parallel Estimation Architecture and Anti-Wrapping Losses for Speech Generation Tasks.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

APCodec: A Neural Audio Codec With Parallel Amplitude and Phase Spectrum Encoding and Decoding.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

MDCTCodec: A Lightweight MDCT-based Neural Audio Codec towards High Sampling Rate and Low Bitrate Scenarios.
CoRR, 2024

APCodec+: A Spectrum-Coding-Based High-Fidelity and High-Compression-Rate Neural Audio Codec with Staged Training Paradigm.
CoRR, 2024

Stage-Wise and Prior-Aware Neural Speech Phase Prediction.
CoRR, 2024

Refining Self-Supervised Learnt Speech Representation using Brain Activations.
CoRR, 2024

Multi-Stage Speech Bandwidth Extension with Flexible Sampling Rate Control.
CoRR, 2024

BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation.
CoRR, 2024

Voice Attribute Editing with Text Prompt.
CoRR, 2024

Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction.
CoRR, 2024

Speech Reconstruction from Silent Lip and Tongue Articulation by Diffusion Models and Text-Guided Pseudo Target Generation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Considering Temporal Connection between Turns for Conversational Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Long-Frame-Shift Neural Speech Phase Prediction With Spectral Continuity Enhancement and Interpolation Error Compensation.
IEEE Signal Process. Lett., 2023

A Dynamic Network for Efficient Point Cloud Registration.
CoRR, 2023

Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement.
CoRR, 2023

Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis.
CoRR, 2023

Face-Driven Zero-Shot Voice Conversion with Memory-based Face-Voice Alignment.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

CMIR: A Unified Cross-Modality Framework for Preoperative Accurate Prediction of Microvascular Invasion in Hepatocellular Carcinoma.
Proceedings of the MEDINFO 2023 - The Future Is Accessible, 2023

Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement through Knowledge Distillation.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Speech Reconstruction from Silent Tongue and Lip Articulation by Pseudo Target Generation and Domain Adversarial Training.
Proceedings of the IEEE International Conference on Acoustics, 2023

Zero-Shot Personalized Lip-To-Speech Synthesis with Face Image Based Voice Control.
Proceedings of the IEEE International Conference on Acoustics, 2023

Neural Speech Phase Prediction Based on Parallel Estimation Architecture and Anti-Wrapping Losses.
Proceedings of the IEEE International Conference on Acoustics, 2023

A Self-Attention Based Fusion Model of Radiomics and Deep Features for Early Recurrence Prediction in NSCLC.
Proceedings of the 12th IEEE Global Conference on Consumer Electronics, 2023

MVI-Wise GAN: Synthetic MRI to Improve Microvascular Invasion Prediction in Hepatocellular Carcinoma.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023

Vision-Guided Attention-Enhanced Network for Predicting Microvascular Invasion in Hepatocellular Carcinoma.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023

The USTC-NERCSLIP System for the Track 1.2 of Audio Deepfake Detection (ADD 2023) Challenge.
Proceedings of the Workshop on Deepfake Audio Detection and Analysis co-located with 32th International Joint Conference on Artificial Intelligence (IJCAI 2023), 2023

2022
Denoising-and-Dereverberation Hierarchical Neural Vocoder for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

A robust encryption watermarking algorithm for medical images based on ridgelet-DCT and THM double chaos.
J. Cloud Comput., 2022

Residual Multilayer Perceptrons for Genotype-Guided Recurrence Prediction of Non-Small Cell Lung Cancer.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

2021
BDDR: An Effective Defense Against Textual Backdoor Attacks.
Comput. Secur., 2021

Enhancing Low-Quality Voice Recordings Using Disentangled Channel Factor and Neural Waveform Model.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Denoising-and-Dereverberation Hierarchical Neural Vocoder for Robust Waveform Generation.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Phase Spectrum Recovery for Enhancing Low-Quality Speech Captured by Laser Microphones.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

2020
A Neural Vocoder With Hierarchical Generation of Amplitude and Phase Spectra for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Robust Watermarking Algorithm for Medical Volume Data in Internet of Medical Things.
IEEE Access, 2020

Reverberation Modeling for Source-Filter-Based Neural Vocoder.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Knowledge-and-Data-Driven Amplitude Spectrum Prediction for Hierarchical Neural Vocoders.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Online Speaker Adaptation for WaveNet-based Neural Vocoders.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019
Zero-Watermarking Algorithm for Medical Images Based on Dual-Tree Complex Wavelet Transform and Discrete Cosine Transform.
J. Medical Imaging Health Informatics, 2019

Singing Voice Synthesis Using Deep Autoregressive Neural Networks for Acoustic Modeling.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Dnn-based Spectral Enhancement for Neural Waveform Generators with Low-bit Quantization.
Proceedings of the IEEE International Conference on Acoustics, 2019

The USTC System for Blizzard Challenge 2019.
Proceedings of the Blizzard Challenge 2019, Vienna, Austria, September 23, 2019, 2019

2018
Waveform Modeling and Generation Using Hierarchical Recurrent Neural Networks for Speech Bandwidth Extension.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Samplernn-Based Neural Vocoder for Statistical Parametric Speech Synthesis.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2010
An Ontology-Based Platform for Scientific Writing and Publishing.
Proceedings of the Future Generation Information Technology, 2010

2009
Computing Minimal Diagnosis with Binary Decision Diagrams Algorithm.
Proceedings of the Sixth International Conference on Fuzzy Systems and Knowledge Discovery, 2009


  Loading...