Yang Ai
Orcid: 0009-0006-0157-4980
According to our database1,
Yang Ai
authored at least 49 papers
between 2009 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
IEEE ACM Trans. Audio Speech Lang. Process., 2024
Low-Latency Neural Speech Phase Prediction Based on Parallel Estimation Architecture and Anti-Wrapping Losses for Speech Generation Tasks.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
APCodec: A Neural Audio Codec With Parallel Amplitude and Phase Spectrum Encoding and Decoding.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
MDCTCodec: A Lightweight MDCT-based Neural Audio Codec towards High Sampling Rate and Low Bitrate Scenarios.
CoRR, 2024
APCodec+: A Spectrum-Coding-Based High-Fidelity and High-Compression-Rate Neural Audio Codec with Staged Training Paradigm.
CoRR, 2024
CoRR, 2024
CoRR, 2024
BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation.
CoRR, 2024
Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction.
CoRR, 2024
Speech Reconstruction from Silent Lip and Tongue Articulation by Diffusion Models and Text-Guided Pseudo Target Generation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
2023
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
Long-Frame-Shift Neural Speech Phase Prediction With Spectral Continuity Enhancement and Interpolation Error Compensation.
IEEE Signal Process. Lett., 2023
Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement.
CoRR, 2023
Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis.
CoRR, 2023
Proceedings of the 31st ACM International Conference on Multimedia, 2023
CMIR: A Unified Cross-Modality Framework for Preoperative Accurate Prediction of Microvascular Invasion in Hepatocellular Carcinoma.
Proceedings of the MEDINFO 2023 - The Future Is Accessible, 2023
Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement through Knowledge Distillation.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Speech Reconstruction from Silent Tongue and Lip Articulation by Pseudo Target Generation and Domain Adversarial Training.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Neural Speech Phase Prediction Based on Parallel Estimation Architecture and Anti-Wrapping Losses.
Proceedings of the IEEE International Conference on Acoustics, 2023
A Self-Attention Based Fusion Model of Radiomics and Deep Features for Early Recurrence Prediction in NSCLC.
Proceedings of the 12th IEEE Global Conference on Consumer Electronics, 2023
MVI-Wise GAN: Synthetic MRI to Improve Microvascular Invasion Prediction in Hepatocellular Carcinoma.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023
Vision-Guided Attention-Enhanced Network for Predicting Microvascular Invasion in Hepatocellular Carcinoma.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023
The USTC-NERCSLIP System for the Track 1.2 of Audio Deepfake Detection (ADD 2023) Challenge.
Proceedings of the Workshop on Deepfake Audio Detection and Analysis co-located with 32th International Joint Conference on Artificial Intelligence (IJCAI 2023), 2023
2022
Denoising-and-Dereverberation Hierarchical Neural Vocoder for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2022
A robust encryption watermarking algorithm for medical images based on ridgelet-DCT and THM double chaos.
J. Cloud Comput., 2022
Residual Multilayer Perceptrons for Genotype-Guided Recurrence Prediction of Non-Small Cell Lung Cancer.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022
2021
Enhancing Low-Quality Voice Recordings Using Disentangled Channel Factor and Neural Waveform Model.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Denoising-and-Dereverberation Hierarchical Neural Vocoder for Robust Waveform Generation.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Phase Spectrum Recovery for Enhancing Low-Quality Speech Captured by Laser Microphones.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021
2020
A Neural Vocoder With Hierarchical Generation of Amplitude and Phase Spectra for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2020
IEEE Access, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Knowledge-and-Data-Driven Amplitude Spectrum Prediction for Hierarchical Neural Vocoders.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020
2019
Zero-Watermarking Algorithm for Medical Images Based on Dual-Tree Complex Wavelet Transform and Discrete Cosine Transform.
J. Medical Imaging Health Informatics, 2019
Singing Voice Synthesis Using Deep Autoregressive Neural Networks for Acoustic Modeling.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Dnn-based Spectral Enhancement for Neural Waveform Generators with Low-bit Quantization.
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the Blizzard Challenge 2019, Vienna, Austria, September 23, 2019, 2019
2018
Waveform Modeling and Generation Using Hierarchical Recurrent Neural Networks for Speech Bandwidth Extension.
IEEE ACM Trans. Audio Speech Lang. Process., 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
2010
Proceedings of the Future Generation Information Technology, 2010
2009
Proceedings of the Sixth International Conference on Fuzzy Systems and Knowledge Discovery, 2009