2024
Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
Low-Latency Neural Speech Phase Prediction Based on Parallel Estimation Architecture and Anti-Wrapping Losses for Speech Generation Tasks.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
APCodec: A Neural Audio Codec With Parallel Amplitude and Phase Spectrum Encoding and Decoding.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram.
CoRR, 2024
Refining Self-Supervised Learnt Speech Representation using Brain Activations.
CoRR, 2024
Multi-Stage Speech Bandwidth Extension with Flexible Sampling Rate Control.
CoRR, 2024
BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation.
CoRR, 2024
Voice Attribute Editing with Text Prompt.
CoRR, 2024
Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction.
CoRR, 2024
Pitch-and-Spectrum-Aware Singing Quality Assessment with Bias Correction and Model Fusion.
Proceedings of the IEEE Spoken Language Technology Workshop, 2024
Stage-Wise and Prior-Aware Neural Speech Phase Prediction.
Proceedings of the IEEE Spoken Language Technology Workshop, 2024
MDCTCodec: A Lightweight MDCT-Based Neural Audio Codec Towards High Sampling Rate and Low Bitrate Scenarios.
Proceedings of the IEEE Spoken Language Technology Workshop, 2024
Speech Reconstruction from Silent Lip and Tongue Articulation by Diffusion Models and Text-Guided Pseudo Target Generation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
SAMOS: A Neural MOS Prediction Model Leveraging Semantic Representations and Acoustic Features.
Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024
APCodec+: A Spectrum-Coding-Based High-Fidelity and High-Compression-Rate Neural Audio Codec with Staged Training Paradigm.
Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024
Considering Temporal Connection between Turns for Conversational Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2024
Biologically Interpretable Model for Precise Recurrence Prediction of Non-Small Cell Lung Cancer.
Proceedings of the 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2024
2023
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
Long-Frame-Shift Neural Speech Phase Prediction With Spectral Continuity Enhancement and Interpolation Error Compensation.
IEEE Signal Process. Lett., 2023
A Dynamic Network for Efficient Point Cloud Registration.
CoRR, 2023
Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement.
CoRR, 2023
Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis.
CoRR, 2023
Face-Driven Zero-Shot Voice Conversion with Memory-based Face-Voice Alignment.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
CMIR: A Unified Cross-Modality Framework for Preoperative Accurate Prediction of Microvascular Invasion in Hepatocellular Carcinoma.
Proceedings of the MEDINFO 2023 - The Future Is Accessible, 2023
Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement through Knowledge Distillation.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Speech Reconstruction from Silent Tongue and Lip Articulation by Pseudo Target Generation and Domain Adversarial Training.
Proceedings of the IEEE International Conference on Acoustics, 2023
Zero-Shot Personalized Lip-To-Speech Synthesis with Face Image Based Voice Control.
Proceedings of the IEEE International Conference on Acoustics, 2023
Neural Speech Phase Prediction Based on Parallel Estimation Architecture and Anti-Wrapping Losses.
Proceedings of the IEEE International Conference on Acoustics, 2023
A Self-Attention Based Fusion Model of Radiomics and Deep Features for Early Recurrence Prediction in NSCLC.
Proceedings of the 12th IEEE Global Conference on Consumer Electronics, 2023
MVI-Wise GAN: Synthetic MRI to Improve Microvascular Invasion Prediction in Hepatocellular Carcinoma.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023
Vision-Guided Attention-Enhanced Network for Predicting Microvascular Invasion in Hepatocellular Carcinoma.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023
The USTC-NERCSLIP System for the Track 1.2 of Audio Deepfake Detection (ADD 2023) Challenge.
,
,
,
,
,
,
,
,
,
,
Proceedings of the Workshop on Deepfake Audio Detection and Analysis co-located with 32th International Joint Conference on Artificial Intelligence (IJCAI 2023), 2023
2022
Denoising-and-Dereverberation Hierarchical Neural Vocoder for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2022
A robust encryption watermarking algorithm for medical images based on ridgelet-DCT and THM double chaos.
J. Cloud Comput., 2022
Residual Multilayer Perceptrons for Genotype-Guided Recurrence Prediction of Non-Small Cell Lung Cancer.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022
2021
BDDR: An Effective Defense Against Textual Backdoor Attacks.
Comput. Secur., 2021
Enhancing Low-Quality Voice Recordings Using Disentangled Channel Factor and Neural Waveform Model.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Denoising-and-Dereverberation Hierarchical Neural Vocoder for Robust Waveform Generation.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Phase Spectrum Recovery for Enhancing Low-Quality Speech Captured by Laser Microphones.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021
2020
A Neural Vocoder With Hierarchical Generation of Amplitude and Phase Spectra for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2020
Robust Watermarking Algorithm for Medical Volume Data in Internet of Medical Things.
IEEE Access, 2020
Reverberation Modeling for Source-Filter-Based Neural Vocoder.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Knowledge-and-Data-Driven Amplitude Spectrum Prediction for Hierarchical Neural Vocoders.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Online Speaker Adaptation for WaveNet-based Neural Vocoders.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020
2019
Zero-Watermarking Algorithm for Medical Images Based on Dual-Tree Complex Wavelet Transform and Discrete Cosine Transform.
J. Medical Imaging Health Informatics, 2019
Singing Voice Synthesis Using Deep Autoregressive Neural Networks for Acoustic Modeling.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Dnn-based Spectral Enhancement for Neural Waveform Generators with Low-bit Quantization.
Proceedings of the IEEE International Conference on Acoustics, 2019
The USTC System for Blizzard Challenge 2019.
Proceedings of the Blizzard Challenge 2019, Vienna, Austria, September 23, 2019, 2019
2018
Waveform Modeling and Generation Using Hierarchical Recurrent Neural Networks for Speech Bandwidth Extension.
IEEE ACM Trans. Audio Speech Lang. Process., 2018
Samplernn-Based Neural Vocoder for Statistical Parametric Speech Synthesis.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
2010
An Ontology-Based Platform for Scientific Writing and Publishing.
Proceedings of the Future Generation Information Technology, 2010
2009
Computing Minimal Diagnosis with Binary Decision Diagrams Algorithm.
Proceedings of the Sixth International Conference on Fuzzy Systems and Knowledge Discovery, 2009