Jie Zhang
Orcid: 0000-0003-1124-0854Affiliations:
- University of Science and Technology of China, NEL-SLIP, Hefei, China
- Chinese Academy of Sciences, Institute of Acoustics, Beijing, China
- Delft University of Technology, Faculty of Electrical Engineering, Mathematics, and Computer Science, The Netherlands (former)
- Peking University, Shenzhen Graduate School, Key Laboratory of Machine Perception / Engineering Lab on Intelligent Perception for Internet of Things, Beijing, China (former)
According to our database1,
Jie Zhang
authored at least 73 papers
between 2014 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
VatLM: Visual-Audio-Text Pre-Training With Unified Masked Prediction for Speech Representation Learning.
IEEE Trans. Multim., 2024
CoRR, 2024
LCM-SVC: Latent Diffusion Model Based Singing Voice Conversion with Inference Acceleration via Latent Consistency Distillation.
CoRR, 2024
CoRR, 2024
LDM-SVC: Latent Diffusion Model Based Zero-Shot Any-to-Any Singing Voice Conversion with Singer Guidance.
CoRR, 2024
FacialPulse: An Efficient RNN-based Depression Detection via Temporal Facial Landmarks.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
An End-to-End EEG Channel Selection Method with Residual Gumbel Softmax for Brain-Assisted Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2024
A Study of Multichannel Spatiotemporal Features and Knowledge Distillation on Robust Target Speaker Extraction.
Proceedings of the IEEE International Conference on Acoustics, 2024
Sifisinger: A High-Fidelity End-to-End Singing Voice Synthesizer Based on Source-Filter Model.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Multichannel AV-wav2vec2: A Framework for Learning Multichannel Multi-Modal Speech Representation.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
IEEE Trans. Mob. Comput., August, 2023
IEEE Trans. Serv. Comput., 2023
A Joint Speech Enhancement and Self-Supervised Representation Learning Framework for Noise-Robust Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
SDW-SWF: Speech Distortion Weighted Single-Channel Wiener Filter for Noise Reduction.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
Energy-Efficient Sparsity-Driven Speech Enhancement in Wireless Acoustic Sensor Networks.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
IEEE ACM Trans. Audio Speech Lang. Process., 2023
Memory Storable Network Based Feature Aggregation for Speaker Representation Learning.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
A Semi-Supervised Complementary Joint Training Approach for Low-Resource Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
CoRR, 2023
A Speech Distortion Weighted Single-Channel Wiener Filter Based STFT-Domain Noise Reduction.
Proceedings of the IEEE Statistical Signal Processing Workshop, 2023
Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Proceedings of the 20th International Conference on Spoken Language Translation, 2023
Real-Time Causal Spectro-Temporal Voice Activity Detection Based on Convolutional Encoding and Residual Decoding.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
BASEN: Time-Domain Brain-Assisted Speech Enhancement Network with Convolutional Cross Attention in Multi-talker Conditions.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Robust Data2VEC: Noise-Robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning.
Proceedings of the IEEE International Conference on Acoustics, 2023
The NERCSLIP-USTC System for the L3DAS23 Challenge Task2: 3D Sound Event Localization and Detection (SELD).
Proceedings of the IEEE International Conference on Acoustics, 2023
A Multi-Scale Feature Aggregation Based Lightweight Network for Audio-Visual Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2023
The USTC-NERCSLIP System for the Track 1.2 of Audio Deepfake Detection (ADD 2023) Challenge.
Proceedings of the Workshop on Deepfake Audio Detection and Analysis co-located with 32th International Joint Conference on Artificial Intelligence (IJCAI 2023), 2023
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
A Comparative Study on Multichannel Speaker-Attributed Automatic Speech Recognition in Multi-party Meetings.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
Learning Semantic Information from Machine Translation to Improve Speech-to-Text Translation.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
2022
Frequency-Invariant Sensor Selection for MVDR Beamforming in Wireless Acoustic Sensor Networks.
IEEE Trans. Wirel. Commun., 2022
A Parametric Unconstrained Beamformer Based Binaural Noise Reduction for Assistive Hearing.
IEEE ACM Trans. Audio Speech Lang. Process., 2022
CoRR, 2022
CoRR, 2022
A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition.
CoRR, 2022
External Text Based Data Augmentation for Low-Resource Speech Recognition in the Constrained Condition of OpenASR21 Challenge.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Differential Time-frequency Log-mel Spectrogram Features for Vision Transformer Based Infant Cry Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
A Complementary Joint Training Approach Using Unpaired Speech and Text A Complementary Joint Training Approach Using Unpaired Speech and Text.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
An Experimental Comparison between Low-Resource Semi-Supervised and High-Resource Supervised Automatic Speech Recognition Models.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022
Learning Contextually Fused Audio-Visual Representations For Audio-Visual Speech Recognition.
Proceedings of the 2022 IEEE International Conference on Image Processing, 2022
A Noise-Robust Self-Supervised Pre-Training Model Based Speech Representation Learning for Automatic Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022
Supervised and Self-Supervised Pretraining Based Covid-19 Detection Using Acoustic Breathing/Cough/Speech Signals.
Proceedings of the IEEE International Conference on Acoustics, 2022
Reference Microphone Selection and Low-Rank Approximation Based Multichannel Wiener Filter with Application to Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022
2021
Power Optimized and Power Constrained Randomized Gossip Approaches for Wireless Sensor Networks.
IEEE Wirel. Commun. Lett., 2021
Quantization-Aware Binaural MWF Based Noise Reduction Incorporating External Wireless Devices.
IEEE ACM Trans. Audio Speech Lang. Process., 2021
Sensor Selection for Relative Acoustic Transfer Function Steered Linearly-Constrained Beamformers.
IEEE ACM Trans. Audio Speech Lang. Process., 2021
IEEE ACM Trans. Audio Speech Lang. Process., 2021
Multi-Granularity Sequence Alignment Mapping for Encoder-Decoder Based End-to-End ASR.
IEEE ACM Trans. Audio Speech Lang. Process., 2021
An Improved Wav2Vec 2.0 Pre-Training Approach Using Enhanced Local Dependency Modeling for Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
2020
Joint Sampling Synchronization and Source Localization for Wireless Acoustic Sensor Networks.
IEEE Commun. Lett., 2020
Attentive Fusion Enhanced Audio-Visual Encoding for Transformer Based Robust Speech Recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020
2019
IEEE ACM Trans. Audio Speech Lang. Process., 2019
Sensor Selection and Rate Distribution Based Beamforming in Wireless Acoustic Sensor Networks.
Proceedings of the 27th European Signal Processing Conference, 2019
2018
Rate-Distributed Spatial Filtering Based Noise Reduction in Wireless Acoustic Sensor Networks.
IEEE ACM Trans. Audio Speech Lang. Process., 2018
IEEE ACM Trans. Audio Speech Lang. Process., 2018
Rate-Distributed Binaural LCMV Beamforming for Assistive Hearing in Wireless Acoustic Sensor Networks.
Proceedings of the 10th IEEE Sensor Array and Multichannel Signal Processing Workshop, 2018
2017
Binaural Sound Localization Based on Reverberation Weighting and Generalized Parametric Mapping.
IEEE ACM Trans. Audio Speech Lang. Process., 2017
2016
Bi-Direction Interaural Matching Filter and Decision Weighting Fusion for Sound Source Localization in Noisy Environments.
IEICE Trans. Inf. Syst., 2016
Structured total least squares based internal delay estimation for distributed microphone auto-localization.
Proceedings of the IEEE International Workshop on Acoustic Signal Enhancement, 2016
Probabilistic binaural multiple sources localization based on time-delay compensation estimator and clustering analysis.
Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2016
2015
Robust Acoustic Localization Via Time-Delay Compensation and Interaural Matching Filter.
IEEE Trans. Signal Process., 2015
Binaural cues estimates based on Interaural Matching Filter for sound source localization.
Proceedings of the 2015 IEEE International Conference on Robotics and Biomimetics, 2015
Direction of arrival estimation based on reverberation weighting and noise error estimator.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Binaural sound source localization based on generalized parametric model and two-layer matching strategy in complex environments.
Proceedings of the IEEE International Conference on Robotics and Automation, 2015
2014
A new hierarchical binaural sound source localization method based on Interaural Matching Filter.
Proceedings of the 2014 IEEE International Conference on Robotics and Automation, 2014
A binaural sound source localization model based on time-delay compensation and interaural coherence.
Proceedings of the IEEE International Conference on Acoustics, 2014
Proceedings of the IEEE 3rd International Conference on Cloud Computing and Intelligence Systems, 2014