Zexu Pan

Orcid: 0000-0002-8106-1176

According to our database1, Zexu Pan authored at least 31 papers between 2020 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Hierarchical Edge Refinement Network for Guided Depth Map Super-Resolution.
IEEE Trans. Computational Imaging, 2024

Speech Separation With Pretrained Frontend to Minimize Domain Mismatch.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

NeuroHeed: Neuro-Steered Speaker Extraction Using EEG Signals.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

pTSE-T: Presentation Target Speaker Extraction using Unaligned Text Cues.
CoRR, 2024

Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions.
CoRR, 2024

Enhanced Reverberation as Supervision for Unsupervised Speech Separation.
CoRR, 2024

TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement.
Proceedings of the 18th International Workshop on Acoustic Signal Enhancement, 2024

GLMB 3D Speaker Tracking with Video-Assisted Multi-Channel Audio Optimization Functions.
Proceedings of the IEEE International Conference on Acoustics, 2024

Late Audio-Visual Fusion for in-the-Wild Speaker Diarization.
Proceedings of the IEEE International Conference on Acoustics, 2024

NeuroHeed+: Improving Neuro-Steered Speaker Extraction with Joint Auditory Attention Detection.
Proceedings of the IEEE International Conference on Acoustics, 2024

NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization.
Proceedings of the IEEE International Conference on Acoustics, 2024

Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-Talker Speech.
Proceedings of the IEEE International Conference on Acoustics, 2024

LOCSELECT: Target Speaker Localization with an Auditory Selective Hearing Mechanism.
Proceedings of the IEEE International Conference on Acoustics, 2024

Generation or Replication: Auscultating Audio Latent Diffusion Models.
Proceedings of the IEEE International Conference on Acoustics, 2024

Restoring Speaking Lips from Occlusion for Audio-Visual Speech Recognition.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Time-Domain Speech Separation Networks With Graph Encoding Auxiliary.
IEEE Signal Process. Lett., 2023

Speaker Extraction with Detection of Presence and Absence of Target Speakers.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Rethinking the Visual Cues in Audio-Visual Speaker Extraction.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Target Active Speaker Detection with Audio-visual Cues.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

ImagineNet: Target Speaker Extraction with Intermittent Visual Cue Through Embedding Inpainting.
Proceedings of the IEEE International Conference on Acoustics, 2023

Scenario-Aware Audio-Visual TF-Gridnet for Target Speech Extraction.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Selective Listening by Synchronizing Speech With Lips.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

USEV: Universal Speaker Extraction With Visual Cue.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

Speaker Extraction With Co-Speech Gestures Cue.
IEEE Signal Process. Lett., 2022

Towards End-to-end Speaker Diarization in the Wild.
CoRR, 2022

A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

VCSE: Time-Domain Visual-Contextual Speaker Extraction Network.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021
Is Someone Speaking?: Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Multi-Target DoA Estimation with an Audio-Visual Fusion Mechanism.
Proceedings of the IEEE International Conference on Acoustics, 2021

Muse: Multi-Modal Target Speaker Extraction with Visual Cues.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Multi-Modal Attention for Speech Emotion Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020


  Loading...