Jeongsoo Choi

Orcid: 0009-0005-6817-604X

According to our database1, Jeongsoo Choi authored at least 17 papers between 2022 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model.
IEEE Trans. Multim., 2024

Textless Unit-to-Unit Training for Many-to-Many Multilingual Speech-to-Speech Translation.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation.
CoRR, 2024

Accelerating Codec-based Speech Synthesis with Multi-Token Prediction and Speculative Decoding.
CoRR, 2024

Multilingual Visual Speech Recognition with a Single Model by Learning with Discrete Visual Speech Units.
CoRR, 2024

Exploring Phonetic Context-Aware Lip-Sync for Talking Face Generation.
Proceedings of the IEEE International Conference on Acoustics, 2024

Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-Training and Multi-Modal Tokens.
Proceedings of the IEEE International Conference on Acoustics, 2024

Text-Driven Talking Face Synthesis by Reprogramming Audio-Driven Models.
Proceedings of the IEEE International Conference on Acoustics, 2024

AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Many-to-Many Spoken Language Translation via Unified Speech and Text Representation Learning with Unit-to-Unit Translation.
CoRR, 2023

Reprogramming Audio-driven Talking Face Synthesis into Text-driven.
CoRR, 2023

Exploring Phonetic Context in Lip Movement for Authentic Talking Face Generation.
CoRR, 2023

Intelligible Lip-to-Speech Synthesis with Speech Units.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
SyncTalkFace: Talking Face Generation with Precise Lip-Syncing via Audio-Lip Memory.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022


  Loading...