Yongxin Zhu

Orcid: 0000-0002-4757-543X

Affiliations:
  • University of Science and Technology of China (USTC), Hefei, China


According to our database1, Yongxin Zhu authored at least 15 papers between 2022 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Addressing Representation Collapse in Vector Quantized Models with One Linear Layer.
CoRR, 2024

Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective.
CoRR, 2024

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs.
CoRR, 2024

Empowering Diffusion Models on the Embedding Space for Text Generation.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Summarizing Like Human: Edit-Based Text Summarization with Keywords.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2024, 2024

Few-shot Temporal Pruning Accelerates Diffusion Models for Text Generation.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Visual Hallucination Elevates Speech Recognition.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
ItrievalKD: An Iterative Retrieval Framework Assisted with Knowledge Distillation for Noisy Text-to-Image Retrieval.
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2023

DiffS2UT: A Semantic Preserving Diffusion Model for Textless Direct Speech-to-Speech Translation.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Span-level Aspect-based Sentiment Analysis via Table Filling.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Locate Then Generate: Bridging Vision and Language with Bounding Box for Scene-Text VQA.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Difformer: Empowering Diffusion Model on Embedding Space for Text Generation.
CoRR, 2022

Sequence-to-Action: Grammatical Error Correction with Action Guided Sequence Generation.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022


  Loading...