Zehuan Yuan

Orcid: 0000-0002-0349-9367

According to our database¹, Zehuan Yuan authored at least 75 papers between 2018 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

ChatSearch: a Dataset and a Generative Retrieval Model for General Conversational Image Retrieval.

[BibT_eX]

[DOI]

CoRR, 2024

HLLM: Enhancing Sequential Recommendations via Hierarchical Large Language Models for Item and User Modeling.

[BibT_eX]

[DOI]

CoRR, 2024

OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction.

[BibT_eX]

[DOI]

CoRR, 2024

View Crafting For Instance-Level Representation from Scene Images.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

General Object Foundation Model for Images and Videos at Scale.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Generative Region-Language Pretraining for Open-Ended Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Sparse R-CNN: An End-to-End Framework for Object Detection.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

DMRNet++: Learning Discriminative Features With Decoupled Networks and Enriched Pairs for One-Step Person Search.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

MCIBI++: Soft Mining Contextual Information Beyond Image for Semantic Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., May, 2023

Trimap-guided feature mining and fusion network for natural image matting.

[BibT_eX]

[DOI]

Comput. Vis. Image Underst., April, 2023

UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces.

[BibT_eX]

[DOI]

CoRR, 2023

Recognize Any Regions.

[BibT_eX]

[DOI]

CoRR, 2023

ChatBridge: Bridging Modalities with Large Language Model as a Language Catalyst.

[BibT_eX]

[DOI]

CoRR, 2023

Meta Compositional Referring Expression Segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

EGC: Image Generation and Classification via a Diffusion Energy-Based Model.

[BibT_eX]

[DOI]

CoRR, 2023

Multi-Level Contrastive Learning for Dense Prediction Task.

[BibT_eX]

[DOI]

CoRR, 2023

MAMO: Fine-Grained Vision-Language Representations Learning with Masked Multimodal Modeling.

[BibT_eX]

[DOI]

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

CoDet: Co-occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Transformer-based Open-world Instance Segmentation with Cross-task Consistency Regularization.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Learning Object-Language Alignments for Open-Vocabulary Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

The First Visual Object Tracking Segmentation VOTS2023 Challenge Results.

[BibT_eX]

[DOI]

Kannappan Palaniappan

Norbert Scherer-Negenborn

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Exploring Transformers for Open-world Instance Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Segment Every Reference Object in Spatial and Temporal Spaces.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

EGC: Image Generation and Classification via a Diffusion Energy-Based Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Meta Compositional Referring Expression Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Token Boosting for Robust Self-Supervised Visual Transformer Pre-training.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Learning Instance-Level Representation for Large-Scale Multi-Modal Pretraining in E-Commerce.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Universal Instance Perception as Object Discovery and Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Birds of a Feather Flock Together: Category-Divergence Guidance for Domain Adaptive Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2022

Conditional Hyper-Network for Blind Super-Resolution With Multiple Degradations.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2022

MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Self-supervised Video Representation Learning with Motion-Aware Masked Autoencoders.

[BibT_eX]

[DOI]

CoRR, 2022

ManiCLIP: Multi-Attribute Face Manipulation from Text.

[BibT_eX]

[DOI]

Hao Wang

Guosheng Lin

Ana Garcia del Molino

CoRR, 2022

Single-Stage Open-world Instance Segmentation with Cross-task Consistency Regularization.

[BibT_eX]

[DOI]

CoRR, 2022

MetaFormer: A Unified Meta Framework for Fine-Grained Recognition.

[BibT_eX]

[DOI]

CoRR, 2022

QueryPose: Sparse Multi-Person Pose Regression via Spatial-Aware Part-Level Query.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Rethinking Resolution in the Context of Efficient Video Recognition.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Objects in Semantic Topology.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

ByteTrack: Multi-object Tracking by Associating Every Detection Box.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Masked Generative Distillation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Towards Grand Unification of Object Tracking.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Multimodal Transformer with Variable-Length Memory for Vision-and-Language Navigation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

You Should Look at All Objects.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Focal and Global Knowledge Distillation for Detectors.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Language as Queries for Referring Video Object Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Content-Variant Reference Image Quality Assessment via Knowledge Distillation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

ByteTrack: Multi-Object Tracking by Associating Every Detection Box.

[BibT_eX]

[DOI]

CoRR, 2021

Memory Based Video Scene Parsing.

[BibT_eX]

[DOI]

CoRR, 2021

Center Prediction Loss for Re-identification.

[BibT_eX]

[DOI]

CoRR, 2021

Conditional Meta-Network for Blind Super-Resolution with Multiple Degradations.

[BibT_eX]

[DOI]

CoRR, 2021

Disentangled Contrastive Learning on Graphs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Multimodal Video Summarization via Time-Aware Transformers.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

What Makes for End-to-End Object Detection?

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Exploring Balanced Feature Spaces for Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Unsupervised Real-World Super-Resolution: A Domain Adaptation Perspective.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Domain-Invariant Disentangled Network for Generalizable Object Detection.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Weakly Supervised Person Search with Region Siamese Networks.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Sparse R-CNN: End-to-End Object Detection With Learnable Proposals.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Slimmable Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

TransTrack: Multiple-Object Tracking with Transformer.

[BibT_eX]

[DOI]

CoRR, 2020

OneNet: Towards End-to-End One-Stage Object Detection.

[BibT_eX]

[DOI]

CoRR, 2020

Moflowgan: Video Generation With Flow Guidance.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Controllable Orthogonalization in Training DNNs.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Non-Local Neural Networks With Grouped Bilinear Attentional Transforms.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Towards Good Practices for Instance Segmentation.

[BibT_eX]

[DOI]

CoRR, 2019

Deformable Tube Network for Action Detection in Videos.

[BibT_eX]

[DOI]

CoRR, 2019

2018

Towards Good Practices for Multi-modal Fusion in Large-Scale Video Classification.

[BibT_eX]

[DOI]

Jinlai Liu

Zehuan Yuan

Changhu Wang

Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

Knowing Where to Look? Analysis on Attention of Visual Question Answering System.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

Zehuan Yuan

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...