Hao Zhang

Orcid: 0000-0001-8232-1665

Affiliations:

Hong Kong University of Science and Technology, Guangzhou, China
International Digital Economy Academy (IDEA), China

According to our database¹, Hao Zhang authored at least 36 papers between 2022 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2024

DN-DETR: Accelerate DETR Training by Introducing Query DeNoising.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., April, 2024

TAPTRv2: Attention-based Position Update Improves Tracking Any Point.

[BibT_eX]

[DOI]

CoRR, 2024

MotionLLM: Understanding Human Behaviors from Human Motions and Videos.

[BibT_eX]

[DOI]

CoRR, 2024

Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection.

[BibT_eX]

[DOI]

CoRR, 2024

Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks.

[BibT_eX]

[DOI]

CoRR, 2024

Interfacing Foundation Models' Embeddings.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Grounding DINO: Marrying DINO with Grounded Pre-training for Open-Set Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Segment and Recognize Anything at Any Granularity.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

TAPTR: Tracking Any Point with Transformers as Detection.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Visual in-Context Prompting.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

Interfacing Foundation Models' Embeddings.

[BibT_eX]

[DOI]

CoRR, 2023

LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models.

[BibT_eX]

[DOI]

CoRR, 2023

Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V.

[BibT_eX]

[DOI]

CoRR, 2023

Semantic-SAM: Segment and Recognize Anything at Any Granularity.

[BibT_eX]

[DOI]

CoRR, 2023

detrex: Benchmarking Detection Transformers.

[BibT_eX]

[DOI]

CoRR, 2023

A Strong and Reproducible Object Detector with Only Public Datasets.

[BibT_eX]

[DOI]

CoRR, 2023

Segment Everything Everywhere All at Once.

[BibT_eX]

[DOI]

CoRR, 2023

A Simple Framework for Open-Vocabulary Segmentation and Detection.

[BibT_eX]

[DOI]

CoRR, 2023

Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection.

[BibT_eX]

[DOI]

CoRR, 2023

DA-BEV: Depth Aware BEV Transformer for 3D Object Detection.

[BibT_eX]

[DOI]

CoRR, 2023

Segment Everything Everywhere All at Once.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

A Simple Framework for Open-Vocabulary Segmentation and Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Detection Transformer with Stable Matching.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

DFA3D: 3D Deformable Attention For 2D-to-3D Feature Lifting.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

MP-Former: Mask-Piloted Transformer for Image Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

A Unified Mutual Supervision Framework for Referring Expression Segmentation and Generation.

[BibT_eX]

[DOI]

CoRR, 2022

Active Domain Adaptation with Multi-level Contrastive Units for Semantic Segmentation.

[BibT_eX]

[DOI]

CoRR, 2022

Vision-Language Intelligence: Tasks, Representation Learning, and Large Models.

[BibT_eX]

[DOI]

CoRR, 2022

DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Active Domain Adaptation with Multi-level Contrastive Units for Semantic Segmentation.

[BibT_eX]

[DOI]

Hao Zhang

Ruimao Zhang

Proceedings of the Computer Vision - ACCV 2022, 2022

Hao Zhang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...