Han Hu

Orcid: 0000-0001-5104-6146

Affiliations:

Microsoft Research Asia, Beijing, China
Tsinghua University, Department of Automation, Tsinghua National Laboratory for Information Science and Technology, Beijing, China (PhD 2014)

According to our database¹, Han Hu authored at least 111 papers between 2009 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

A Survey on Video Diffusion Models.

[BibT_eX]

[DOI]

ACM Comput. Surv., February, 2025

2024

Expediting Large-Scale Vision Transformer for Dense Prediction Without Fine-Tuning.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., January, 2024

Xwin-LM: Strong and Scalable Alignment Practice for LLMs.

[BibT_eX]

[DOI]

CoRR, 2024

Common 7B Language Models Already Possess Strong Math Capabilities.

[BibT_eX]

[DOI]

CoRR, 2024

Unsupervised Graphic Layout Grouping with Transformers.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

LarvSeg: Exploring Image Classification Data for Large Vocabulary Semantic Segmentation via Category-Wise Attentive Classifier.

[BibT_eX]

[DOI]

Proceedings of the Pattern Recognition and Computer Vision - 7th Chinese Conference, 2024

Data-efficient Large Vision Models through Sequential Autoregression.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

GAIA: Zero-shot Talking Avatar Generation.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

V-DETR: DETR with Vertex Relative Position Encoding for 3D Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

ScalingFilter: Assessing Data Quality through Inverse Utilization of Scaling Laws.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

SimDA: Simple Diffusion Adapter for Efficient Video Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

MotionEditor: Editing Video Motion via Content-Aware Diffusion.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Multiple View Geometry Transformers for 3D Human Pose Estimation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Segment and Caption Anything.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

InstructDiffusion: A Generalist Modeling Interface for Vision Tasks.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

SAN: Side Adapter Network for Open-Vocabulary Semantic Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

Global Context Networks.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models.

[BibT_eX]

[DOI]

CoRR, 2023

FP8-LM: Training FP8 Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance.

[BibT_eX]

[DOI]

CoRR, 2023

InstructDiffusion: A Generalist Modeling Interface for Vision Tasks.

[BibT_eX]

[DOI]

CoRR, 2023

DETR Doesn't Need Multi-Scale or Locality Design.

[BibT_eX]

[DOI]

CoRR, 2023

GlyphControl: Glyph Conditional Control for Visual Text Generation.

[BibT_eX]

[DOI]

CoRR, 2023

VanillaKD: Revisit the Power of Vanilla Knowledge Distillation from Small Scale to Large Scale.

[BibT_eX]

[DOI]

CoRR, 2023

DeepMIM: Deep Supervision for Masked Image Modeling.

[BibT_eX]

[DOI]

CoRR, 2023

All in Tokens: Unifying Output Space of Visual Tasks via Soft Token.

[BibT_eX]

[DOI]

CoRR, 2023

GlyphControl: Glyph Conditional Control for Visual Text Generation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

ImageBrush: Learning Visual In-Context Instructions for Exemplar-Based Image Manipulation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Rank-DETR for High Quality Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

One-for-All: Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Revisit the Power of Vanilla Knowledge Distillation: from Small Scale to Large Scale.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Tutel: Adaptive Mixture-of-Experts at Scale.

[BibT_eX]

[DOI]

Proceedings of the Sixth Conference on Machine Learning and Systems, 2023

ClipCrop: Conditioned Cropping Driven by Vision-Language Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Attentive Mask CLIP.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Improving CLIP Fine-tuning Performance.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Implicit Temporal Modeling with Learnable Alignment for Video Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

All in Tokens: Unifying Output Space of Visual Tasks via Soft Token.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

DETR Does Not Need Multi-Scale or Locality Design.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Mask-Attention-Free Transformer for 3D Instance Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Efficient Diffusion Training via Min-SNR Weighting Strategy.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Side Adapter Network for Open-Vocabulary Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

SVFormer: Semi-supervised Video Transformer for Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

On Data Scaling in Masked Image Modeling.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Revealing the Dark Secrets of Masked Image Modeling.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

iCLIP: Bridging Image Classification and Contrastive Language-Image Pre-training for Visual Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

ResFormer: Scaling ViTs with Multi-Resolution Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

DETRs with Hybrid Matching.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Human Pose as Compositional Tokens.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Exploring Non-additive Randomness on ViT against Query-Based Black-Box Attacks.

[BibT_eX]

[DOI]

Proceedings of the 34th British Machine Vision Conference 2023, 2023

2022

Exploring Discrete Diffusion Models for Image Captioning.

[BibT_eX]

[DOI]

CoRR, 2022

Could Giant Pretrained Image Models Extract Universal Representations?

[BibT_eX]

[DOI]

CoRR, 2022

Tutel: Adaptive Mixture-of-Experts at Scale.

[BibT_eX]

[DOI]

CoRR, 2022

Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation.

[BibT_eX]

[DOI]

CoRR, 2022

Deeper Insights into ViTs Robustness towards Common Corruptions.

[BibT_eX]

[DOI]

CoRR, 2022

iCAR: Bridging Image Classification and Image-text Alignment for Visual Recognition.

[BibT_eX]

[DOI]

CoRR, 2022

Enhancing the Robustness, Efficiency, and Diversity of Differentiable Architecture Search.

[BibT_eX]

[DOI]

CoRR, 2022

Region Rebalance for Long-Tailed Semantic Segmentation.

[BibT_eX]

[DOI]

CoRR, 2022

MLSeg: Image and Video Segmentation as Multi-Label Classification and Selected-Label Pixel Classification.

[BibT_eX]

[DOI]

CoRR, 2022

Could Giant Pre-trained Image Models Extract Universal Representations?

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Graph Hawkes Transformer for Extrapolated Reasoning on Temporal Knowledge Graphs.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-Language Model.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

A Simple Approach and Benchmark for 21, 000-Category Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

RankSeg: Adaptive Pixel Classification with Image Category Ranking for Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

SimMIM: a Simple Framework for Masked Image Modeling.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Video Swin Transformer.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Swin Transformer V2: Scaling Up Capacity and Resolution.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

A Simple Baseline for Zero-shot Semantic Segmentation with Pre-trained Vision-language Model.

[BibT_eX]

[DOI]

CoRR, 2021

Breaking Shortcut: Exploring Fully Convolutional Cycle-Consistency for Video Correspondence Learning.

[BibT_eX]

[DOI]

CoRR, 2021

Self-Supervised Learning with Swin Transformers.

[BibT_eX]

[DOI]

CoRR, 2021

Bootstrap Your Object Detector via Mixed Training.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Aligning Pretraining for Detection via Object-Level Contrastive Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Leveraging Batch Normalization for Vision Transformers.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

End-to-End Semi-Supervised Object Detection with Soft Teacher.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Group-Free 3D Object Detection via Transformers.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Capsule Network Is Not More Robust Than Convolutional Network.

[BibT_eX]

[DOI]

Jindong Gu

Volker Tresp

Han Hu

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Boosting Adversarial Transferability through Enhanced Momentum.

[BibT_eX]

[DOI]

Proceedings of the 32nd British Machine Vision Conference 2021, 2021

2020

RelationNet++: Bridging Visual Representations for Object Detection via Transformer Decoder.

[BibT_eX]

[DOI]

Cheng Chi

Fangyun Wei

Han Hu

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

RepPoints v2: Verification Meets Regression for Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Parametric Instance Classification for Unsupervised Visual Feature learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Disentangled Non-local Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Dense RepPoints: Representing Visual Objects with Dense Point Sets.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

A Closer Look at Local Aggregation Operators in Point Cloud Analysis.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Negative Margin Matters: Understanding Margin in Few-Shot Classification.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Memory Enhanced Global-Local Aggregation for Video Object Detection.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Deep Metric Transfer for Label Propagation with Limited Annotated Data.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

GCNet: Non-Local Networks Meet Squeeze-Excitation Networks and Beyond.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

RepPoints: Point Set Representation for Object Detection.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Spatial-Temporal Relation Networks for Multi-Object Tracking.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Local Relation Networks for Image Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Deformable ConvNets V2: More Deformable, Better Results.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

Learning Region Features for Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

Relation Networks for Object Detection.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017

Deformable Convolutional Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Computer Vision, 2017

2016

Depth Estimation Using a Sliding Camera.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2016

2015

Exploiting Unsupervised and Supervised Constraints for Subspace Clustering.

[BibT_eX]

[DOI]

Han Hu

Jianjiang Feng

Jie Zhou

IEEE Trans. Pattern Anal. Mach. Intell., 2015

Progressive feature matching via triplet graph.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

2014

Smooth Representation Clustering.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

2013

Multi-Class Constrained Normalized Cut With Hard, Soft, Unary and Pairwise Priors and its Applications to Object Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2013

2012

Multi-way constrained spectral clustering by nonnegative restriction.

[BibT_eX]

[DOI]

Proceedings of the 21st International Conference on Pattern Recognition, 2012

2011

Video Stabilization and Completion Using Two Cameras.

[BibT_eX]

[DOI]

Jie Zhou

Han Hu

Dingrui Wan

IEEE Trans. Circuits Syst. Video Technol., 2011

2010

HTF: a novel feature for general crack detection.

[BibT_eX]

[DOI]

Han Hu

Quanquan Gu

Jie Zhou

Proceedings of the International Conference on Image Processing, 2010

Trajectory matching from unsynchronized videos.

[BibT_eX]

[DOI]

Han Hu

Jie Zhou

Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

2009

Multiframe Motion Segmentation via Penalized MAP Estimation and Linear Programming.

[BibT_eX]

[DOI]

Proceedings of the British Machine Vision Conference, 2009

Han Hu

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...