Han Zhang

Orcid: 0000-0001-7072-2189

Affiliations:
  • Google Brain, Mountain View, CA, USA
  • Rutgers University, Department of Computer Science, Piscataway, NJ, USA (PhD 2018)
  • Beijing University of Posts and Telecommunications, Multimedia Communication and Pattern Recognition Lab, Beijing, China (2009 - 2012)


According to our database1, Han Zhang authored at least 70 papers between 2010 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
DICE: Discrete Inversion Enabling Controllable Editing for Multinomial Diffusion and Masked Generative Models.
CoRR, 2024

ε-VAE: Denoising as Visual Decoding.
CoRR, 2024

BACON: Supercharge Your VLM with Bag-of-Concept Graph to Mitigate Hallucinations.
CoRR, 2024

Steering Prototypes with Prompt-tuning for Rehearsal-free Continual Learning.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

CCM: Real-Time Controllable Visual Content Creation Using Text-to-Image Consistency Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Lipschitz Singularities in Diffusion Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Leveraging Unpaired Data for Vision-Language Generative Models via Cycle Consistency.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

DreamClean: Restoring Clean Image Using Deep Diffusion Prior.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Parrot: Pareto-Optimal Multi-reward Reinforcement Learning Framework for Text-to-Image Generation.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
CCM: Adding Conditional Controls to Text-to-Image Consistency Models.
CoRR, 2023

Eliminating Lipschitz Singularities in Diffusion Models.
CoRR, 2023

Learning Disentangled Prompts for Compositional Image Synthesis.
CoRR, 2023

Steering Prototype with Prompt-tuning for Rehearsal-free Continual Learning.
CoRR, 2023

StraIT: Non-autoregressive Generation with Stratified Image Transformer.
CoRR, 2023

StoryBench: A Multifaceted Benchmark for Continuous Story Visualization.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Muse: Text-To-Image Generation via Masked Generative Transformers.
Proceedings of the International Conference on Machine Learning, 2023

Phenaki: Variable Length Video Generation from Open Domain Textual Descriptions.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

VQ3D: Learning a 3D-Aware Generative Model on ImageNet.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SVDiff: Compact Parameter Space for Diffusion Fine-Tuning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Dimensionality-Varying Diffusion Process.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

MAGVIT: Masked Generative Video Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Visual Prompt Tuning for Generative Transfer Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation.
Trans. Mach. Learn. Res., 2022

Deep image synthesis from intuitive user input: A review and perspectives.
Comput. Vis. Media, 2022

Phenaki: Variable Length Video Generation From Open Domain Textual Description.
CoRR, 2022

Vector-quantized Image Modeling with Improved VQGAN.
Proceedings of the Tenth International Conference on Learning Representations, 2022

ViTGAN: Training GANs with Vision Transformers.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Learning Instance-Specific Adaptation for Cross-Domain Segmentation.
Proceedings of the Computer Vision - ECCV 2022, 2022

MaxViT: Multi-axis Vision Transformer.
Proceedings of the Computer Vision, 2022

BLT: Bidirectional Layout Transformer for Controllable Layout Generation.
Proceedings of the Computer Vision - ECCV 2022, 2022

DualPrompt: Complementary Prompting for Rehearsal-Free Continual Learning.
Proceedings of the Computer Vision - ECCV 2022, 2022

MAXIM: Multi-Axis MLP for Image Processing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

MaskGIT: Masked Generative Image Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Learning to Prompt for Continual Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Aggregating Nested Transformers.
CoRR, 2021

Improved Transformer for High-Resolution GANs.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

PseudoSeg: Designing Pseudo Labels for Semantic Segmentation.
Proceedings of the 9th International Conference on Learning Representations, 2021

Revisiting Hierarchical Approach for Persistent Long-Term Video Prediction.
Proceedings of the 9th International Conference on Learning Representations, 2021

Cross-Modal Contrastive Learning for Text-to-Image Generation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Improved Consistency Regularization for GANs.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Image Augmentations for GAN Training.
CoRR, 2020

A Simple Semi-Supervised Learning Framework for Object Detection.
CoRR, 2020

FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Small-GAN: Speeding up GAN Training using Core-Sets.
Proceedings of the 37th International Conference on Machine Learning, 2020

Consistency Regularization for Generative Adversarial Networks.
Proceedings of the 8th International Conference on Learning Representations, 2020

ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring.
Proceedings of the 8th International Conference on Learning Representations, 2020

Distilling Effective Supervision From Severe Label Noise.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

ReMixMatch: Semi-Supervised Learning with Distribution Alignment and Augmentation Anchoring.
CoRR, 2019

IEG: Robust Neural Network Training to Tackle Severe Label Noise.
CoRR, 2019

Self-Attention Generative Adversarial Networks.
Proceedings of the 36th International Conference on Machine Learning, 2019

Co-Occurrent Features in Semantic Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
SegAN: Adversarial Network with Multi-scale L 1 Loss for Medical Image Segmentation.
Neuroinformatics, 2018

Improving GANs Using Optimal Transport.
Proceedings of the 6th International Conference on Learning Representations, 2018

AttnGAN: Fine-Grained Text to Image Generation With Attentional Generative Adversarial Networks.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017
Multi-feature based benchmark for cervical dysplasia classification evaluation.
Pattern Recognit., 2017

SegAN: Adversarial Network with Multi-scale L<sub>1</sub> Loss for Medical Image Segmentation.
CoRR, 2017

StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Link the Head to the "Beak": Zero Shot Learning from Noisy Text Description at Part Precision.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks.
CoRR, 2016

Multimodal Deep Learning for Cervical Dysplasia Diagnosis.
Proceedings of the Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016, 2016

SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-Grained Recognition.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2014
Robust shape prior modeling based on Gaussian-Bernoulli restricted Boltzmann Machine.
Proceedings of the IEEE 11th International Symposium on Biomedical Imaging, 2014

2013
Categorization of Underwater Habitats Using Dynamic Video Textures.
Proceedings of the 2013 IEEE International Conference on Computer Vision Workshops, 2013

2012
Video Browser Showdown by NUS.
Proceedings of the Advances in Multimedia Modeling - 18th International Conference, 2012

2011
BUPT-MCPRL at TRECVID 2011.
Proceedings of the 2011 TREC Video Retrieval Evaluation, 2011

2010
BUPT-MCPRL at TRECVID 2010.
Proceedings of the TRECVID 2010 workshop participants notebook papers, 2010


  Loading...