We stand with Ukraine

We stand with Ukraine

Hao Tan

Affiliations:

Adobe Research
University of North Carolina, Chapel Hill, NC, USA (former)

According to our database¹, Hao Tan authored at least 48 papers between 2017 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

2017

2018

2019

2020

2021

2022

2023

2024

0

5

10

15

20

25

16

1

1

7

3

4

3

7

3

2

1

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

Georgios Pavlakos

,

CoRR, 2024

LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, 2024

Numerical Pruning for Efficient Autoregressive Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, 2024

Turbo3D: Ultra-fast Text-to-3D Generation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Shubham Tulsiani

,

CoRR, 2024

RandAR: Decoder-only Autoregressive Visual Generation in Random Orders.

[BibT_eX]

[DOI]

,

,

,

,

,

,

William T. Freeman

,

CoRR, 2024

Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Gordon Wetzstein

,

CoRR, 2024

Generating 3D-Consistent Videos from Unposed Internet Photos.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Bharath Hariharan

,

CoRR, 2024

LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2024

Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2024

Progressive Autoregressive Video Diffusion Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Arie E. Kaufman

,

CoRR, 2024

RelitLRM: Generative Relightable Radiance for Large Reconstruction Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

William T. Freeman

,

,

CoRR, 2024

Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language Models.

[BibT_eX]

[DOI]

,

Franck Dernoncourt

,

,

Hanieh Deilamsalehy

,

,

,

,

,

Thien Huu Nguyen

CoRR, 2024

LRM-Zero: Training Large Reconstruction Models with Synthesized Data.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Arie E. Kaufman

,

,

CoRR, 2024

Pre-trained Vision-Language Models Learn Discoverable Visual Concepts.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2024

MeshLRM: Large Reconstruction Model for High-Quality Mesh.

[BibT_eX]

[DOI]

,

,

,

,

,

Valentin Deschaintre

,

Kalyan Sunkavalli

,

,

CoRR, 2024

Single-View 3D Human Digitalization with Large Reconstruction Models.

[BibT_eX]

[DOI]

,

,

,

,

,

Serena Yeung-Levy

,

CoRR, 2024

DMV3D: Denoising Multi-view Diffusion Using 3D Large Reconstruction Model.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Kalyan Sunkavalli

,

Gordon Wetzstein

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Instant3D: Fast Text-to-3D with Sparse-view Generation and Large Reconstruction Model.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Kalyan Sunkavalli

,

Greg Shakhnarovich

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

LRM: Large Reconstruction Model for Single Image to 3D.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Kalyan Sunkavalli

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

SOHES: Self-supervised Open-world Hierarchical Entity Segmentation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction.

[BibT_eX]

[DOI]

,

,

,

,

,

Kalyan Sunkavalli

,

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting.

[BibT_eX]

[DOI]

,

,

,

,

,

Kalyan Sunkavalli

,

Proceedings of the Computer Vision - ECCV 2024, 2024

Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Arie E. Kaufman

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

Boosting Punctuation Restoration with Data Generation and Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Hanieh Deilamsalehy

,

Franck Dernoncourt

,

Thien Huu Nguyen

CoRR, 2023

Boosting Punctuation Restoration with Data Generation and Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Hanieh Deilamsalehy

,

Franck Dernoncourt

,

Thien Huu Nguyen

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Scaling Data Generation in Vision-and-Language Navigation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Learning Navigational Visual Representations with Semantic Map Supervision.

[BibT_eX]

[DOI]

,

,

,

Franck Dernoncourt

,

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022

CLEAR: Improving Vision-Language Navigation with Cross-Lingual, Environment-Agnostic Representations.

[BibT_eX]

[DOI]

,

,

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

How Much Can CLIP Benefit Vision-and-Language Tasks?

[BibT_eX]

[DOI]

,

Liunian Harold Li

,

,

,

,

,

,

Proceedings of the Tenth International Conference on Learning Representations, 2022

Envedit: Environment Editing for Vision-and-Language Navigation.

[BibT_eX]

[DOI]

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Scientific Chart Summarization: Datasets and Improved Text Modeling.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Workshop on Scientific Document Understanding co-located with 36th AAAI Conference on Artificial Inteligence, 2022

2021

VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2021

VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Improving Cross-Modal Alignment in Vision Language Navigation via Syntactic Information.

[BibT_eX]

[DOI]

,

,

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Unifying Vision-and-Language Tasks via Text Generation.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 38th International Conference on Machine Learning, 2021

2020

Diagnosing the Environment Bias in Vision-and-Language Navigation.

[BibT_eX]

[DOI]

,

,

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Enabling Robots to Understand Incomplete Natural Language Instructions Using Commonsense Reasoning.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

The Curse of Performance Instability in Analysis Datasets: Consequences, Source, and Suggestions.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding.

[BibT_eX]

[DOI]

,

,

,

Michael W. Mahoney

,

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision.

[BibT_eX]

[DOI]

,

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Modality-Balanced Models for Visual Dialogue.

[BibT_eX]

[DOI]

,

,

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout.

[BibT_eX]

[DOI]

,

,

Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

LXMERT: Learning Cross-Modality Encoder Representations from Transformers.

[BibT_eX]

[DOI]

,

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Expressing Visual Relationships via Language.

[BibT_eX]

[DOI]

,

Franck Dernoncourt

,

,

,

Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018

Object Ordering with Bidirectional Matchings for Visual Reasoning.

[BibT_eX]

[DOI]

,

Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Source-Target Inference Models for Spatial Instruction Understanding.

[BibT_eX]

[DOI]

,

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

A Joint Speaker-Listener-Reinforcer Model for Referring Expressions.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Loading...