Hao Tan

Affiliations:
  • Adobe Research
  • University of North Carolina, Chapel Hill, NC, USA (former)


According to our database1, Hao Tan authored at least 39 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Progressive Autoregressive Video Diffusion Models.
CoRR, 2024

RelitLRM: Generative Relightable Radiance for Large Reconstruction Models.
CoRR, 2024

Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language Models.
CoRR, 2024

LRM-Zero: Training Large Reconstruction Models with Synthesized Data.
CoRR, 2024

Pre-trained Vision-Language Models Learn Discoverable Visual Concepts.
CoRR, 2024

MeshLRM: Large Reconstruction Model for High-Quality Mesh.
CoRR, 2024

Single-View 3D Human Digitalization with Large Reconstruction Models.
CoRR, 2024

DMV3D: Denoising Multi-view Diffusion Using 3D Large Reconstruction Model.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Instant3D: Fast Text-to-3D with Sparse-view Generation and Large Reconstruction Model.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

LRM: Large Reconstruction Model for Single Image to 3D.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

SOHES: Self-supervised Open-world Hierarchical Entity Segmentation.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting.
Proceedings of the Computer Vision - ECCV 2024, 2024

Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Boosting Punctuation Restoration with Data Generation and Reinforcement Learning.
CoRR, 2023

Boosting Punctuation Restoration with Data Generation and Reinforcement Learning.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Scaling Data Generation in Vision-and-Language Navigation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Learning Navigational Visual Representations with Semantic Map Supervision.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022
CLEAR: Improving Vision-Language Navigation with Cross-Lingual, Environment-Agnostic Representations.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

How Much Can CLIP Benefit Vision-and-Language Tasks?
Proceedings of the Tenth International Conference on Learning Representations, 2022

Envedit: Environment Editing for Vision-and-Language Navigation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Scientific Chart Summarization: Datasets and Improved Text Modeling.
Proceedings of the Workshop on Scientific Document Understanding co-located with 36th AAAI Conference on Artificial Inteligence, 2022

2021
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning.
CoRR, 2021

VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Improving Cross-Modal Alignment in Vision Language Navigation via Syntactic Information.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Unifying Vision-and-Language Tasks via Text Generation.
Proceedings of the 38th International Conference on Machine Learning, 2021

2020
Diagnosing the Environment Bias in Vision-and-Language Navigation.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Enabling Robots to Understand Incomplete Natural Language Instructions Using Commonsense Reasoning.
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

The Curse of Performance Instability in Analysis Datasets: Consequences, Source, and Suggestions.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Modality-Balanced Models for Visual Dialogue.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

LXMERT: Learning Cross-Modality Encoder Representations from Transformers.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Expressing Visual Relationships via Language.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Object Ordering with Bidirectional Matchings for Visual Reasoning.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Source-Target Inference Models for Spatial Instruction Understanding.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
A Joint Speaker-Listener-Reinforcer Model for Referring Expressions.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017


  Loading...