Haoxuan You

Orcid: 0000-0002-7912-4035

According to our database1, Haoxuan You authored at least 30 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Detecting Multimodal Situations with Insufficient Context and Abstaining from Baseless Predictions.
CoRR, 2024

Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models.
CoRR, 2024

LLM-based Conversational AI Therapist for Daily Functioning Screening and Psychotherapeutic Intervention via Everyday Smart Devices.
CoRR, 2024

2023
Ferret: Refer and Ground Anything Anywhere at Any Granularity.
CoRR, 2023

CoBIT: A Contrastive Bi-directional Image-Text Generation Model.
CoRR, 2023

IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks.
CoRR, 2022

CLIP-TD: CLIP Targeted Distillation for Vision-Language Tasks.
CoRR, 2022

SHREC'22 track: Open-Set 3D Object Retrieval.
Comput. Graph., 2022

Rethinking Network Design and Local Geometry in Point Cloud: A Simple Residual MLP Framework.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Find Someone Who: Visual Commonsense Understanding in Human-Centric Grounding.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training.
Proceedings of the Computer Vision - ECCV 2022, 2022

SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Graph-MLP: Node Classification without Message Passing in Graph.
CoRR, 2021

Unsupervised Vision-and-Language Pre-training Without Parallel Images and Captions.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

2020
PointHop: An Explainable Machine Learning Method for Point Cloud Classification.
IEEE Trans. Multim., 2020

Weakly-supervised VisualBERT: Pre-training without Parallel Images and Captions.
CoRR, 2020

Learning Visual Commonsense for Robust Scene Graph Generation.
Proceedings of the Computer Vision - ECCV 2020, 2020

2019
PointDAN: A Multi-Scale 3D Domain Adaption Network for Point Cloud Representation.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Decoding EEG by Visual-guided Deep Neural Networks.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Multi-Modality Latent Interaction Network for Visual Question Answering.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

PVRNet: Point-View Relation Neural Network for 3D Shape Recognition.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Hypergraph Neural Networks.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

MeshNet: Mesh Neural Network for 3D Shape Representation.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
PVNet: A Joint Convolutional Network of Point Cloud and Multi-View for 3D Shape Recognition.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

2017
Restricting Greed in Training of Generative Adversarial Network.
CoRR, 2017


  Loading...