De-An Huang

Orcid: 0000-0002-6945-7768

According to our database1, De-An Huang authored at least 58 papers between 2012 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders.
CoRR, 2024

ARDuP: Active Region Video Diffusion for Universal Policies.
CoRR, 2024

X-VILA: Cross-Modality Alignment for Large Language Model.
CoRR, 2024

T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching.
CoRR, 2024

Differentially Private Video Activity Recognition.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Eureka: Human-Level Reward Design via Coding Large Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

LITA: Language Instructed Temporal-Localization Assistant.
Proceedings of the Computer Vision - ECCV 2024, 2024

Perada: Parameter-Efficient Federated Learning Personalization with Generalization Guarantees.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

What is Point Supervision Worth in Video Instance Segmentation?
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Capturing fine-grained details for video-based automation of suturing skills assessment.
Int. J. Comput. Assist. Radiol. Surg., March, 2023

Dr-Fairness: Dynamic Data Ratio Adjustment for Fair Training on Real and Generated Data.
Trans. Mach. Learn. Res., 2023

PerAda: Parameter-Efficient and Generalizable Federated Learning Personalization with Guarantees.
CoRR, 2023

Deep Multimodal Fusion for Surgical Feedback Classification.
Proceedings of the Machine Learning for Health, 2023

I<sup>2</sup>SB: Image-to-Image Schrödinger Bridge.
Proceedings of the International Conference on Machine Learning, 2023

Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

2022
PlaTe: Visually-Grounded Planning With Transformers in Procedural Tasks.
IEEE Robotics Autom. Lett., 2022

Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Pre-Trained Language Models for Interactive Decision-Making.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Identifying Auxiliary or Adversarial Tasks Using Necessary Condition Analysis for Adversarial Multi-task Video Understanding.
Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

2021
Auditing AI models for Verified Deployment under Semantic Specifications.
CoRR, 2021

SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies.
Proceedings of the 38th International Conference on Machine Learning, 2021

2020
Purposive visual imitation for learning structured tasks from videos.
PhD thesis, 2020

Motion Reasoning for Goal-Based Imitation Learning.
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

Procedure Planning in Instructional Videos.
Proceedings of the Computer Vision - ECCV 2020, 2020

Spatio-Temporal Graph for Video Captioning With Knowledge Distillation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
D<sup>3</sup>TW: Discriminative Differentiable Dynamic Time Warping for Weakly Supervised Action Alignment and Segmentation.
CoRR, 2019

Action-Agnostic Human Pose Forecasting.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2019

Regression Planning Networks.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Continuous Relaxation of Symbolic Planner for One-Shot Imitation Learning.
Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019

Imitation Learning for Human Pose Prediction.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Neural Task Graphs: Generalizing to Unseen Tasks From a Single Video Demonstration.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

D3TW: Discriminative Differentiable Dynamic Time Warping for Weakly Supervised Action Alignment and Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Learning to Decompose and Disentangle Representations for Video Prediction.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Temporal Modular Networks for Retrieving Complex Compositional Activities in Videos.
Proceedings of the Computer Vision - ECCV 2018, 2018

Dynamic Task Prioritization for Multitask Learning.
Proceedings of the Computer Vision - ECCV 2018, 2018

Neural Graph Matching Networks for Fewshot 3D Action Recognition.
Proceedings of the Computer Vision - ECCV 2018, 2018

What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Finding "It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional Videos.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017
Visual Forecasting by Imitating Dynamics in Natural Sequences.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Forecasting Interactive Dynamics of Pedestrians with Fictitious Play.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Unsupervised Learning of Long-Term Motion Dynamics for Videos.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Unsupervised Visual-Linguistic Reference Resolution in Instructional Videos.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Activity Forecasting: An Invitation to Predictive Perception.
Proceedings of the Group and Crowd Behavior for Computer Vision, 1st Edition, 2017

2016
A Game-Theoretic Approach to Multi-Pedestrian Activity Forecasting.
CoRR, 2016

Connectionist Temporal Modeling for Weakly Supervised Action Labeling.
Proceedings of the Computer Vision - ECCV 2016, 2016

2015
How do we use our hands? Discovering a diverse set of common grasps.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Approximate MaxEnt Inverse Optimal Control and Its Application for Mental Simulation of Human Interactions.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014
Self-Learning Based Image Decomposition With Applications to Single Image Denoising.
IEEE Trans. Multim., 2014

Action-Reaction: Forecasting the Dynamics of Human Interaction.
Proceedings of the Computer Vision - ECCV 2014, 2014

2013
With one look: robust face recognition using single sample per person.
Proceedings of the ACM Multimedia Conference, 2013

Coupled Dictionary and Feature Space Learning with Applications to Cross-Domain Image Synthesis and Recognition.
Proceedings of the IEEE International Conference on Computer Vision, 2013

2012
Context-aware single image super-resolution using locality-constrained group sparse representation.
Proceedings of the 2012 Visual Communications and Image Processing, 2012

Self-Learning of Edge-Preserving Single Image Super-Resolution via Contourlet Transform.
Proceedings of the 2012 IEEE International Conference on Multimedia and Expo, 2012

Context-Aware Single Image Rain Removal.
Proceedings of the 2012 IEEE International Conference on Multimedia and Expo, 2012

Compiling program control flows into biochemical reactions.
Proceedings of the 2012 IEEE/ACM International Conference on Computer-Aided Design, 2012


  Loading...