Yifei Huang

Orcid: 0000-0001-8067-6227

Affiliations:

University of Tokyo, Japan
Shanghai Jiao Tong University, China (until 2015)

According to our database¹, Yifei Huang authored at least 54 papers between 2017 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2024

Matching Compound Prototypes for Few-Shot Action Recognition.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., September, 2024

Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model.

[BibT_eX]

[DOI]

CoRR, 2024

CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding.

[BibT_eX]

[DOI]

CoRR, 2024

Pre-Training for 3D Hand Pose Estimation with Contrastive Learning on Large-Scale Hand Images in the Wild.

[BibT_eX]

[DOI]

CoRR, 2024

EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation.

[BibT_eX]

[DOI]

CoRR, 2024

InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding.

[BibT_eX]

[DOI]

CoRR, 2024

Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding.

[BibT_eX]

[DOI]

CoRR, 2024

FineBio: A Fine-Grained Video Dataset of Biological Experiments with Hierarchical Annotation.

[BibT_eX]

[DOI]

CoRR, 2024

Masked Video and Body-Worn IMU Autoencoder for Egocentric Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

InternVideo2: Scaling Foundation Models for Multimodal Video Understanding.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

ActionVOS: Actions as Prompts for Video Object Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Retrieval-Augmented Egocentric Video Captioning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives.

[BibT_eX]

[DOI]

Triantafyllos Afouras

Santhosh Kumar Ramakrishnan

Oluwatumininu Oguntola

Giovanni Maria Farinella

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

MoVQA: A Benchmark of Versatile Question-Answering for Long-Form Movie Understanding.

[BibT_eX]

[DOI]

CoRR, 2023

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives.

[BibT_eX]

[DOI]

CoRR, 2023

VideoLLM: Modeling Video Sequence with Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

Fine-grained Affordance Annotation for Egocentric Hand-Object Interaction Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

3D Segmenter: 3D Transformer based Semantic Segmentation via 2D Panoramic Distillation.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Memory-and-Anticipation Transformer for Online Action Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Weakly Supervised Temporal Sentence Grounding with Uncertainty-Guided Self-training.

[BibT_eX]

[DOI]

Yifei Huang

Lijin Yang

Yoichi Sato

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Structural Multiplane Image: Bridging Neural View Synthesis and 3D Reconstruction.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

First Bite/Chew: distinguish different types of food by first biting/chewing and the corresponding hand movement.

[BibT_eX]

[DOI]

Proceedings of the Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, 2023

Proposal-based Temporal Action Localization with Point-level Supervision.

[BibT_eX]

[DOI]

Proceedings of the 34th British Machine Vision Conference 2023, 2023

First Bite/Chew: distinguish typical allergic food by two IMUs.

[BibT_eX]

[DOI]

Proceedings of the Augmented Humans International Conference 2023, 2023

2022

Spatio-Temporal Perturbations for Video Attribution.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2022

InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D Challenges.

[BibT_eX]

[DOI]

CoRR, 2022

Precise Affordance Annotation for Egocentric Action Video Datasets.

[BibT_eX]

[DOI]

CoRR, 2022

Seeing our Blind Spots: Smart Glasses-based Simulation to Increase Design Students' Awareness of Visual Impairment.

[BibT_eX]

[DOI]

Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology, 2022

Inner self drawing machine.

[BibT_eX]

[DOI]

Proceedings of the SIGGRAPH Asia 2022 Art Gallery, 2022

GazeSync: Eye Movement Transfer Using an Optical Eye Tracker and Monochrome Liquid Crystal Displays.

[BibT_eX]

[DOI]

Proceedings of the IUI 2022: 27th International Conference on Intelligent User Interfaces, Helsinki, Finland, March 22 - 25, 2022, 2022

Compound Prototype Matching for Few-Shot Action Recognition.

[BibT_eX]

[DOI]

Yifei Huang

Lijin Yang

Yoichi Sato

Proceedings of the Computer Vision - ECCV 2022, 2022

CLRNet: Cross Layer Refinement Network for Lane Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Interact before Align: Leveraging Cross-Modal Knowledge for Domain Adaptive Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Ego4D: Around the World in 3, 000 Hours of Egocentric Video.

[BibT_eX]

[DOI]

Santhosh Kumar Ramakrishnan

Christoph Feichtenhofer

Giovanni Maria Farinella

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Ego4D: Around the World in 3, 000 Hours of Egocentric Video.

[BibT_eX]

[DOI]

Santhosh Kumar Ramakrishnan

Christoph Feichtenhofer

Giovanni Maria Farinella

CoRR, 2021

EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition 2021: Team M3EM Technical Report.

[BibT_eX]

[DOI]

CoRR, 2021

Towards Visually Explaining Video Understanding Networks with Perturbation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Precise Multi-Modal In-Hand Pose Estimation using Low-Precision Sensors for Robotic Assembly.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2021

Goal-Oriented Gaze Estimation for Zero-Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Stacked Temporal Attention: Improving First-person Action Recognition by Emphasizing Discriminative Clips.

[BibT_eX]

[DOI]

Proceedings of the 32nd British Machine Vision Conference 2021, 2021

Leveraging Human Selective Attention for Medical Image Analysis with Limited Training Data.

[BibT_eX]

[DOI]

Proceedings of the 32nd British Machine Vision Conference 2021, 2021

Commonsense Knowledge Aware Concept Selection For Diverse and Informative Visual Storytelling.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Mutual Context Network for Jointly Estimating Egocentric Gaze and Action.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2020

An Ego-Vision System for Discovering Human Joint Attention.

[BibT_eX]

[DOI]

Yifei Huang

Minjie Cai

Yoichi Sato

IEEE Trans. Hum. Mach. Syst., 2020

Learn to Extract Building Outline from Misaligned Annotation through Nearest Feature Selector.

[BibT_eX]

[DOI]

Remote. Sens., 2020

A Comprehensive Study on Visual Explanations for Spatio-temporal Networks.

[BibT_eX]

[DOI]

CoRR, 2020

Learn to Recover Visible Color for Video Surveillance in a Day.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Improving Action Segmentation via Graph-Based Temporal Reasoning.

[BibT_eX]

[DOI]

Yifei Huang

Yusuke Sugano

Yoichi Sato

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Mutual Context Network for Jointly Estimating Egocentric Gaze and Actions.

[BibT_eX]

[DOI]

CoRR, 2019

Manipulation-Skill Assessment from Videos with Spatial Attention Network.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

2018

Predicting Gaze in Egocentric Video by Learning Task-Dependent Attention Transition.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

Semantic Aware Attention Based Deep Object Co-segmentation.

[BibT_eX]

[DOI]

Hong Chen

Yifei Huang

Hideki Nakayama

Proceedings of the Computer Vision - ACCV 2018, 2018

2017

Temporal Localization and Spatial Segmentation of Joint Attention in Multiple First-Person Videos.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

Yifei Huang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...