Kris Makoto Kitani

Orcid: 0000-0002-9389-4060

Affiliations:
  • Carnegie Mellon University


According to our database1, Kris Makoto Kitani authored at least 268 papers between 2008 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Harmony4D: A Video Dataset for In-The-Wild Close Human Interactions.
CoRR, 2024

Human Action Anticipation: A Survey.
CoRR, 2024

Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos.
CoRR, 2024

Multi-Modal Diffusion for Hand-Object Grasp Generation.
CoRR, 2024

Unlocking Exocentric Video-Language Data for Egocentric Video Representation Learning.
CoRR, 2024

ExpertAF: Expert Actionable Feedback from Video.
CoRR, 2024

Grasping Diverse Objects with Simulated Humanoids.
CoRR, 2024

DegustaBot: Zero-Shot Visual Preference Estimation for Personalized Multi-Object Rearrangement.
CoRR, 2024

SMPLOlympics: Sports Environments for Physically Simulated Humanoids.
CoRR, 2024

OmniH2O: Universal and Dexterous Human-to-Humanoid Whole-Body Teleoperation and Learning.
CoRR, 2024

Generalizable Neural Human Renderer.
CoRR, 2024

Zero-Shot Multi-Object Shape Completion.
CoRR, 2024

Real-Time Simulated Avatar from Head-Mounted Sensors.
CoRR, 2024

Learning Human-to-Humanoid Real-Time Whole-Body Teleoperation.
CoRR, 2024

Mixed Gaussian Flow for Diverse Trajectory Prediction.
CoRR, 2024

Multi-Person 3D Pose Estimation from Multi-View Uncalibrated Depth Cameras.
CoRR, 2024

SolePoser: Full Body Pose Estimation using a Single Pair of Insole Sensor.
Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology, 2024

Dual-Modal 3D Human Pose Estimation using Insole Foot Pressure Sensors.
Proceedings of the IEEE International Symposium on Mixed and Augmented Reality Adjunct, 2024

JaywalkerVR: A VR System for Collecting Safety-Critical Pedestrian-Vehicle Interactions.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Multi-Object Tracking by Hierarchical Visual Representations.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Universal Humanoid Motion Representations for Physics-Based Control.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Zero-Shot Multi-object Scene Completion.
Proceedings of the Computer Vision - ECCV 2024, 2024

Video Question Answering with Procedural Programs.
Proceedings of the Computer Vision - ECCV 2024, 2024

EgoSG: Learning 3D Scene Graphs from Egocentric RGB-D Sequences.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

G-HOP: Generative Hand-Object Prior for Interaction Reconstruction and Grasp Synthesis.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Flexible Depth Completion for Sparse and Varying Point Densities.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Real-Time Simulated Avatar from Head-Mounted Sensors.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Bootstrapping Linear Models for Fast Online Adaptation in Human-Agent Collaboration.
Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, 2024

2023
Multi-View Person Matching and 3D Pose Estimation with Arbitrary Uncalibrated Camera Networks.
CoRR, 2023

Zero-Shot Video Question Answering with Procedural Programs.
CoRR, 2023

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives.
CoRR, 2023

Evaluating a VR System for Collecting Safety-Critical Vehicle-Pedestrian Interactions.
CoRR, 2023

3D-CLFusion: Fast Text-to-3D Rendering with Contrastive Latent Diffusion.
CoRR, 2023

Type-to-Track: Retrieve Any Object via Prompt-based Tracking.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Cost-Aware Evaluation and Model Scaling for LiDAR-Based 3D Object Detection.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Time Will Tell: New Outlooks and A Baseline for Temporal Multi-View 3D Object Detection.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Deep OC-Sort: Multi-Pedestrian Tracking by Adaptive Re-Identification.
Proceedings of the IEEE International Conference on Image Processing, 2023

Joint Metrics Matter: A Better Standard for Trajectory Forecasting.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

EgoHumans: An Egocentric 3D Multi-Human Benchmark.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Deformer: Dynamic Fusion Transformer for Robust Hand Pose Estimation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

TEMPO: Efficient Multi-View Pose Estimation, Tracking, and Forecasting.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Perpetual Humanoid Control for Real-time Simulated Avatars.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

ST-MVDNet++: Improve Vehicle Detection with Lidar-Radar Geometrical Augmentation via Self-Training.
Proceedings of the IEEE International Conference on Acoustics, 2023

Towards Online Adaptation for Autonomous Household Assistants.
Proceedings of the Companion of the 2023 ACM/IEEE International Conference on Human-Robot Interaction, 2023

Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Azimuth Super-Resolution for FMCW Radar in Autonomous Driving.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Origami Sensei: Mixed Reality AI-Assistant for Creative Tasks Using Hands.
Proceedings of the Companion Publication of the 2023 ACM Designing Interactive Systems Conference, 2023

2022
Spatiotemporal Video Highlight by Neural Network Considering Gaze and Hands of Surgeon in Egocentric Surgical Videos.
J. Medical Robotics Res., 2022

HARMONIC: A multimodal dataset of assistive human-robot collaboration.
Int. J. Robotics Res., 2022

3D-Aware Encoding for Style-based Neural Radiance Fields.
CoRR, 2022

From Universal Humanoid Control to Automatic Physically Valid Character Creation.
CoRR, 2022

No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

Embodied Scene-aware Human Pose Estimation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Online No-regret Model-Based Meta RL for Personalized Navigation.
Proceedings of the Learning for Dynamics and Control Conference, 2022

Learnable Spatio-Temporal Map Embeddings for Deep Inertial Localization.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022

REvolveR: Continuous Evolutionary Models for Robot-to-robot Policy Transfer.
Proceedings of the International Conference on Machine Learning, 2022

Wisdom of Committees: An Overlooked Approach To Faster and More Accurate Models.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Transform2Act: Learning a Transform-and-Control Policy for Efficient Agent Design.
Proceedings of the Tenth International Conference on Learning Representations, 2022

S2Net: Stochastic Sequential Pointcloud Forecasting.
Proceedings of the Computer Vision - ECCV 2022, 2022

Domain Adaptive Hand Keypoint and Pixel Localization in the Wild.
Proceedings of the Computer Vision - ECCV 2022, 2022

Whose Track Is It Anyway? Improving Robustness to Tracking Errors with Affinity-based Trajectory Prediction.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Modality-Agnostic Learning for Radar-Lidar Fusion in Vehicle Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Cross-Domain Adaptive Teacher for Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Occluded Human Mesh Recovery.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022


Sequential Voting with Relational Box Fields for Active Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

GLAMR: Global Occlusion-Aware Human Mesh Recovery with Dynamic Cameras.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

HERD: Continuous Human-to-Robot Evolution for Learning from Human Demonstration.
Proceedings of the Conference on Robot Learning, 2022

Multi-View Multi-Person 3D Pose Estimation with Uncalibrated Camera Networks.
Proceedings of the 33rd British Machine Vision Conference 2022, 2022

Track Targets by Dense Spatio-Temporal Position Encoding.
Proceedings of the 33rd British Machine Vision Conference 2022, 2022

2021
PTP: Parallelized Tracking and Prediction With Graph Neural Networks and Diversity Sampling.
IEEE Robotics Autom. Lett., 2021

Helping People Through Space and Time: Assistance as a Perspective on Human-Robot Interaction.
Frontiers Robotics AI, 2021

Cross-Domain Object Detection via Adaptive Self-Training.
CoRR, 2021

Sequential Decision-Making for Active Object Detection from Hand.
CoRR, 2021

Ego4D: Around the World in 3, 000 Hours of Egocentric Video.
CoRR, 2021

RePOSE: Real-Time Iterative Rendering and Refinement for 6D Object Pose Estimation.
CoRR, 2021

Efficient Model Performance Estimation via Feature Histories.
CoRR, 2021

DeepBLE: Generalizing RSSI-based Localization Across Different Devices.
CoRR, 2021

Audio-Visual Self-Supervised Terrain Type Recognition for Ground Mobile Platforms.
IEEE Access, 2021

Learning Shape Representations for Person Re-Identification under Clothing Change.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Dynamics-regulated kinematic policy for egocentric pose estimation.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

KDFNet: Learning Keypoint Distance Field for 6D Object Pose Estimation.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

Joint Object Detection and Multi-Object Tracking with Graph Neural Networks.
Proceedings of the IEEE International Conference on Robotics and Automation, 2021

Crack Detection and Refinement Via Deep Reinforcement Learning.
Proceedings of the 2021 IEEE International Conference on Image Processing, 2021

Rethinking Transformer-based Set Prediction for Object Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Multi-Echo LiDAR for 3D Object Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

StereOBJ-1M: Large-scale Stereo Image Dataset for 6D Object Pose Estimation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Visio-Temporal Attention for Multi-Camera Multi-Target Association.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

RePOSE: Fast 6D Object Pose Refinement via Deep Texture Rendering.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Wide-Baseline Multi-Camera Calibration Using Person Re-Identification.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

SimPoE: Simulated Character Control for 3D Human Pose Estimation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

V-MAO: Generative Modeling for Multi-Arm Manipulation of Articulated Objects.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

Neighborhood-Aware Neural Architecture Search.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

Multi-Modality Task Cascade for 3D Object Detection.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

IDOL: Inertial Deep Orientation-Estimation and Localization.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Inverse Reinforcement Learning with Explicit Policy Estimates.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Learning Context-dependent Personal Preferences for Adaptive Recommendation.
ACM Trans. Interact. Intell. Syst., 2020

First-Person Activity Forecasting from Video with Online Inverse Reinforcement Learning.
IEEE Trans. Pattern Anal. Mach. Intell., 2020

Human motion trajectory prediction: a survey.
Int. J. Robotics Res., 2020

Virtual navigation for blind people: Transferring route knowledge to the real-World.
Int. J. Hum. Comput. Stud., 2020

AutoSelect: Automatic and Dynamic Detection Selection for 3D Multi-Object Tracking.
CoRR, 2020

Multiple Networks are More Efficient than One: Fast and Accurate Models via Ensembles and Cascades.
CoRR, 2020

Kinematics-Guided Reinforcement Learning for Object-Aware 3D Ego-Pose Estimation.
CoRR, 2020

Audio-Visual Self-Supervised Terrain Type Discovery for Mobile Platforms.
CoRR, 2020

End-to-End 3D Multi-Object Tracking and Trajectory Forecasting.
CoRR, 2020

Few-Shot Learning with Intra-Class Knowledge Transfer.
CoRR, 2020

Graph Neural Networks for 3D Multi-Object Tracking.
CoRR, 2020

AB3DMOT: A Baseline for 3D Multi-Object Tracking and New Evaluation Metrics.
CoRR, 2020

Joint Detection and Multi-Object Tracking with Graph Neural Networks.
CoRR, 2020

GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking with Multi-Feature Learning.
CoRR, 2020

No-Reference Image Quality Assessment via Feature Fusion and Multi-Task Learning.
CoRR, 2020

Unsupervised Sequence Forecasting of 100,000 Points for Unsupervised Trajectory Forecasting.
CoRR, 2020

Joint 3D Tracking and Forecasting with Graph Neural Network and Diversity Sampling.
CoRR, 2020

Learning Shape Representations for Clothing Variations in Person Re-Identification.
CoRR, 2020

Estimating 3D Camera Pose from 2D Pedestrian Trajectories.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

Back-Hand-Pose: 3D Hand Pose Estimation for a Wrist-worn Camera via Dorsum Deformation Network.
Proceedings of the UIST '20: The 33rd Annual ACM Symposium on User Interface Software and Technology, 2020

MonoEye: Multimodal Human Motion Capture System Using A Single Ultra-Wide Fisheye Camera.
Proceedings of the UIST '20: The 33rd Annual ACM Symposium on User Interface Software and Technology, 2020

Examining the Effects of Anticipatory Robot Assistance on Human Decision Making.
Proceedings of the Social Robotics - 12th International Conference, 2020

Residual Force Control for Agile Human Behavior Imitation and Extended Motion Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

3D Multi-Object Tracking: A Baseline and New Evaluation Metrics.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

When We First Met: Visual-Inertial Person Localization for Co-Robot Rendezvous.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

Diverse Trajectory Forecasting with Determinantal Point Processes.
Proceedings of the 8th International Conference on Learning Representations, 2020

DLow: Diversifying Latent Flows for Diverse Human Motion Prediction.
Proceedings of the Computer Vision - ECCV 2020, 2020

AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification.
Proceedings of the Computer Vision - ECCV 2020, 2020

Efficient Non-Line-of-Sight Imaging from Transient Sinograms.
Proceedings of the Computer Vision - ECCV 2020, 2020

Neural Batch Sampling with Reinforcement Learning for Semi-supervised Anomaly Detection.
Proceedings of the Computer Vision - ECCV 2020, 2020

GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking With 2D-3D Multi-Feature Learning.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Optical Non-Line-of-Sight Physics-Based 3D Human Pose Estimation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Generative Hybrid Representations for Activity Forecasting With No-Regret Learning.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Inverting the Pose Forecasting Pipeline with SPF2: Sequential Pointcloud Forecasting for Sequential Pose Forecasting.
Proceedings of the 4th Conference on Robot Learning, 2020

Twitter A11y: A Browser Extension to Make Twitter Images Accessible.
Proceedings of the CHI '20: CHI Conference on Human Factors in Computing Systems, 2020

ReCog: Supporting Blind People in Recognizing Personal Objects.
Proceedings of the CHI '20: CHI Conference on Human Factors in Computing Systems, 2020

Importance of Self-Consistency in Active Learning for Semantic Segmentation.
Proceedings of the 31st British Machine Vision Conference 2020, 2020

MGpi: A Computational Model of Multiagent Group Perception and Interaction.
Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

Making GIFs Accessible.
Proceedings of the ASSETS '20: The 22nd International ACM SIGACCESS Conference on Computers and Accessibility, 2020

3D Human Motion Estimation via Motion Compression and Refinement.
Proceedings of the Computer Vision - ACCV 2020 - 15th Asian Conference on Computer Vision, Kyoto, Japan, November 30, 2020

2019
NavCog3 in the Wild: Large-scale Blind Indoor Navigation Assistant with Semantic Features.
ACM Trans. Access. Comput., 2019

Smartphone-based localization for blind navigation in building-scale indoor environments.
Pervasive Mob. Comput., 2019

A Baseline for 3D Multi-Object Tracking.
CoRR, 2019

Future Near-Collision Prediction from Monocular Video: Feasibility, Dataset, and Challenges.
CoRR, 2019

Modeling Social Group Communication with Multi-Agent Imitation Learning.
CoRR, 2019

Adversarial domain adaptation for cross data source macromolecule in situ structural classification in cellular electron cryo-tomograms.
Bioinform., 2019

"It's almost like they're trying to hide it": How User-Provided Image Descriptions Have Failed to Make Twitter Accessible.
Proceedings of the World Wide Web Conference, 2019

Domain Randomization for Scene-Specific Car Detection and Pose Estimation.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2019

ADA: Adversarial Data Augmentation for Object Detection.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2019

An Independent and Interactive Museum Experience for Blind People.
Proceedings of the 16th Web For All 2019 Conference - Personalizing the Web, 2019

Impact of Expertise on Interaction Preferences for Navigation Assistance of Visually Impaired Individuals.
Proceedings of the 16th Web For All 2019 Conference - Personalizing the Web, 2019

GroundNet: Monocular Ground Plane Normal Estimation with Geometric Consistency.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Forecasting Time-to-Collision from Monocular Video: Feasibility, Dataset, and Challenges.
Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019

A-EXP4: Online Social Policy Learning for Adaptive Robot-Pedestrian Interaction.
Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019

Directed-Info GAIL: Learning Hierarchical Policies from Unsegmented Demonstrations using Directed Information.
Proceedings of the 7th International Conference on Learning Representations, 2019

Learnable Embedding Space for Efficient Neural Architecture Compression.
Proceedings of the 7th International Conference on Learning Representations, 2019

Improving Lesion Segmentation for Diabetic Retinopathy Using Adversarial Learning.
Proceedings of the Image Analysis and Recognition - 16th International Conference, 2019

Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

PRECOG: PREdiction Conditioned on Goals in Visual Multi-Agent Settings.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Incremental Class Discovery for Semantic Segmentation With RGBD Sensing.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Ego-Pose Estimation and Forecasting As Real-Time PD Control.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

BBeep: A Sonic Collision Avoidance System for Blind Travellers and Nearby Pedestrians.
Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 2019

Airport Accessibility and Navigation Assistance for People with Visual Impairments.
Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 2019

Learning Spatio-Temporal Features with Two-Stream Deep 3D CNNs for Lipreading.
Proceedings of the 30th British Machine Vision Conference 2019, 2019

CaBot: Designing and Evaluating an Autonomous Navigation Robot for Blind People.
Proceedings of the 21st International ACM SIGACCESS Conference on Computers and Accessibility, 2019

2018
Ego-Surfing: Person Localization in First-Person Videos Using Ego-Motion Signatures.
IEEE Trans. Pattern Anal. Mach. Intell., 2018

Deep Learning and Geometry-based Image Localization Enhanced by Bluetooth Signals.
J. Inf. Process., 2018

Variability in Reactions to Instructional Guidance during Smartphone-Based Assisted Navigation of Blind Users.
Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 2018

Crowdsourcing the Installation and Maintenance of Indoor Localization Infrastructure to Support Blind Navigation.
Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 2018

Synthesizing a Scene-Specific Pedestrian Detector and Pose Estimator for Static Video Surveillance - Can We Learn Pedestrian Detectors and Pose Estimators Without Real Data?
Int. J. Comput. Vis., 2018

VADRA: Visual Adversarial Domain Randomization and Augmentation.
CoRR, 2018

GroundNet: Segmentation-Aware Monocular Ground Plane Estimation with Geometric Consistency.
CoRR, 2018

Softer-NMS: Rethinking Bounding Box Regression for Accurate Object Detection.
CoRR, 2018

Understanding hand-object manipulation by modeling the contextual relationship between actions, grasp types and object attributes.
CoRR, 2018

Human-Interactive Subgoal Supervision for Efficient Inverse Reinforcement Learning.
CoRR, 2018

Personalized Dynamics Models for Adaptive Assistive Navigation Interfaces.
CoRR, 2018

SmartPartNet: Part-Informed Person Detection for Body-Worn Smartphones.
Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

Rotational Rectification Network: Enabling Pedestrian Detection for Mobile Vision.
Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

Recognizing Visual Signatures of Spontaneous Head Gestures.
Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

Deep Radio-Visual Localization.
Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

How Context and User Behavior Affect Indoor Navigation Assistance for Blind People.
Proceedings of the 15th Web for All Conference, 2018

Informedia @ TRECVID 2018: Ad-hoc Video Search, Video to Text Description, Activities in Extended video.
Proceedings of the 2018 TREC Video Retrieval Evaluation, 2018

Semi-automated home-based therapy for the upper extremity of stroke survivors.
Proceedings of the 11th PErvasive Technologies Related to Assistive Environments Conference, 2018

Smartphone-based Indoor Localization for Blind Navigation across Building Complexes.
Proceedings of the 2018 IEEE International Conference on Pervasive Computing and Communications, 2018

Modeling Expertise in Assistive Navigation Interfaces for Blind People.
Proceedings of the 23rd International Conference on Intelligent User Interfaces, 2018

N2N learning: Network to Network Compression via Policy Gradient Reinforcement Learning.
Proceedings of the 6th International Conference on Learning Representations, 2018

3D Ego-Pose Estimation via Imitation Learning.
Proceedings of the Computer Vision - ECCV 2018, 2018

r2p2: A ReparameteRized Pushforward Policy for Diverse, Precise Generative Path Forecasting.
Proceedings of the Computer Vision - ECCV 2018, 2018

Learning Neural Parsers with Deterministic Differentiable Imitation Learning.
Proceedings of the 2nd Annual Conference on Robot Learning, 2018

Personalized Dynamics Models for Adaptive Assistive Navigation Systems.
Proceedings of the 2nd Annual Conference on Robot Learning, 2018

Environmental Factors in Indoor Navigation Based on Real-World Trajectories of Blind Users.
Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, 2018

Error Correction Maximization for Deep Image Hashing.
Proceedings of the British Machine Vision Conference 2018, 2018

The Present and Future of Museum Accessibility for People with Visual Impairments.
Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility, 2018

Efficient K-Shot Learning With Regularized Deep Networks.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Phase-Parametric Policies for Reinforcement Learning in Cyclic Environments.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Guest Editorial Special Issue on Wearable and Ego-Vision Systems for Augmented Experience.
IEEE Trans. Hum. Mach. Syst., 2017

An Ego-Vision System for Hand Grasp Analysis.
IEEE Trans. Hum. Mach. Syst., 2017

Adversarially Optimizing Intersection over Union for Object Localization Tasks.
CoRR, 2017

Inverse Reinforcement Learning with Conditional Choice Probabilities.
CoRR, 2017

Rotational Rectification Network for Robust Pedestrian Detection.
CoRR, 2017

Beacon-Guided Structure from Motion for Smartphone-Based Navigation.
Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision, 2017

Achieving Practical and Accurate Indoor Navigation for People with Visual Impairments.
Proceedings of the 14th Web for All Conference, 2017

Predictive-State Decoders: Encoding the Future into Recurrent Networks.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

EyeQual: Accurate, Explainable, Retinal Image Quality Assessment.
Proceedings of the 16th IEEE International Conference on Machine Learning and Applications, 2017

The Visual Object Tracking VOT2017 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

Privacy-Preserving Visual Learning Using Doubly Permuted Homomorphic Encryption.
Proceedings of the IEEE International Conference on Computer Vision, 2017

First-Person Activity Forecasting with Online Inverse Reinforcement Learning.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Inference Machines for supervised Bluetooth localization.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Forecasting Interactive Dynamics of Pedestrians with Fictitious Play.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

People with Visual Impairment Training Personal Object Recognizers: Feasibility and Challenges.
Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, 2017

Video segmentation and stabilization for BallCam.
Proceedings of the 8th Augmented Human International Conference, 2017

NavCog3: An Evaluation of a Smartphone-Based Blind Indoor Navigation Assistant with Semantic Features in a Large-Scale Environment.
Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility, 2017

HOMER: An Interactive System for Home Based Stroke Rehabilitation.
Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility, 2017

Virtual Navigation for Blind People: Building Sequential Representations of the Real-World.
Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility, 2017

Activity Forecasting: An Invitation to Predictive Perception.
Proceedings of the Group and Crowd Behavior for Computer Vision, 1st Edition, 2017

2016
Hybrid macro-micro visual analysis for city-scale state estimation.
Comput. Vis. Image Underst., 2016

Gesture-based Bootstrapping for Egocentric Hand Segmentation.
CoRR, 2016

Contextual Visual Similarity.
CoRR, 2016

In Teacher We Trust: Learning Compressed Models for Pedestrian Detection.
CoRR, 2016

Online Semantic Activity Forecasting with DARKO.
CoRR, 2016

A Game-Theoretic Approach to Multi-Pedestrian Activity Forecasting.
CoRR, 2016

Visual Compiler: Synthesizing a Scene-Specific Pedestrian Detector and Pose Estimator.
CoRR, 2016

Predicting wide receiver trajectories in American football.
Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision, 2016

Cutting through the clutter: Task-relevant features for image matching.
Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision, 2016

NavCog: turn-by-turn smartphone navigation assistant for people with visual impairments or blindness.
Proceedings of the 13th Web for All Conference, 2016

Activity-Aware Video Stabilization for BallCam.
Proceedings of the 29th Annual Symposium on User Interface Software and Technology, 2016

Understanding Hand-Object Manipulation with Grasp Types and Object Attributes.
Proceedings of the Robotics: Science and Systems XII, University of Michigan, Ann Arbor, Michigan, USA, June 18, 2016

NavCog: a navigational cognitive assistant for the blind.
Proceedings of the 18th International Conference on Human-Computer Interaction with Mobile Devices and Services, 2016

Visual Motif Discovery via First-Person Vision.
Proceedings of the Computer Vision - ECCV 2016, 2016

Recognizing Micro-Actions and Reactions from Paired Egocentric Videos.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Learning Action Maps of Large Environments via First-Person Vision.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Going Deeper into First-Person Activity Recognition.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

VizMap: Accessible Visual Information Through Crowdsourced Map Reconstruction.
Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility, 2016

Deep Supervised Hashing with Triplet Labels.
Proceedings of the Computer Vision - ACCV 2016, 2016

Long-Term Activity Forecasting Using First-Person Vision.
Proceedings of the Computer Vision - ACCV 2016, 2016

2015
Face Alignment Refinement.
Proceedings of the 2015 IEEE Winter Conference on Applications of Computer Vision, 2015

Hand parsing for fine-grained recognition of human grasps in monocular images.
Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2015

A scalable approach for understanding the visual structures of hand grasps.
Proceedings of the IEEE International Conference on Robotics and Automation, 2015

Recognizing hand-object interactions in wearable camera videos.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Semantic video segmentation using both appearance and geometric information.
Proceedings of the Intelligent Robots and Computer Vision XXXII: Algorithms and Techniques, 2015

Ego-surfing first person videos.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

How do we use our hands? Discovering a diverse set of common grasps.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Learning scene-specific pedestrian detectors without real data.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Approximate MaxEnt Inverse Optimal Control and Its Application for Mental Simulation of Human Interactions.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014
Action-Reaction: Forecasting the Dynamics of Human Interaction.
Proceedings of the Computer Vision - ECCV 2014, 2014

An Introduction to the 3rd Workshop on Egocentric (First-Person) Vision.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014

Massive City-Scale Surface Condition Analysis Using Ground and Aerial Imagery.
Proceedings of the Computer Vision - ACCV 2014, 2014

Automating Stroke Rehabilitation for Home-Based Therapy.
Proceedings of the 2014 AAAI Fall Symposia, Arlington, Virginia, USA, November 13-15, 2014, 2014

2013
Multi-pose multi-target tracking for activity understanding.
Proceedings of the 2013 IEEE Workshop on Applications of Computer Vision, 2013

Model Recommendation with Virtual Probes for Egocentric Hand Detection.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Pixel-Level Hand Detection in Ego-centric Videos.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Experiencing the ball's POV for ballistic sports.
Proceedings of the 4th Augmented Human International Conference, 2013

2012
Ego-Action Analysis for First-Person Sports Videos.
IEEE Pervasive Comput., 2012

BallCam!: dynamic view synthesis from spinning cameras.
Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology, 2012

Activity Forecasting.
Proceedings of the Computer Vision - ECCV 2012, 2012

Detecting Interesting Events Using Unsupervised Density Ratio Estimation.
Proceedings of the Computer Vision - ECCV 2012. Workshops and Demonstrations, 2012

Coupling eye-motion and ego-motion features for first-person activity recognition.
Proceedings of the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2012

Human-centric panoramic imaging stitching.
Proceedings of the 3rd Augmented Human International Conference, 2012

2011
Fast unsupervised ego-action learning for first-person sports videos.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

EdgeSonic: image feature sonification for the visually impaired.
Proceedings of the 2nd Augmented Human International Conference, 2011

Ego-motion analysis using average image data intensity.
Proceedings of the 2nd Augmented Human International Conference, 2011

2010
ImprovGenerator: Online Grammatical Induction for On-the-Fly Improvisation Accompaniment.
Proceedings of the 10th International Conference on New Interfaces for Musical Expression, 2010

3-D interaction with a large wall display using transparent markers.
Proceedings of the International Conference on Advanced Visual Interfaces, 2010

2009
Recognizing Multiple Objects via Regression Incorporating the Co-occurrence of Categories.
Proceedings of the Advances in Image and Video Technology, Third Pacific Rim Symposium, 2009

Using individuality to track individuals: Clustering individual trajectories in crowds using local appearance and frequency trait.
Proceedings of the IEEE 12th International Conference on Computer Vision, ICCV 2009, Kyoto, Japan, September 27, 2009

2008
Recovering the Basic Structure of Human Activities from Noisy Video-Based Symbol Strings.
Int. J. Pattern Recognit. Artif. Intell., 2008

Recognizing Overlapped Human Activities from a Sequence of Primitive Actions via Deleted Interpolation.
Int. J. Pattern Recognit. Artif. Intell., 2008


  Loading...