Rahul Sukthankar

Affiliations:
  • Carnegie Mellon University, Pittsburgh, USA


According to our database1, Rahul Sukthankar authored at least 169 papers between 1997 and 2023.

Collaborative distances:

Awards

IEEE Fellow

IEEE Fellow 2018, "For contributions to video understanding".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Self-supervised Hypergraphs for Learning Multiple World Interpretations.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022
UFO Depth: Unsupervised learning with flow-based odometry optimization for metric depth estimation.
Proceedings of the 2022 International Conference on Robotics and Automation, 2022

Discrete Representations Strengthen Vision Transformer Robustness.
Proceedings of the Tenth International Conference on Learning Representations, 2022

2021
HSPACE: Synthetic Parametric Humans Animated in Complex Environments.
CoRR, 2021

THUNDR: Transformer-based 3D HUmaN Reconstruction with Markers.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Neural Descent for Visual 3D Human Pose and Shape.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Depth Distillation: Unsupervised Metric Depth Estimation for UAVs by Finding Consensus Between Kinematics, Optical Flow and Deep Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

Semi-Supervised Learning for Multi-Task Scene Understanding by Neural Graph Consensus.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Cognitive Mapping and Planning for Visual Navigation.
Int. J. Comput. Vis., 2020

The End-of-End-to-End: A Video Understanding Pentathlon Challenge (2020).
CoRR, 2020

Learning Video Representations from Textual Web Supervision.
CoRR, 2020

D3D: Distilled 3D Networks for Video Action Recognition.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

SelfieDroneStick: A Natural Interface for Quadcopter Photography.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows.
Proceedings of the Computer Vision - ECCV 2020, 2020

GHUM & GHUML: Generative 3D Human Shape and Articulated Pose Models.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Speech2Action: Cross-Modal Supervision for Action Recognition.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Selfie Drone Stick: A Natural Interface for Quadcopter Photography.
CoRR, 2019

Customizing Object Detectors for Indoor Robots.
Proceedings of the International Conference on Robotics and Automation, 2019

Relational Action Forecasting.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

An Efficient 3D CNN for Action/Object Segmentation in Video.
Proceedings of the 30th British Machine Vision Conference 2019, 2019

2018
Deep Learning for Visual Understanding: Part 2 [From the Guest Editors].
IEEE Signal Process. Mag., 2018

Guest Editorial.
Comput. Vis. Image Underst., 2018

Modulated Policy Hierarchies.
CoRR, 2018

Object category learning and retrieval with weak supervision.
CoRR, 2018

Actor-Centric Relation Network.
Proceedings of the Computer Vision - ECCV 2018, 2018

The 2nd YouTube-8M Large-Scale Video Understanding Challenge.
Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

AVA: A Video Dataset of Spatio-Temporally Localized Atomic Visual Actions.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Rethinking the Faster R-CNN Architecture for Temporal Action Localization.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017
Deep Learning for Visual Understanding [From the Guest Editors].
IEEE Signal Process. Mag., 2017

Video Object Discovery and Co-Segmentation with Extremely Weak Supervision.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

Behavior Discovery and Alignment of Articulated Object Classes from Unstructured Video.
Int. J. Comput. Vis., 2017

The THUMOS challenge on action recognition for videos "in the wild".
Comput. Vis. Image Underst., 2017

SfM-Net: Learning of Structure and Motion from Video.
CoRR, 2017

WebVision Challenge: Visual Learning and Understanding With Web Data.
CoRR, 2017

AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions.
CoRR, 2017

Motion Prediction Under Multimodality with Conditional Stochastic Networks.
CoRR, 2017

Traffic Lights with Auction-Based Controllers: Algorithms and Real-World Data.
CoRR, 2017

Robust Adversarial Reinforcement Learning.
Proceedings of the 34th International Conference on Machine Learning, 2017

Cloudlet-based just-in-time indexing of IoT video.
Proceedings of the Global Internet of Things Summit, 2017

Cognitive Mapping and Planning for Visual Navigation.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Real-Time Temporal Action Localization in Untrimmed Videos by Sub-Action Discovery.
Proceedings of the British Machine Vision Conference 2017, 2017

2016
Variable Rate Image Compression with Recurrent Neural Networks.
Proceedings of the 4th International Conference on Learning Representations, 2016

Beyond Skip Connections: Top-Down Modulation for Object Detection.
CoRR, 2016

Physical and virtual cell phone sensors for traffic control: Algorithms and deployment impact.
Proceedings of the IEEE Sensors Applications Symposium, 2016

Selecting Vantage Points for an Autonomous Quadcopter Videographer.
Proceedings of the Twenty-Ninth International Florida Artificial Intelligence Research Society Conference, 2016

Discovering the Physical Parts of an Articulated Object Class from Multiple Videos.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Labeling the Features Not the Samples: Efficient Video Classification with Minimal Supervision.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Exploring the Benefits of Context in 3D Gesture Recognition for Game-Based Virtual Environments.
ACM Trans. Interact. Intell. Syst., 2015

Coreset-Based Adaptive Tracking.
CoRR, 2015

Temporal Localization of Fine-Grained Actions in Videos by Domain Transfer from Web Images.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Micro-Auction-Based Traffic-Light Control: Responsive, Local Decision Making.
Proceedings of the IEEE 18th International Conference on Intelligent Transportation Systems, 2015

Approximating the Effects of Installed Traffic Lights: A Behaviorist Approach Based on Travel Tracks.
Proceedings of the IEEE 18th International Conference on Intelligent Transportation Systems, 2015

Preface to 3D Reconstruction and Understanding with Video and Sound.
Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop, 2015

Robust video segment proposals with painless occlusion handling.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Articulated motion discovery using pairs of trajectories.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

MatchNet: Unifying feature and metric learning for patch-based matching.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

The Virtues of Peer Pressure: A Simple Method for Discovering High-Value Mistakes.
Proceedings of the Computer Analysis of Images and Patterns, 2015

2014
Classification of Cinematographic Shots Using Lie Algebra and its Application to Complex Event Recognition.
IEEE Trans. Multim., 2014

Generalized Boundaries from Multiple Image Interpretations.
IEEE Trans. Pattern Anal. Mach. Intell., 2014

Recovering Spatiotemporal Correspondence between Deformable Objects by Exploiting Consistent Foreground Motion in Video.
CoRR, 2014

Thoughts on a Recursive Classifier Graph: a Multiclass Network for Deep Object Recognition.
CoRR, 2014

Features in Concert: Discriminative Feature Selection meets Unsupervised Clustering.
CoRR, 2014

Video Object Discovery and Co-segmentation with Extremely Weak Supervision.
Proceedings of the Computer Vision - ECCV 2014, 2014

DaMN - Discriminative and Mutually Nearest: Exploiting Pairwise Category Proximity for Video Action Recognition.
Proceedings of the Computer Vision - ECCV 2014, 2014

Large-Scale Video Classification with Convolutional Neural Networks.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Recognition of Complex Events: Exploiting Temporal Dynamics between Underlying Concepts.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

2013
Exploring the Trade-off Between Accuracy and Observational Latency in Action Recognition.
Int. J. Comput. Vis., 2013

Reports of the 2013 AAAI Spring Symposium Series.
AI Mag., 2013

Multi-armed recommendation bandits for selecting state machine policies for robotic systems.
Proceedings of the 2013 IEEE International Conference on Robotics and Automation, 2013

Spatiotemporal Deformable Part Models for Action Detection.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Discriminative Segment Annotation in Weakly Labeled Video.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

CrowdCam: Instantaneous Navigation of Crowd Images Using Angled Graph.
Proceedings of the 2013 International Conference on 3D Vision, 2013

2012
Unsupervised Learning for Graph Matching.
Int. J. Comput. Vis., 2012

Classification of plant structures from uncalibrated image sequences.
Proceedings of the IEEE Workshop on Applications of Computer Vision, 2012

Importance-weighted label prediction for active learning with noisy annotations.
Proceedings of the 21st International Conference on Pattern Recognition, 2012

Classifier Ensemble Recommendation.
Proceedings of the Computer Vision - ECCV 2012. Workshops and Demonstrations, 2012

Efficient Closed-Form Solution to Generalized Boundary Detection.
Proceedings of the Computer Vision - ECCV 2012, 2012

Weakly Supervised Learning of Object Segmentations from Web-Scale Video.
Proceedings of the Computer Vision - ECCV 2012. Workshops and Demonstrations, 2012

Model recommendation for action recognition.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

D-Nets: Beyond patch-based image descriptors.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011
A holistic approach to aesthetic enhancement of photographs.
ACM Trans. Multim. Comput. Commun. Appl., 2011

Fast and accurate global motion compensation.
Pattern Recognit., 2011

Large-Scale Multimedia Retrieval and Mining [Guest editors' introduction].
IEEE Multim., 2011

Incremental Relabeling for Active Learning with Noisy Crowdsourced Annotations.
Proceedings of the PASSAT/SocialCom 2011, Privacy, 2011

Measuring and reducing observational latency when recognizing actions.
Proceedings of the IEEE International Conference on Computer Vision Workshops, 2011

Feature seeding for action recognition.
Proceedings of the IEEE International Conference on Computer Vision, 2011

Prop-free pointing detection in dynamic cluttered environments.
Proceedings of the Ninth IEEE International Conference on Automatic Face and Gesture Recognition (FG 2011), 2011

Localizing actions through sequential 2D video projections.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2011

A probabilistic representation for efficient large scale visual recognition tasks.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Violence Detection in Video Using Computer Vision Techniques.
Proceedings of the Computer Analysis of Images and Patterns, 2011

Robust Active Learning Using Crowdsourced Annotations for Activity Recognition.
Proceedings of the Human Computation, 2011

2010
Guest Editors' Introduction: Labeling the World.
IEEE Pervasive Comput., 2010

A Boosting Framework for Visuality-Preserving Distance Metric Learning and Its Application to Medical Image Retrieval.
IEEE Trans. Pattern Anal. Mach. Intell., 2010

The unique strengths and storage access characteristics of discard-based search.
J. Internet Serv. Appl., 2010

Searching Complex Data Without an Index.
Int. J. Next Gener. Comput., 2010

Volumetric Features for Video Event Detection.
Int. J. Comput. Vis., 2010

Decentralized estimation and control of graph connectivity for mobile sensor networks.
Autom., 2010

Exploiting multi-level parallelism for low-latency activity recognition in streaming video.
Proceedings of the First Annual ACM SIGMM Conference on Multimedia Systems, 2010

A framework for photo-quality assessment and enhancement based on visual aesthetics.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Controlling your TV with gestures.
Proceedings of the 11th ACM SIGMM International Conference on Multimedia Information Retrieval, 2010

Motif Discovery and Feature Selection for CRF-based Activity Recognition.
Proceedings of the 20th International Conference on Pattern Recognition, 2010

Representing Pairwise Spatial and Temporal Relations for Action Recognition.
Proceedings of the Computer Vision, 2010

Food recognition using statistics of pairwise local features.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

Optimizing one-shot recognition with micro-set learning.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

2009
SLIPstream: scalable low-latency interactive perception on streaming data.
Proceedings of the Network and Operating System Support for Digital Audio and Video, 2009

An Integer Projected Fixed Point Method for Graph Matching and MAP Inference.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

The 1st workshop on large-scale multimedia retrieval and mining (LS-MMRM'09).
Proceedings of the 17th International Conference on Multimedia 2009, 2009

PFID: Pittsburgh fast-food image dataset.
Proceedings of the International Conference on Image Processing, 2009

Trajectons: Action recognition through the motion analysis of tracked features.
Proceedings of the 12th IEEE International Conference on Computer Vision Workshops, 2009

2008
Activity-Based Computing.
IEEE Pervasive Comput., 2008

Semi-supervised Learning with Weakly-Related Unlabeled Data: Towards Better Text Categorization.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Distributed online anomaly detection in high-content screening.
Proceedings of the 2008 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 2008

Unifying discriminative visual codebook generation with classifier training for object category recognition.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

Learning class-specific affinities for image labelling.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

Fast Motion Consistency through Matrix Quantization.
Proceedings of the British Machine Vision Conference 2008, Leeds, UK, September 2008, 2008

Semi-Supervised Clustering via Learnt Codeword Distances.
Proceedings of the British Machine Vision Conference 2008, Leeds, UK, September 2008, 2008

2007
Shadow Elimination and Blinding Light Suppression for Interactive Projected Displays.
IEEE Trans. Vis. Comput. Graph., 2007

Feature-based Part Retrieval for Interactive 3D Reassembly.
Proceedings of the 8th IEEE Workshop on Applications of Computer Vision (WACV 2007), 2007

Bayesian Active Distance Metric Learning.
Proceedings of the UAI 2007, 2007

Learning distance metrics for interactive search-assisted diagnosis of mammograms.
Proceedings of the Medical Imaging 2007: Computer-Aided Diagnosis, San Diego, 2007

Interactive Search of Adipocytes in Large Collections of Digital Cellular Images.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Event Detection in Crowded Videos.
Proceedings of the IEEE 11th International Conference on Computer Vision, 2007

Semi-supervised Collaborative Text Classification.
Proceedings of the Machine Learning: ECML 2007, 2007

Discriminative Cluster Refinement: Improving Object Category Recognition Given Limited Training Data.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

Beyond Local Appearance: Category Recognition from Pairwise Interactions of Simple Features.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

Spatio-temporal Shape and Flow Correlation for Action Recognition.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

2006
Distributed Inference in Dynamical Systems.
Proceedings of the Advances in Neural Information Processing Systems 19, 2006

Distributed localization of networked cameras.
Proceedings of the Fifth International Conference on Information Processing in Sensor Networks, 2006

Semantic Learning for Audio Applications: A Computer Vision Approach.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2006

Correlated Label Propagation with Application to Multi-label Learning.
Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), 2006

Industry and Object Recognition: Applications, Applied Research and Challenges.
Proceedings of the Toward Category-Level Object Recognition, 2006

An Efficient Algorithm for Local Distance Metric Learning.
Proceedings of the Proceedings, 2006

2005
Tools and Applications for Large-Scale Display Walls.
IEEE Computer Graphics and Applications, 2005

Tracking Locations of Moving Hand-Held Displays Using Projected Light.
Proceedings of the Pervasive Computing, 2005

IrisNet: an internet-scale architecture for multimedia sensors.
Proceedings of the 13th ACM International Conference on Multimedia, 2005

A Robust Visual Odometry and Precipice Detection System Using Consumer-grade Monocular Vision.
Proceedings of the 2005 IEEE International Conference on Robotics and Automation, 2005

Evaluating keypoint methods for content-based copyright protection of digital images.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

Efficient Visual Event Detection Using Volumetric Features.
Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV 2005), 2005

SOLAR: sound object localization and retrieval in complex audio environments.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Dynamic load balancing for distributed search.
Proceedings of the 14th IEEE International Symposium on High Performance Distributed Computing, 2005

Computer Vision for Music Identification: Video Demonstration.
Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), 2005

Computer Vision for Music Identification.
Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), 2005

2004
An efficient parts-based near-duplicate and sub-image retrieval system.
Proceedings of the 12th ACM International Conference on Multimedia, 2004

Forensic video reconstruction.
Proceedings of the ACM 2nd International Workshop on Video Surveillance & Sensor Networks, 2004

Techniques for evaluating optical flow for visual odometry in extreme terrain.
Proceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems, Sendai, Japan, September 28, 2004

SnapFind: brute force interactive image retrieval.
Proceedings of the Third International Conference on Image and Graphics, 2004

Diamond: A Storage Architecture for Early Discard in Interactive Search.
Proceedings of the FAST '04 Conference on File and Storage Technologies, March 31, 2004

PCA-SIFT: A More Distinctive Representation for Local Image Descriptors.
Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2004), with CD-ROM, 27 June, 2004

Object-Based Image Retrieval Using the Statistical Structure of Images.
Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2004), with CD-ROM, 27 June, 2004

A Flexible Projector-Camera System for Multi-Planar Displays.
Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2004), with CD-ROM, 27 June, 2004

Visual Odometry Using Commodity Optical Flow.
Proceedings of the Nineteenth National Conference on Artificial Intelligence, 2004

2003
Shadow Elimination and Occluder Light Suppression for Multi-Projector Displays.
Proceedings of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2003), 2003

2002
Scalable Alignment of Large-Format Multi-Projector Displays Using Camera Homography Trees.
Proceedings of the 13th IEEE Visualization Conference, 2002

A Theory of the Quasi-Static World.
Proceedings of the 16th International Conference on Pattern Recognition, 2002

Projected light displays using visual feedback.
Proceedings of the Seventh International Conference on Control, 2002

The OD Theory of TOD: The Use and Limits of Temporal Information for Object Discovery.
Proceedings of the Eighteenth National Conference on Artificial Intelligence and Fourteenth Conference on Innovative Applications of Artificial Intelligence, July 28, 2002

2001
Argus: The Digital Doorman.
IEEE Intell. Syst., 2001

Smarter Presentations: Exploiting Homography in Camera-Projector Systems.
Proceedings of the Eighth International Conference On Computer Vision (ICCV-01), Vancouver, British Columbia, Canada, July 7-14, 2001, 2001

Self-Calibrating Camera Projector Systems for Interactive Displays and Presentations.
Proceedings of the Eighth International Conference On Computer Vision (ICCV-01), Vancouver, British Columbia, Canada, July 7-14, 2001, 2001

Dynamic Shadow Elimination for Multi-Projector Displays.
Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001), 2001

2000
Applying Machine Learning for High-Performance Named-Entity Extraction.
Comput. Intell., 2000

JKanji: Wavelet-Based Interactive Kanji Completion.
Proceedings of the 15th International Conference on Pattern Recognition, 2000

Complete Cross-Validation for Nearest Neighbor Classifiers.
Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29, 2000

AutomaticKeystone Correction for Camera-Assisted Presentation Interfaces.
Proceedings of the Advances in Multimodal Interfaces, 2000

Memory-Based Face Recognition for Visitor Identification.
Proceedings of the 4th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2000), 2000

1999
JGram: Rapid Development of Multi-Agent Pipelines for Real-World Tasks.
Proceedings of the 1st International Symposium on Agent Systems and Applications / 3rd International Symposium on Mobile Agents (ASA/MA '99), 1999

ARGUS: An Automated Multi-Agent Visitor Identification System.
Proceedings of the Sixteenth National Conference on Artificial Intelligence and Eleventh Conference on Innovative Applications of Artificial Intelligence, 1999

1998
Multiple Adaptive Agents for Tactical Driving.
Appl. Intell., 1998

1997
Evolving an intelligent vehicle for tactical reasoning in traffic.
Proceedings of the 1997 IEEE International Conference on Robotics and Automation, 1997


  Loading...