Alex Hauptmann

Orcid: 0000-0003-2123-0684

Affiliations:
  • Carnegie Mellon University, Pittsburgh, USA


According to our database1, Alex Hauptmann authored at least 478 papers between 1986 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Combo: Co-speech holistic 3D human motion generation and efficient customizable adaptation in harmony.
CoRR, 2024

SHIELD: LLM-Driven Schema Induction for Predictive Analytics in EV Battery Supply Chain Disruptions.
CoRR, 2024

Multimodal Reranking for Knowledge-Intensive Visual Question Answering.
CoRR, 2024

MetaDesigner: Advancing Artistic Typography through AI-Driven, User-Centric, and Multilingual WordArt Synthesis.
CoRR, 2024

Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions.
CoRR, 2024

Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning.
CoRR, 2024

Learning Visual-Semantic Subspace Representations for Propositional Reasoning.
CoRR, 2024

MM-TTS: A Unified Framework for Multimodal, Prompt-Induced Emotional Text-to-Speech Synthesis.
CoRR, 2024

Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward.
CoRR, 2024

Adversarially Masked Video Consistency for Unsupervised Domain Adaptation.
CoRR, 2024

Hyperbolic vs Euclidean Embeddings in Few-Shot Learning: Two Sides of the Same Coin.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Visual Grounding for User Interfaces.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track, 2024

SZTU-CMU at MER2024: Improving Emotion-LLaMA with Conv-Attention for Multimodal Emotion Recognition.
Proceedings of the 2nd International Workshop on Multimodal and Responsible Affective Computing, 2024

VICAN: Very Efficient Calibration Algorithm for Large Camera Networks.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Language Model Beats Diffusion - Tokenizer is key to visual generation.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

PhISANet: Phonetically Informed Speech Animation Network.
Proceedings of the IEEE International Conference on Acoustics, 2024

The Seven Faces of Stress: Understanding Facial Activity Patterns During Cognitive Stress.
Proceedings of the 18th IEEE International Conference on Automatic Face and Gesture Recognition, 2024

Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

Transitive Consistency Constrained Learning for Entity-to-Entity Stance Detection.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
TN-ZSTAD: Transferable Network for Zero-Shot Temporal Activity Detection.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

Video Pivoting Unsupervised Multi-Modal Machine Translation.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

Training Vision-Language Transformers from Captions.
Trans. Mach. Learn. Res., 2023

A Comprehensive Survey of Scene Graphs: Generation and Application.
IEEE Trans. Pattern Anal. Mach. Intell., 2023

Document Entity Retrieval with Massive and Noisy Pre-training.
CoRR, 2023

ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules.
CoRR, 2023

SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Breaking The Limits of Text-conditioned 3D Motion Synthesis with Elaborative Descriptions.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

DocumentNet: Bridging the Data Gap in Document Pre-training.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: EMNLP 2023, 2023

STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

MAGVIT: Masked Generative Video Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Towards Open-Domain Twitter User Profile Inference.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Zero-Shot and Few-Shot Stance Detection on Varied Topics via Conditional Generation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023

2022
Contrastive Adaptation Network for Single- and Multi-Source Domain Adaptation.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Deep Discrete Cross-Modal Hashing with Multiple Supervision.
Neurocomputing, 2022

Training Vision-Language Transformers from Captions Alone.
CoRR, 2022

Argus++: Robust Real-time Activity Detection for Unconstrained Video Streams with Overlapping Cube Proposals.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, 2022

TRM: Temporal Relocation Module for Video Recognition.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, 2022

KAT: A Knowledge Augmented Transformer for Vision-and-Language.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Rethinking Zero-shot Action Recognition: Learning from Latent Atomic Actions.
Proceedings of the Computer Vision - ECCV 2022, 2022

Speech Driven Tongue Animation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Rethinking Spatial Invariance of Convolutional Networks for Object Counting.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Subspace Representation Learning for Few-shot Image Classification.
CoRR, 2021

Scene Graphs: A Survey of Generations and Applications.
CoRR, 2021

MSNet: A Multilevel Instance Segmentation Network for Natural Disaster Damage Assessment in Aerial Videos.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

iFetch: Multimodal Conversational Agents for the Online Fashion Marketplace.
Proceedings of the MuCAI'21: Proceedings of the 2nd ACM Multimedia Workshop on Multimodal Conversational AI, 2021

MuCAI'21: 2nd ACM Multimedia Workshop on Multimodal Conversational AI.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

MMPT'21: International Joint Workshop on Multi-Modal Pre-Training for Multimedia Understanding.
Proceedings of the ICMR '21: International Conference on Multimedia Retrieval, 2021

Learning Unbiased Transformer for Long-Tail Sports Action Classification.
Proceedings of the Working Notes Proceedings of the MediaEval 2021 Workshop, 2021

Importance of Parasagittal Sensor Information in Tongue Motion Capture Through a Diphonic Analysis.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Person Search Challenges and Solutions: A Survey.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Support-set bottlenecks for video-text representation learning.
Proceedings of the 9th International Conference on Learning Representations, 2021

Pose Guided Person Image Generation With Hidden P-Norm Regression.
Proceedings of the 2021 IEEE International Conference on Image Processing, 2021

Learning to Hallucinate Examples from Extrinsic and Intrinsic Supervision.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Statistical Distance Metric Learning for Image Set Retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Fuzzy Least Squares Support Vector Machine With Adaptive Membership for Object Tracking.
IEEE Trans. Multim., 2020

Deep Top-$k$ Ranking for Image-Sentence Matching.
IEEE Trans. Multim., 2020

Learning Distilled Graph for Large-Scale Social Network Data Clustering.
IEEE Trans. Knowl. Data Eng., 2020

Pair-based Uncertainty and Diversity Promoting Early Active Learning for Person Re-identification.
ACM Trans. Intell. Syst. Technol., 2020

Semantics-Preserving Graph Propagation for Zero-Shot Object Detection.
IEEE Trans. Image Process., 2020

Simultaneous Bearing Fault Recognition and Remaining Useful Life Prediction Using Joint-Loss Convolutional Neural Network.
IEEE Trans. Ind. Informatics, 2020

Few-shot activity recognition with cross-modal memory network.
Pattern Recognit., 2020

Spatial-Temporal Alignment Network for Action Recognition and Detection.
CoRR, 2020

From A Glance to "Gotcha": Interactive Facial Image Retrieval with Progressive Relevance Feedback.
CoRR, 2020

SimAug: Learning Robust Representations from 3D Simulation for Pedestrian Trajectory Prediction in Unseen Cameras.
CoRR, 2020

Adaptive Feature Aggregation for Video Object Detection.
Proceedings of the IEEE Winter Applications of Computer Vision Workshops, 2020

Argus: Efficient Activity Detection System for Extended Video Analysis.
Proceedings of the IEEE Winter Applications of Computer Vision Workshops, 2020

CMU Informedia at TRECVID 2020: Activity Detection with Dense Spatio-temporal Proposals.
Proceedings of the 2020 TREC Video Retrieval Evaluation, 2020

Pixel-Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

MuCAI'20: 1st International Workshop on Multimodal Conversational AI.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Forward and Backward Multimodal NMT for Improved Monolingual and Multilingual Cross-Modal Retrieval.
Proceedings of the 2020 on International Conference on Multimedia Retrieval, 2020

Stacked Pooling for Boosting Scale Invariance of Crowd Counting.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Event-Related Bias Removal for Real-time Disaster Events.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Gun Source and Muzzle Head Detection.
Proceedings of the Imaging and Multimedia Analytics in a Web and Mobile World 2020, 2020

SimAug: Learning Robust Representations from Simulation for Trajectory Prediction.
Proceedings of the Computer Vision - ECCV 2020, 2020

The Eighth Visual Object Tracking VOT2020 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020

Robust Long-Term Object Tracking via Improved Discriminative Model Prediction.
Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020

ZSTAD: Zero-Shot Temporal Activity Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Zero-VIRUS<sup>*</sup>: Zero-shot Vehicle Route Understanding System for Intelligent Transportation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

ELECTRICITY: An Efficient Multi-camera Vehicle Tracking System for Intelligent City.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Unsupervised Multimodal Neural Machine Translation with Pseudo Visual Pivoting.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Adaptive Semi-Supervised Feature Selection for Cross-Modal Retrieval.
IEEE Trans. Multim., 2019

Generating Video Descriptions With Latent Topic Guidance.
IEEE Trans. Multim., 2019

Automatic Vacant Parking Places Management System Using Multicamera Vehicle Detection.
IEEE Trans. Intell. Transp. Syst., 2019

Scheduled sampling for one-shot learning via matching network.
Pattern Recognit., 2019

Focal Visual-Text Attention for Memex Question Answering.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Report of 2017 NSF Workshop on Multimedia Challenges, Opportunities and Research Roadmaps.
CoRR, 2019

Activitynet 2019 Task 3: Exploring Contexts for Dense Captioning Events in Videos.
CoRR, 2019

Technical Report of the DAISY System - Shooter Localization, Models, Interface, and Beyond.
CoRR, 2019

Minding the Gaps in a Video Action Analysis Pipeline.
Proceedings of the IEEE Winter Applications of Computer Vision Workshops, 2019

Inf@TRECVID 2019: Instance Search Task.
Proceedings of the 2019 TREC Video Retrieval Evaluation, 2019

MMVG-INF-Etrol@TRECVID 2019: Activities in Extended Video.
Proceedings of the 2019 TREC Video Retrieval Evaluation, 2019

CMU-Informedia at TREC 2019 Incident Streams Track.
Proceedings of the Twenty-Eighth Text REtrieval Conference, 2019


ExCL: Extractive Clip Localization Using Natural Language Descriptions.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Shooter Localization Using Social Media Videos.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Annotation Efficient Cross-Modal Retrieval with Adversarial Attentive Alignment.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Improving the Learning of Multi-column Convolutional Neural Network for Crowd Counting.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

PANEL: Challenges for Multimedia/Multimodal Research in the Next Decade.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Improving What Cross-Modal Retrieval Models Learn through Object-Oriented Inter- and Intra-Modal Attention Networks.
Proceedings of the 2019 on International Conference on Multimedia Retrieval, 2019

Multi-shot Person Re-identification through Set Distance with Visual Distributional Representation.
Proceedings of the 2019 on International Conference on Multimedia Retrieval, 2019

Learning Sound Events from Webly Labeled Data.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Cross-Modal Transfer Hashing Based on Coherent Projection.
Proceedings of the IEEE International Conference on Multimedia & Expo Workshops, 2019

Learning Spatial Awareness to Improve Crowd Counting.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Multi-Head Attention with Diversity for Learning Grounded Multilingual Multimodal Representations.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Peeking Into the Future: Predicting Future Person Activities and Locations in Videos.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Contrastive Adaptation Network for Unsupervised Domain Adaptation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Shooter Localization Using Videos in the Wild.
Proceedings of the 2019 International Conference on Content-Based Multimedia Indexing, 2019

Training-free Monocular 3D Event Detection System for Traffic Surveillance.
Proceedings of the 2019 IEEE International Conference on Big Data (IEEE BigData), 2019

Unsupervised Bilingual Lexicon Induction from Mono-Lingual Multimodal Data.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Video Content Analysis.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Joint Attributes and Event Analysis for Multimedia Event Detection.
IEEE Trans. Neural Networks Learn. Syst., 2018

Adaptive Unsupervised Feature Selection With Structure Regularization.
IEEE Trans. Neural Networks Learn. Syst., 2018

Few-Shot Text and Image Classification via Analogical Transfer Learning.
ACM Trans. Intell. Syst. Technol., 2018

An Adaptive Semisupervised Feature Analysis for Video Semantic Recognition.
IEEE Trans. Cybern., 2018

Deep feature learning via structured graph Laplacian embedding for person re-identification.
Pattern Recognit., 2018

A unified framework with a benchmark dataset for surveillance event detection.
Neurocomputing, 2018

Perceiving Physical Equation by Observing Visual Scenarios.
CoRR, 2018

Accident Forecasting in CCTV Traffic Camera Videos.
CoRR, 2018

Activity Recognition on a Large Scale in Short Videos - Moments in Time Dataset.
CoRR, 2018

Stacked Pooling: Improving Crowd Counting by Boosting Scale Invariance.
CoRR, 2018

Learning Distributional Representation and Set Distance for Multi-shot Person Re-identification.
CoRR, 2018

RUC+CMU: System Report for Dense Captioning Events in Videos.
CoRR, 2018

A Closer Look at Weak Label Learning for Audio Events.
CoRR, 2018

Multimodal Co-Training for Selecting Good Examples from Webly Labeled Video.
CoRR, 2018

Informedia @ TRECVID 2018: Ad-hoc Video Search, Video to Text Description, Activities in Extended video.
Proceedings of the 2018 TREC Video Retrieval Evaluation, 2018


GNAS: A Greedy Neural Architecture Search Method for Multi-Attribute Learning.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Learning to Transfer: Generalizable Attribute Learning with Multitask Neural Model Search.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Multimodal Filtering of Social Media for Temporal Monitoring and Event Analysis.
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, 2018

Class-aware Self-Attention for Audio Event Recognition.
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, 2018

Distinction of stress and non-stress tasks using facial action units.
Proceedings of the International Conference on Multimodal Interaction: Adjunct, 2018

RCAA: Relational Context-Aware Agents for Person Search.
Proceedings of the Computer Vision - ECCV 2018, 2018

Focal Visual-Text Attention for Visual Question Answering.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

DecideNet: Counting Varying Density Crowds Through Attention Guided Detection and Density Estimation.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

News Recommendation and Filter Bubble.
Proceedings of the CIKM 2018 Workshops co-located with 27th ACM International Conference on Information and Knowledge Management (CIKM 2018), 2018

Towards Independent Stress Detection: A Dependent Model Using Facial Action Units.
Proceedings of the 2018 International Conference on Content-Based Multimedia Indexing, 2018

Adaptive Context-aware Reinforced Agent for Handwritten Text Recognition.
Proceedings of the British Machine Vision Conference 2018, 2018

Traffic Danger Recognition With Surveillance Cameras Without Training Data.
Proceedings of the 15th IEEE International Conference on Advanced Video and Signal Based Surveillance, 2018

CADP: A Novel Dataset for CCTV Traffic Camera based Accident Analysis.
Proceedings of the 15th IEEE International Conference on Advanced Video and Signal Based Surveillance, 2018

Hidden Two-Stream Convolutional Networks for Action Recognition.
Proceedings of the Computer Vision - ACCV 2018, 2018

2017
The Many Shades of Negativity.
IEEE Trans. Multim., 2017

Feature Interaction Augmented Sparse Learning for Fast Kinect Motion Detection.
IEEE Trans. Image Process., 2017

Bi-Level Semantic Representation Analysis for Multimedia Event Detection.
IEEE Trans. Cybern., 2017

Avoiding Optimal Mean ℓ<sub>2, 1</sub>-Norm Maximization-Based Robust PCA for Reconstruction.
Neural Comput., 2017

Efficient human action recognition using histograms of motion gradients and VLAD with descriptor shape information.
Multim. Tools Appl., 2017

Uncovering the Temporal Context for Video Question Answering.
Int. J. Comput. Vis., 2017

Simple to complex cross-modal learning to rank.
Comput. Vis. Image Underst., 2017

MemexQA: Visual Memex Question Answering.
CoRR, 2017

Guided Optical Flow Learning.
CoRR, 2017

Simple to Complex Cross-modal Learning to Rank.
CoRR, 2017

Deep Local Video Feature for Action Recognition.
CoRR, 2017

Video Representation Learning and Latent Concept Mining for Large-scale Multi-label Video Classification.
CoRR, 2017

Deep Feature Learning via Structured Graph Laplacian Embedding for Person Re-Identification.
CoRR, 2017

Delving Deep into Personal Photo and Video Search.
Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, 2017

Informedia @ TRECVID 2017.
Proceedings of the 2017 TREC Video Retrieval Evaluation, 2017

Video Search via Ranking Network with Very Few Query Exemplars.
Proceedings of the MultiMedia Modeling - 23rd International Conference, 2017

MultiEdTech 2017: 1st International Workshop on Multimedia-based Educational and Knowledge Technologies for Personalized and Social Online Training.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Knowing Yourself: Improving Video Caption via In-depth Recap.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Video Captioning with Guidance of Multimodal Latent Topics.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Joint Saliency Estimation and Matching using Image Regions for Geo-Localization of Online Video.
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, 2017

Leveraging Multi-modal Prior Knowledge for Large-scale Concept Learning in Noisy Web Data.
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, 2017

Discriminative Dictionary Learning With Ranking Metric Embedded for Person Re-Identification.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Rewind to track: Parallelized apprenticeship learning with backward tracklets.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Complex Event Detection by Identifying Reliable Shots from Untrimmed Videos.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Temporal localization of audio events for conflict monitoring in social media.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Synchronization for multi-perspective videos in the wild.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Deep Local Video Feature for Action Recognition.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017

Probabilistic Non-Negative Matrix Factorization and Its Robust Extensions for Topic Modeling.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Webly-Supervised Learning of Multimodal Video Detectors.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

An Event Reconstruction Tool for Conflict Monitoring Using Social Media.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Visual Memory QA: Your Personal Photo and Video Search Agent.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Dictionary pruning with visual word significance for medical image retrieval.
Neurocomputing, 2016

Smart computing for large scale visual data sensing and processing.
Neurocomputing, 2016

InfAR dataset: Infrared action recognition at different times.
Neurocomputing, 2016

Text-to-video: a semantic search engine for internet videos.
Int. J. Multim. Inf. Retr., 2016

Person Re-identification: Past, Present and Future.
CoRR, 2016

Strategies for Searching Video Content with Text Queries or Video Examples.
CoRR, 2016

Long-Term Identity-Aware Multi-Person Tracking for Surveillance Video Summarization.
CoRR, 2016

Exploiting Multi-modal Curriculum in Noisy Web Data for Large-scale Concept Learning.
CoRR, 2016

UTS-CMU-D2DCRC Submission at TRECVID 2016 Video Localization.
Proceedings of the 2016 TREC Video Retrieval Evaluation, 2016

Informedia @ TRECVID 2016.
Proceedings of the 2016 TREC Video Retrieval Evaluation, 2016

Which Information Sources are More Effective and Reliable in Video Search.
Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016

Describing Videos using Multi-modal Fusion.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Avoiding Optimal Mean Robust PCA/2DPCA with Non-greedy ℓ<sub>1</sub>-Norm Maximization.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Learning to Detect Concepts from Webly-Labeled Video Data.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

The Solution Path Algorithm for Identity-Aware Multi-object Tracking.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

The Best of BothWorlds: Combining Data-Independent and Data-Driven Approaches for Action Recognition.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2016

Histograms of Motion Gradients for real-time video classification.
Proceedings of the 14th International Workshop on Content-Based Multimedia Indexing, 2016

Concepts Not Alone: Exploring Pairwise Relationships for Zero-Shot Video Activity Recognition.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

Dynamic Concept Composition for Zero-Example Event Detection.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Event Oriented Dictionary Learning for Complex Event Detection.
IEEE Trans. Image Process., 2015

Multi-view discriminative and structured dictionary learning with group sparsity for human action recognition.
Signal Process., 2015

Multi-Class Active Learning by Uncertainty Sampling with Diversity Maximization.
Int. J. Comput. Vis., 2015

Cross-Lingual Cross-Media Content Linking: Annotations and Joint Representations (Dagstuhl Seminar 15201).
Dagstuhl Reports, 2015

Uncovering Temporal Context for Video Question and Answering.
CoRR, 2015

The Best of Both Worlds: Combining Data-independent and Data-driven Approaches for Action Recognition.
CoRR, 2015

Handcrafted Local Features are Convolutional Neural Networks.
CoRR, 2015

Improving Human Activity Recognition Through Ranking and Re-ranking.
CoRR, 2015

Long-short Term Motion Feature for Action Classification and Retrieval.
CoRR, 2015

Beyond Spatial Pyramid Matching: Space-time Extended Descriptor for Action Recognition.
CoRR, 2015


WARD-CMU @ TRECVID 2015.
Proceedings of the 2015 TREC Video Retrieval Evaluation, 2015

Early Implementation Experience with Wearable Cognitive Assistance Applications.
Proceedings of the 2015 workshop on Wearable Systems and Applications, 2015

Fast and Accurate Content-based Semantic Search in 100M Internet Videos.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Image Profiling for History Events on the Fly.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Searching Persuasively: Joint Event Detection and Evidence Recounting with Limited Supervision.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Content-Based Video Search over 1 Million Videos with 1 Core in 1 Second.
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Incremental Multimodal Query Construction for Video Search.
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Bridging the Ultimate Semantic Gap: A Semantic Search Engine for Internet Videos.
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Density Corrected Sparse Recovery when R.I.P. Condition Is Broken.
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

Semantic Concept Discovery for Large-Scale Zero-Shot Event Detection.
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

Exploiting Feature Hierarchies with Convolutional Neural Networks for Cultural Event Recognition.
Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop, 2015

A discriminative CNN video representation for event detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Beyond Gaussian Pyramid: Multi-skip Feature Stacking for action recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

DevNet: A Deep Event Network for multimedia event detection and evidence recounting.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Massive Open Online Proctor: Protecting the Credibility of MOOCs certificates.
Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing, 2015

Ranking-Based Vocabulary Pruning in Bag-of-Features for Image Retrieval.
Proceedings of the Artificial Life and Computational Intelligence, 2015

Self-Paced Learning for Matrix Factorization.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

Complex Event Detection via Event Oriented Dictionary Learning.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

Self-Paced Curriculum Learning.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

Exploring Semantic Inter-Class Relationships (SIR) for Zero-Shot Action Recognition.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

Monitoring and Coaching the Use of Home Medical Devices.
Proceedings of the Health Monitoring and Personalized Feedback using Multimedia Data, 2015

Overview of Multimedia in Healthcare.
Proceedings of the Health Monitoring and Personalized Feedback using Multimedia Data, 2015

2014
Semi-Supervised Multiple Feature Analysis for Action Recognition.
IEEE Trans. Multim., 2014

Symbiotic Tracker Ensemble Toward A Unified Tracking Framework.
IEEE Trans. Circuits Syst. Video Technol., 2014

Knowledge Adaptation with PartiallyShared Features for Event DetectionUsing Few Exemplars.
IEEE Trans. Pattern Anal. Mach. Intell., 2014

E-LAMP: integration of innovative ideas for multimedia event detection.
Mach. Vis. Appl., 2014

Multimedia classification and event detection using double fusion.
Multim. Tools Appl., 2014

Enhanced and hierarchical structure algorithm for data imbalance problem in semantic extraction under massive video dataset.
Multim. Tools Appl., 2014

Harnessing Lab Knowledge for Real-World Action Recognition.
Int. J. Comput. Vis., 2014

Temporal Extension of Scale Pyramid and Spatial Pyramid Matching for Action Recognition.
CoRR, 2014


Self-Paced Learning with Diversity.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Resource Constrained Multimedia Event Detection.
Proceedings of the MultiMedia Modeling - 20th Anniversary International Conference, 2014

Instructional Videos for Unsupervised Harvesting and Learning of Action Examples.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Multiple Features But Few Labels?: A Symbiotic Solution Exemplified for Video Analysis.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Easy Samples First: Self-paced Reranking for Zero-Example Multimedia Search.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

The Mystery of Faces: Investigating Face Contribution for Multimedia Event Detection.
Proceedings of the International Conference on Multimedia Retrieval, 2014

Towards Efficient Learning of Optimal Spatial Bag-of-Words Representations.
Proceedings of the International Conference on Multimedia Retrieval, 2014

Viral Video Style: A Closer Look at Viral Videos on YouTube.
Proceedings of the International Conference on Multimedia Retrieval, 2014

Zero-Example Event Search using MultiModal Pseudo Relevance Feedback.
Proceedings of the International Conference on Multimedia Retrieval, 2014

Interactive Surveillance Event Detection through Mid-level Discriminative Representation.
Proceedings of the International Conference on Multimedia Retrieval, 2014

Unsupervised Video Adaptation for Parsing Human Motion.
Proceedings of the Computer Vision - ECCV 2014, 2014

Snippet Based Trajectory Statistics Histograms for Assistive Technologies.
Proceedings of the Computer Vision - ECCV 2014 Workshops, 2014

Event Detection Using Multi-level Relevance Labels and Multiple Features.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Everything is in the Face? Represent Faces with Object Bank.
Proceedings of the Computer Vision - ACCV 2014 Workshops, 2014

2013
Multi-Feature Fusion via Hierarchical Regression for Multimedia Analysis.
IEEE Trans. Multim., 2013

Feature Selection for Multimedia Analysis by Sharing Information Among Multiple Tasks.
IEEE Trans. Multim., 2013

Multimedia Event Detection Using A Classifier-Specific Intermediate Representation.
IEEE Trans. Multim., 2013

Infrared Patch-Image Model for Small Target Detection in a Single Image.
IEEE Trans. Image Process., 2013

The co-attention model for tiny activity analysis.
Neurocomputing, 2013

Beyond audio and video retrieval: topic-oriented multimedia summarization.
Int. J. Multim. Inf. Retr., 2013

The Future of Multimedia Analysis and Mining: Visions from the Shonan Meeting.
IEEE Multim., 2013

Corrigendum to cross-domain video concept detection: A joint discriminative and generative active learning approach [Expert Systems with Applications 39 (15) (2012) 12220-12228].
Expert Syst. Appl., 2013

Unified Dictionary Learning and Region Tagging with Hierarchical Sparse Representation.
Comput. Vis. Image Underst., 2013


Multi-camera Egocentric Activity Detection for Personal Assistant.
Proceedings of the Advances in Multimedia Modeling, 19th International Conference, 2013

Fall detection in multi-camera surveillance videos: experimentations and observations.
Proceedings of the 1st ACM international workshop on Multimedia indexing and information retrieval for healthcare, 2013

We are not equally negative: fine-grained labeling for multimedia event detection.
Proceedings of the ACM Multimedia Conference, 2013

Spatio-temporal fisher vector coding for surveillance event detection.
Proceedings of the ACM Multimedia Conference, 2013

A cognitive assistive system for monitoring the use of home medical devices.
Proceedings of the 1st ACM international workshop on Multimedia indexing and information retrieval for healthcare, 2013

ACM MM MIIRH 2013: workshop on multimedia indexing and information retrieval for healthcare.
Proceedings of the ACM Multimedia Conference, 2013

How Related Exemplars Help Complex Event Detection in Web Videos?
Proceedings of the IEEE International Conference on Computer Vision, 2013

Feature Weighting via Optimal Thresholding for Video Analysis.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Space-Time Robust Representation for Action Recognition.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Multimedia event detection using visual concept signatures.
Proceedings of the Multimedia Content and Mobile Devices 2013, 2013

Harry Potter's Marauder's Map: Localizing and Tracking Multiple Persons-of-Interest by Nonnegative Discretization.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Complex Event Detection via Multi-source Video Attributes.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

2012
Discriminating Joint Feature Analysis for Multimedia Data Understanding.
IEEE Trans. Multim., 2012

Web and Personal Image Annotation by Mining Label Correlation With Relaxed Visual Graph Embedding.
IEEE Trans. Image Process., 2012

Spline Regression Hashing for Fast Image Search.
IEEE Trans. Image Process., 2012

The Future of Multimedia Analysis and Mining (NII Shonan Meeting 2012-9).
NII Shonan Meet. Rep., 2012

A Framework for Classifier Adaptation for Large-Scale Multimedia Data.
Proc. IEEE, 2012

Web-Scale Multimedia Processing and Applications [Scanning the Issue].
Proc. IEEE, 2012

Societally connected multimedia across cultures.
J. Zhejiang Univ. Sci. C, 2012

Large-Scale Multimedia Data Collections.
IEEE Multim., 2012

Cross-domain video concept detection: A joint discriminative and generative active learning approach.
Expert Syst. Appl., 2012


SRI-Sarnoff AURORA System at TRECVID 2012 Multimedia Event Detection and Recounting.
Proceedings of the 2012 TREC Video Retrieval Evaluation, 2012

Symbiotic Black-Box Tracker.
Proceedings of the Advances in Multimedia Modeling - 18th International Conference, 2012

Double Fusion for Multimedia Event Detection.
Proceedings of the Advances in Multimedia Modeling - 18th International Conference, 2012

Knowledge adaptation for ad hoc multimedia event detection with few exemplars.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Leveraging high-level and low-level features for multimedia event detection.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Multimodal knowledge-based analysis in multimedia event detection.
Proceedings of the International Conference on Multimedia Retrieval, 2012

Classifier-specific intermediate representation for multimedia tasks.
Proceedings of the International Conference on Multimedia Retrieval, 2012

Beyond audio and video retrieval: towards multimedia summarization.
Proceedings of the International Conference on Multimedia Retrieval, 2012

Constrained keypoint quantization: towards better bag-of-words model for large-scale multimedia retrieval.
Proceedings of the International Conference on Multimedia Retrieval, 2012

Activity Recognition from RGB-D Camera with 3D Local Spatio-temporal Features.
Proceedings of the 2012 IEEE International Conference on Multimedia and Expo, 2012

Human action recognition using a Markovian conditional exponential model.
Proceedings of the Multimedia on Mobile Devices 2012; and Multimedia Content Access: Algorithms and Systems VI, 2012

Action recognition by exploring data distribution and feature correlation.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Learning to predict health status of geriatric patients from observational data.
Proceedings of the 2012 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, 2012

2011

Informedia@TRECVID 2011: Surveillance Event Detection.
Proceedings of the 2011 TREC Video Retrieval Evaluation, 2011

People detection based on appearance and motion models.
Proceedings of the 8th IEEE International Conference on Advanced Video and Signal-Based Surveillance, 2011

2010
A Multi-Pronged Approach to Improving Semantic Extraction of News Video.
J. Signal Process. Syst., 2010

Representations of Keypoint-Based Semantic Concept Detection: A Comprehensive Study.
IEEE Trans. Multim., 2010

The Application of Spatio-temporal Feature and Multi-Sensor in Home Medical Devices.
J. Digit. Content Technol. its Appl., 2010

Informedia @ TRECVID2010.
Proceedings of the TRECVID 2010 workshop participants notebook papers, 2010

Joint-AL: Joint Discriminative and Generative Active Learning for Cross-Domain Semantic Concept Classification.
Proceedings of the 4th IEEE International Conference on Semantic Computing (ICSC 2010), 2010

Multi-camera Monitoring of Infusion Pump Use.
Proceedings of the 4th IEEE International Conference on Semantic Computing (ICSC 2010), 2010

Exploiting multi-level parallelism for low-latency activity recognition in streaming video.
Proceedings of the First Annual ACM SIGMM Conference on Multimedia Systems, 2010

Hybrid active learning for cross-domain video concept detection.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

ACM international workshop on very-large-scale multimedia corpus, mining and retrieval (VLS-MCMR'10).
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Explicit and implicit concept-based video retrieval with bipartite graph propagation model.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Controlling your TV with gestures.
Proceedings of the 11th ACM SIGMM International Conference on Multimedia Information Retrieval, 2010

Comparing Evaluation Protocols on the KTH Dataset.
Proceedings of the Human Behavior Understanding, First International Workshop, 2010

2009
Video Content Analysis.
Proceedings of the Encyclopedia of Database Systems, 2009

Real-Time Near-Duplicate Elimination for Web Video Search With Content and Context.
IEEE Trans. Multim., 2009

Informedia @ TRECVID2009: Analyzing Video Motions.
Proceedings of the TRECVID 2009 workshop participants notebook papers, 2009

Identifying news videos' ideological perspectives using emphatic patterns of visual concepts.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

ACM SIGMM the first workshop on web-scale multimedia corpus (WSMC09).
Proceedings of the 17th International Conference on Multimedia 2009, 2009

Action recognition via local descriptors and holistic features.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009

2008
Multimodal News Story Clustering With Pairwise Visual Near-Duplicate Constraint.
IEEE Trans. Multim., 2008

Video Retrieval Based on Semantic Concepts.
Proc. IEEE, 2008

VideOlympics: Real-Time Evaluation of Multimedia Retrieval Systems.
IEEE Multim., 2008

Measuring novelty and redundancy with multiple modalities in cross-lingual broadcast news.
Comput. Vis. Image Underst., 2008

Informedia @ TRECVID2008: Exploring New Frontiers.
Proceedings of the TRECVID 2008 workshop participants notebook papers, 2008

Do These News Videos Portray a News Event from Different Ideological Perspectives?.
Proceedings of the 2th IEEE International Conference on Semantic Computing (ICSC 2008), 2008

A Joint Topic and Perspective Model for Ideological Discourse.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2008

Exploring the utility of fast-forward surrogates for bbc rushes.
Proceedings of the 2nd ACM Workshop on Video Summarization, 2008

A framework for classifier adaptation and its applications in concept detection.
Proceedings of the 1st ACM SIGMM International Conference on Multimedia Information Retrieval, 2008

Vox Populi Annotation: Measuring Intensity of Ideological Perspectives by Aggregating Group Judgments.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

The Aware Community.
Proceedings of the Second International Conference on Future Generation Communication and Networking, 2008

(Un)Reliability of video concept detection.
Proceedings of the 7th ACM International Conference on Image and Video Retrieval, 2008

Efficient search: the informedia video retrieval system.
Proceedings of the 7th ACM International Conference on Image and Video Retrieval, 2008

Identifying Ideological Perspectives of Web Videos Using Folksonomies.
Proceedings of the Multimedia Information Extraction, 2008

Multimedia Information Extraction Roadmap.
Proceedings of the Multimedia Information Extraction, 2008

2007
Can High-Level Concepts Fill the Semantic Gap in Video Retrieval? A Case Study With Broadcast News.
IEEE Trans. Multim., 2007

A review of text and image retrieval approaches for broadcast news video.
Inf. Retr., 2007

Summarizing BBC Rushes the Informedia Way.
Proceedings of the TRECVID 2007 workshop participants notebook papers, 2007

A Hybrid Approach to Improving Semantic Extraction of News Video.
Proceedings of the First IEEE International Conference on Semantic Computing (ICSC 2007), 2007

Exploring Concept Selection Strategies for Interactive Video Search.
Proceedings of the First IEEE International Conference on Semantic Computing (ICSC 2007), 2007

Harmonium Models for Semantic Video Representation and Classification.
Proceedings of the Seventh SIAM International Conference on Data Mining, 2007

Discriminative Fields for Modeling Semantic Concepts in Video.
Proceedings of the Computer-Assisted Information Retrieval (Recherche d'Information et ses Applications) - RIAO 2007, 8th International Conference, Carnegie Mellon University, Pittsburgh, PA, USA, May 30, 2007

Cross-domain video concept detection using adaptive svms.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

Practical elimination of near-duplicates from web video search.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

Novelty detection for cross-lingual news stories with visual duplicates and speech transcripts.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

Clever clustering vs. simple speed-up for summarizing rushes.
Proceedings of the 1st ACM Workshop on Video Summarization, 2007

Evaluating bag-of-visual-words representations in scene classification.
Proceedings of the 9th ACM SIGMM International Workshop on Multimedia Information Retrieval, 2007

Undirected Graphical Models for Video Analysis and Classification.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Adapting SVM Classifiers to Data with Shifted Distributions.
Proceedings of the Workshops Proceedings of the 7th IEEE International Conference on Data Mining (ICDM 2007), 2007

How many high-level concepts will fill the semantic gap in news video retrieval?
Proceedings of the 6th ACM International Conference on Image and Video Retrieval, 2007

Query expansion using probabilistic local feedback with application to multimedia retrieval.
Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, 2007

2006
Learning rich semantics from news video archives by style analysis.
ACM Trans. Multim. Comput. Commun. Appl., 2006

A Discriminative Learning Framework with Pairwise Constraints for Video Object Classification.
IEEE Trans. Pattern Anal. Mach. Intell., 2006

Large-Scale Concept Ontology for Multimedia.
IEEE Multim., 2006

Multi-Lingual Broadcast News Retrieval.
Proceedings of the 2006 TREC Video Retrieval Evaluation, 2006

Probabilistic latent query analysis for combining multiple retrieval sources.
Proceedings of the SIGIR 2006: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2006

3WNews: who, where, and when in news video.
Proceedings of the 14th ACM International Conference on Multimedia, 2006

Extreme video retrieval: joint maximization of human and computer performance.
Proceedings of the 14th ACM International Conference on Multimedia, 2006

Exploring temporal consistency for video analysis and retrieval.
Proceedings of the 8th ACM SIGMM International Workshop on Multimedia Information Retrieval, 2006

Diversity in multimedia information retrieval research.
Proceedings of the 8th ACM SIGMM International Workshop on Multimedia Information Retrieval, 2006

Mining Relationship Between Video Concepts using Probabilistic Graphical Models.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Label Disambiguation and Sequence Modeling for Identifying Human Activities from Wearable Physiological Sensors.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Which Thousand Words are Worth a Picture? Experiments on Video Retrieval using a Thousand Concepts.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Which Side are You on? Identifying Perspectives at the Document and Sentence Levels.
Proceedings of the Tenth Conference on Computational Natural Language Learning, 2006

Annotating News Video with Locations.
Proceedings of the Image and Video Retrieval, 5th International Conference, 2006

Efficient Margin-Based Rank Learning Algorithms for Information Retrieval.
Proceedings of the Image and Video Retrieval, 5th International Conference, 2006

Exploring the Synergy of Humans and Machines in Extreme Video Retrieval.
Proceedings of the Image and Video Retrieval, 5th International Conference, 2006

Are These Documents Written from Different Perspectives? A Test of Different Perspectives Based on Statistical Distribution Divergence.
Proceedings of the ACL 2006, 2006

2005
Multimedia information retrieval: workshop report.
SIGIR Forum, 2005

Mining Associated Text and Images with Dual-Wing Harmoniums.
Proceedings of the UAI '05, 2005

CMU Informedia's TRECVID 2005 Skirmishes.
Proceedings of the 2005 TREC Video Retrieval Evaluation, 2005

Multi-modal analysis for person type classification in news video.
Proceedings of the Storage and Retrieval Methods and Applications for Multimedia 2005, 2005

Revisiting the effect of topic set size on retrieval error.
Proceedings of the SIGIR 2005: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2005

Multiple instance learning for labeling faces in broadcasting news video.
Proceedings of the 13th ACM International Conference on Multimedia, 2005

Putting active learning into multimedia applications: dynamic definition and refinement of concept classifiers.
Proceedings of the 13th ACM International Conference on Multimedia, 2005

Assessing Effectiveness in Video Retrieval.
Proceedings of the Image and Video Retrieval, 4th International Conference, 2005

Lessons for the Future from a Decade of Informedia Video Analysis Research.
Proceedings of the Image and Video Retrieval, 4th International Conference, 2005

The Use and Utility of High-Level Semantic Features in Video Retrieval.
Proceedings of the Image and Video Retrieval, 4th International Conference, 2005

2004
Automated Analysis of Nursing Home Observations.
IEEE Pervasive Comput., 2004

Confounded Expectations: Informedia at TRECVID 2004.
Proceedings of the 2004 TREC Video Retrieval Evaluation, 2004

Naming every individual in news video monologues.
Proceedings of the 12th ACM International Conference on Multimedia, 2004

Learning query-class dependent weights in automatic video retrieval.
Proceedings of the 12th ACM International Conference on Multimedia, 2004

Successful approaches in the TREC video retrieval evaluations.
Proceedings of the 12th ACM International Conference on Multimedia, 2004

Video grammar for locating named people.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2004

Multi-modal classification in digital news libraries.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2004

Dining Activity Analysis Using a Hidden Markov Model.
Proceedings of the 17th International Conference on Pattern Recognition, 2004

Multi-class active learning for video semantic feature extraction.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Detection of TV news monologues by style analysis.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Modeling timing features in broadcast news video classification.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Merging rank lists from multiple sources in video classification.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Comparison and combination of two novel commercial detection methods.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Towards robust face recognition from multiple views.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Searching for a specific person in broadcast news video.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Combining Motion Segmentation with Tracking for Activity Analysis.
Proceedings of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition (FGR 2004), 2004

Articulated Motion Modeling for Activity Analysis.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2004

Finding Person X: Correlating Names with Visual Appearances.
Proceedings of the Image and Video Retrieval: Third International Conference, 2004

Co-retrieval: A Boosted Reranking Approach for Video Retrieval.
Proceedings of the Image and Video Retrieval: Third International Conference, 2004

Towards a Large Scale Concept Ontology for Broadcast Video.
Proceedings of the Image and Video Retrieval: Third International Conference, 2004

What's News, What's Not? Associating News Videos with Words.
Proceedings of the Image and Video Retrieval: Third International Conference, 2004

2003
Web Image Retrieval Re-Ranking with Relevance Model.
Proceedings of the 2003 IEEE / WIC International Conference on Web Intelligence, 2003

Informedia at TRECVID 2003 : Analyzing and Searching Broadcast News Video.
Proceedings of the 2003 TREC Video Retrieval Evaluation, 2003

Video retrieval using speech and image information.
Proceedings of the Storage and Retrieval for Media Databases 2003, 2003

Negative pseudo-relevance feedback in content-based video retrieval.
Proceedings of the Eleventh ACM International Conference on Multimedia, 2003

The combination limit in multimedia retrieval.
Proceedings of the Eleventh ACM International Conference on Multimedia, 2003

Constant Density Displays Using Diversity Sampling.
Proceedings of the 9th IEEE Symposium on Information Visualization (InfoVis 2003), 2003

Modified Logistic Regression: An Approximation to SVM and Its Applications in Large-Scale Text Categorization.
Proceedings of the Machine Learning, 2003

A Faster Iterative Scaling Algorithm for Conditional Exponential Model.
Proceedings of the Machine Learning, 2003

Supervised classification for video shot segmentation.
Proceedings of the 2003 IEEE International Conference on Multimedia and Expo, 2003

Learning to identify video shots with people based on face detection.
Proceedings of the 2003 IEEE International Conference on Multimedia and Expo, 2003

Automatically Labeling Video Data Using Multi-class Active Learning.
Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV 2003), 2003

On predicting rare classes with SVM ensembles in scene classification.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Information retrieval for OCR documents: a content-based probabilistic correction model.
Proceedings of the Document Recognition and Retrieval X, 2003

Multimedia Search with Pseudo-relevance Feedback.
Proceedings of the Image and Video Retrieval, Second International Conference, 2003

2002
Video Classification and Retrieval with the Informedia Digital Video Library System.
Proceedings of The Eleventh Text REtrieval Conference, 2002

Language model for IR using collection information.
Proceedings of the SIGIR 2002: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2002

Title language model for information retrieval.
Proceedings of the SIGIR 2002: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2002

Meta-Classification of Multimedia Classifiers.
Proceedings of the International Workshop on Knowledge Discovery in Multimedia and Complex Data (KDMCD 2002), 2002

News video classification using SVM-based multimodal classifiers and combination strategies.
Proceedings of the 10th ACM International Conference on Multimedia 2002, 2002

Collages as dynamic summaries for news video.
Proceedings of the 10th ACM International Conference on Multimedia 2002, 2002

Meta-classification: Combining Multimodal Classifiers.
Proceedings of the Mining Multimedia and Complex Data, 2002

A wearable digital library of personal conversations.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2002

Video-cuebik: adapting image search to video shots.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2002

Multi-modal information retrieval from broadcast video using OCR and speech recognition.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2002

Video retrieval with multiple image search strategies.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2002

A Probabilistic Model for Camera Zoom Detection.
Proceedings of the 16th International Conference on Pattern Recognition, 2002

Using a probabilistic source model for comparing images.
Proceedings of the 2002 International Conference on Image Processing, 2002

The TREC2001 Video Track: Information Retrieval on Digital Video Information.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2002

A New Probabilistic Model for Title Generation.
Proceedings of the 19th International Conference on Computational Linguistics, 2002

2001
Video Retrieval with the Informedia Digital Video Library System.
Proceedings of The Tenth Text REtrieval Conference, 2001

Meta-scoring: Automatically Evaluating Term Weighting Schemes in IR without Precision-Recall.
Proceedings of the SIGIR 2001: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2001

Automatic Title Generation for Spoken Broadcast News.
Proceedings of the First International Conference on Human Language Technology Research, 2001

Demonstration of hierarchical document clustering of digital library retrieval results.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2001

Title Generation for Machine-Translated Documents.
Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, 2001

Learning to Select Good Title Words: An New Approach based on Reverse Information Retrieval.
Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williams College, Williamstown, MA, USA, June 28, 2001

Title Generation Using a Training Corpus.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2001

2000
Complementary Video and Audio Analysis for Broadcast News Archives.
Commun. ACM, 2000

Title generation for spoken broadcast news using a training corpus.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Data Analysis for a Multimedia Library.
Proceedings of the Text- and Speech-Triggered Information Access, 2000

Automatic title generation for EM.
Proceedings of the Fifth ACM Conference on Digital Libraries, 2000

1999
Learning to Recognize Speech by Watching Television.
IEEE Intell. Syst., 1999

Guest Editor's Introduction: Integrating and Using Large Databases of Text, Images, Video, and Audio.
IEEE Intell. Syst., 1999

Informedia Experience-on-Demand: Capturing, Integrating and Communicating Experiences across People, Time and Space.
ACM Comput. Surv., 1999

Lessons Learned from Building a Terabyte Digital Video Library.
Computer, 1999

CMU Spoken Document Retrieval in Trec-8: Analysis of the role of Term Frequency TF.
Proceedings of The Eighth Text REtrieval Conference, 1999

Laughter extracted from television closed captions as speech recognizer training data.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Selection for acoustic coverage from unlimited speech extracted from closed-captioned TV.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Improving Acoustic Models with Captioned Multimedia Speech.
Proceedings of the IEEE International Conference on Multimedia Computing and Systems, 1999

Adjustable Filmstrips and Skims as Abstractions for a Digital Video Library.
Proceedings of the IEEE Forum on Research and Technology Advances in Digital Libraries, 1999

1998
Speech Recognition for a Digital Video Library.
J. Am. Soc. Inf. Sci., 1998

Experiments in Spoken Document Retrieval at CMU.
Proceedings of The Seventh Text REtrieval Conference, 1998

Hierarchical cluster language modeling with statistical rule extraction for rescoring n-best hypotheses during speech decoding.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Topic Labeling of Broadcast News Stories in the Informedia Digital Video Library.
Proceedings of the 3rd ACM International Conference on Digital Libraries, 1998

Story Segmentation and Detection of Commercials in Broadcast News Video.
Proceedings of the IEEE Forum on Research and Technology Advances in Digital Libraries, 1998

1997
Experiments in Spoken Document Retrieval at CMU.
Proceedings of The Sixth Text REtrieval Conference, 1997

Indexing and search of multimodal information.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

Using Words and Phonetic Strings for Efficient Information Retrieval from Imperfectly Transcribed Spoken Documents.
Proceedings of the 2nd ACM International Conference on Digital Libraries, 1997

Artificial Intelligence Techniques in the Interface to a Digital Video Library.
Proceedings of the Human Factors in Computing Systems, 1997

1995
News-on-Demand: An Application of Informedia® Technology.
D Lib Mag., 1995

Demonstration of a Reading Coach that Listens.
Proceedings of the 8th Annual ACM Symposium on User Interface Software and Technology, 1995

Speech for Multimedia Information Retrieval.
Proceedings of the 8th Annual ACM Symposium on User Interface Software and Technology, 1995

Speech recognition in the Informedia Digital Video Library: uses and limitations.
Proceedings of the Seventh International Conference on Tools with Artificial Intelligence, 1995

1994
Survey of Current Speech Technology.
Commun. ACM, 1994

A Prototype Reading Coach that Listens: Summary of Project LISTEN.
Proceedings of the Human Language Technology, 1994

A Prototype Reading Coach that Listens.
Proceedings of the 12th National Conference on Artificial Intelligence, Seattle, WA, USA, July 31, 1994

A Reading Coach that Listens: (Edited) Video Transcript.
Proceedings of the 12th National Conference on Artificial Intelligence, Seattle, WA, USA, July 31, 1994

1993
Gestures with Speech for Graphic Manipulation.
Int. J. Man Mach. Stud., 1993

Speech recognition applied to reading assistance for children: a baseline language model.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993

SPEAKEZ: a first experiment in concatenation synthesis from a large corpus.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993

Towards a Reading Coach that Listens: Automated Detection of Oral Reading Errors.
Proceedings of the 11th National Conference on Artificial Intelligence. Washington, 1993

1991
JANUS: a speech-to-speech translation system using connectionist and symbolic processing strategies.
Proceedings of the 1991 International Conference on Acoustics, 1991

Models for evaluating interaction protocols in speech recognition.
Proceedings of the Conference on Human Factors in Computing Systems, 1991

From Syntax to Meaning in Natural Language Processing.
Proceedings of the 9th National Conference on Artificial Intelligence, 1991

1990
A Comparison of Speech and Typed Input.
Proceedings of the Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, 1990

1989
High Level Knowledge Sources in Usable Speech Recognition Systems.
Commun. ACM, 1989

Layering Predictions: Flexible Use of Dialog Expectation in Speech Recognition.
Proceedings of the 11th International Joint Conference on Artificial Intelligence. Detroit, 1989

Speech and gestures for graphic image manipulation.
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 1989

1988
Talking to Computers: An Empirical Investigation.
Int. J. Man Mach. Stud., 1988

Parsing spoken phrases despite missing words.
Proceedings of the IEEE International Conference on Acoustics, 1988

Using Dialog-Level Knowledge Sources to Improve Speech Recognition.
Proceedings of the 7th National Conference on Artificial Intelligence, 1988

1987
Sentence parsing with weak grammatical constraints.
Proceedings of the IEEE International Conference on Acoustics, 1987

1986
On quick word spotting techniques.
Proceedings of the IEEE International Conference on Acoustics, 1986

Parsing Spoken Language: A Semantic Caseframe Approach.
Proceedings of the 11th International Conference on Computational Linguistics, 1986


  Loading...