Winston H. Hsu

Orcid: 0000-0002-3330-0638

Affiliations:
  • National Taiwan University, Taipei, Taiwan


According to our database1, Winston H. Hsu authored at least 223 papers between 2003 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Attention Tracker: Detecting Prompt Injection Attacks in LLMs.
CoRR, 2024

Revisiting Semi-supervised Adversarial Robustness via Noise-aware Online Robust Distillation.
CoRR, 2024

Distribution Discrepancy and Feature Heterogeneity for Active 3D Object Detection.
CoRR, 2024

Context-Aware Replanning with Pre-explored Semantic Map for Object Navigation.
CoRR, 2024

Bridging Episodes and Semantics: A Novel Framework for Long-Form Video Understanding.
CoRR, 2024

Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies.
CoRR, 2024

Shared-unique Features and Task-aware Prioritized Sampling on Multi-task Reinforcement Learning.
CoRR, 2024

VICtoR: Learning Hierarchical Vision-Instruction Correlation Rewards for Long-horizon Manipulation.
CoRR, 2024

Tracking-Assisted Object Detection with Event Cameras.
CoRR, 2024

AED: Adaptable Error Detection for Few-shot Imitation Policy.
CoRR, 2024

Tel2Veh: Fusion of Telecom Data and Vehicle Flow to Predict Camera-Free Traffic via a Spatio-Temporal Framework.
Proceedings of the Companion Proceedings of the ACM on Web Conference 2024, 2024

Enhancing Sustainable Urban Mobility Prediction with Telecom Data: A Spatio-Temporal Framework Approach.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

WLST: Weak Labels Guided Self-training for Weakly-supervised Domain Adaptation on 3D Object Detection.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Unsupervised Image Prior via Prompt Learning and CLIP Semantic Guidance for Low-Light Image Enhancement.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

TelTrans: Applying Multi-Type Telecom Data to Transportation Evaluation and Prediction via Multifaceted Graph Modeling.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Dual-Awareness Attention for Few-Shot Object Detection.
IEEE Trans. Multim., 2023

Unsupervised Adversarial Detection without Extra Model: Training Loss Should Change.
CoRR, 2023

MuRAL: Multi-Scale Region-based Active Learning for Object Detection.
CoRR, 2023

Self-Training with High-Dimensional Markers for Cell Instance Segmentation.
Proceedings of the 20th IEEE International Symposium on Biomedical Imaging, 2023

Geographical Cellular Traffic Prediction with Multivariate Spatio-Temporal Modeling.
Proceedings of the 2nd International Workshop on Spatio-Temporal Reasoning and Learning (STRL 2023) co-located with the 32nd International Joint Conference on Artificial Intelligence (IJCAI 2023), 2023

CrossDTR: Cross-view and Depth-guided Transformers for 3D Object Detection.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

CFVS: Coarse-to-Fine Visual Servoing for 6-DoF Object-Agnostic Peg-In-Hole Assembly.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Coarse-to-Fine Point Cloud Registration with SE(3)-Equivariant Representations.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Orbeez-SLAM: A Real-time Monocular Visual SLAM with ORB Features and NeRF-realized Mapping.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Pay Attention to Multi-Channel for Improving Graph Neural Networks.
Proceedings of the First Tiny Papers Track at ICLR 2023, 2023

Fair Robust Active Learning by Joint Inconsistency.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

BIRD-PCC: Bi-Directional Range Image-Based Deep Lidar Point Cloud Compression.
Proceedings of the IEEE International Conference on Acoustics, 2023

Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Revisiting Depth-guided Methods for Monocular 3D Object Detection by Hierarchical Balanced Depth.
Proceedings of the Conference on Robot Learning, 2023

CTCam: Enhancing Transportation Evaluation through Fusion of Cellular Traffic and Camera-Based Vehicle Flows.
Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023

STAMINA (Spatial-Temporal Aligned Meteorological INformation Attention) and FPL (Focal Precip Loss): Advancements in Precipitation Nowcasting for Heavy Rainfall Events.
Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023

Minisuperb: Lightweight Benchmark for Self-Supervised Speech Models.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Raw Image Deblurring.
IEEE Trans. Multim., 2022

Fair Robust Active Learning by Joint Inconsistency.
CoRR, 2022

ADeADA: Adaptive Density-aware Active Domain Adaptation for Semantic Segmentation.
CoRR, 2022

Lung-Originated Tumor Segmentation from Computed Tomography Scan (LOTUS) Benchmark.
CoRR, 2022

SeqDNet: Improving Missing Value by Sequential Depth Network.
Proceedings of the 2022 IEEE International Conference on Image Processing, 2022

$\mathrm {D^2ADA}$: Dynamic Density-Aware Active Domain Adaptation for Semantic Segmentation.
Proceedings of the Computer Vision - ECCV 2022, 2022

GenISP: Neural ISP for Low-Light Machine Cognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Learning Fine-Grained Visual Understanding for Video Question Answering via Decoupling Spatial-Temporal Modeling.
Proceedings of the 33rd British Machine Vision Conference 2022, 2022

Free-form 3D Scene Inpainting with Dual-stream GAN.
Proceedings of the 33rd British Machine Vision Conference 2022, 2022

Stage Conscious Attention Network (SCAN): A Demonstration-Conditioned Policy for Few-Shot Imitation.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
xCos: An Explainable Cosine Metric for Face Verification Task.
ACM Trans. Multim. Comput. Commun. Appl., 2021

End-to-End Video Question-Answer Generation With Generator-Pretester Network.
IEEE Trans. Circuits Syst. Video Technol., 2021

Learn from the past - sequentially one-to-one video deblurring network.
J. Vis. Commun. Image Represent., 2021

3rd Place Solution for NeurIPS 2021 Shifts Challenge: Vehicle Motion Prediction.
CoRR, 2021

Anomaly-Aware Semantic Segmentation by Leveraging Synthetic-Unknown Data.
CoRR, 2021

Learning from 2D: Pixel-to-Point Knowledge Transfer for 3D Pretraining.
CoRR, 2021

S<sup>3</sup>: Learnable Sparse Signal Superdensity for Guided Depth Estimation.
CoRR, 2021

Should I Look at the Head or the Tail? Dual-awareness Attention for Few-Shot Object Detection.
CoRR, 2021

Situation and Behavior Understanding by Trope Detection on Films.
Proceedings of the WWW '21: The Web Conference 2021, 2021

Class-agnostic Few-shot Object Counting.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

OCID-Ref: A 3D Robotic Dataset With Embodied Language For Clutter Scene Grounding.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

ODIP: Towards Automatic Adaptation for Object Detection by Interactive Perception.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

ReDAL: Region-based and Diversity-aware Active Learning for Point Cloud Semantic Segmentation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Role Aware Multi-Party Dialogue Question Answering.
Proceedings of the IEEE International Conference on Acoustics, 2021

S3: Learnable Sparse Signal Superdensity for Guided Depth Estimation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

TrUMAn: Trope Understanding in Movies and Animations.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021

Multivariate and Propagation Graph Attention Network for Spatial-Temporal Prediction with Outdoor Cellular Traffic.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021

NOD: Taking a Closer Look at Detection under Extreme Low-Light Conditions with Night Object Detection Dataset.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

Multi-Stream Attention Learning for Monocular Vehicle Velocity and Inter-Vehicle Distance Estimation.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

2020
Deep Multi-Kernel Convolutional LSTM Networks and an Attention-Based Mechanism for Videos.
IEEE Trans. Multim., 2020

A Coarse-To-Fine (C2F) Representation for End-To-End 6-DoF Grasp Detection.
CoRR, 2020

Large Margin Mechanism and Pseudo Query Set on Cross-Domain Few-Shot Learning.
CoRR, 2020

Expanding Sparse Guidance for Stereo Matching.
CoRR, 2020

xCos: An Explainable Cosine Metric for Face Verification Task.
CoRR, 2020

Efficient and Phase-Aware Video Super-Resolution for Cardiac MRI.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2020, 2020

Video Question Generation via Semantic Rich Cross-Modal Self-Attention Networks Learning.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

GDN: A Coarse-To-Fine (C2F) Representation for End-To-End 6-DoF Grasp Detection.
Proceedings of the 4th Conference on Robot Learning, 2020

2019
Deep Long Audio Inpainting.
CoRR, 2019

Organ At Risk Segmentation with Multiple Modality.
CoRR, 2019

Video Question Generation via Cross-Modal Self-Attention Networks Learning.
CoRR, 2019

Learnable Gated Temporal Shift Module for Deep Video Inpainting.
CoRR, 2019

FishNet: A Camera Localizer using Deep Recurrent Networks.
CoRR, 2019

Learning from 3D (Point Cloud) Data.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

DECCNet: Depth Enhanced Crowd Counting.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Indoor Depth Completion with Boundary Consistency and Self-Attention.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Free-Form Video Inpainting With 3D Gated Convolution and Temporal PatchGAN.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Audio Feature Generation for Missing Modality Problem in Video Action Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

Saliency Aware: Weakly Supervised Object Localization.
Proceedings of the IEEE International Conference on Acoustics, 2019

VORNet: Spatio-Temporally Consistent Video Inpainting for Object Removal.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Anticipation of Human Actions With Pose-Based Fine-Grained Representations.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Learnable Gated Temporal Shift Module for Free-form Video Inpainting.
Proceedings of the 30th British Machine Vision Conference 2019, 2019

A Unified Point-Based Framework for 3D Segmentation.
Proceedings of the 2019 International Conference on 3D Vision, 2019

2018
Learning From Cross-Domain Media Streams for Event-of-Interest Discovery.
IEEE Trans. Multim., 2018

Netizen-Style Commenting on Fashion Photos: Dataset and Diversity Measures.
Proceedings of the Companion of the The Web Conference 2018 on The Web Conference 2018, 2018

Super-Identity Convolutional Neural Network for Face Hallucination.
Proceedings of the Computer Vision - ECCV 2018, 2018

Deep Disguised Faces Recognition.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018

Cross-Domain Hallucination Network for Fine-Grained Object Recognition.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018

Attribute Augmented Convolutional Neural Network for Face Hallucination.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018

Drone-View Building Identification by Cross-View Visual Learning and Relative Spatial Estimation.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018

PIVTONS: Pose Invariant Virtual Try-On Shoe with Conditional Image Completion.
Proceedings of the Computer Vision - ACCV 2018, 2018

Cross-Domain Image-Based 3D Shape Retrieval by View Sequence Learning.
Proceedings of the 2018 International Conference on 3D Vision, 2018

2017
Photo Filter Recommendation by Category-Aware Aesthetic Learning.
IEEE Trans. Multim., 2017

Dehashing: Server-Side Context-Aware Feature Reconstruction for Mobile Visual Search.
IEEE Trans. Circuits Syst. Video Technol., 2017

Scalable Face Track Retrieval in Video Archives Using Bag-of-Faces Sparse Representation.
IEEE Trans. Circuits Syst. Video Technol., 2017

Editorial for the ICMR 2016 special issue.
Int. J. Multim. Inf. Retr., 2017

Harnessing A.I. for Augmenting Creativity: Application to Movie Trailer Creation.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Feature Learning with Rank-Based Candidate Selection for Product Search.
Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

Drone-Based Object Counting by Spatially Regularized Regional Proposal Network.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Multi-task learning for face identification and attribute estimation.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Joint Sequence Learning and Cross-Modality Convolution for 3D Biomedical Segmentation.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017


2016
City-view image location identification by multiple geo-social media and graph-based image cluster refinement.
J. Vis. Commun. Image Represent., 2016

De-Hashing: Server-Side Context-Aware Feature Reconstruction for Mobile Visual Search.
CoRR, 2016

We Can "See" You via Wi-Fi - An Overview and Beyond.
CoRR, 2016

Location-Independent WiFi Action Recognition via Vision-based Methods.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Egocentric activity recognition by leveraging multiple mid-level representations.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2016

LDADEEP+: Latent aspect discovery with deep representations.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

WiFi action recognition via vision-based methods.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Mediated experts for deep convolutional networks.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Face Recognition and Retrieval Using Cross-Age Reference Coding With Cross-Age Celebrity Dataset.
IEEE Trans. Multim., 2015

Augmenting flower recognition by automatically expanding training data from web.
Proceedings of the 17th IEEE International Workshop on Multimedia Signal Processing, 2015

Exploiting Word and Visual Word Co-occurrence for Sketch-based Clipart Image Retrieval.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Unsupervised Latent Aspect Discovery for Diverse Event Summarization.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Real-Time Instant Event Detection in Egocentric Videos by Leveraging Sensor-Based Motion Context.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Who are the Devils Wearing Prada in New York City?
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Filter-Invariant Image Classification on Social Media Photos.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Trending pool: Visual analytics for trending event compositions for time-series categorical log data.
Proceedings of the 10th IEEE Conference on Visual Analytics Science and Technology, 2015

Summarizing While Recording: Context-Based Highlight Detection for Egocentric Videos.
Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop, 2015

Identify Visual Human Signature in community via wearable camera.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Enhancing sparse voice annotation for semantic retrieval of personal photos by continuous space word representations.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Approximating Weighted Hamming Distance by Probabilistic Selection for Multiple Hash Tables.
Proceedings of the Advances in Information Retrieval, 2015

Scalable object detection by filter compression with regularized sparse coding.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Visually Interpreting Names as Demographic Attributes by Exploiting Click-Through Data.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014
Scalable Mobile Visual Classification by Kernel Preserving Projection Over High-Dimensional Features.
IEEE Trans. Multim., 2014

Online image search result grouping with MapReduce-based image clustering and graph construction for large-scale photos.
J. Vis. Commun. Image Represent., 2014

Transfer Learning for Video Recognition with Scarce Training Data.
CoRR, 2014

Me-link: link me to the media - fusing audio and visual cues for robust and efficient mobile media interaction.
Proceedings of the 23rd International World Wide Web Conference, 2014

Learning to personalize trending image search suggestion.
Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2014

Facial Attribute Space Compression by Latent Human Topic Discovery.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Efficient Cross-Domain Image Retrieval by Multi-Level Matching and Spatial Verification for Structural Similarity.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Discovering the City by Mining Diverse and Multimodal Data Streams.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Automatic Facial Image Annotation and Retrieval by Integrating Voice Label and Visual Appearance.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Efficient Face Detection by Leveraging Knowledge from Large-Scale Photos.
Proceedings of the International Conference on Multimedia Retrieval, 2014

Rank-Preserving and Unsupervised Hash Learning from Auxiliary Contextual Cues.
Proceedings of the International Conference on Multimedia Retrieval, 2014

Predicting Viewer Affective Comments Based on Image Content in Social Media.
Proceedings of the International Conference on Multimedia Retrieval, 2014

Learning-based heart rate detection from remote photoplethysmography features.
Proceedings of the IEEE International Conference on Acoustics, 2014

Investigating and predicting social and visual image interestingness on social media by crowdsourcing.
Proceedings of the IEEE International Conference on Acoustics, 2014

Jointly Optimizing 3D Model Fitting and Fine-Grained Classification.
Proceedings of the Computer Vision - ECCV 2014, 2014

Cross-Age Reference Coding for Age-Invariant Face Recognition and Retrieval.
Proceedings of the Computer Vision - ECCV 2014, 2014

2013
Automatic Training Image Acquisition and Effective Feature Selection From Community-Contributed Photos for Facial Attribute Detection.
IEEE Trans. Multim., 2013

Scalable Face Image Retrieval Using Attribute-Enhanced Sparse Codewords.
IEEE Trans. Multim., 2013

Travel Recommendation by Mining People Attributes and Travel Group Types From Community-Contributed Photos.
IEEE Trans. Multim., 2013

Investigating 3-D Model and Part Information for Improving Content-Based Vehicle Retrieval.
IEEE Trans. Circuits Syst. Video Technol., 2013

Graph-based semi-supervised learning with multi-modality propagation for large-scale image datasets.
J. Vis. Commun. Image Represent., 2013

Scalable Mobile Video Retrieval with Sparse Projection Learning and Pseudo Label Mining.
IEEE Multim., 2013

Search-based relevance association with auxiliary contextual cues.
Proceedings of the ACM Multimedia Conference, 2013

Flickr-tag prediction using multi-modal fusion and meta information.
Proceedings of the ACM Multimedia Conference, 2013

Enabling low bitrate mobile visual recognition: a performance versus bandwidth evaluation.
Proceedings of the ACM Multimedia Conference, 2013

City-view image retrieval leveraging check-in data.
Proceedings of the 2nd ACM international workshop on Geotagging and its applications in multimedia, 2013

Real-time privacy-preserving moving object detection in the cloud.
Proceedings of the ACM Multimedia Conference, 2013

3D Sub-query Expansion for Improving Sketch-Based Multi-view Image Retrieval.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Full body human attribute detection in indoor surveillance environment using color-depth information.
Proceedings of the 10th IEEE International Conference on Advanced Video and Signal Based Surveillance, 2013

2012
Preference-Aware View Recommendation System for Scenic Photos Based on Bag-of-Aesthetics-Preserving Features.
IEEE Trans. Multim., 2012

Unsupervised Semantic Feature Discovery for Image Object Retrieval and Tag Refinement.
IEEE Trans. Multim., 2012

Learning by expansion: Exploiting social media for image classification with few training examples.
Neurocomputing, 2012

Where is who: large-scale photo retrieval by facial attributes and canvas layout.
Proceedings of the 35th International ACM SIGIR conference on research and development in Information Retrieval, 2012

Sharing the trees among random forests for effective and efficient concept detection.
Proceedings of the 14th IEEE International Workshop on Multimedia Signal Processing, 2012

Large-scale simultaneous multi-object recognition and localization via bottom up search-based approach.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Sketch-based image retrieval on mobile devices using compact hash bits.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Detecting the directions of viewing landmarks for recommendation by large-scale user-contributed photos.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Emerging challenges and opportunities in exploiting mobile photos and videos.
Proceedings of the 2nd ACM international workshop on Interactive multimedia on mobile and portable devices, 2012

Discovering informative social subgraphs and predicting pairwise relationships from group photos.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Evaluating Gaussian Like Image Representations over Local Features.
Proceedings of the 2012 IEEE International Conference on Multimedia and Expo, 2012

Live Semantic Sport Highlight Detection Based on Analyzing Tweets of Twitter.
Proceedings of the 2012 IEEE International Conference on Multimedia and Expo, 2012

Content-based vehicle retrieval using 3D model and part information.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Two-stage sparse graph construction using MinHash on MapReduce.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
Learning facial attributes by crowdsourcing in social media.
Proceedings of the 20th International Conference on World Wide Web, 2011

Multi-layer graph-based semi-supervised learning for large-scale image datasets using mapreduce.
Proceedings of the Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2011

Region-based landmark discovery by crowdsourcing geo-referenced photos.
Proceedings of the Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2011

Snap2Read: Automatic Magazine Capturing and Analysis for Adaptive Mobile Reading.
Proceedings of the Advances in Multimedia Modeling, 2011

Scalable mobile video question-answering system with locally aggregated descriptors and random projection.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Scenic photo quality assessment with bag of aesthetics-preserving features.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Multiple object localization by context-aware adaptive window search and search-based object recognition.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Photo search by face positions and facial attributes on touch devices.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Augmenting mobile city-view image retrieval with context-rich user-contributed photos.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Comp2Watch: enhancing the mobile video browsing experience.
Proceedings of the 2011 international ACM workshop on Interactive multimedia on mobile and portable devices, 2011

Personalized travel recommendation by mining people attributes from community-contributed photos.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Semi-supervised face image retrieval using sparse coding with identity constraint.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Coarse-to-fine temporal optimization for video retargeting based on seam carving.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Unsupervised auxiliary visual words discovery for large-scale image object retrieval.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

2010
Boosting image object retrieval and indexing by automatically discovered pseudo-objects.
J. Vis. Commun. Image Represent., 2010

Knowledge Discovery from Community-Contributed Multimedia.
IEEE Multim., 2010

Interactive inquiry for object of interest in video playback by motion-augmented graph cut.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

A technical demonstration of large-scale image object retrieval by efficient query evaluation and effective auxiliary visual feature discovery.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

GPS, compass, or camera?: investigating effective mobile sensors for automatic search-based image annotation.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Search-Based Automatic Image Annotation via Flickr Photos Using Tag Expansion.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Online Reranking via Ordinal Informative Concepts for Context Fusion in Concept Detection and Video Search.
IEEE Trans. Circuits Syst. Video Technol., 2009

Adaptive Learning for Multimodal Fusion in Video Search.
Proceedings of the Advances in Multimedia Information Processing, 2009

Content-based and concept-based retrieval for large-scale image/video collections.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

Query expansion for hash-based image object retrieval.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

Canonical image selection and efficient image graph construction for large-scale flickr photos.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

A latent semantic retrieval and clustering system for personal photos with sparse speech annotation.
Proceedings of the third workshop on Searching spontaneous conversational speech, 2009

Knowledge discovery over community-sharing media: From signal to intelligence.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Foreground segmentation for static video via multi-core and multi-modal graph cut.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Boosting object retrieval by estimating pseudo-objects.
Proceedings of the International Conference on Image Processing, 2009

2008
AdImage: video advertising by image matching and ad scheduling optimization.
Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2008

ContextSeer: context search and recommendation at query time for shared consumer photos.
Proceedings of the 16th International Conference on Multimedia 2008, 2008

Recent developments in content-based and concept-based image/video retrieval.
Proceedings of the 16th International Conference on Multimedia 2008, 2008

Keyword-based concept search on consumer photos by web-based kernel function.
Proceedings of the 16th International Conference on Multimedia 2008, 2008

SheepDog: group and tag recommendation for flickr photos by automatic search-based learning.
Proceedings of the 16th International Conference on Multimedia 2008, 2008

Video search reranking via online ordinal reranking.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

2007
Reranking Methods for Visual Search.
IEEE Multim., 2007

The NTU Toolkit and Framework for High-Level Feature Detection at TRECVID 2007.
Proceedings of the TRECVID 2007 workshop participants notebook papers, 2007

NTU TRECVID-2007 fast rushes summarization system.
Proceedings of the 1st ACM Workshop on Video Summarization, 2007

Video search reranking through random walk over document-level context graph.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

2006
Large-Scale Concept Ontology for Multimedia.
IEEE Multim., 2006

Columbia University TRECVID-2006 Video Search and High-Level Feature Extraction.
Proceedings of the 2006 TREC Video Retrieval Evaluation, 2006

Video search reranking via information bottleneck principle.
Proceedings of the 14th ACM International Conference on Multimedia, 2006

Topic Tracking Across Broadcast News Videos with Visual Duplicates and Semantic Concepts.
Proceedings of the International Conference on Image Processing, 2006

2005
Columbia University TRECVID-2005 Video Search and High-Level Feature Extraction.
Proceedings of the 2005 TREC Video Retrieval Evaluation, 2005

Visual Cue Cluster Construction via Information Bottleneck Principle and Kernel Density Estimation.
Proceedings of the Image and Video Retrieval, 4th International Conference, 2005

2004

Discovery and fusion of salient multimodal features toward news story segmentation.
Proceedings of the Storage and Retrieval Methods and Applications for Multimedia 2004, 2004

Story boundary detection in large broadcast news video archives: techniques, experience and trends.
Proceedings of the 12th ACM International Conference on Multimedia, 2004

Generative, discriminative, and ensemble learning on multi-modal perceptual fusion toward news video story segmentation.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

News video story segmentation using fusion of multi-level multi-modal features in TRECVID 2003.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Discovery and Fusion of Salient Multi-modal Features Towards News Story Segmentation.
Proceedings of the 2003 TREC Video Retrieval Evaluation, 2003

IBM Research TRECVID-2003 Video Retrieval System.
Proceedings of the 2003 TREC Video Retrieval Evaluation, 2003

A statistical framework for fusing mid-level perceptual features in news story segmentation.
Proceedings of the 2003 IEEE International Conference on Multimedia and Expo, 2003


  Loading...