Zicheng Liu

Orcid: 0000-0001-5894-7828

Affiliations:
  • Microsoft Research, Redmond, WA, USA
  • Princeton University, NJ, USA (PhD 1996)
  • Chinese Academy of Sciences, Institute of Applied Mathematics, Beijing, China


According to our database1, Zicheng Liu authored at least 260 papers between 1992 and 2024.

Collaborative distances:
  • Dijkstra number2 of three.
  • Erdős number3 of two.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Tuning Timestep-Distilled Diffusion Model Using Pairwise Sample Optimization.
CoRR, 2024

AutoDirector: Online Auto-scheduling Agents for Multi-sensory Composition.
CoRR, 2024

MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities.
CoRR, 2024

A Unified Gaussian Process for Branching and Nested Hyperparameter Optimization.
CoRR, 2024

StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis.
CoRR, 2024

MPT: Mesh Pre-Training with Transformers for Human Pose and Mesh Reconstruction.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

OpenLEAF: A Novel Benchmark for Open-Domain Interleaved Image-Text Generation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Bring Metric Functions into Diffusion Models.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

StrokeNUWA - Tokenizing Strokes for Vector Graphic Synthesis.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Completing Visual Objects via Bridging Generation and Segmentation.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation.
Proceedings of the Computer Vision - ECCV 2024, 2024

Idea2Img: Iterative Self-refinement with GPT-4V for Automatic Image Design and Generation.
Proceedings of the Computer Vision - ECCV 2024, 2024

GRiT: A Generative Region-to-Text Transformer for Object Understanding.
Proceedings of the Computer Vision - ECCV 2024, 2024

MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Disco: Disentangled Control for Realistic Human Dance Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Training Diffusion Models Towards Diverse Image Generation with Reinforcement Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Segment and Caption Anything.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ORES: Open-Vocabulary Responsible Visual Synthesis.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Co-Communication Graph Convolutional Network for Multi-View Crowd Counting.
IEEE Trans. Multim., 2023

MGL: Mutual Graph Learning for Camouflaged Object Detection.
IEEE Trans. Image Process., 2023

GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation.
CoRR, 2023

MM-VID: Advancing Video Understanding with GPT-4V(ision).
CoRR, 2023

On the Hidden Waves of Image.
CoRR, 2023

Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation.
CoRR, 2023

OpenLEAF: Open-Domain Interleaved Image-Text Generation and Evaluation.
CoRR, 2023

Completing Visual Objects via Bridging Generation and Segmentation.
CoRR, 2023

The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision).
CoRR, 2023

Does Full Waveform Inversion Benefit from Big Data?
CoRR, 2023

Spatial-Frequency U-Net for Denoising Diffusion Probabilistic Models.
CoRR, 2023

DisCo: Disentangled Control for Referring Human Dance Generation in Real World.
CoRR, 2023

RefineVIS: Video Instance Segmentation with Temporal Attention Refinement.
CoRR, 2023

PaintSeg: Training-free Segmentation via Painting.
CoRR, 2023

Image is First-order Norm+Linear Autoregressive.
CoRR, 2023

Simplifying Full Waveform Inversion via Domain-Independent Self-Supervised Learning.
CoRR, 2023

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation.
CoRR, 2023

MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action.
CoRR, 2023

MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

PaintSeg: Painting Pixels for Training-free Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Learning 3D Photography Videos via Self-supervised Diffusion on Single Images.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Energy-Inspired Self-Supervised Pretraining for Vision Models.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Layer Grafted Pre-training: Bridging Contrastive Learning And Masked Image Modeling For Label-Efficient Representations.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Zero-Shot Human-Object Interaction (HOI) Classification by Bridging Generative and Contrastive Image-Language Models.
Proceedings of the IEEE International Conference on Image Processing, 2023

Equivariant Similarity for Vision-Language Foundation Models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

ReCo: Region-Controlled Text-to-Image Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Binary Latent Diffusion.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Adaptive Human Matting for Dynamic Videos.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Deep Frequency Filtering for Domain Generalization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Neural Voting Field for Camera-Space 3D Hand Pose Estimation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

An Empirical Study of End-to-End Video-Language Transformers with Masked Visual Modeling.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
EFRNet: Efficient Feature Reconstructing Network for Real-Time Scene Parsing.
IEEE Trans. Multim., 2022

GIT: A Generative Image-to-text Transformer for Vision and Language.
Trans. Mach. Learn. Res., 2022

Vision-Language Pre-Training: Basics, Recent Advances, and Future Trends.
Found. Trends Comput. Graph. Vis., 2022

Self-Supervised Learning based on Heat Equation.
CoRR, 2022

Exploring Discrete Diffusion Models for Image Captioning.
CoRR, 2022

Should All Proposals be Treated Equally in Object Detection?
CoRR, 2022

Consistent Video Instance Segmentation with Inter-Frame Recurrent Attention.
CoRR, 2022

Cross-modal Representation Learning for Zero-shot Action Recognition.
CoRR, 2022

MPS-NeRF: Generalizable 3D Human Rendering from Multiview Images.
CoRR, 2022

The Overlooked Classifier in Human-Object Interaction Recognition.
CoRR, 2022

Exploring Multi-physics with Extremely Weak Supervision.
CoRR, 2022

SA-VQA: Structured Alignment of Visual and Semantic Representations for Visual Question Answering.
CoRR, 2022

NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

An Intriguing Property of Geophysics Inversion.
Proceedings of the International Conference on Machine Learning, 2022

Unsupervised Learning of Full-Waveform Inversion: Connecting CNN and Partial Differential Equation in a Loop.
Proceedings of the Tenth International Conference on Learning Representations, 2022

UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling.
Proceedings of the Computer Vision - ECCV 2022, 2022

A Simple Approach and Benchmark for 21, 000-Category Object Detection.
Proceedings of the Computer Vision - ECCV 2022, 2022

Should All Proposals Be Treated Equally in Object Detection?
Proceedings of the Computer Vision - ECCV 2022, 2022

Crossmodal Representation Learning for Zero-shot Action Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Injecting Semantic Concepts into End-to-End Image Captioning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

An Empirical Study of Training End-to-End Vision-and-Language Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Mobile-Former: Bridging MobileNet and Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Lifelong Unsupervised Domain Adaptive Person Re-identification with Coordinated Anti-forgetting and Adaptation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Scaling Up Vision-Language Pretraining for Image Captioning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

OVIS: Open-Vocabulary Visual Instance Search via Visual-Semantic Aligned Representation Learning.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

Playing Lottery Tickets with Vision and Language.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Cross-Domain Complementary Learning Using Pose for Multi-Person Part Segmentation.
IEEE Trans. Circuits Syst. Video Technol., 2021

Human pose estimation and its application to action recognition: A survey.
J. Vis. Commun. Image Represent., 2021

Decoupling Object Detection from Human-Object Interaction Recognition.
CoRR, 2021

Improving Vision Transformers for Incremental Learning.
CoRR, 2021

MLP Architectures for Vision-and-Language Modeling: An Empirical Study.
CoRR, 2021

VIOLET : End-to-End Video-Language Transformers with Masked Visual-token Modeling.
CoRR, 2021

Scaling Up Vision-Language Pre-training for Image Captioning.
CoRR, 2021

Crossing the Format Boundary of Text and Boxes: Towards Unified Vision-Language Modeling.
CoRR, 2021

Florence: A New Foundation Model for Computer Vision.
CoRR, 2021

UFO: A UniFied TransfOrmer for Vision-Language Representation Learning.
CoRR, 2021

Is Object Detection Necessary for Human-Object Interaction Recognition?
CoRR, 2021

Disentanglement-based Cross-Domain Feature Augmentation for Effective Unsupervised Domain Adaptive Person Re-identification.
CoRR, 2021

Weak NAS Predictors Are All You Need.
CoRR, 2021

Stronger NAS with Weaker Predictors.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

Revisiting Dynamic Convolution via Matrix Decomposition.
Proceedings of the 9th International Conference on Learning Representations, 2021

SEED: Self-supervised Distillation For Visual Representation.
Proceedings of the 9th International Conference on Learning Representations, 2021

Learning Nonparametric Human Mesh Reconstruction From A Single Image Without Ground Truth Meshes.
Proceedings of the 2021 IEEE International Conference on Image Processing, 2021

End-to-End Semi-Supervised Object Detection with Soft Teacher.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Mesh Graphormer.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

MicroNet: Improving Image Recognition with Extremely Low FLOPs.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Compressing Visual-linguistic Model via Knowledge Distillation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

End-to-End Human Pose and Mesh Reconstruction with Transformers.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Probabilistic Model Distillation for Semantic Correspondence.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

VIVO: Visual Vocabulary Pre-Training for Novel Object Captioning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Learning Quintuplet Loss for Large-Scale Visual Geolocalization.
IEEE Multim., 2020

3D Human motion anticipation and classification.
CoRR, 2020

MiniVLM: A Smaller and Faster Vision-Language Model.
CoRR, 2020

MicroNet: Towards Image Recognition with Extremely Low FLOPs.
CoRR, 2020

VIVO: Surpassing Human Performance in Novel Object Captioning with Visual Vocabulary Pre-Training.
CoRR, 2020

Human Action Image Generation with Differential Privacy.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Dynamic ReLU.
Proceedings of the Computer Vision - ECCV 2020, 2020

Dynamic Convolution: Attention Over Convolution Kernels.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Rethinking Classification and Localization for Object Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Discriminative Spatio-Temporal Pattern Discovery for 3D Action Recognition.
IEEE Trans. Circuits Syst. Video Technol., 2019

Cross-Domain Complementary Learning with Synthetic Data for Multi-Person Part Segmentation.
CoRR, 2019

Rethinking Classification and Localization in R-CNN.
CoRR, 2019

Large Scale Incremental Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
3D cartoon face rigging from sparse examples.
Vis. Comput., 2018

Depth Super-Resolution on RGB-D Video Sequences With Large Displacement 3D Motion.
IEEE Trans. Image Process., 2018

A set-to-set nearest neighbor approach for robust and efficient face recognition with image sets.
J. Vis. Commun. Image Represent., 2018

Incremental Classifier Learning with Generative Adversarial Networks.
CoRR, 2018

Reinforced Temporal Attention and Split-Rate Transfer for Depth-Based Person Re-identification.
Proceedings of the Computer Vision - ECCV 2018, 2018

HP-GAN: Probabilistic 3D Human Motion Prediction via GAN.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018

2017
A Tube-and-Droplet-Based Approach for Representing and Analyzing Motion Trajectories.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

Person Depth ReID: Robust Person Re-identification with Commodity Depth Sensors.
CoRR, 2017

2016
3D cartoon face generation by local deformation mapping.
Vis. Comput., 2016

Handling Occlusion and Large Displacement Through Improved RGB-D Scene Flow Estimation.
IEEE Trans. Circuits Syst. Video Technol., 2016

Survey on 3D Hand Gesture Recognition.
IEEE Trans. Circuits Syst. Video Technol., 2016

Sparsity-Induced Similarity Measure and Its Applications.
IEEE Trans. Circuits Syst. Video Technol., 2016

An image-to-class dynamic time warping approach for both 3D static and trajectory hand gesture recognition.
Pattern Recognit., 2016

2015
Random Forest Construction With Robust Semisupervised Node Splitting.
IEEE Trans. Image Process., 2015

Propagative Hough Voting for Human Activity Detection and Recognition.
IEEE Trans. Circuits Syst. Video Technol., 2015

Real time gaze estimation with a consumer depth camera.
Inf. Sci., 2015

Auxiliary Training Information Assisted Visual Recognition.
IPSJ Trans. Comput. Vis. Appl., 2015

Modeling and design of a 0.8-30 GHz tunable inductor-less divide-by-2 frequency divider with digital frequency calibration.
Proceedings of the IEEE 58th International Midwest Symposium on Circuits and Systems, 2015

VTouch: Vision-enhanced interaction for large touch displays.
Proceedings of the 2015 IEEE International Conference on Multimedia and Expo, 2015

Anomaly detection by using random projection forest.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

ImmerseBoard: Immersive Telepresence Experience using a Digital Whiteboard.
Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, 2015

2014
Human Action Recognition with Depth Cameras
Springer Briefs in Computer Science, Springer, ISBN: 978-3-319-04560-3, 2014

Face Modeling.
Computer Vision, A Reference Guide, 2014

Activity Recognition.
Computer Vision, A Reference Guide, 2014

On the improvement of human action recognition from depth map sequences using Space-Time Occupancy Patterns.
Pattern Recognit. Lett., 2014

Animated Pose Templates for Modeling and Detecting Human Actions.
IEEE Trans. Pattern Anal. Mach. Intell., 2014

Learning Actionlet Ensemble for 3D Human Action Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., 2014

Kernelized pyramid nearest-neighbor search for object categorization.
Mach. Vis. Appl., 2014

Real world activity summary for senior home monitoring.
Multim. Tools Appl., 2014

A robust elastic net approach for feature learning.
J. Vis. Commun. Image Represent., 2014

Introduction to the special issue on visual understanding and applications with RGB-D cameras.
J. Vis. Commun. Image Represent., 2014

Forging a Close Relationship with Multimedia Communities.
IEEE Multim., 2014

Real-Time Gaze Estimation with Online Calibration.
IEEE Multim., 2014

Realtime gaze estimation with online calibration.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2014

Video Summarization based on Nonnegative Linear Reconstruction.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2014

Eye gaze tracking using an RGBD camera: a comparison with a RGB solution.
Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing, 2014

Towards accurate and robust cross-ratio based gaze trackers through learning from simulation.
Proceedings of the Eye Tracking Research and Applications, 2014

Detecting Subtle Human-Object Interactions Using Kinect.
Proceedings of the Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, 2014

Automatic Camera-Screen Localization.
Proceedings of the Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, 2014

Randomized Support Vector Forest.
Proceedings of the British Machine Vision Conference, 2014

Can Visual Recognition Benefit from Auxiliary Information in Training?
Proceedings of the Computer Vision - ACCV 2014, 2014

Discriminative Orderlet Mining for Real-Time Recognition of Human-Object Interaction.
Proceedings of the Computer Vision - ACCV 2014, 2014

Completed Dense Scene Flow in RGB-D Space.
Proceedings of the Computer Vision - ACCV 2014 Workshops, 2014

2013
Action Search by Example Using Randomized Visual Vocabularies.
IEEE Trans. Image Process., 2013

Sparse representation and learning in visual recognition: Theory and applications.
Signal Process., 2013

Image-to-Class Dynamic Time Warping for 3D hand gesture recognition.
Proceedings of the 2013 IEEE International Conference on Multimedia and Expo, 2013

A Compensated Technique for 2.5-GHz Ring-Oscillator-Based PLL used in Wireless Transmission.
Proceedings of the 2013 IEEE International Conference on Green Computing and Communications (GreenCom) and IEEE Internet of Things (iThings) and IEEE Cyber, 2013

Measuring the engagement level of TV viewers.
Proceedings of the 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, 2013

Probabilistic Graphlet Cut: Exploiting Spatial Structure Cue for Weakly Supervised Image Segmentation.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Semi-supervised Node Splitting for Random Forest Construction.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Tensor-Based Human Body Modeling.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Action Detection by Fusing Hierarchically Filtered Motion with Spatiotemporal Interest Point Features.
Proceedings of the Human Behavior Recognition Technologies, 2013

2012
Hierarchical Filtered Motion for Action Recognition in Crowded Videos.
IEEE Trans. Syst. Man Cybern. Part C, 2012

Multi-support-region image descriptors and its application to street landmark localization.
Mach. Vis. Appl., 2012

Predicting human activities using spatio-temporal structure of interest points.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Sparsity-based online missing sensor data recovery.
Proceedings of the 2012 IEEE International Symposium on Circuits and Systems, 2012

A Pyramid Nearest Neighbor Search Kernel for object categorization.
Proceedings of the 21st International Conference on Pattern Recognition, 2012

Propagative Hough Voting for Human Activity Recognition.
Proceedings of the Computer Vision - ECCV 2012, 2012

Robust 3D Action Recognition with Random Occupancy Patterns.
Proceedings of the Computer Vision - ECCV 2012, 2012

Mining actionlet ensemble for action recognition with depth cameras.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

STOP: Space-Time Occupancy Patterns for 3D Action Recognition from Depth Map Sequences.
Proceedings of the Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, 2012

Human Activity Recognition with 2D and 3D Cameras.
Proceedings of the Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, 2012

Combing RGB and Depth Map Features for human activity recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

2011
Fast Action Detection via Discriminative Random Forest Voting and Top-K Subvolume Search.
IEEE Trans. Multim., 2011

Introduction to the ICME2010 Special Issue.
IEEE Trans. Multim., 2011

Discriminative Video Pattern Search for Efficient Action Detection.
IEEE Trans. Pattern Anal. Mach. Intell., 2011

Real-time human action search using random forest based hough voting.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Real world activity summary for senior home monitoring.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Unsupervised random forest indexing for fast action search.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Face Synthesis.
Proceedings of the Handbook of Face Recognition, 2nd Edition., 2011

2010
Image Ratio Features for Facial Expression Recognition Application.
IEEE Trans. Syst. Man Cybern. Part B, 2010

Efficient search of Top-K video subvolumes for multi-instance action detection.
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

Learning feature transforms for object detection from panoramic images.
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

Action detection using multiple spatial-temporal interest point features.
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

Action recognition based on a bag of 3D points.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2010

Cross-dataset action detection.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

Human Action Recognition with Expandable Graphical Models.
Proceedings of the Machine Learning for Human Motion Analysis - Theory and Practice., 2010

2009
Dual-RBF based surface reconstruction.
Vis. Comput., 2009

Active Lighting for Video Conferencing.
IEEE Trans. Circuits Syst. Video Technol., 2009

Face Relighting from a Single Image under Arbitrary Unknown Lighting Conditions.
IEEE Trans. Pattern Anal. Mach. Intell., 2009

Implicit Surface Reconstruction with an Analogy of Polar Field Model.
Proceedings of the Advances in Image and Video Technology, Third Pacific Rim Symposium, 2009

Speeding up spatio-temporal sliding-window search for efficient event detection in crowded videos.
Proceedings of the 1st ACM international workshop on Events in multimedia, 2009

Optimal joint linear acoustic echo cancelation and blind source separation in the presence of loudspeaker nonlinearity.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Sparsity induced similarity measure for label propagation.
Proceedings of the IEEE 12th International Conference on Computer Vision, ICCV 2009, Kyoto, Japan, September 27, 2009

Discriminative subvolume search for efficient action detection.
Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

Efficient Scale-Space Spatiotemporal Saliency Tracking for Distortion-Free Video Retargeting.
Proceedings of the Computer Vision, 2009

2008
Expandable Data-Driven Graphical Modeling of Human Actions Based on Salient Postures.
IEEE Trans. Circuits Syst. Video Technol., 2008

Multisensory processing for speech enhancement and magnitude-normalized spectra for speech modeling.
Speech Commun., 2008

Semantic saliency driven camera control for personal remote collaboration.
Proceedings of the International Workshop on Multimedia Signal Processing, 2008

Graphical modeling and decoding of human actions.
Proceedings of the International Workshop on Multimedia Signal Processing, 2008

Requirements and recommendations for an enhanced meeting viewing experience.
Proceedings of the 16th International Conference on Multimedia 2008, 2008

Blind source separation in a distributed microphone meeting environment for improved teleconferencing.
Proceedings of the IEEE International Conference on Acoustics, 2008

Meta-tag propagation by co-training an ensemble classifier for improving image search relevance.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2008

A deformable local image descriptor.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

2007
A Generic Framework for Efficient 2-D and 3-D Facial Expression Analogy.
IEEE Trans. Multim., 2007

Head-Size Equalization for Improved Visual Perception in Video Conferencing.
IEEE Trans. Multim., 2007

Learning-Based Perceptual Image Quality Improvement for Video Conferencing.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Enhancing a Driver's Situation Awareness using a Global View Map.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Energy-Based Sound Source Localization and Gain Normalization for Ad Hoc Microphone Arrays.
Proceedings of the IEEE International Conference on Acoustics, 2007

Face Re-Lighting from a Single Image under Harsh Lighting Conditions.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

2006
Geometry-Driven Photorealistic Facial Expression Synthesis.
IEEE Trans. Vis. Comput. Graph., 2006

Iterative Local-Global Energy Minimization for Automatic Extraction of Objects of Interest.
IEEE Trans. Pattern Anal. Mach. Intell., 2006

Speech Modelingwith Magnitude-Normalized Complex Spectra and Its Application to Multisensory Speech Enhancement.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Subtle Facial Expression Modeling with Vector Field Decomposition.
Proceedings of the International Conference on Image Processing, 2006

Automatic Business Card Scanning with a Camera.
Proceedings of the International Conference on Image Processing, 2006

Real-Time Facial Expression Mapping for High Resolution 3D Meshes.
Proceedings of the Advances in Computer Graphics, 2006

2005
A graphical model for multi-sensory speech processing in air-and-bone conductive microphones.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Multi-sensory speech processing: incorporating automatically extracted hidden dynamic information.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

Head-size equalization for better visual perception of video conferencing.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

Automatic Head-size Equalization in Panorama Images for Video Conferencing.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

Leakage Model and Teeth Clack Removal for Air- and Bone-Conductive Integrated Microphones.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Characters or Faces: A User Study on Ease of Use for HIPs.
Proceedings of the Human Interactive Proofs, Second International Workshop, 2005

2004
ARTiFACIAL: Automated Reverse Turing test using FACIAL features.
Multim. Syst., 2004

Robust and Rapid Generation of Animated Faces from Video Images: A Model-Based Modeling Approach.
Int. J. Comput. Vis., 2004

Image-Based Surface Detail Transfer.
IEEE Computer Graphics and Applications, 2004

Direct filtering for air- and bone-conductive microphones.
Proceedings of the IEEE 6th Workshop on Multimedia Signal Processing, 2004

Nonlinear information fusion in multi-sensor processing - extracting and exploiting hidden dynamics of speech captured by a bone-conductive microphone.
Proceedings of the IEEE 6th Workshop on Multimedia Signal Processing, 2004

Low bit-rate video streaming for face-to-face teleconference.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Multi-sensory microphones for robust speech detection, enhancement and recognition.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Geometry-driven photorealistic facial expression synthesis.
Proceedings of the 2003 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, 2003

Excuse me, but are you human?
Proceedings of the Eleventh ACM International Conference on Multimedia, 2003

Why take notes? Use the whiteboard capture system.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Face Relighting with Radiance Environment Maps.
Proceedings of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2003), 2003

2002
Distributed meetings: a meeting capture and broadcasting system.
Proceedings of the 10th ACM International Conference on Multimedia 2002, 2002

On recovering detailed face deformation under general lighting using height from shading.
Proceedings of the 2002 IEEE International Conference on Multimedia and Expo, 2002

Model-based face image coding using spherical harmonics.
Proceedings of the 2002 International Conference on Image Processing, 2002

2001
Rapid modeling of animated faces from video.
Comput. Animat. Virtual Worlds, 2001

Expressive expression mapping with ratio images.
Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, 2001

A Robust and Fast Face Modeling System.
Proceedings of the Advances in Multimedia Information Processing, 2001

Cloning Your Own Face with a Desktop Camera.
Proceedings of the Eighth International Conference On Computer Vision (ICCV-01), Vancouver, British Columbia, Canada, July 7-14, 2001, 2001

Model-Based Bundle Adjustment with Application to Face Modeling.
Proceedings of the Eighth International Conference On Computer Vision (ICCV-01), Vancouver, British Columbia, Canada, July 7-14, 2001, 2001

2000
Rapid modeling of animated faces from video images.
Proceedings of the 8th ACM International Conference on Multimedia 2000, Los Angeles, CA, USA, October 30, 2000

Robust Head Motion Computation by Taking Advantage of Physical Properties.
Proceedings of the Workshop on Human Motion, 2000

1996
The Bounded Membership Problem of the Monoid SL_2(N).
Math. Syst. Theory, 1996

1994
Efficient Average-Case Algorithms for the Modular Group
Electron. Colloquium Comput. Complex., 1994

Hierarchical spacetime control.
Proceedings of the 21th Annual Conference on Computer Graphics and Interactive Techniques, 1994

1993
Minimum Steiner Trees in Normed Planes.
Discret. Comput. Geom., 1993

1992
On Steiner Minimal Trees with L_p Distance.
Algorithmica, 1992


  Loading...