Yadong Mu

Orcid: 0000-0001-7815-3750

Affiliations:
  • Peking University, Beijing, China
  • AT&T Labs, Middletown, NJ, USA (former)
  • Columbia University, New York, NY, USA (former)


According to our database1, Yadong Mu authored at least 133 papers between 2007 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Multi-Granularity Interaction for Multi-Person 3D Motion Prediction.
IEEE Trans. Circuits Syst. Video Technol., March, 2024

Localized Linear Temporal Dynamics for Self-Supervised Skeleton Action Recognition.
IEEE Trans. Multim., 2024

Hierarchical reinforcement learning for chip-macro placement in integrated circuit.
Pattern Recognit. Lett., 2024

Pyramidal Flow Matching for Efficient Video Generative Modeling.
CoRR, 2024

Closed-Loop Long-Horizon Robotic Planning via Equilibrium Sequence Modeling.
CoRR, 2024

InstructLayout: Instruction-Driven 2D and 3D Layout Synthesis with Semantic Graph Prior.
CoRR, 2024

HumanSplat: Generalizable Single-Image Human Gaussian Splatting with Structure Priors.
CoRR, 2024

RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance.
CoRR, 2024

Neural Assembler: Learning to Generate Fine-Grained Robotic Assembly Instructions from Multi-View Images.
CoRR, 2024

Text-controlled Motion Mamba: Text-Instructed Temporal Grounding of Human Motion.
CoRR, 2024

Learning Solution-Aware Transformers for Efficiently Solving Quadratic Assignment Problem.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

InstructScene: Instruction-Driven 3D Indoor Scene Synthesis with Semantic Graph Prior.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Local Occupancy-Enhanced Object Grasping with Multiple Triplanar Projection.
Proceedings of the Computer Vision - ECCV 2024, 2024

Weakly-Supervised Spatio-Temporal Video Grounding with Variational Cross-Modal Alignment.
Proceedings of the Computer Vision - ECCV 2024, 2024

Countering Personalized Text-to-Image Generation with Influence Watermarks.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Ink Dot-Oriented Differentiable Optimization for Neural Image Halftoning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Exploring Orthogonality in Open World Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Transferable Video Moment Localization by Moment-Guided Query Prompting.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Language-Guided Multi-Granularity Context Aggregation for Temporal Sentence Grounding.
IEEE Trans. Multim., 2023

Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization.
CoRR, 2023

Curriculum Graph Poisoning.
Proceedings of the ACM Web Conference 2023, 2023

Image Completion with Heterogeneously Filtered Spectral Hints.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

Rewiring Neurons in Non-Stationary Environments.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Co-Salient Object Detection with Semantic-Level Consensus Extraction and Dispersion.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Diffused Fourier Network for Video Action Segmentation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Trapdoor Normalization with Irreversible Ownership Verification.
Proceedings of the International Conference on Machine Learning, 2023

Video Action Segmentation via Contextually Refined Temporal Keypoints.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Neural Koopman Pooling: Control-Inspired Temporal Dynamics Encoding for Skeleton-Based Action Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Learning Instance-Level Representation for Large-Scale Multi-Modal Pretraining in E-Commerce.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Regularizing Second-Order Influences for Continual Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Tree-Structured Trajectory Encoding for Vision-and-Language Navigation.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Zero-Shot Video Event Detection With High-Order Semantic Concept Discovery and Matching.
IEEE Trans. Multim., 2022

Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Conditional Diffusion Process for Inverse Halftoning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Patch-based Knowledge Distillation for Lifelong Person Re-Identification.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Video2Subtitle: Matching Weakly-Synchronized Sequences via Dynamic Temporal Alignment.
Proceedings of the ICMR '22: International Conference on Multimedia Retrieval, Newark, NJ, USA, June 27, 2022

Learning Sample Importance for Cross-Scenario Video Temporal Grounding.
Proceedings of the ICMR '22: International Conference on Multimedia Retrieval, Newark, NJ, USA, June 27, 2022

Complex Video Action Reasoning via Learnable Markov Logic Network.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Joint Video Summarization and Moment Localization by Cross-Task Sample Transfer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Deep High-Resolution Representation Learning for Visual Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Rethinking the Spatial Route Prior in Vision-and-Language Navigation.
CoRR, 2021

Poisoning MorphNet for Clean-Label Backdoor Attack to Point Clouds.
CoRR, 2021

Searching Motion Graphs for Human Motion Synthesis.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Joint Hand-Object Pose Estimation with Differentiably-Learned Physical Contact Point Analysis.
Proceedings of the ICMR '21: International Conference on Multimedia Retrieval, 2021

Question-Guided Semantic Dual-Graph Visual Reasoning with Novel Answers.
Proceedings of the ICMR '21: International Conference on Multimedia Retrieval, 2021

Multi-Target Invisibly Trojaned Networks for Visual Recognition and Detection.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Learning 3-D Human Pose Estimation from Catadioptric Videos.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Self-Supervised Video Action Localization with Adversarial Temporal Transforms.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Learning Factorized Cross-View Fusion for Multi-View Crowd Counting.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Efficient Fine-Grained Visual-Text Search Using Adversarially-Learned Hash Codes.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

PointerNet: Spatiotemporal Modeling for Crowd Counting in Videos.
Proceedings of the ICDLT 2021: 5th International Conference on Deep Learning Technologies, Qingdao, China, July 23, 2021

Russian Doll Network: Learning Nested Networks for Sample-Adaptive Dynamic Inference.
Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

DRENet: Giving Full Scope to Detection and Regression-Based Estimation for Video Crowd Counting.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2021, 2021

Dense Events Grounding in Video.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Fast Fourier Convolution.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Cap2Seg: Inferring Semantic and Spatial Context from Captions for Zero-Shot Image Segmentation.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Google Helps YouTube: Learning Few-Shot Video Classification from Historic Tasks and Cross-Domain Sample Transfer.
Proceedings of the 2020 on International Conference on Multimedia Retrieval, 2020

Informative Dropout for Robust Representation Learning: A Shape-bias Perspective.
Proceedings of the 37th International Conference on Machine Learning, 2020

Spectrally-Enforced Global Receptive Field For Contextual Medical Image Segmentation And Classification.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Scale Matters: Temporal Scale Aggregation Network For Precise Action Localization In Untrimmed Videos.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Weakly-Supervised Action Localization by Generative Attention Modeling.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Beyond Short-Term Snippet: Video Relation Detection With Spatio-Temporal Global Context.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Visual-Semantic Matching by Exploring High-Order Attention and Distraction.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Learning Temporal Co-Attention Models for Unsupervised Video Action Localization.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Non-Local Neural Networks With Grouped Bilinear Attentional Transforms.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Localize, Assemble, and Predicate: Contextual Object Proposal Embedding for Visual Relation Detection.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Revisiting Jump-Diffusion Process for Visual Tracking: A Reinforcement Learning Approach.
IEEE Trans. Circuits Syst. Video Technol., 2019

Learning single-shot vehicle orientation estimation from large-scale street panoramas.
Neurocomputing, 2019

Scale Matters: Temporal Scale Aggregation Network for Precise Action Localization in Untrimmed Videos.
CoRR, 2019

High-Resolution Representations for Labeling Pixels and Regions.
CoRR, 2019

Fast Non-Local Neural Networks with Spectral Residual Learning.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

High-Capacity Convolutional Video Steganography with Temporal Residual Modeling.
Proceedings of the 2019 on International Conference on Multimedia Retrieval, 2019

Two-Stream Video Classification with Cross-Modality Attention.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Recurrent Attentive Zooming for Joint Crowd Counting and Precise Localization.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Attention-Based Multi-Context Guiding for Few-Shot Semantic Segmentation.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
High-Precision Camera Localization in Scenes with Repetitive Patterns.
ACM Trans. Intell. Syst. Technol., 2018

A Stochastic Attribute Grammar for Robust Cross-View Human Tracking.
IEEE Trans. Circuits Syst. Video Technol., 2018

Convolutional Video Steganography with Temporal Residual Modeling.
CoRR, 2018

2017
Stochastic Gradient Made Stable: A Manifold Propagation Approach for Large-Scale Optimization.
IEEE Trans. Knowl. Data Eng., 2017

Deep Steering: Learning End-to-End Driving Model from Spatial and Temporal Visual Cues.
CoRR, 2017

End-to-end Active Object Tracking via Reinforcement Learning.
CoRR, 2017

Classification by Retrieval: Binarizing Data and Classifiers.
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017

Temporal Binary Coding for Large-Scale Video Search.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

VSCC'2017: Visual Analysis for Smart and Connected Communities.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Learning End-to-End Autonomous Steering Model from Spatial and Temporal Visual Cues.
Proceedings of the Workshop on Visual Analysis in Smart and Connected Communities, 2017

Deep Hashing: A Joint Approach for Image Signature Learning.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Boosting Complementary Hash Tables for Fast Nearest Neighbor Search.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Learning Binary Codes and Binary Weights for Efficient Classification.
CoRR, 2016

Coordinate Discrete Optimization for Efficient Cross-View Image Retrieval.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

A Stochastic Image Grammar for Fine-Grained 3D Scene Reconstruction.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Fixed-Rank Supervised Metric Learning on Riemannian Manifold.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Large-Scale Unsupervised Hashing with Shared Structure Learning.
IEEE Trans. Cybern., 2015

Large-scale multi-task image labeling with adaptive relevance discovery and feature hashing.
Signal Process., 2015

Stochastic Gradient Made Stable: A Manifold Propagation Approach for Large-Scale Optimization.
CoRR, 2015

Piecewise linear approximation of streaming time series data with max-error guarantees.
Proceedings of the 31st IEEE International Conference on Data Engineering, 2015

2014
Mixed image-keyword query adaptive hashing over multilabel images.
ACM Trans. Multim. Comput. Commun. Appl., 2014

Video De-Fencing.
IEEE Trans. Circuits Syst. Video Technol., 2014

Guest Editorial: Special issue on large scale multimedia semantic indexing.
Comput. Vis. Image Underst., 2014

Supervised deep learning with auxiliary networks.
Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014

Text vectorization based on character recognition and character stroke modeling.
Proceedings of the Imaging and Multimedia Analytics in a Web and Mobile World 2014, 2014

Hash-SVM: Scalable Kernel Machines for Large-Scale Visual Classification.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

2013
Large-scale multilabel propagation based on efficient sparse graph construction.
ACM Trans. Multim. Comput. Commun. Appl., 2013

Computational facial attractiveness prediction by aesthetics-aware features.
Neurocomputing, 2013

Divide-and-Conquer Subspace Segmentation
CoRR, 2013

IBM Research and Columbia University TRECVID-2013 Multimedia Event Detection (MED), Multimedia Event Recounting (MER), Surveillance Event Detection (SED), and Semantic Indexing (SIN) Systems.
Proceedings of the 2013 TREC Video Retrieval Evaluation, 2013

Distributed Low-Rank Subspace Segmentation.
Proceedings of the IEEE International Conference on Computer Vision, 2013

2012
Multimedia semantics-aware query-adaptive hashing with bits reconfigurability.
Int. J. Multim. Inf. Retr., 2012

IBM Research and Columbia University TRECVID-2012 Multimedia Event Detection (MED), Multimedia Event Recounting (MER), and Semantic Indexing (SIN) Systems.
Proceedings of the 2012 TREC Video Retrieval Evaluation, 2012

Submodular video hashing: a unified framework towards video pooling and indexing.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Compact hashing for mixed image-keyword query over multi-label images.
Proceedings of the International Conference on Multimedia Retrieval, 2012

Compact Hyperplane Hashing with Bilinear Functions.
Proceedings of the 29th International Conference on Machine Learning, 2012

Accelerated Large Scale Optimization by Concomitant Hashing.
Proceedings of the Computer Vision - ECCV 2012, 2012

Scene Aligned Pooling for Complex Video Recognition.
Proceedings of the Computer Vision - ECCV 2012, 2012

2011
Efficient region-aware large graph construction towards scalable multi-label propagation.
Pattern Recognit., 2011

Non-uniform multiple kernel learning with cluster-based gating functions.
Neurocomputing, 2011

IBM Research and Columbia University TRECVID-2011 Multimedia Event Detection (MED) System.
Proceedings of the 2011 TREC Video Retrieval Evaluation, 2011

Learning reconfigurable hashing for diverse semantics.
Proceedings of the 1st International Conference on Multimedia Retrieval, 2011

Towards Optimal Discriminating Order for Multiclass Classification.
Proceedings of the 11th IEEE International Conference on Data Mining, 2011

Accelerated low-rank visual recovery by random projection.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

2010
Information-Theoretic Analysis of Input Strokes in Visual Object Cutout.
IEEE Trans. Multim., 2010

MC-JBIG2: an improved algorithm for Chinese textual image compression.
Int. J. Document Anal. Recognit., 2010

Efficient large-scale image annotation by probabilistic collaborative multi-label propagation.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Activity recognition using dense long-duration trajectories.
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

Randomized Locality Sensitive Vocabularies for Bag-of-Features Model.
Proceedings of the Computer Vision, 2010

Weakly-supervised hashing in kernel space.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

Non-Metric Locality-Sensitive Hashing.
Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010

2009
Prior-guided automatic object cutout in personal album.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Connectivity similarity based transductive learning for interactive image segmentation.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
Contextual motion field-based distance for video analysis.
Vis. Comput., 2008

Discriminative local binary patterns for human detection in personal album.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

2007
Co-segmentation of Image Pairs with Quadratic Global Constraint in MRFs.
Proceedings of the Computer Vision, 2007


  Loading...