Linchao Zhu

Orcid: 0000-0002-4093-7557

According to our database1, Linchao Zhu authored at least 130 papers between 2015 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Noise-Tolerant Hybrid Prototypical Learning with Noisy Web Data.
ACM Trans. Multim. Comput. Commun. Appl., October, 2024

Divide and Retain: A Dual-Phase Modeling for Long-Tailed Visual Recognition.
IEEE Trans. Neural Networks Learn. Syst., October, 2024

Bilaterally Normalized Scale-Consistent Sinkhorn Distance for Few-Shot Image Classification.
IEEE Trans. Neural Networks Learn. Syst., August, 2024

Penalizing the Hard Example But Not Too Much: A Strong Baseline for Fine-Grained Visual Classification.
IEEE Trans. Neural Networks Learn. Syst., May, 2024

CMGNet: Collaborative multi-modal graph network for video captioning.
Comput. Vis. Image Underst., January, 2024

Show Me a Video: A Large-Scale Narrated Video Dataset for Coherent Story Illustration.
IEEE Trans. Multim., 2024

SKIM: Skeleton-Based Isolated Sign Language Recognition With Part Mixing.
IEEE Trans. Multim., 2024

IcoCap: Improving Video Captioning by Compounding Images.
IEEE Trans. Multim., 2024

Zero-Shot Video Grounding With Pseudo Query Lookup and Verification.
IEEE Trans. Image Process., 2024

Collaborative group: Composed image retrieval via consensus learning from noisy annotations.
Knowl. Based Syst., 2024

MC-Bench: A Benchmark for Multi-Context Visual Grounding in the Era of MLLMs.
CoRR, 2024

Point-Calibrated Spectral Neural Operators.
CoRR, 2024

FreeLong: Training-Free Long Video Generation with SpectralBlend Temporal Attention.
CoRR, 2024

High-Fidelity Facial Albedo Estimation via Texture Quantization.
CoRR, 2024

DeltaPhi: Learning Physical Trajectory Residual for PDE Solving.
CoRR, 2024

AudioScenic: Audio-Driven Video Scene Editing.
CoRR, 2024

Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback.
CoRR, 2024

EVA: Zero-shot Accurate Attributes and Multi-Object Video Editing.
CoRR, 2024

Ghost Sentence: A Tool for Everyday Users to Copyright Data from Large Language Models.
CoRR, 2024

AntEval: Quantitatively Evaluating Informativeness and Expressiveness of Agent Social Interactions.
CoRR, 2024

GG-Editor: Locally Editing 3D Avatars with Multimodal Large Language Model Guidance.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Neural Interaction Energy for Multi-Agent Trajectory Prediction.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

MoS<sup>2</sup>: Mixture of Scale and Shift Experts for Text-Only Video Captioning.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Knowledge-Enhanced Dual-Stream Zero-Shot Composed Image Retrieval.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

CapHuman: Capture Your Moments in Parallel Universes.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

FragRel: Exploiting Fragment-level Relations in the External Memory of Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

VillagerAgent: A Graph-Based Multi-Agent Framework for Coordinating Complex Task Dependencies in Minecraft.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Stitching Segments and Sentences towards Generalization in Video-Text Pre-training.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Filter Pruning by Switching to Neighboring CNNs With Good Attributes.
IEEE Trans. Neural Networks Learn. Syst., October, 2023

Lightweight Distortion-Aware Network for Salient Object Detection in Omnidirectional Images.
IEEE Trans. Circuits Syst. Video Technol., October, 2023

Variational Cross-Graph Reasoning and Adaptive Structured Semantics Learning for Compositional Temporal Grounding.
IEEE Trans. Pattern Anal. Mach. Intell., October, 2023

Deep Tabular Data Modeling With Dual-Route Structure-Adaptive Graph Networks.
IEEE Trans. Knowl. Data Eng., September, 2023

PoseGU: 3D human pose estimation with novel human pose generator and unbiased learning.
Comput. Vis. Image Underst., August, 2023

Symbiotic Attention for Egocentric Action Recognition With Object-Centric Alignment.
IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

A Differentiable Parallel Sampler for Efficient Video Classification.
ACM Trans. Multim. Comput. Commun. Appl., 2023

Align and Tell: Boosting Text-Video Retrieval With Local Alignment and Fine-Grained Supervision.
IEEE Trans. Multim., 2023

Language-Guided Multi-Granularity Context Aggregation for Temporal Sentence Grounding.
IEEE Trans. Multim., 2023

Co-Learning Meets Stitch-Up for Noisy Multi-Label Visual Recognition.
IEEE Trans. Image Process., 2023

Collaborative Contrastive Refining for Weakly Supervised Person Search.
IEEE Trans. Image Process., 2023

Discriminative Radial Domain Adaptation.
IEEE Trans. Image Process., 2023

Exploring viewport features for semi-supervised saliency prediction in omnidirectional images.
Image Vis. Comput., 2023

FlowZero: Zero-Shot Text-to-Video Synthesis with LLM-Driven Dynamic Scene Syntax.
CoRR, 2023

Combating Label Noise With A General Surrogate Model For Sample Selection.
CoRR, 2023

DiverseMotion: Towards Diverse Human Motion Generation via Discrete Diffusion.
CoRR, 2023

Tachikuma: Understading Complex Interactions with Multi-Character and Novel Objects by Large Language Models.
CoRR, 2023

Whitening-based Contrastive Learning of Sentence Embeddings.
CoRR, 2023

CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model.
CoRR, 2023

Temporal Perceiving Video-Language Pre-training.
CoRR, 2023

DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

MAAL: Multimodality-Aware Autoencoder-based Affordance Learning for 3D Articulated Objects.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Text Augmented Spatial Aware Zero-shot Referring Image Segmentation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Efficient Multimodal Fusion via Interactive Prompting.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

MIST : Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

PointListNet: Deep Learning on 3D Point Lists.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

WhitenedCSE: Whitening-based Contrastive Learning of Sentence Embeddings.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Gloss-Free End-to-End Sign Language Translation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Temporal Cross-Layer Correlation Mining for Action Recognition.
IEEE Trans. Multim., 2022

Weakly Supervised RGB-D Salient Object Detection With Prediction Consistency Training and Active Scribble Boosting.
IEEE Trans. Image Process., 2022

SemGloVe: Semantic Co-Occurrences for GloVe From BERT.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

Label Independent Memory for Semi-Supervised Few-Shot Video Classification.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Instance-Invariant Domain Adaptive Object Detection Via Progressive Disentanglement.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

AFE-CNN: 3D Skeleton-based Action Recognition with Action Feature Enhancement.
Neurocomputing, 2022

Weakly Supervised Moment Localization with Decoupled Consistent Concept Prediction.
Int. J. Comput. Vis., 2022

Slimmable Networks for Contrastive Self-supervised Learning.
CoRR, 2022

CenterCLIP: Token Clustering for Efficient Text-Video Retrieval.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

Feature-Robust Optimal Transport for High-Dimensional Data.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2022

Fine-Grained Semantically Aligned Vision-Language Pre-Training.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Dilated Context Integrated Network with Cross-Modal Consensus for Temporal Emotion Localization in Videos.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Unified Transformer Tracker for Object Tracking.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

A Simple Episodic Linear Probe Improves Visual Recognition in the Wild.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

SEEG: Semantic Energized Co-speech Gesture Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Complex Video Action Reasoning via Learnable Markov Logic Network.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Few-Shot Common-Object Reasoning Using Common-Centric Localization Network.
IEEE Trans. Image Process., 2021

Training Robust Object Detectors From Noisy Category Labels and Imprecise Bounding Boxes.
IEEE Trans. Image Process., 2021

Learning to Anticipate Egocentric Actions by Imagination.
IEEE Trans. Image Process., 2021

Holistic LSTM for Pedestrian Trajectory Prediction.
IEEE Trans. Image Process., 2021

Visual commonsense reasoning with directional visual connections.
Frontiers Inf. Technol. Electron. Eng., 2021

Less is More: Sparse Sampling for Dense Reaction Predictions.
CoRR, 2021

OR-Net: Pointwise Relational Inference for Data Completion under Partial Observation.
CoRR, 2021

Universal-Prototype Augmentation for Few-Shot Object Detection.
CoRR, 2021

PoseGate-Former: Transformer Encoder with Trainable Gate for 3D Human Pose Estimation Using Weakly Supervised Learning.
Proceedings of the Neural Information Processing - 28th International Conference, 2021

Vector-Decomposed Disentanglement for Domain-Invariant Object Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Universal-Prototype Enhancing for Few-Shot Object Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Interactive Prototype Learning for Egocentric Action Recognition.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

A Multi-Mode Modulator for Multi-Domain Few-Shot Classification.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Adaptive Hierarchical Graph Reasoning with Semantic Coherence for Video-and-Language Inference.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

OpenMix: Reviving Known Knowledge for Discovering Novel Visual Categories in an Open World.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Faster Meta Update Strategy for Noise-Robust Deep Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Recurrent Attention Network with Reinforced Generator for Visual Dialog.
ACM Trans. Multim. Comput. Commun. Appl., 2020

Feature Robust Optimal Transport for High-dimensional Data.
CoRR, 2020

UTS Submission at the TRECVID 2020 Disaster Scene Description and Indexing Task.
Proceedings of the 2020 TREC Video Retrieval Evaluation, 2020

Learning to Transfer Learn: Reinforcement Learning-Based Selection for Adaptive Transfer Learning.
Proceedings of the Computer Vision - ECCV 2020, 2020

Motion-Excited Sampler: Video Adversarial Attack with Sparked Prior.
Proceedings of the Computer Vision - ECCV 2020, 2020

SF-Net: Single-Frame Supervision for Temporal Action Localization.
Proceedings of the Computer Vision - ECCV 2020, 2020

ActBERT: Learning Global-Local Video-Text Representations.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Inflated Episodic Memory With Region Self-Attention for Long-Tailed Visual Recognition.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Gated Channel Transformation for Visual Recognition.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Semantic Correspondence as an Optimal Transport Problem.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Learning Filter Pruning Criteria for Deep Convolutional Neural Networks Acceleration.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

FASTER Recurrent Networks for Efficient Video Classification.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Symbiotic Attention with Privileged Information for Egocentric Action Recognition.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Video representation learning with deep neural networks
PhD thesis, 2019

Learning to Transfer Learn.
CoRR, 2019

Baidu-UTS Submission to the EPIC-Kitchens Action Recognition Challenge 2019.
CoRR, 2019

FASTER Recurrent Networks for Video Classification.
CoRR, 2019

Meta Filter Pruning to Accelerate Deep Convolutional Neural Networks.
CoRR, 2019

Connective Cognition Network for Directional Visual Commonsense Reasoning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Dual Attention Matching for Audio-Visual Event Localization.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Auto-ReID: Searching for a Part-Aware ConvNet for Person Re-Identification.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Entangled Transformer for Image Captioning.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Sim-Real Joint Reinforcement Transfer for 3D Indoor Navigation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Cubic LSTMs for Video Prediction.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Attentive Sequence to Sequence Translation for Localizing Clips of Interest by Natural Language Descriptions.
CoRR, 2018

UTS_CAI submission at TRECVID 2018 Ad-hoc Video Search Task.
Proceedings of the 2018 TREC Video Retrieval Evaluation, 2018

Activities in Extended Video.
Proceedings of the 2018 TREC Video Retrieval Evaluation, 2018

Decoupled Novel Object Captioner.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Fast Parameter Adaptation for Few-shot Image Captioning and Visual Question Answering.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Watching a Small Portion could be as Good as Watching All: Towards Efficient Video Classification.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Compound Memory Networks for Few-Shot Video Classification.
Proceedings of the Computer Vision - ECCV 2018, 2018

2017
Uncovering the Temporal Context for Video Question Answering.
Int. J. Comput. Vis., 2017

UTS submission to Google YouTube-8M Challenge 2017.
CoRR, 2017

Bidirectional Multirate Reconstruction for Temporal Modeling in Videos.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Few-Shot Object Recognition from Machine-Labeled Web Images.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
Recognizing an Action Using Its Name: A Knowledge-Based Approach.
Int. J. Comput. Vis., 2016

UTS-CMU-D2DCRC Submission at TRECVID 2016 Video Localization.
Proceedings of the 2016 TREC Video Retrieval Evaluation, 2016

2015
Uncovering Temporal Context for Video Question and Answering.
CoRR, 2015


  Loading...