Yansong Tang

Orcid: 0000-0002-1534-4549

According to our database1, Yansong Tang authored at least 83 papers between 2017 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
StableSwap: Stable Face Swapping in a Shared and Controllable Latent Space.
IEEE Trans. Multim., 2024

DOVE: Doodled vessel enhancement for photoacoustic angiography super resolution.
Medical Image Anal., 2024

A Multitask Fourier Transformer Network for Seismic Source Characterization Estimation From a Single-Station Waveform.
IEEE Geosci. Remote. Sens. Lett., 2024

Q-VLM: Post-training Quantization for Large Vision-Language Models.
CoRR, 2024

Fully Aligned Network for Referring Image Segmentation.
CoRR, 2024

Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model.
CoRR, 2024

Arena Learning: Build Data Flywheel for LLMs Post-training via Simulated Chatbot Arena.
CoRR, 2024

RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models.
CoRR, 2024

Hierarchical Memory for Long Video QA.
CoRR, 2024

LIPE: Learning Personalized Identity Prior for Non-rigid Image Editing.
CoRR, 2024

GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation.
CoRR, 2024

VoCo-LLaMA: Towards Vision Compression with Large Language Models.
CoRR, 2024

Localizing Events in Videos with Multimodal Queries.
CoRR, 2024

Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams.
CoRR, 2024

ManiCM: Real-time 3D Diffusion Policy via Consistency Model for Robotic Manipulation.
CoRR, 2024

OpenCarbonEval: A Unified Carbon Emission Estimation Framework in Large-Scale AI Models.
CoRR, 2024

GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling.
CoRR, 2024

Learning Dual-Level Deformable Implicit Representation for Real-World Scale Arbitrary Super-Resolution.
CoRR, 2024

1st Place Solution for 5th LSVOS Challenge: Referring Video Object Segmentation.
CoRR, 2024

Language-Free Compositional Action Generation via Decoupling Refinement.
Proceedings of the IEEE International Conference on Acoustics, 2024

Post-training Quantization with Progressive Calibration and Activation Relaxing for Text-to-Image Diffusion Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation.
Proceedings of the Computer Vision - ECCV 2024, 2024

Plan, Posture and Go: Towards Open-Vocabulary Text-to-Motion Generation.
Proceedings of the Computer Vision - ECCV 2024, 2024

MotionLCM: Real-Time Controllable Motion Generation via Latent Consistency Model.
Proceedings of the Computer Vision - ECCV 2024, 2024

FlowIE: Efficient Image Enhancement via Rectified Flow.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

DPMesh: Exploiting Diffusion Prior for Occluded Human Mesh Recovery.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Narrative Action Evaluation with Prompt-Guided Multimodal Interaction.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

PTM-VQA: Efficient Video Quality Assessment Leveraging Diverse PreTrained Models from the Wild.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Towards Accurate Post-Training Quantization for Diffusion Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Universal Segmentation at Arbitrary Granularity with Language Instruction.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Open-Vocabulary Segmentation with Semantic-Assisted Calibration.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Segment and Caption Anything.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

CoSTA: End-to-End Comprehensive Space-Time Entanglement for Spatio-Temporal Video Grounding.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Learning Multi-Scale Video-Text Correspondence for Weakly Supervised Temporal Article Gronding.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Plan, Posture and Go: Towards Open-World Text-to-Motion Generation.
CoRR, 2023

OccNeRF: Self-Supervised Multi-Camera Occupancy Prediction with Neural Radiance Fields.
CoRR, 2023

ThinkBot: Embodied Instruction Following with Thought Chain Reasoning.
CoRR, 2023

Fine-tuning vision foundation model for crack segmentation in civil infrastructures.
CoRR, 2023

Lightweight Diffusion Models with Distillation-Based Block Neural Architecture Search.
CoRR, 2023

Language-free Compositional Action Generation via Decoupling Refinement.
CoRR, 2023

Efficient Text-Guided 3D-Aware Portrait Generation with Score Distillation Sampling on Distribution.
CoRR, 2023

Towards Accurate Data-free Quantization for Diffusion Models.
CoRR, 2023

Self-similarity-based super-resolution of photoacoustic angiography from hand-drawn doodles.
CoRR, 2023

Efficient Meshy Neural Fields for Animatable Human Avatars.
CoRR, 2023

SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

MCUFormer: Deploying Vision Tranformers on Microcontrollers with Limited Memory.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Ada-DQA: Adaptive Diverse Quality-aware Feature Acquisition for Video Quality Assessment.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

LUNA: Language as Continuing Anchors for Referring Expression Comprehension.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

HOI-aware Adaptive Network for Weakly-supervised Action Segmentation.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

GAIN: On the Generalization of Instructional Action Understanding.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Context-Aware Inpainter-Refiner for Skeleton-Based Human Motion Completion.
Proceedings of the IEEE International Conference on Image Processing, 2023

FineDance: A Fine-grained Choreography Dataset for 3D Full Body Dance Generation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Skip-Plan: Procedure Planning in Instructional Videos via Condensed Action Space Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Global Knowledge Calibration for Fast Open-Vocabulary Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

LOGO: A Long-Form Video Dataset for Group Action Quality Assessment.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

FLAG3D: A 3D Fitness Activity Dataset with Language Instruction.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Semantics-Aware Dynamic Localization and Refinement for Referring Image Segmentation.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based Action Recognition.
ACM Trans. Multim. Comput. Commun. Appl., 2022

VideoABC: A Real-World Video Dataset for Abductive Visual Reasoning.
IEEE Trans. Image Process., 2022

HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

OrdinalCLIP: Learning Rank Prompts for Language-Guided Ordinal Regression.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

ScalableViT: Rethinking the Context-Oriented Generalization of Vision Transformer.
Proceedings of the Computer Vision, 2022

Global Spectral Filter Memory Network for Video Object Segmentation.
Proceedings of the Computer Vision - ECCV 2022, 2022

Semantic-Aware Auto-Encoders for Self-supervised Representation Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

BNV-Fusion: Dense 3D Reconstruction using Bi-level Neural Volume Fusion.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

LAVT: Language-Aware Vision Transformer for Referring Image Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

YouMVOS: An Actor-centric Multi-shot Video Object Segmentation Dataset.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Comprehensive Instructional Video Analysis: The COIN Dataset and Performance Evaluation.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Unsupervised Embedding Learning from Uncertainty Momentum Modeling.
CoRR, 2021

Breaking Shortcut: Exploring Fully Convolutional Cycle-Consistency for Video Correspondence Learning.
CoRR, 2021

Hierarchical Interaction Network for Video Object Segmentation from Referring Expressions.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

2020
Graph Interaction Networks for Relation Transfer in Human Activity Videos.
IEEE Trans. Circuits Syst. Video Technol., 2020

Uncertainty-Aware Score Distribution Learning for Action Quality Assessment.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Learning Semantics-Preserving Attention and Contextual Interaction for Group Activity Recognition.
IEEE Trans. Image Process., 2019

Multi-Stream Deep Neural Networks for RGB-D Egocentric Action Recognition.
IEEE Trans. Circuits Syst. Video Technol., 2019

COIN: A Large-Scale Dataset for Comprehensive Instructional Video Analysis.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Mining Semantics-Preserving Attention for Group Activity Recognition.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Deep Progressive Reinforcement Learning for Skeleton-Based Action Recognition.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017
Action recognition in RGB-D egocentric videos.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017


  Loading...