Li Yuan

Orcid: 0000-0002-2120-5588

Affiliations:
  • Peking University, School of Electronic and Computer Engineering, Beijing, China
  • National University of Singapore, Singapore


According to our database1, Li Yuan authored at least 104 papers between 2018 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
An Organ-Aware Diagnosis Framework for Radiology Report Generation.
IEEE Trans. Medical Imaging, December, 2024

Adversarial Attacks on Video Object Segmentation With Hard Region Discovery.
IEEE Trans. Circuits Syst. Video Technol., June, 2024

Efficient Long-Short Temporal Attention network for unsupervised Video Object Segmentation.
Pattern Recognit., February, 2024

Fully Transformer-Equipped Architecture for end-to-end Referring Video Object Segmentation.
Inf. Process. Manag., January, 2024

Self-architectural knowledge distillation for spiking neural networks.
Neural Networks, 2024

ETTFS: An Efficient Training Framework for Time-to-First-Spike Neuron.
CoRR, 2024

Spatial-Temporal Search for Spiking Neural Networks.
CoRR, 2024

MoH: Multi-Head Attention as Mixture-of-Head Attention.
CoRR, 2024

Is Parameter Collision Hindering Continual Learning in LLMs?
CoRR, 2024

MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts.
CoRR, 2024

Multi-granularity Score-based Generative Framework Enables Efficient Inverse Design of Complex Organics.
CoRR, 2024

ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis.
CoRR, 2024

Cycle3D: High-quality and Consistent Image-to-3D Generation via Generation-Reconstruction Cycle.
CoRR, 2024

HoloDreamer: Holistic 3D Panoramic World Generation from Text Descriptions.
CoRR, 2024

Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation.
CoRR, 2024

ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation.
CoRR, 2024

ShareGPT4Video: Improving Video Understanding and Generation with Better Captions.
CoRR, 2024

EvaGaussians: Event Stream Assisted Gaussian Splatting from Blurry Images.
CoRR, 2024

MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators.
CoRR, 2024

QKFormer: Hierarchical Spiking Transformer using Q-K Attention.
CoRR, 2024

Envision3D: One Image to 3D with Anchor Views Interpolation.
CoRR, 2024

TaxDiff: Taxonomic-Guided Diffusion Model for Protein Sequence Generation.
CoRR, 2024

ProLLaMA: A Protein Large Language Model for Multi-Task Protein Language Processing.
CoRR, 2024

LLMBind: A Unified Modality-Task Integration Framework.
CoRR, 2024

MoE-LLaVA: Mixture of Experts for Large Vision-Language Models.
CoRR, 2024

Deep peak property learning for efficient chiral molecules ECD spectra prediction.
CoRR, 2024

Spikformer V2: Join the High Accuracy Club on ImageNet with an SNN Ticket.
CoRR, 2024

Prompt2Poster: Automatically Artistic Chinese Poster Creation from Prompt Only.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Optimal ANN-SNN Conversion with Group Neurons.
Proceedings of the IEEE International Conference on Acoustics, 2024

Temporal Contrastive Learning for Spiking Neural Networks.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2024, 2024

A Multi-modal Spiking Meta-learner with Brain-Inspired Task-Aware Modulation Scheme.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2024, 2024

LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Repaint123: Fast and High-Quality One Image to 3D Generation with Progressive Controllable Repainting.
Proceedings of the Computer Vision - ECCV 2024, 2024

HiFi-123: Towards High-Fidelity One Image to 3D Content Generation.
Proceedings of the Computer Vision - ECCV 2024, 2024

FreestyleRet: Retrieving Images from Style-Diversified Queries.
Proceedings of the Computer Vision - ECCV 2024, 2024

Learning Pseudo 3D Guidance for View-Consistent Texturing with 2D Diffusion.
Proceedings of the Computer Vision - ECCV 2024, 2024

Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation.
Proceedings of the Computer Vision - ECCV 2024, 2024

GraCo: Granularity-Controllable Interactive Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Regressor-Segmenter Mutual Prompt Learning for Crowd Counting.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

SynSP: Synergy of Smoothness and Precision in Pose Sequences Refinement.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Towards Better Seach Query Classification with Distribution-Diverse Multi-Expert Knowledge Distillation in JD Ads Search.
Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, 2024

RAP: Efficient Text-Video Retrieval with Sparse-and-Correlated Adapter.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Parallel Vertex Diffusion for Unified Visual Grounding.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
MADAv2: Advanced Multi-Anchor Based Active Domain Adaptation Segmentation.
IEEE Trans. Pattern Anal. Mach. Intell., November, 2023

Truncated attention-aware proposal networks with multi-scale dilation for temporal action detection.
Pattern Recognit., October, 2023

VOLO: Vision Outlooker for Visual Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2023

Masked Autoencoders for 3D Point Cloud Self-supervised Learning.
World Sci. Annu. Rev. Artif. Intell., 2023

Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., 2023

Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting.
CoRR, 2023

Machine Mindset: An MBTI Exploration of Large Language Models.
CoRR, 2023

FreestyleRet: Retrieving Images from Style-Diversified Queries.
CoRR, 2023

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection.
CoRR, 2023

Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding.
CoRR, 2023

Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs.
CoRR, 2023

HiFi-123: Towards High-fidelity One Image to 3D Content Generation.
CoRR, 2023

LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment.
CoRR, 2023

Triple-View Knowledge Distillation for Semi-Supervised Semantic Segmentation.
CoRR, 2023

ChatLaw: Open-Source Legal Large Language Model with Integrated External Knowledge Bases.
CoRR, 2023

Auto-Spikformer: Spikformer Architecture Search.
CoRR, 2023

ChatFace: Chat-Guided Real Face Editing via Diffusion Latent Space Manipulation.
CoRR, 2023

Album Storytelling with Iterative Story-aware Captioning and Large Language Models.
CoRR, 2023

Parallel Vertex Diffusion for Unified Visual Grounding.
CoRR, 2023

Spike-driven Transformer.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Spikformer: When Spiking Neural Network Meets Transformer.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Learning Sparse Neural Networks with Identity Layers.
Proceedings of the Image and Graphics - 12th International Conference, 2023

Multi-granularity Interaction Simulation for Unsupervised Interactive Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

DiffusionRet: Generative Text-Video Retrieval with Diffusion Model.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Learning with Fantasy: Semantic-Aware Virtual Contrastive Constraint for Few-Shot Class-Incremental Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Out-of-Candidate Rectification for Weakly Supervised Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

ACSeg: Adaptive Conceptualization for Unsupervised Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Dynamic Clustering Network for Unsupervised Semantic Segmentation.
CoRR, 2022

Masked Autoencoders for Point Cloud Self-supervised Learning.
Proceedings of the Computer Vision - ECCV 2022, 2022

Locality Guidance for Improving Vision Transformers on Tiny Datasets.
Proceedings of the Computer Vision, 2022

Improving Vision Transformers by Revisiting High-Frequency Components.
Proceedings of the Computer Vision, 2022

2021
Exploring global diverse attention via pairwise temporal relation for video summarization.
Pattern Recognit., 2021

Refiner: Refining Self-attention for Vision Transformers.
CoRR, 2021

Token Labeling: Training a 85.4% Top-1 Accuracy Vision Transformer with 56M Parameters on ImageNet.
CoRR, 2021

Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet.
CoRR, 2021

All Tokens Matter: Token Labeling for Training Better Vision Transformers.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

PnP-DETR: Towards Efficient Visual Analysis with Transformers.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Continual Learning via Bit-Level Information Preserving.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Unsupervised Video Summarization With Cycle-Consistent Adversarial LSTM Networks.
IEEE Trans. Multim., 2020

Adversarial images for the primate brain.
CoRR, 2020

Towards Accurate Human Pose Estimation in Videos of Crowded Scenes.
CoRR, 2020

A Simple Baseline for Pose Tracking in Videos of Crowded Scenes.
CoRR, 2020

Toward Accurate Person-level Action Recognition in Videos of Crowded Scenes.
CoRR, 2020

Toward Accurate Person-level Action Recognition in Videos of Crowed Scenes.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Towards Accurate Human Pose Estimation in Videos of Crowded Scenes.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

A Simple Baseline for Pose Tracking in Videos of Crowed Scenes.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Central Similarity Quantization for Efficient Image and Video Retrieval.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Revisiting Knowledge Distillation via Label Smoothing Regularization.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Revisit Knowledge Distillation: a Teacher-free Framework.
CoRR, 2019

Central Similarity Hashing via Hadamard matrix.
CoRR, 2019

Few-Shot Adaptive Faster R-CNN.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Distilling Object Detectors With Fine-Grained Feature Imitation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Cycle-SUM: Cycle-Consistent Adversarial LSTM Networks for Unsupervised Video Summarization.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Object Relation Detection Based on One-shot Learning.
CoRR, 2018


  Loading...