Guanglu Song

Orcid: 0000-0001-5391-5712

According to our database1, Guanglu Song authored at least 61 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping.
CoRR, 2024

EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM.
CoRR, 2024

See Further When Clear: Curriculum Consistency Model.
CoRR, 2024

Robo-MUTUAL: Robotic Multimodal Task Specification via Unimodal Learning.
CoRR, 2024

MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines.
CoRR, 2024

Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models.
CoRR, 2024

Phased Consistency Model.
CoRR, 2024

MoVA: Adapting Mixture of Vision Experts to Multimodal Context.
CoRR, 2024

CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching.
CoRR, 2024

Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models.
CoRR, 2024

AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning.
CoRR, 2024

AnimateLCM: Computation-Efficient Personalized Style Video Generation without Personalized Video Data.
Proceedings of the SIGGRAPH Asia 2024 Technical Communications, 2024

Three Things We Need to Know About Transferring Stable Diffusion to Visual Dense Prediction Tasks.
Proceedings of the Computer Vision - ECCV 2024, 2024

Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

Be-Your-Outpainter: Mastering Video Outpainting Through Input-Specific Adaptation.
Proceedings of the Computer Vision - ECCV 2024, 2024

ZoLA: Zero-Shot Creative Long Animation Generation with Short Video Model.
Proceedings of the Computer Vision - ECCV 2024, 2024

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis.
Proceedings of the Computer Vision - ECCV 2024, 2024

Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

LMDrive: Closed-Loop End-to-End Driving with Large Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Teach-DETR: Better Training DETR With Teachers.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

UniFormer: Unifying Convolution and Self-Attention for Visual Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., October, 2023

Towards Large-scale Masked Face Recognition.
CoRR, 2023

RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths.
CoRR, 2023

Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising.
CoRR, 2023

RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DETRs with Collaborative Hybrid Assignments Training.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Decoupled DETR: Spatially Disentangling Localization and Classification for Improved End-to-End Object Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Masked Autoencoders Are Stronger Knowledge Distillers.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

UniKD: Universal Knowledge Distillation for Mimicking Homogeneous or Heterogeneous Object Detectors.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022
Large-batch Optimization for Dense Visual Predictions.
CoRR, 2022

UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning.
CoRR, 2022

Large-batch Optimization for Dense Visual Predictions: Training Faster R-CNN in 4.2 Minutes.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

UniFormer: Unified Transformer for Efficient Spatial-Temporal Representation Learning.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Self-slimmed Vision Transformer.
Proceedings of the Computer Vision - ECCV 2022, 2022

Towards Robust Face Recognition with Comprehensive Search.
Proceedings of the Computer Vision - ECCV 2022, 2022

Rethinking Robust Representation Learning Under Fine-Grained Noisy Faces.
Proceedings of the Computer Vision - ECCV 2022, 2022

UniNet: Unified Architecture Search with Convolution, Transformer, and MLP.
Proceedings of the Computer Vision - ECCV 2022, 2022

Unifying Visual Perception by Dispersible Points Learning.
Proceedings of the Computer Vision - ECCV 2022, 2022

2021
INTERN: A New Learning Paradigm Towards General Vision.
CoRR, 2021

FNAS: Uncertainty-Aware Fast Neural Architecture Search.
CoRR, 2021

Scale Semantic Flow Preserving Across Image Pyramid.
Proceedings of the Neural Information Processing - 28th International Conference, 2021

PCNET: Parallelly Conquer the Large Variance of Person Re-Identification.
Proceedings of the 2021 IEEE International Conference on Image Processing, 2021

Rectifying the Data Bias in Knowledge Distillation.
Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

Switchable K-class Hyperplanes for Noise-Robust Representation Learning.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2020
Weighted triple-sequence loss for video-based person re-identification.
Neurocomputing, 2020

1st place solution for AVA-Kinetics Crossover in AcitivityNet Challenge 2020.
CoRR, 2020

1st Place Solutions for OpenImage2019 - Object Detection and Instance Segmentation.
CoRR, 2020

Top-1 Solution of Multi-Moments in Time Challenge 2019.
CoRR, 2020

Discriminability Distillation in Group Representation Learning.
Proceedings of the Computer Vision - ECCV 2020, 2020

Revisiting the Sibling Head in Object Detector.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

KPNet: Towards Minimal Face Detector.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Spatial-Transformed Regional Quality Estimation Network for Large-Variance Person Re-Identification.
IEEE Access, 2019

Scale Pyramid Attention for Single Shot MultiBox Detector.
IEEE Access, 2019

Towards Flops-Constrained Face Recognition.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

2018
Fast Portrait Matting Using Spatial Detail-Preserving Network.
Proceedings of the Neural Information Processing - 25th International Conference, 2018

Transductive Centroid Projection for Semi-supervised Large-Scale Recognition.
Proceedings of the Computer Vision - ECCV 2018, 2018

Beyond Trade-Off: Accelerate FCN-Based Face Detector With Higher Accuracy.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Region-Based Quality Estimation Network for Large-Scale Person Re-Identification.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Spatial Quality Aware Network for Video-Based Person Re-identification.
Proceedings of the Neural Information Processing - 24th International Conference, 2017

A Multi-level Weighted Representation for Person Re-identification.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2017, 2017


  Loading...