Yingya Zhang

Orcid: 0009-0008-9524-9218

According to our database1, Yingya Zhang authored at least 49 papers between 2013 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
CLIP-guided Prototype Modulating for Few-shot Action Recognition.
Int. J. Comput. Vis., June, 2024

CMDFusion: Bidirectional Fusion Network With Cross-Modality Knowledge Distillation for LiDAR Semantic Segmentation.
IEEE Robotics Autom. Lett., January, 2024

EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models.
CoRR, 2024

FreeMask: Rethinking the Importance of Attention Masks for Zero-Shot Video Editing.
CoRR, 2024

UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation.
CoRR, 2024

S<sup>3</sup>D-NeRF: Single-Shot Speech-Driven Neural Radiance Field for High Fidelity Talking Head Synthesis.
Proceedings of the Computer Vision - ECCV 2024, 2024

InstructVideo: Instructing Video Diffusion Models with Human Feedback.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

A Recipe for Scaling up Text-to-Video Generation with Text-free Videos.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Hierarchical Spatio-temporal Decoupling for Text-to- Video Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Dream Video: Composing Your Dream Videos with Customized Subject and Motion.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

AE-NeRF: Audio Enhanced Neural Radiance Field for Few Shot Talking Head Synthesis.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models.
CoRR, 2023

VideoLCM: Video Latent Consistency Model.
CoRR, 2023

DreamVideo: Composing Your Dream Videos with Customized Subject and Motion.
CoRR, 2023

I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models.
CoRR, 2023

Few-shot Action Recognition with Captioning Foundation Models.
CoRR, 2023

DeltaSpace: A Semantic-aligned Feature Space for Flexible Text-guided Image Editing.
CoRR, 2023

ModelScope Text-to-Video Technical Report.
CoRR, 2023

Temporally-Adaptive Models for Efficient Video Understanding.
CoRR, 2023

VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation.
CoRR, 2023

FaceComposer: A Unified Model for Versatile Facial Content Creation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

VideoComposer: Compositional Video Synthesis with Motion Controllability.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

The Devil is in the Wrongly-classified Samples: Towards Unified Open-set Recognition.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

RLIPv2: Fast Scaling of Relational Language-Image Pre-training.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Space-time Prompting for Video Class-incremental Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

LipFormer: High-fidelity and Generalizable Talking Face Generation with A Pre-learned Facial Codebook.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Action Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Enlarging Instance-specific and Class-specific Information for Open-set Action Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Revisiting Optimal Convergence Rate for Smooth and Non-convex Stochastic Decentralized Optimization.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

2021
ANN Softmax: Acceleration of Extreme Classification Training.
Proc. VLDB Endow., 2021

ACCL: Architecting Highly Scalable Distributed Training Systems With Highly Efficient Collective Communication Library.
IEEE Micro, 2021

Once and for All: Self-supervised Multi-modal Co-training on One-billion Videos at Alibaba.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Extremely Compact Non-local Representation Learning.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

Accelerating Gossip SGD with Periodic Global Averaging.
Proceedings of the 38th International Conference on Machine Learning, 2021

DecentLaM: Decentralized Momentum SGD for Large-batch Deep Training.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Communication Efficient SGD via Gradient Sampling With Bayes Prior.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Distribution Adaptive INT8 Quantization for Training CNNs.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Large-Scale Training System for 100-Million Classification at Alibaba.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

EFLOPS: Algorithm and System Co-Design for a High Performance Distributed Training Platform.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

2019
Large-Scale Visual Search with Binary Distributed Graph at Alibaba.
Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019

2018
Visual Search at Alibaba.
Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018

2016
Information Theoretic Subspace Clustering.
IEEE Trans. Neural Networks Learn. Syst., 2016

Vehicle trajectory prediction based on Hidden Markov Model.
KSII Trans. Internet Inf. Syst., 2016

A Method for Traffic Congestion Clustering Judgment Based on Grey Relational Analysis.
ISPRS Int. J. Geo Inf., 2016

2015
Robust Subspace Clustering With Complex Noise.
IEEE Trans. Image Process., 2015

A Method of Vehicle Route Prediction Based on Social Network Analysis.
J. Sensors, 2015

2013
Robust Subspace Clustering via Half-Quadratic Minimization.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Robust Low-Rank Representation via Correntropy.
Proceedings of the 2nd IAPR Asian Conference on Pattern Recognition, 2013


  Loading...