Kai Wang

Orcid: 0000-0002-1154-5175

Affiliations:

National University of Singapore, School of Computing, Singapore
Alibaba Group, DAMO Acadmey, Hangzhou, China
Chinese Academy of Sciences, Shenzhen Institutes of Advanced Technology, China

According to our database¹, Kai Wang authored at least 94 papers between 2017 and 2025.

Collaborative distances:

Dijkstra number² of three.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

Enhance-A-Video: Better Generated Video for Free.

[BibT_eX]

[DOI]

CoRR, February, 2025

Recurrent Diffusion for Large-Scale Parameter Generation.

[BibT_eX]

[DOI]

CoRR, January, 2025

2024

Region Generation and Assessment Network for Occluded Person Re-Identification.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Forensics Secur., 2024

HARWE: A multi-modal large-scale dataset for context-aware human activity recognition in smart working environments.

[BibT_eX]

[DOI]

Konstantinos N. Plataniotis

Pattern Recognit. Lett., 2024

Faster Vision Mamba is Rebuilt in Minutes via Merged Token Re-training.

[BibT_eX]

[DOI]

Shanmukha Ramakrishna Vedantam

Wangbo Zhao

Kai Wang

Yang You

CoRR, 2024

HunyuanVideo: A Systematic Framework For Large Video Generative Models.

[BibT_eX]

[DOI]

CoRR, 2024

A Stitch in Time Saves Nine: Small VLM is a Precise Guidance for Accelerating Large VLMs.

[BibT_eX]

[DOI]

CoRR, 2024

Emphasizing Discriminative Features for Dataset Distillation in Complex Scenarios.

[BibT_eX]

[DOI]

Konstantinos N. Plataniotis

Alex Hauptmann

Yang You

CoRR, 2024

Dynamic Diffusion Transformer.

[BibT_eX]

[DOI]

CoRR, 2024

Real-Time Video Generation with Pyramid Attention Broadcast.

[BibT_eX]

[DOI]

CoRR, 2024

Prioritize Alignment in Dataset Distillation.

[BibT_eX]

[DOI]

Konstantinos N. Plataniotis

Kai Wang

Yang You

CoRR, 2024

More Than Positive and Negative: Communicating Fine Granularity in Medical Diagnosis.

[BibT_eX]

[DOI]

CoRR, 2024

Conditional LoRA Parameter Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning.

[BibT_eX]

[DOI]

CoRR, 2024

AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and Video Generation.

[BibT_eX]

[DOI]

CoRR, 2024

A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training.

[BibT_eX]

[DOI]

CoRR, 2024

Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond.

[BibT_eX]

[DOI]

CoRR, 2024

DynST: Dynamic Sparse Training for Resource-Constrained Spatio-Temporal Forecasting.

[BibT_eX]

[DOI]

CoRR, 2024

Neural Network Diffusion.

[BibT_eX]

[DOI]

CoRR, 2024

Two Trades is not Baffled: Condensing Graph via Crafting Rational Gradient Matching.

[BibT_eX]

[DOI]

CoRR, 2024

Must: Maximizing Latent Capacity of Spatial Transcriptomics Data.

[BibT_eX]

[DOI]

CoRR, 2024

Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability, Reproducibility, and Practicality.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

GDeR: Safeguarding Efficiency, Balancing, and Robustness via Prototypical Graph Pruning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning.

[BibT_eX]

[DOI]

Alexander G. Hauptmann

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

The Snowflake Hypothesis: Training and Powering GNN with One Node One Receptive Field.

[BibT_eX]

[DOI]

Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

Navigating Complexity: Toward Lossless Graph Condensation via Expanding Window Matching.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Two Heads Are Better Than One: Boosting Graph Sparse Training via Semantic and Topological Awareness.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

DiffAug: Enhance Unsupervised Contrastive Learning with Domain-Knowledge-Free Diffusion-based Data Augmentation.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Can We Evaluate Domain Adaptation Models Without Target-Domain Labels?

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

NuwaDynamics: Discovering and Updating in Causal Spatio-Temporal Modeling.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Towards Lossless Dataset Distillation via Difficulty-Aligned Trajectory Matching.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

MOMA: Mixture-of-Modality-Adaptations for Transferring Knowledge from Image Models Towards Efficient Audio-Visual Action Recognition.

[BibT_eX]

[DOI]

Kai Wang

Dimitrios Hatzinakos

Proceedings of the IEEE International Conference on Acoustics, 2024

Dataset Growth.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Towards Efficient Audio-Visual Learners via Empowering Pre-trained Vision Transformers with Cross-Modal Adaptation.

[BibT_eX]

[DOI]

Kai Wang

Yapeng Tian

Dimitrios Hatzinakos

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ATOM: Attention Mixer for Efficient Dataset Distillation.

[BibT_eX]

[DOI]

Konstantinos N. Plataniotis

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Summarizing Stream Data for Memory-Constrained Online Continual Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

MLLMs-Augmented Visual-Language Representation Learning.

[BibT_eX]

[DOI]

CoRR, 2023

DREAM+: Efficient Dataset Distillation by Bidirectional Representative Matching.

[BibT_eX]

[DOI]

CoRR, 2023

Can pre-trained models assist in dataset distillation?

[BibT_eX]

[DOI]

CoRR, 2023

Boosting Unsupervised Contrastive Learning Using Diffusion-Based Data Augmentation From Scratch.

[BibT_eX]

[DOI]

CoRR, 2023

Color Prompting for Data-Free Continual Unsupervised Domain Adaptive Person Re-Identification.

[BibT_eX]

[DOI]

CoRR, 2023

The Snowflake Hypothesis: Training Deep GNN with One Node One Receptive field.

[BibT_eX]

[DOI]

CoRR, 2023

Evidential Detection and Tracking Collaboration: New Problem, Benchmark and Algorithm for Robust Anti-UAV System.

[BibT_eX]

[DOI]

CoRR, 2023

Summarizing Stream Data for Memory-Restricted Online Continual Learning.

[BibT_eX]

[DOI]

CoRR, 2023

The 3rd Anti-UAV Workshop & Challenge: Methods and Results.

[BibT_eX]

[DOI]

CoRR, 2023

InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning.

[BibT_eX]

[DOI]

CoRR, 2023

DiM: Distilling Dataset into Generative Model.

[BibT_eX]

[DOI]

CoRR, 2023

Expanding Small-Scale Datasets with Guided Imagination.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Does Graph Distillation See Like Vision Dataset Counterpart?

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Divide to Adapt: Mitigating Confirmation Bias for Domain Adaptation of Black-Box Predictors.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Dataset Quantization.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

DREAM: Efficient Dataset Distillation by Representative Matching.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

A Spatio-Temporal Decomposition Network for Compressed Video Quality Enhancement.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

BiCro: Noisy Correspondence Rectification for Multi-modality Data via Bi-directional Cross-modal Similarity Consistency.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

MSINet: Twins Contrastive Search of Multi-Scale Interaction for Object ReID.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

SEformer: Dual-Path Conformer Neural Network is a Good Speech Denoiser.

[BibT_eX]

[DOI]

Kai Wang

Dimitrios Hatzinakos

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022

Prompt Vision Transformer for Domain Generalization.

[BibT_eX]

[DOI]

CoRR, 2022

Architecture-Agnostic Masked Image Modeling - From ViT back to CNN.

[BibT_eX]

[DOI]

CoRR, 2022

FaceMAE: Privacy-Preserving Face Recognition via Masked Autoencoders.

[BibT_eX]

[DOI]

CoRR, 2022

Reliable Label Correction is a Good Booster When Learning with Extremely Noisy Labels.

[BibT_eX]

[DOI]

CoRR, 2022

Crafting Better Contrastive Views for Siamese Representation Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Dataset Distillation via Factorization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

DLME: Deep Local-Flatness Manifold Embedding.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

CAFE: Learning to Condense Dataset by Aligning Features.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

An Efficient Training Approach for Very Large Scale Face Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Crafting Better Contrastive Views for Siamese Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

MSDN: Mutually Semantic Distillation Network for Zero-Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

GarbageNet: A Unified Learning Framework for Robust Garbage Classification.

[BibT_eX]

[DOI]

IEEE Trans. Artif. Intell., 2021

Brain MRI super-resolution using coupled-projection residual network.

[BibT_eX]

[DOI]

Neurocomputing, 2021

Align Yourself: Self-supervised Pre-training for Fine-grained Recognition via Saliency Alignment.

[BibT_eX]

[DOI]

CoRR, 2021

An Efficient Training Approach for Very Large Scale Face Recognition.

[BibT_eX]

[DOI]

CoRR, 2021

Learning to Cluster Faces via Transformer.

[BibT_eX]

[DOI]

CoRR, 2021

Mask Aware Network for Masked Face Recognition in the Wild.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

2020

Advancing Image Understanding in Poor Visibility Environments: A Collective Benchmark Study.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2020

Region Attention Networks for Pose and Occlusion Robust Facial Expression Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2020

AU-Guided Unsupervised Domain Adaptive Facial Expression Recognition.

[BibT_eX]

[DOI]

CoRR, 2020

Learning Discriminative Representation For Facial Expression Recognition From Uncertainties.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Image Processing, 2020

Suppressing Mislabeled Data via Grouping and Self-attention.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Suppressing Uncertainties for Large-Scale Facial Expression Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Multiple Transfer Learning and Multi-label Balanced Training Strategies for Facial AU Detection In the Wild.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Interactive Dual Generative Adversarial Networks for Image Captioning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Coupled-Projection Residual Network for MRI Super-Resolution.

[BibT_eX]

[DOI]

CoRR, 2019

Exploring Emotion Features and Fusion Strategies for Audio-Video Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Multimodal Interaction, 2019

Bootstrap Model Ensemble and Rank Loss for Engagement Intensity Regression.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Multimodal Interaction, 2019

Exploring Regularizations with Face, Body and Image Cues for Group Cohesion Prediction.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Multimodal Interaction, 2019

Frame Attention Networks for Facial Expression Recognition in Videos.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Multi-Modal Face Anti-Spoofing Attack Detection Challenge at CVPR2019.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

2018

Deep Recurrent Multi-instance Learning with Spatio-temporal Features for Engagement Intensity Prediction.

[BibT_eX]

[DOI]

Proceedings of the 2018 on International Conference on Multimodal Interaction, 2018

Cascade Attention Networks For Group Emotion Recognition with Face, Body and Image Cues.

[BibT_eX]

[DOI]

Proceedings of the 2018 on International Conference on Multimodal Interaction, 2018

2017

Group emotion recognition with individual facial emotion CNNs and global image based CNNs.

[BibT_eX]

[DOI]

Proceedings of the 19th ACM International Conference on Multimodal Interaction, 2017

Kai Wang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...