Yang You

Orcid: 0000-0003-2816-4384

Affiliations:
  • National University of Singapore
  • UC Berkeley, USA (PhD 2020)


According to our database1, Yang You authored at least 150 papers between 2013 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Distributed and Joint Evidential K-Nearest Neighbor Classification.
IEEE Trans. Knowl. Data Eng., November, 2024

Scalable Evidential K-Nearest Neighbor Classification on Big Data.
IEEE Trans. Big Data, June, 2024

Self-filling evidential clustering for partial multi-view data.
Expert Syst. Appl., March, 2024

Sparse Reconstructive Evidential Clustering for Multi-View Data.
IEEE CAA J. Autom. Sinica, February, 2024

Emphasizing Discriminative Features for Dataset Distillation in Complex Scenarios.
CoRR, 2024

EnvBridge: Bridging Diverse Environments with Cross-Environment Knowledge Transfer for Embodied AI.
CoRR, 2024

MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures.
CoRR, 2024

Dynamic Diffusion Transformer.
CoRR, 2024

Visual Perception in Text Strings.
CoRR, 2024

Real-Time Video Generation with Pyramid Attention Broadcast.
CoRR, 2024

Prioritize Alignment in Dataset Distillation.
CoRR, 2024

More Than Positive and Negative: Communicating Fine Granularity in Medical Diagnosis.
CoRR, 2024

Conditional LoRA Parameter Generation.
CoRR, 2024

WallFacer: Guiding Transformer Model Training Out of the Long-Context Dark Forest with N-body Problem.
CoRR, 2024

Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability, Reproducibility, and Practicality.
CoRR, 2024

MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures.
CoRR, 2024

A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training.
CoRR, 2024

Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond.
CoRR, 2024

Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation.
CoRR, 2024

DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers.
CoRR, 2024

HeteGen: Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices.
CoRR, 2024

Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning.
CoRR, 2024

Neural Network Diffusion.
CoRR, 2024

Two Trades is not Baffled: Condensing Graph via Crafting Rational Gradient Matching.
CoRR, 2024

RAP: Retrieval-Augmented Planning with Contextual Memory for Multimodal LLM Agents.
CoRR, 2024

AutoChunk: Automated Activation Chunk for Memory-Efficient Long Sequence Inference.
CoRR, 2024

Must: Maximizing Latent Capacity of Spatial Transcriptomics Data.
CoRR, 2024

Helen: Optimizing CTR Prediction Models with Frequency-wise Hessian Eigenvalue Regularization.
Proceedings of the ACM on Web Conference 2024, 2024

FastFold: Optimizing AlphaFold Training and Inference on GPU Clusters.
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024

HeteGen: Efficient Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices.
Proceedings of the Seventh Annual Conference on Machine Learning and Systems, 2024

The Snowflake Hypothesis: Training and Powering GNN with One Node One Receptive Field.
Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

Single Domain Generalization For Scene Classification Using Style-Oriented Data Augmentation.
Proceedings of the IGARSS 2024, 2024

Navigating Complexity: Toward Lossless Graph Condensation via Expanding Window Matching.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

DiffAug: Enhance Unsupervised Contrastive Learning with Domain-Knowledge-Free Diffusion-based Data Augmentation.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

AutoChunk: Automated Activation Chunk for Memory-Efficient Deep Learning Inference.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Towards Lossless Dataset Distillation via Difficulty-Aligned Trajectory Matching.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

How Does the Textual Information Affect the Retrieval of Multimodal In-Context Learning?
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024


Efficient Dataset Distillation via Minimax Diffusion.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ParaGAN: A Scalable Distributed Training Framework for Generative Adversarial Networks.
Proceedings of the 2024 ACM Symposium on Cloud Computing, 2024

Summarizing Stream Data for Memory-Constrained Online Continual Learning.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Adaptive evidential <i>K</i>-NN classification: Integrating neighborhood search and feature weighting.
Inf. Sci., November, 2023

A Sparse Reconstructive Evidential K-Nearest Neighbor Classifier for High-Dimensional Data.
IEEE Trans. Knowl. Data Eng., June, 2023

Multitask Learning for Visual Question Answering.
IEEE Trans. Neural Networks Learn. Syst., March, 2023

Parallel Training of Pre-Trained Models via Chunk-Based Dynamic Memory Management.
IEEE Trans. Parallel Distributed Syst., 2023

MLLMs-Augmented Visual-Language Representation Learning.
CoRR, 2023

DREAM+: Efficient Dataset Distillation by Bidirectional Representative Matching.
CoRR, 2023

LoBaSS: Gauging Learnability in Supervised Fine-tuning Data.
CoRR, 2023

Let's reward step by step: Step-Level reward model as the Navigators for Reasoning.
CoRR, 2023

Can pre-trained models assist in dataset distillation?
CoRR, 2023

Boosting Unsupervised Contrastive Learning Using Diffusion-Based Data Augmentation From Scratch.
CoRR, 2023

Color Prompting for Data-Free Continual Unsupervised Domain Adaptive Person Re-Identification.
CoRR, 2023

The Snowflake Hypothesis: Training Deep GNN with One Node One Receptive field.
CoRR, 2023

Learning Referring Video Object Segmentation from Weak Annotation.
CoRR, 2023

Summarizing Stream Data for Memory-Restricted Online Continual Learning.
CoRR, 2023

InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning.
CoRR, 2023

DiM: Distilling Dataset into Generative Model.
CoRR, 2023

Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models.
CoRR, 2023

ATP: Adaptive Tensor Parallelism for Foundation Models.
CoRR, 2023

Hanayo: Harnessing Wave-like Pipeline Parallelism for Enhanced Large Model Training Efficiency.
Proceedings of the International Conference for High Performance Computing, 2023

Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Does Graph Distillation See Like Vision Dataset Counterpart?
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

An Efficient 2D Method for Training Super-Large Deep Learning Models.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training.
Proceedings of the 52nd International Conference on Parallel Processing, 2023

Adaptive Computation with Elastic Input Sequence.
Proceedings of the International Conference on Machine Learning, 2023

A Study on Transformer Configuration and Training Objective.
Proceedings of the International Conference on Machine Learning, 2023

Divide to Adapt: Mitigating Confirmation Bias for Domain Adaptation of Black-Box Predictors.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

One Student Knows All Experts Know: From Sparse to Dense.
Proceedings of the First Tiny Papers Track at ICLR 2023, 2023

Hierarchical Dialogue Understanding with Special Tokens and Turn-level Attention.
Proceedings of the First Tiny Papers Track at ICLR 2023, 2023

Dataset Quantization.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

DREAM: Efficient Dataset Distillation by Representative Matching.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

BiCro: Noisy Correspondence Rectification for Multi-modality Data via Bi-directional Cross-modal Similarity Consistency.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

MSINet: Twins Contrastive Search of Multi-Scale Interaction for Object ReID.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

CAME: Confidence-guided Adaptive Memory Efficient Optimization.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Sequence Parallelism: Long Sequence Training from System Perspective.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

CowClip: Reducing CTR Prediction Model Training Time from 12 Hours to 10 Minutes on 1 GPU.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

GPTR: Gestalt-Perception Transformer for Diagram Object Detection.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Weakly Supervised Learning for Textbook Question Answering.
IEEE Trans. Image Process., 2022

Distributed evidential clustering toward time series with big data issue.
Expert Syst. Appl., 2022

Elixir: Train a Large Language Model on a Small GPU Cluster.
CoRR, 2022

EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models.
CoRR, 2022

Prompt Vision Transformer for Domain Generalization.
CoRR, 2022

A Frequency-aware Software Cache for Large Recommendation System Embeddings.
CoRR, 2022

FaceMAE: Privacy-Preserving Face Recognition via Masked Autoencoders.
CoRR, 2022

Deeper vs Wider: A Revisit of Transformer Configuration.
CoRR, 2022

Reliable Label Correction is a Good Booster When Learning with Extremely Noisy Labels.
CoRR, 2022

CowClip: Reducing CTR Prediction Model Training Time from 12 hours to 10 minutes on 1 GPU.
CoRR, 2022

FastFold: Reducing AlphaFold Training Time from 11 Days to 67 Hours.
CoRR, 2022

Sky Computing: Accelerating Geo-distributed Computing in Federated Learning.
CoRR, 2022

Crafting Better Contrastive Views for Siamese Representation Learning.
CoRR, 2022

Random Sharpness-Aware Minimization.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Handling heavy-tailed input of transformer inference on GPUs.
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

Tesseract: Parallelize the Tensor Parallelism Efficiently.
Proceedings of the 51st International Conference on Parallel Processing, 2022

Concurrent Adversarial Learning for Large-Batch Training.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Joint Evidential $K$-Nearest Neighbor Classification.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Self-reconstructive evidential clustering for high-dimensional data.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

CAFE: Learning to Condense Dataset by Aligning Features.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

An Efficient Training Approach for Very Large Scale Face Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Crafting Better Contrastive Views for Siamese Representation Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Towards Efficient and Scalable Sharpness-Aware Minimization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Distributed EK-NN Classification.
Proceedings of the Belief Functions: Theory and Applications, 2022

Go Wider Instead of Deeper.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Evidential instance selection for <i>K</i>-nearest neighbor classification of big data.
Int. J. Approx. Reason., 2021

Large-Scale Deep Learning Optimizations: A Comprehensive Survey.
CoRR, 2021

Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training.
CoRR, 2021

Sparse-MLP: A Fully-MLP Architecture with Conditional Computation.
CoRR, 2021

PatrickStar: Parallel Training of Pre-trained Models via a Chunk-based Memory Management.
CoRR, 2021

2.5-dimensional distributed model training.
CoRR, 2021

Maximizing Parallelism in Distributed Training for Huge Neural Networks.
CoRR, 2021

Sequence Parallelism: Making 4D Parallelism Possible.
CoRR, 2021

An Efficient Training Approach for Very Large Scale Face Recognition.
CoRR, 2021

An Efficient 2D Method for Training Super-Large Deep Learning Models.
CoRR, 2021

Communication-avoiding kernel ridge regression on parallel and distributed systems.
CCF Trans. High Perform. Comput., 2021

Auto-Precision Scaling for Distributed Deep Learning.
Proceedings of the High Performance Computing - 36th International Conference, 2021

Online evolutionary batch size orchestration for scheduling deep learning workloads in GPU clusters.
Proceedings of the International Conference for High Performance Computing, 2021

Dynamic scaling for low-precision learning.
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021

Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021

Mask Aware Network for Masked Face Recognition in the Wild.
Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

2020
Fast and Accurate Machine Learning on Distributed Systems and Supercomputers
PhD thesis, 2020

Fast LSTM by dynamic decomposition on cloud and distributed systems.
Knowl. Inf. Syst., 2020

How much progress have we made in neural network training? A New Evaluation Protocol for Benchmarking Optimizers.
CoRR, 2020

The Limit of the Batch Size.
CoRR, 2020

Large Batch Optimization for Deep Learning: Training BERT in 76 minutes.
Proceedings of the 8th International Conference on Learning Representations, 2020

Rethinking the Value of Asynchronous Solvers for Distributed Deep Learning.
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2020

2019
Fast Deep Neural Network Training on Distributed Systems and Cloud TPUs.
IEEE Trans. Parallel Distributed Syst., 2019

Reducing BERT Pre-Training Time from 3 Days to 76 Minutes.
CoRR, 2019

Large-batch training for LSTM and beyond.
Proceedings of the International Conference for High Performance Computing, 2019

Fast LSTM Inference by Dynamic Decomposition on Cloud Systems.
Proceedings of the 2019 IEEE International Conference on Data Mining, 2019

2018
Accurate, Fast and Scalable Kernel Ridge Regression on Parallel and Distributed Systems.
Proceedings of the 32nd International Conference on Supercomputing, 2018

ImageNet Training in Minutes.
Proceedings of the 47th International Conference on Parallel Processing, 2018

2017
Design and Implementation of a Communication-Optimal Classifier for Distributed Kernel Support Vector Machines.
IEEE Trans. Parallel Distributed Syst., 2017

Parallel Multiclass Support Vector Machine for Remote Sensing Data Classification on Multicore and Many-Core Architectures.
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2017

Designing and implementing a heuristic cross-architecture combination for graph traversal.
J. Parallel Distributed Comput., 2017

100-epoch ImageNet Training with AlexNet in 24 Minutes.
CoRR, 2017

Scaling deep learning on GPU and knights landing clusters.
Proceedings of the International Conference for High Performance Computing, 2017

Runtime Data Layout Scheduling for Machine Learning Dataset.
Proceedings of the 46th International Conference on Parallel Processing, 2017

2016
Asynchronous Parallel Greedy Coordinate Descent.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

2015
Scaling Support Vector Machines on modern HPC platforms.
J. Parallel Distributed Comput., 2015

CA-SVM: Communication-Avoiding Support Vector Machines on Distributed Systems.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

2014
Evaluating multi-core and many-core architectures through accelerating the three-dimensional Lax-Wendroff correction stencil.
Int. J. High Perform. Comput. Appl., 2014

MIC-SVM: Designing a Highly Efficient Support Vector Machine for Advanced Modern Multi-core and Many-Core Architectures.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

An adaptive cross-architecture combination method for graph traversal.
Proceedings of the 2014 International Conference on Supercomputing, 2014

Designing a Heuristic Cross-Architecture Combination for Breadth-First Search.
Proceedings of the 43rd International Conference on Parallel Processing, 2014

Scaling and analyzing the stencil performance on multi-core and many-core architectures.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

2013
Accelerating the 3D Elastic Wave Forward Modeling on GPU and MIC.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013


  Loading...