Shiwei Liu

Orcid: 0000-0001-6195-771X

Affiliations:
  • University of Oxford, Mathematical Institute, UK
  • University of Texas at Austin, TX, USA
  • Eindhoven University of Technology, Eindhoven, The Netherlands (PhD)


According to our database1, Shiwei Liu authored at least 66 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models.
CoRR, 2024

(PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork.
CoRR, 2024

From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients.
CoRR, 2024

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
CoRR, 2024

Composable Interventions for Language Models.
CoRR, 2024

Dynamic Data Pruning for Automatic Speech Recognition.
CoRR, 2024

MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization.
CoRR, 2024

Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion.
CoRR, 2024

OwLore: Outlier-weighed Layerwise Sampled Low-Rank Projection for Memory-Efficient LLM Fine-tuning.
CoRR, 2024

Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding.
CoRR, 2024

Q-Hitter: A Better Token Oracle for Efficient LLM Inference via Sparse-Quantized KV Cache.
Proceedings of the Seventh Annual Conference on Machine Learning and Systems, 2024

Sparse Cocktail: Every Sparse Pattern Every Sparse Ratio All At Once.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Advancing Dynamic Sparse Training by Exploring Optimization Opportunities.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Junk DNA Hypothesis: Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs "Difficult" Downstream Tasks in LLMs.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

CaM: Cache Merging for Memory-efficient LLMs Inference.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

AdaMerging: Adaptive Model Merging for Multi-Task Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

NeurRev: Train Better Sparse Neural Network Practically via Neuron Revitalization.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

FFN-SkipLLM: A Hidden Gem for Autoregressive Decoding with Adaptive Feed Forward Skipping.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Is C4 Dataset Optimal for Pruning? An Investigation of Calibration Data for LLM Pruning.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023
Don't Be So Dense: Sparse-to-Sparse GAN Training Without Sacrificing Performance.
Int. J. Comput. Vis., October, 2023

Supervised Feature Selection with Neuron Evolution in Sparse Neural Networks.
Trans. Mach. Learn. Res., 2023

The Counterattack of CNNs in Self-Supervised Learning: Larger Kernel Size might be All You Need.
CoRR, 2023

E2ENet: Dynamic Sparse Feature Fusion for Accurate and Efficient 3D Medical Image Segmentation.
CoRR, 2023

Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective.
CoRR, 2023

Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity.
CoRR, 2023

Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity.
CoRR, 2023

Ten Lessons We Have Learned in the New "Sparseland": A Short Handbook for Sparse Neural Network Researchers.
CoRR, 2023

REST: Enhancing Group Robustness in DNNs Through Reweighted Sparse Training.
Proceedings of the Machine Learning and Knowledge Discovery in Databases: Research Track, 2023

Enhancing Adversarial Training via Reweighting Optimization Trajectory.
Proceedings of the Machine Learning and Knowledge Discovery in Databases: Research Track, 2023

Towards Data-Agnostic Pruning At Initialization: What Makes a Good Sparse Mask?
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Don't just prune by magnitude! Your mask topology is a secret weapon.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Dynamic Sparsity Is Channel-Level Sparsity Learner.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models.
Proceedings of the International Conference on Machine Learning, 2023

Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication.
Proceedings of the International Conference on Machine Learning, 2023

Are Large Kernels Better Teachers than Transformers for ConvNets?
Proceedings of the International Conference on Machine Learning, 2023

Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Proceedings of the Eleventh International Conference on Learning Representations, 2023

More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using Sparsity.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Revisiting Pruning at Initialization Through the Lens of Ramanujan Graph.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Data Augmented Flatness-aware Gradient Projection for Continual Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Many-Task Federated Learning: A New Problem Setting and A Simple Baseline.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Lottery Pools: Winning More by Interpolating Tickets without Increasing Training or Inference Cost.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
A brain-inspired algorithm for training highly sparse neural networks.
Mach. Learn., 2022

More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using Sparsity.
CoRR, 2022

Superposing Many Tickets into One: A Performance Booster for Sparse Neural Network Training.
CoRR, 2022

Achieving Personalized Federated Learning with Sparse Local Models.
CoRR, 2022

Dynamic Sparse Network for Time Series Classification: Learning What to "See".
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

You Can Have Better Graph Neural Networks by Not Training Weights at All: Finding Untrained GNNs Tickets.
Proceedings of the Learning on Graphs Conference, 2022

The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity.
Proceedings of the Tenth International Conference on Learning Representations, 2022

2021
Efficient and effective training of sparse recurrent neural networks.
Neural Comput. Appl., 2021

Sparse evolutionary deep learning with over one million artificial neurons on commodity hardware.
Neural Comput. Appl., 2021

FreeTickets: Accurate, Robust and Efficient Deep Ensemble by Training with Dynamic Sparsity.
CoRR, 2021

Sparse Training via Boosting Pruning Plasticity with Neuroregeneration.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training.
Proceedings of the 38th International Conference on Machine Learning, 2021

Selfish Sparse RNN Training.
Proceedings of the 38th International Conference on Machine Learning, 2021

Hierarchical Semantic Segmentation using Psychometric Learning.
Proceedings of the Asian Conference on Machine Learning, 2021

2020
Topological Insights in Sparse Neural Networks.
CoRR, 2020

Topological Insights into Sparse Neural Networks.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2020

Learning Sparse Neural Networks for Better Generalization.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Network Performance Optimization with Real Time Traffic Prediction in Data Center Network.
Proceedings of the European Conference on Optical Communications, 2020

2019
On improving deep learning generalization with adaptive sparse connectivity.
CoRR, 2019

Intrinsically Sparse Long Short-Term Memory Networks.
CoRR, 2019


  Loading...