2024
ModuLoRA: Finetuning 2-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers.
Trans. Mach. Learn. Res., 2024
Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices.
CoRR, 2024
Gradient Descent on Logistic Regression with Non-Separable Data and Large Step Sizes.
CoRR, 2024
Zeroth-Order Fine-Tuning of LLMs with Extreme Sparsity.
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
STAT: Shrinking Transformers After Training.
CoRR, 2024
QTIP: Quantization with Trellises and Incoherence Processing.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Diffusion Models With Learned Adaptive Noise.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Shadow Cones: A Generalized Framework for Partial Order Embeddings.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Arbitrariness and Social Prediction: The Confounding Role of Variance in Fair Classification.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
Decentralized Learning: Theoretical Optimality and Practical Improvements.
J. Mach. Learn. Res., 2023
Report of the 1st Workshop on Generative AI and Law.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
ModuLoRA: Finetuning 3-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers.
CoRR, 2023
Shadow Cones: Unveiling Partial Orders in Hyperbolic Space.
CoRR, 2023
Scale up with Order: Finding Good Data Permutations for Distributed Training.
CoRR, 2023
Variance, Self-Consistency, and Arbitrariness in Fair Classification.
CoRR, 2023
Inference for probabilistic dependency graphs.
Proceedings of the Uncertainty in Artificial Intelligence, 2023
Neural Caches for Monte Carlo Partial Differential Equation Solvers.
Proceedings of the SIGGRAPH Asia 2023 Conference Papers, 2023
Coneheads: Hierarchy Aware Attention.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Riemannian Residual Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
CD-GraB: Coordinating Distributed Example Orders for Provably Accelerated Training.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
QuIP: 2-Bit Quantization of Large Language Models With Guarantees.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
TART: A plug-and-play Transformer module for task-agnostic reasoning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models.
Proceedings of the International Conference on Machine Learning, 2023
CocktailSGD: Fine-tuning Foundation Models over 500Mbps Networks.
Proceedings of the International Conference on Machine Learning, 2023
STEP: Learning N: M Structured Sparsity Masks from Scratch with Precondition.
Proceedings of the International Conference on Machine Learning, 2023
Random Laplacian Features for Learning with Hyperbolic Space.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
Maximizing Communication Efficiency for Large-scale Training via 0/1 Adam.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
2022
MCTensor: A High-Precision Deep Learning Library with Multi-Component Floating-Point.
CoRR, 2022
Non-Determinism and the Lawlessness of ML Code.
CoRR, 2022
Structured Pruning is All You Need for Pruning CNNs at Initialization.
CoRR, 2022
HyLa: Hyperbolic Laplacian Features For Graph Learning.
CoRR, 2022
Understanding Hyperdimensional Computing for Parallel Single-Pass Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
GraB: Finding Provably Better Data Permutations than Random Reshuffling.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Model Preserving Compression for Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Low-Precision Stochastic Gradient Langevin Dynamics.
Proceedings of the International Conference on Machine Learning, 2022
How Low Can We Go: Trading Memory for Error in Low-Precision Training.
Proceedings of the Tenth International Conference on Learning Representations, 2022
A General Analysis of Example-Selection for Stochastic Gradient Descent.
Proceedings of the Tenth International Conference on Learning Representations, 2022
Non-Determinism and the Lawlessness of Machine Learning Code.
Proceedings of the 2022 Symposium on Computer Science and Law, 2022
2021
Tecnologica cosa: Modeling Storyteller Personalities in Boccaccio's Decameron.
CoRR, 2021
Pruning Neural Networks with Interpolative Decompositions.
CoRR, 2021
Model Selection's Disparate Impact in Real-World Deep Learning Applications.
CoRR, 2021
Variance Reduction in Training Forecasting Models with Subgroup Sampling.
CoRR, 2021
Low-Precision Reinforcement Learning.
CoRR, 2021
Hyperparameter Optimization Is Deceiving Us, and How to Stop It.
CoRR, 2021
Representing Hyperbolic Space Accurately using Multi-Component Floats.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Equivariant Manifold Flows.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Hyperparameter Optimization Is Deceiving Us, and How to Stop It.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
PipeMare: Asynchronous Pipeline Parallel DNN Training.
Proceedings of the Fourth Conference on Machine Learning and Systems, 2021
Optimal Complexity in Decentralized Training.
Proceedings of the 38th International Conference on Machine Learning, 2021
Variance Reduced Training with Stratified Sampling for Forecasting Models.
Proceedings of the 38th International Conference on Machine Learning, 2021
Low-Precision Reinforcement Learning: Running Soft Actor-Critic in Half Precision.
Proceedings of the 38th International Conference on Machine Learning, 2021
Accuracy-Efficiency Trade-Offs and Accountability in Distributed ML Systems.
Proceedings of the EAAMO 2021: ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization, Virtual Event, USA, October 5, 2021
Meta-Learning Divergences for Variational Inference.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021
2020
Revisiting BFloat16 Training.
CoRR, 2020
Meta-Learning for Variational Inference.
CoRR, 2020
Regulating Accuracy-Efficiency Trade-Offs in Distributed Machine Learning Systems.
CoRR, 2020
Towards Optimal Convergence Rate in Decentralized Stochastic Training.
CoRR, 2020
MixML: A Unified Analysis of Weakly Consistent Parallel Learning.
CoRR, 2020
Optimizing JPEG Quantization for Classification Networks.
CoRR, 2020
Asymptotically Optimal Exact Minibatch Metropolis-Hastings.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
Random Reshuffling is Not Always Better.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
Neural Manifold Ordinary Differential Equations.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
Moniqua: Modulo Quantized Communication in Decentralized SGD.
Proceedings of the 37th International Conference on Machine Learning, 2020
Differentiating through the Fréchet Mean.
Proceedings of the 37th International Conference on Machine Learning, 2020
AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic Gradient MCMC.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020
2019
Cloud-Hosted Intelligence for Real-time IoT Applications.
ACM SIGOPS Oper. Syst. Rev., 2019
Overwrite Quantization: Opportunistic Outlier Handling for Neural Network Accelerators.
CoRR, 2019
SysML: The New Frontier of Machine Learning Systems.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2019
Poisson-Minibatching for Gibbs Sampling with Convergence Rate Guarantees.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019
QPyTorch: A Low-Precision Arithmetic Simulation Framework.
Proceedings of the Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing, 2019
Numerically Accurate Hyperbolic Embeddings Using Tiling-Based Models.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019
Dimension-Free Bounds for Low-Precision Training.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019
Channel Gating Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019
Boosting the Performance of CNN Accelerators with Dynamic Fine-Grained Channel Gating.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019
Improving Neural Network Quantization without Retraining using Outlier Channel Splitting.
Proceedings of the 36th International Conference on Machine Learning, 2019
SWALP : Stochastic Weight Averaging in Low Precision Training.
Proceedings of the 36th International Conference on Machine Learning, 2019
A Kernel Theory of Modern Data Augmentation.
Proceedings of the 36th International Conference on Machine Learning, 2019
Distributed Learning with Sublinear Communication.
Proceedings of the 36th International Conference on Machine Learning, 2019
A Formal Framework for Probabilistic Unclean Databases.
Proceedings of the 22nd International Conference on Database Theory, 2019
Building Efficient Deep Neural Networks With Unitary Group Convolutions.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
2018
Soft optoelectronic sensory foams with proprioception.
Sci. Robotics, 2018
Channel Gating Neural Networks.
CoRR, 2018
High-Accuracy Low-Precision Training.
CoRR, 2018
A Two-pronged Progress in Structured Dense Matrix Vector Multiplication.
Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, 2018
The Convergence of Stochastic Gradient Descent in Asynchronous Shared Memory.
Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing, 2018
Representation Tradeoffs for Hyperbolic Embeddings.
Proceedings of the 35th International Conference on Machine Learning, 2018
Minibatch Gibbs Sampling on Large Graphical Models.
Proceedings of the 35th International Conference on Machine Learning, 2018
Accelerated Stochastic Power Iteration.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018
2017
Incremental knowledge base construction using DeepDive.
VLDB J., 2017
Flipper: A Systematic Approach to Debugging Training Sets.
Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics, 2017
Gaussian Quadrature for Kernel Features.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017
Understanding and Optimizing Asynchronous Low-Precision Stochastic Gradient Descent.
Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017
2016
DeepDive: Declarative Knowledge Base Construction.
SIGMOD Rec., 2016
Parallel SGD: When does averaging help?
CoRR, 2016
Data Programming: Creating Large Training Sets, Quickly.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016
Scan Order in Gibbs Sampling: Models in Which it Matters and Bounds on How Much.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016
Ensuring Rapid Mixing and Low Bias for Asynchronous Gibbs Sampling.
Proceedings of the 33nd International Conference on Machine Learning, 2016
Have abstraction and eat performance, too: optimized heterogeneous computing with parallel patterns.
Proceedings of the 2016 International Symposium on Code Generation and Optimization, 2016
Generating Configurable Hardware from Parallel Patterns.
Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems, 2016
2015
Incremental Knowledge Base Construction Using DeepDive.
Proc. VLDB Endow., 2015
Taming the Wild: A Unified Analysis of Hogwild-Style Algorithms.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015
Rapidly Mixing Gibbs Sampling for a Class of Factor Graphs Using Hierarchy Width.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015
Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems.
Proceedings of the 32nd International Conference on Machine Learning, 2015
2014
Global Convergence of Stochastic Gradient Descent for Some Nonconvex Matrix Problems.
CoRR, 2014