Michael W. Mahoney

Orcid: 0000-0001-7920-4652

Affiliations:
  • University of California, Berkeley, Department of Statistics
  • Stanford University, Department of Mathematics


According to our database1, Michael W. Mahoney authored at least 316 papers between 2003 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
End-to-end codesign of Hessian-aware quantized neural networks for FPGAs.
ACM Trans. Reconfigurable Technol. Syst., September, 2024

Fully Stochastic Trust-Region Sequential Quadratic Programming for Equality-Constrained Optimization Problems.
SIAM J. Optim., 2024

AI and Memory Wall.
IEEE Micro, 2024

AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models.
CoRR, 2024

Elucidating the Design Choice of Probability Paths in Flow Matching for Forecasting.
CoRR, 2024

Mitigating Memorization In Language Models.
CoRR, 2024

Tuning Frequency Bias of State Space Models.
CoRR, 2024

Trust-Region Sequential Quadratic Programming for Stochastic Optimization with Random Models.
CoRR, 2024

Learning Physics for Unveiling Hidden Earthquake Ground Motions via Conditional Generative Modeling.
CoRR, 2024

Comparing and Contrasting Deep Learning Weather Prediction Backbones on Navier-Stokes and Atmospheric Dynamics.
CoRR, 2024

Sharpness-diversity tradeoff: improving flat ensembles with SharpBalance.
CoRR, 2024

WaveCastNet: An AI-enabled Wavefield Forecasting Framework for Earthquake Early Warning.
CoRR, 2024

There is HOPE to Avoid HiPPOs for Long-memory State Space Models.
CoRR, 2024

Chronos: Learning the Language of Time Series.
CoRR, 2024

Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning.
CoRR, 2024

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization.
CoRR, 2024

SALSA: Sequential Approximate Leverage-Score Algorithm with Application in Analyzing Big Time Series Data.
CoRR, 2024


Recent and Upcoming Developments in Randomized Numerical Linear Algebra for Machine Learning.
Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

Towards Scalable and Versatile Weight Space Learning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Using Uncertainty Quantification to Characterize and Improve Out-of-Domain Learning for PDEs.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

An LLM Compiler for Parallel Function Calling.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

SqueezeLLM: Dense-and-Sparse Quantization.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Robustifying State-space Models for Long Sequences via Approximate Diagonalization.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Generative Modeling of Regular and Irregular Time Series Data via Koopman VAEs.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

NoisyMix: Boosting Model Robustness to Common Corruptions.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Flow-Based Algorithms for Improving Clusters: A Unifying Framework, Software, and Performance.
SIAM Rev., February, 2023

Hessian averaging in stochastic Newton methods achieves superlinear convergence.
Math. Program., 2023

Multi-scale Local Network Structure Critically Impacts Epidemic Spread and Interventions.
CoRR, 2023

DMLR: Data-centric Machine Learning Research - Past, Present and Future.
CoRR, 2023

CholeskyQR with Randomization and Pivoting for Tall Matrices (CQRRPT).
CoRR, 2023

A PAC-Bayesian Perspective on the Interpolating Information Criterion.
CoRR, 2023

Surrogate-based Autotuning for Randomized Sketching Algorithms in Regression Problems.
CoRR, 2023

Probabilistic Forecasting with Coherent Aggregation.
CoRR, 2023

The Interpolating Information Criterion for Overparameterized Models.
CoRR, 2023

GEANN: Scalable Graph Augmentations for Multi-Horizon Time Series Forecasting.
CoRR, 2023

SuperBench: A Super-Resolution Benchmark Dataset for Scientific Machine Learning.
CoRR, 2023

End-to-end codesign of Hessian-aware quantized neural networks for FPGAs and ASICs.
CoRR, 2023

Full Stack Optimization of Transformer Inference: a Survey.
CoRR, 2023

Randomized Numerical Linear Algebra : A Perspective on the Field With an Eye to Software.
CoRR, 2023

Big Little Transformer Decoder.
CoRR, 2023

Extensions to the SENSEI In situ Framework for Heterogeneous Architectures.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

When are ensembles really effective?
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

A Heavy-Tailed Algebra for Probabilistic Programming.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Speculative Decoding with Big Little Decoder.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Test Accuracy vs. Generalization Gap: Model Selection in NLP without Accessing Training or Testing Data.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

A Three-regime Model of Network Pruning.
Proceedings of the International Conference on Machine Learning, 2023

Constrained Optimization via Exact Augmented Lagrangian and Randomized Iterative Sketching.
Proceedings of the International Conference on Machine Learning, 2023

Monotonicity and Double Descent in Uncertainty Estimation with Gaussian Processes.
Proceedings of the International Conference on Machine Learning, 2023

Learning Physical Models that Can Respect Conservation Laws.
Proceedings of the International Conference on Machine Learning, 2023

Gradient Gating for Deep Multi-Rate Learning on Graphs.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Learning differentiable solvers for systems with hard constraints.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Adaptive Self-Supervision Algorithms for Physics-Informed Neural Networks.
Proceedings of the ECAI 2023 - 26th European Conference on Artificial Intelligence, September 30 - October 4, 2023, Kraków, Poland, 2023

Fast Feature Selection with Fairness Constraints.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022
Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms.
J. Mach. Learn. Res., 2022

LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data.
J. Mach. Learn. Res., 2022

Newton-MR: Inexact Newton Method with minimum residual sub-problem solver.
EURO J. Comput. Optim., 2022

Gated Recurrent Neural Networks with Weighted Time-Delay Feedback.
CoRR, 2022

GACT: Activation Compressed Training for General Architectures.
CoRR, 2022

Asymptotic Convergence Rate and Statistical Inference for Stochastic Sequential Quadratic Programming.
CoRR, 2022

The Sky Above The Clouds.
CoRR, 2022

Learning continuous models for continuous physics.
CoRR, 2022

Evaluating natural language processing models with generalization metrics that do not need access to any training or testing data.
CoRR, 2022

NoisyMix: Boosting Robustness by Combining Data Augmentations, Stability Training, and Noise Injections.
CoRR, 2022

Hessian-Aware Pruning and Optimal Neural Implant.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

A Fast Post-Training Pruning Framework for Transformers.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Squeezeformer: An Efficient Transformer for Automatic Speech Recognition.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Neurotoxin: Durable Backdoors in Federated Learning.
Proceedings of the International Conference on Machine Learning, 2022

AutoIP: A United Framework to Integrate Physics into Gaussian Processes.
Proceedings of the International Conference on Machine Learning, 2022

GACT: Activation Compressed Training for Generic Network Architectures.
Proceedings of the International Conference on Machine Learning, 2022

Fat-Tailed Variational Inference with Anisotropic Tail Adaptive Flows.
Proceedings of the International Conference on Machine Learning, 2022

Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers.
Proceedings of the International Conference on Machine Learning, 2022

Long Expressive Memory for Sequence Modeling.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Noisy Feature Mixup.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Doubly Adaptive Scaled Algorithm for Machine Learning Using Second-Order Information.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Integer-Only Zero-Shot Quantization for Efficient Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Inexact Nonconvex Newton-Type Methods.
INFORMS J. Optim., January, 2021

Parallel and Communication Avoiding Least Angle Regression.
SIAM J. Sci. Comput., 2021

Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning.
J. Mach. Learn. Res., 2021

Limit theorems for out-of-sample extensions of the adjacency and Laplacian spectral embeddings.
J. Mach. Learn. Res., 2021

Statistical guarantees for local graph clustering.
J. Mach. Learn. Res., 2021

Learning from learning machines: a new generation of AI technology to meet the needs of science.
CoRR, 2021

Generalization Properties of Stochastic Optimizers via Trajectory Analysis.
CoRR, 2021

Compressing Deep ODE-Nets using Basis Function Expansions.
CoRR, 2021

Post-mortem on a deep learning contest: a Simpson's paradox and the complementary roles of scale metrics versus shape metrics.
CoRR, 2021

MLPruning: A Multilevel Structured Pruning Framework for Transformer-based Models.
CoRR, 2021

LocalNewton: Reducing Communication Bottleneck for Distributed Learning.
CoRR, 2021

Q-ASR: Integer-only Zero-shot Quantization for Efficient Speech Recognition.
CoRR, 2021

A Survey of Quantization Methods for Efficient Neural Network Inference.
CoRR, 2021

A Differential Geometry Perspective on Orthogonal Recurrent Models.
CoRR, 2021

Hessian-Aware Pruning and Optimal Neural Implant.
CoRR, 2021

Geometric rates of convergence for kernel-based sampling algorithms.
Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, 2021

Stochastic continuous normalizing flows: training SDEs as ODEs.
Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, 2021

LocalNewton: Reducing communication rounds for distributed learning.
Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, 2021

Noise-Response Analysis of Deep Neural Networks Quantifies Robustness and Fingerprints Structural Malware.
Proceedings of the 2021 SIAM International Conference on Data Mining, 2021

Taxonomizing local versus global structure in neural network loss landscapes.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Stateful ODE-Nets using Basis Function Expansions.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Noisy Recurrent Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Hessian Eigenspectra of More Realistic Nonlinear Models.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Characterizing possible failure modes in physics-informed neural networks.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Training Recommender Systems at Scale: Communication-Efficient Model and Data Parallelism.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

Improved Guarantees and a Multiple-descent Curve for Column Subset Selection and the Nystrom Method (Extended Abstract).
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

HAWQ-V3: Dyadic Neural Network Quantization.
Proceedings of the 38th International Conference on Machine Learning, 2021

I-BERT: Integer-only BERT Quantization.
Proceedings of the 38th International Conference on Machine Learning, 2021

Multiplicative Noise and Heavy Tails in Stochastic Optimization.
Proceedings of the 38th International Conference on Machine Learning, 2021

ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training.
Proceedings of the 38th International Conference on Machine Learning, 2021

Adversarially-Trained Deep Nets Transfer Better: Illustration on Image Classification.
Proceedings of the 9th International Conference on Learning Representations, 2021

Sparse Quantized Spectral Clustering.
Proceedings of the 9th International Conference on Learning Representations, 2021

Lipschitz Recurrent Neural Networks.
Proceedings of the 9th International Conference on Learning Representations, 2021

What's Hidden in a One-layer Randomly Weighted Transformer?
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Sparse sketches with small inversion bias.
Proceedings of the Conference on Learning Theory, 2021

Improving Semi-supervised Federated Learning by Reducing the Gradient Diversity of Models.
Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), 2021

Good Classifiers are Abundant in the Interpolating Regime.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Newton-type methods for non-convex optimization under inexact Hessian information.
Math. Program., 2020

HAWQV3: Dyadic Neural Network Quantization.
CoRR, 2020

Fast Distributed Training of Deep Neural Networks: Dynamic Communication Thresholding for Model and Data Parallelism.
CoRR, 2020

Benchmarking Semi-supervised Federated Learning.
CoRR, 2020

Continuous-in-Depth Neural Networks.
CoRR, 2020

Noise-response Analysis for Rapid Detection of Backdoors in Deep Neural Networks.
CoRR, 2020

Adversarially-Trained Deep Nets Transfer Better.
CoRR, 2020

Good linear classifiers are abundant in the interpolating regime.
CoRR, 2020

Lipschitz Recurrent Neural Networks.
CoRR, 2020

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning.
CoRR, 2020

Determinantal Point Processes in Randomized Numerical Linear Algebra.
CoRR, 2020

Rethinking Batch Normalization in Transformers.
CoRR, 2020

Stochastic Normalizing Flows.
CoRR, 2020

Improved guarantees and a multiple-descent curve for the Column Subset Selection Problem and the Nyström method.
CoRR, 2020

Predicting trends in the quality of state-of-the-art neural networks without access to training or testing data.
CoRR, 2020

Second-order Optimization for Non-convex Machine Learning: an Empirical Study.
Proceedings of the 2020 SIAM International Conference on Data Mining, 2020

Heavy-Tailed Universality Predicts Trends in Test Accuracies for Very Large Pre-Trained Deep Neural Networks.
Proceedings of the 2020 SIAM International Conference on Data Mining, 2020

Newton-ADMM: a distributed GPU-accelerated optimizer for multiclass classification problems.
Proceedings of the International Conference for High Performance Computing, 2020

Boundary thickness and robustness in learning models.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

A random matrix analysis of random Fourier features: beyond the Gaussian kernel, a precise phase transition, and the corresponding double descent.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Exact expressions for double descent and implicit regularization via surrogate random design.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Precise expressions for random projections: Low-rank approximation and randomized Newton.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Improved guarantees and a multiple-descent curve for Column Subset Selection and the Nystrom method.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Debiasing Distributed Second Order Optimization with Surrogate Sketching and Scaled Regularization.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

A Statistical Framework for Low-bitwidth Training of Deep Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks.
Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods, 2020

PowerNorm: Rethinking Batch Normalization in Transformers.
Proceedings of the 37th International Conference on Machine Learning, 2020

Error Estimation for Sketched SVD via the Bootstrap.
Proceedings of the 37th International Conference on Machine Learning, 2020

Forecasting Sequential Data Using Consistent Koopman Autoencoders.
Proceedings of the 37th International Conference on Machine Learning, 2020

MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

ZeroQ: A Novel Zero Shot Quantization Framework.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

PyHessian: Neural Networks Through the Lens of the Hessian.
Proceedings of the 2020 IEEE International Conference on Big Data (IEEE BigData 2020), 2020

OverSketched Newton: Fast Convex Optimization for Serverless Systems.
Proceedings of the 2020 IEEE International Conference on Big Data (IEEE BigData 2020), 2020

Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Bayesian experimental design using regularized determinantal point processes.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Inefficiency of K-FAC for Large Batch Size Training.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
The Difficulties of Addressing Interdisciplinary Challenges at theFoundations of Data Science.
SIGACT News, 2019

Avoiding Communication in Primal and Dual Block Coordinate Descent Methods.
SIAM J. Sci. Comput., 2019

Block Basis Factorization for Scalable Kernel Evaluation.
SIAM J. Matrix Anal. Appl., 2019

Sub-sampled Newton methods.
Math. Program., 2019

Variational perspective on local graph clustering.
Math. Program., 2019

Scalable Kernel K-Means Clustering with Nystr\"om Approximation: Relative-Error Bounds.
J. Mach. Learn. Res., 2019

A Bootstrap Method for Error Estimation in Randomized Matrix Multiplication.
J. Mach. Learn. Res., 2019

Group Collaborative Representation for Image Set Classification.
Int. J. Comput. Vis., 2019

HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks.
CoRR, 2019

Running Alchemist on Cray XC and CS Series Supercomputers: Dask and PySpark Interfaces, Deployment Options, and Data Transfer Times.
CoRR, 2019

Bootstrapping the Operator Norm in High Dimensions: Error Estimation for Covariance Matrices and Sketching.
CoRR, 2019

The Difficulties of Addressing Interdisciplinary Challenges at the Foundations of Data Science.
CoRR, 2019

On Linear Convergence of Weighted Kernel Herding.
CoRR, 2019

ANODEV2: A Coupled Neural ODE Evolution Framework.
CoRR, 2019

Residual Networks as Nonlinear Systems: Stability Analysis using Linearization.
CoRR, 2019

Parallel and Communication Avoiding Least Angle Regression.
CoRR, 2019

Physics-informed Autoencoders for Lyapunov-stable Fluid Flow Prediction.
CoRR, 2019

Shallow Learning for Fluid Flow Reconstruction with Limited Sensors and Limited Data.
CoRR, 2019

Alchemist: An Apache Spark ⇔ MPI interface.
Concurr. Comput. Pract. Exp., 2019

GPU Accelerated Sub-Sampled Newton's Method for Convex Classification Problems.
Proceedings of the 2019 SIAM International Conference on Data Mining, 2019

ANODEV2: A Coupled Neural ODE Framework.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Distributed estimation of the inverse Hessian by determinantal averaging.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Statistical Mechanics Methods for Discovering Knowledge from Modern Production Quality Neural Networks.
Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019

Traditional and Heavy Tailed Self Regularization in Neural Network Models.
Proceedings of the 36th International Conference on Machine Learning, 2019

HAWQ: Hessian AWare Quantization of Neural Networks With Mixed-Precision.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Trust Region Based Adversarial Attack on Neural Networks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Minimax experimental design: Bridging the gap between statistical and worst-case approaches to least squares regression.
Proceedings of the Conference on Learning Theory, 2019

2018
Parameter Re-Initialization through Cyclical Batch Size Schedules.
CoRR, 2018

On the Computational Inefficiency of Large Batch Sizes for Stochastic Gradient Descent.
CoRR, 2018

A Short Introduction to Local Graph Clustering Methods and Software.
CoRR, 2018

Large batch size training of neural networks with adversarial training and second-order information.
CoRR, 2018

Newton-MR: Newton's Method Without Smoothness or Convexity.
CoRR, 2018

Distributed Second-order Convex Optimization.
CoRR, 2018

GPU Accelerated Sub-Sampled Newton\textsf{'}s Method.
CoRR, 2018

LASAGNE: Locality and Structure Aware Graph Node Embedding.
Proceedings of the 2018 IEEE/WIC/ACM International Conference on Web Intelligence, 2018

Hessian-based Analysis of Large Batch Training and Robustness to Adversaries.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

GIANT: Globally Improved Approximate Newton Method for Distributed Optimization.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Accelerating Large-Scale Data Analysis by Offloading to High-Performance Computing Libraries using Alchemist.
Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018

Avoiding Synchronization in First-Order Methods for Sparse Convex Optimization.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

Error Estimation for Randomized Least-Squares Algorithms via the Bootstrap.
Proceedings of the 35th International Conference on Machine Learning, 2018

Out-of-sample extension of graph adjacency spectral embedding.
Proceedings of the 35th International Conference on Machine Learning, 2018

FLAG n' FLARE: Fast Linearly-Coupled Adaptive Gradient Methods.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

2017
DCAR: A Discriminative and Compact Audio Representation for Audio Processing.
IEEE Trans. Multim., 2017

An Optimization Approach to Locally-Biased Graph Algorithms.
Proc. IEEE, 2017

Principles and Applications of Science of Information.
Proc. IEEE, 2017

A local perspective on community structure in multilayer networks.
Netw. Sci., 2017

Weighted SGD for $\ell_p$ Regression with Randomized Preconditioning.
J. Mach. Learn. Res., 2017

Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging.
J. Mach. Learn. Res., 2017

Lectures on Randomized Numerical Linear Algebra.
CoRR, 2017

A Berkeley View of Systems Challenges for AI.
CoRR, 2017

Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior.
CoRR, 2017

Social Discrete Choice Models.
CoRR, 2017

Scalable Kernel K-Means Clustering with Nystrom Approximation: Relative-Error Bounds.
CoRR, 2017

Union of Intersections (UoI) for Interpretable Data Driven Discovery and Prediction.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Capacity Releasing Diffusion for Speed and Locality.
Proceedings of the 34th International Conference on Machine Learning, 2017

Skip-Gram - Zipf + Uniform = Vector Additivity.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

2016
The Fast Cauchy Transform and Faster Robust Linear Regression.
SIAM J. Comput., 2016

Recent Advances in Randomized Numerical Linear Algebra (NII Shonan Meeting 2016-10).
NII Shonan Meet. Rep., 2016

Parallel Local Graph Clustering.
Proc. VLDB Endow., 2016

Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments.
Proc. IEEE, 2016

A Statistical Perspective on Randomized Sketching for Ordinary Least-Squares.
J. Mach. Learn. Res., 2016

Revisiting the Nystrom Method for Improved Large-scale Machine Learning.
J. Mach. Learn. Res., 2016

Quasi-Monte Carlo Feature Maps for Shift-Invariant Kernels.
J. Mach. Learn. Res., 2016

Tree decompositions and social graphs.
Internet Math., 2016

Sub-Sampled Newton Methods II: Local Convergence Rates.
CoRR, 2016

Sub-Sampled Newton Methods I: Globally Convergent Algorithms.
CoRR, 2016

Lecture Notes on Spectral Graph Methods.
CoRR, 2016

Lecture Notes on Randomized Linear Algebra.
CoRR, 2016

Mapping the Similarities of Spectra: Global and Locally-biased Approaches to SDSS Galaxy Data.
CoRR, 2016

DCAR: A Discriminative and Compact Audio Representation to Improve Event Detection.
CoRR, 2016

Matrix Factorization at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies.
CoRR, 2016

FLAG: Fast Linearly-Coupled Adaptive Gradient Method.
CoRR, 2016

RandNLA: randomized numerical linear algebra.
Commun. ACM, 2016

Weighted SGD for <i>ℓ<sub>p</sub></i> Regression with Randomized Preconditioning.
Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, 2016

Feature-distributed sparse regression: a screen-and-clean approach.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Sub-sampled Newton Methods with Non-uniform Sampling.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

A Discriminative and Compact Audio Representation for Event Detection.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

A Multi-Platform Evaluation of the Randomized CX Low-Rank Matrix Factorization in Spark.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

A Simple and Strongly-Local Flow-Based Method for Cut Improvement.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Unified Acceleration Method for Packing and Covering Problems via Diameter Reduction.
Proceedings of the 43rd International Colloquium on Automata, Languages, and Programming, 2016

Approximating the Solution to Mixed Packing and Covering LPs in Parallel O˜(epsilon^{-3}) Time.
Proceedings of the 43rd International Colloquium on Automata, Languages, and Programming, 2016

Matrix factorizations at scale: A comparison of scientific data analytics in spark and C+MPI using three case studies.
Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData 2016), 2016

Structural Properties Underlying High-Quality Randomized Numerical Linear Algebra Algorithms.
Proceedings of the Handbook of Big Data., 2016

Mining Large Graphs.
Proceedings of the Handbook of Big Data., 2016

2015
Randomized Dimensionality Reduction for k-Means Clustering.
IEEE Trans. Inf. Theory, 2015

A statistical perspective on algorithmic leveraging.
J. Mach. Learn. Res., 2015

Faster Parallel Solver for Positive Linear Programs via Dynamically-Bucketed Selective Coordinate Descent.
CoRR, 2015

Structured Block Basis Factorization for Scalable Kernel Matrix Evaluation.
CoRR, 2015

Fast Randomized Kernel Ridge Regression with Statistical Guarantees.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Using Local Spectral Methods to Robustify Graph-Based Learning Algorithms.
Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015

Statistical and Algorithmic Perspectives on Randomized Sketching for Ordinary Least-Squares.
Proceedings of the 32nd International Conference on Machine Learning, 2015

Spectral Gap Error Bounds for Improving CUR Matrix Decomposition and the Nyström Method.
Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, 2015

2014
Signal Processing for Big Data [From the Guest Editors].
IEEE Signal Process. Mag., 2014

Quantile Regression for Large-Scale Applications.
SIAM J. Sci. Comput., 2014

LSRN: A Parallel Iterative Solver for Strongly Over- or Underdetermined Systems.
SIAM J. Sci. Comput., 2014

Semi-supervised eigenvectors for large-scale locally-biased learning.
J. Mach. Learn. Res., 2014

Think Locally, Act Locally: The Detection of Small, Medium-Sized, and Large Communities in Large Networks.
CoRR, 2014

Fast Randomized Kernel Methods With Statistical Guarantees.
CoRR, 2014

A new spin on an old algorithm: technical perspective.
Commun. ACM, 2014

Anti-differentiating approximation algorithms: A case study with min-cuts, spectral, and flow.
Proceedings of the 31th International Conference on Machine Learning, 2014

Random Laplace Feature Maps for Semigroup Kernels on Histograms.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

2013
On the Hyperbolicity of Small-World and Treelike Random Graphs.
Internet Math., 2013

Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression.
Proceedings of the Symposium on Theory of Computing Conference, 2013

Evaluating OpenMP Tasking at Scale for the Computation of Graph Hyperbolicity.
Proceedings of the OpenMP in the Era of Low Power Devices and Accelerators, 2013

Robust Regression on MapReduce.
Proceedings of the 30th International Conference on Machine Learning, 2013

Tree-Like Structure in Large Social and Information Networks.
Proceedings of the 2013 IEEE 13th International Conference on Data Mining, 2013

2012
A local spectral method for graphs: with applications to improving graph partitions and exploring data graphs locally.
J. Mach. Learn. Res., 2012

Fast approximation of matrix coherence and statistical leverage.
J. Mach. Learn. Res., 2012

The Fast Cauchy Transform: with Applications to Basis Construction, Regression, and Subspace Approximation in L1
CoRR, 2012

On the Hyperbolicity of Small-World Networks and Tree-Like Graphs
CoRR, 2012

rCUR: an R package for CUR matrix decomposition.
BMC Bioinform., 2012

Approximate computation and implicit regularization for very large-scale data analysis.
Proceedings of the 31st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, 2012

Semi-supervised Eigenvectors for Locally-biased Learning.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

On the Hyperbolicity of Small-World and Tree-Like Random Graphs.
Proceedings of the Algorithms and Computation - 23rd International Symposium, 2012

2011
Faster least squares approximation.
Numerische Mathematik, 2011

Randomized Algorithms for Matrices and Data.
Found. Trends Mach. Learn., 2011

Stochastic Dimensionality Reduction for K-means Clustering
CoRR, 2011

LSRN: A Parallel Iterative Solver for Strongly Over- or Under-Determined Systems
CoRR, 2011

Localization on low-order eigenvectors of data matrices
CoRR, 2011

Regularized Laplacian Estimation and Fast Eigenvector Approximation.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Implementing regularization implicitly via approximate eigenvector computation.
Proceedings of the 28th International Conference on Machine Learning, 2011

2010
Computation in large-scale scientific and internet data applications is a focus of MMDS 2010.
SIGKDD Explor., 2010

SIGACT news algorithms column: computation in large-scale scientific and internet data applications is a focus of MMDS 2010.
SIGACT News, 2010

Algorithmic and Statistical Perspectives on Large-Scale Data Analysis
CoRR, 2010

Effective Resistances, Statistical Leverage, and Applications to Linear Equation Solving
CoRR, 2010

Empirical comparison of algorithms for network community detection.
Proceedings of the 19th International Conference on World Wide Web, 2010

Approximating Higher-Order Distances Using Random Projections.
Proceedings of the UAI 2010, 2010

CUR from a Sparse Optimization Viewpoint.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

2009
Sampling Algorithms and Coresets for $\ell<sub>p</sub> Regression.
SIAM J. Comput., 2009

CUR matrix decompositions for improved data analysis.
Proc. Natl. Acad. Sci. USA, 2009

Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters.
Internet Math., 2009

A Spectral Algorithm for Improving Graph Partitions
CoRR, 2009

Learning with Spectral Kernels and Heavy-Tailed Data
CoRR, 2009

Empirical Evaluation of Graph Partitioning Using Spectral Embeddings and Flow.
Proceedings of the Experimental Algorithms, 8th International Symposium, 2009

An improved approximation algorithm for the column subset selection problem.
Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, 2009

Unsupervised Feature Selection for the $k$-means Clustering Problem.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

2008
Algorithmic and statistical challenges in modern largescale data analysis are the focus of MMDS 2008.
SIGKDD Explor., 2008

Tensor-CUR Decompositions for Tensor-Based Data.
SIAM J. Matrix Anal. Appl., 2008

Relative-Error CUR Matrix Decompositions.
SIAM J. Matrix Anal. Appl., 2008

Sampling subproblems of heterogeneous Max-Cut problems and approximation algorithms.
Random Struct. Algorithms, 2008

Algorithmic and Statistical Challenges in Modern Large-Scale Data Analysis are the Focus of MMDS 2008
CoRR, 2008

Statistical properties of community structure in large social and information networks.
Proceedings of the 17th International Conference on World Wide Web, 2008

Sampling algorithms and coresets for ℓ<sub><i>p</i></sub> regression.
Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 2008

Unsupervised feature selection for principal components analysis.
Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2008

2007
Sampling Algorithms and Coresets for Lp Regression
CoRR, 2007

Feature selection methods for text classification.
Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2007

07071 Abstracts Collection -- Web Information Retrieval and Linear Algebra Algorithms.
Proceedings of the Web Information Retrieval and Linear Algebra Algorithms, 11.02., 2007

07071 Report on Dagstuhl Seminar -- Web Information Retrieval and Linear Algebra Algorithms.
Proceedings of the Web Information Retrieval and Linear Algebra Algorithms, 11.02., 2007

2006
Fast Monte Carlo Algorithms for Matrices III: Computing a Compressed Approximate Matrix Decomposition.
SIAM J. Comput., 2006

Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix.
SIAM J. Comput., 2006

Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication.
SIAM J. Comput., 2006

Randomized Algorithms for Matrices and Massive Data Sets.
Proceedings of the 32nd International Conference on Very Large Data Bases, 2006

Sampling algorithms for <i>l</i><sub>2</sub> regression and applications.
Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms, 2006

Subspace Sampling and Relative-Error Matrix Approximation: Column-Row-Based Methods.
Proceedings of the Algorithms, 2006

Subspace Sampling and Relative-Error Matrix Approximation: Column-Based Methods.
Proceedings of the Approximation, 2006

2005
On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning.
J. Mach. Learn. Res., 2005

Sampling Sub-problems of Heterogeneous Max-cut Problems and Approximation Algorithms.
Proceedings of the STACS 2005, 2005

Approximating a Gram Matrix for Improved Kernel-Based Learning.
Proceedings of the Learning Theory, 18th Annual Conference on Learning Theory, 2005

2003
Rapid Mixing of Several Markov Chains for a Hard-Core Model.
Proceedings of the Algorithms and Computation, 14th International Symposium, 2003


  Loading...