Mark Schmidt

Orcid: 0000-0003-1129-5273

Affiliations:
  • University of British Columbia, Department of Computer Science, Vancouver, Canada
  • École Normale Supérieure, INRIA SIERRA project team, Paris, France
  • University of Alberta, Department of Computing Science, Edmonton, Canada


According to our database1, Mark Schmidt authored at least 100 papers between 2005 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Why Line Search when you can Plane Search? SO-Friendly Neural Networks allow Per-Iteration Optimization of Learning and Momentum Rates for Every Layer.
CoRR, 2024

BlockLLM: Memory-Efficient Adaptation of LLMs by Selecting and Optimizing the Right Coordinate Blocks.
CoRR, 2024

Enhancing Policy Gradient with the Polyak Step-Size Adaption.
CoRR, 2024

Faster Convergence of Stochastic Accelerated Gradient Descent under Interpolation.
CoRR, 2024

Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models.
CoRR, 2024

2023
Predicting DNA kinetics with a truncated continuous-time Markov chain method.
Comput. Biol. Chem., June, 2023

Analyzing and Improving Greedy 2-Coordinate Updates for Equality-Constrained Optimization via Steepest Descent in the 1-Norm.
CoRR, 2023

Simplifying Momentum-based Riemannian Submanifold Optimization.
CoRR, 2023

Optimistic Thompson Sampling-based algorithms for episodic reinforcement learning.
Proceedings of the Uncertainty in Artificial Intelligence, 2023

Fast Convergence of Random Reshuffling Under Over-Parameterization and the Polyak-Łojasiewicz Condition.
Proceedings of the Machine Learning and Knowledge Discovery in Databases: Research Track, 2023

Searching for Optimal Per-Coordinate Step-sizes with Multidimensional Backtracking.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

BiSLS/SPS: Auto-tune Step Sizes for Stable Bi-level Optimization.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning.
Proceedings of the International Conference on Machine Learning, 2023

Target-based Surrogates for Stochastic Optimization.
Proceedings of the International Conference on Machine Learning, 2023

Noise Is Not the Main Factor Behind the Gap Between Sgd and Adam on Transformers, But Sign Descent Might Be.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
SVRG meets AdaGrad: painless variance reduction.
Mach. Learn., 2022

Let's Make Block Coordinate Descent Converge Faster: Faster Greedy Rules, Message-Passing, Active-Set Complexity, and Superlinear Convergence.
J. Mach. Learn. Res., 2022

Homeomorphic-Invariance of EM: Non-Asymptotic Convergence in KL Divergence for Exponential Families via Mirror Descent (Extended Abstract).
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Improved Policy Optimization for Online Imitation Learning.
Proceedings of the Conference on Lifelong Learning Agents, 2022

2021
Structured second-order methods via natural gradient descent.
CoRR, 2021

AutoRetouch: Automatic Professional Face Retouching.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Robust Asymmetric Learning in POMDPs.
Proceedings of the 38th International Conference on Machine Learning, 2021

Tractable structured natural-gradient descent using local parameterizations.
Proceedings of the 38th International Conference on Machine Learning, 2021

Homeomorphic-Invariance of EM: Non-Asymptotic Convergence in KL Divergence for Exponential Families via Mirror Descent.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020
Variance-Reduced Methods for Machine Learning.
Proc. IEEE, 2020

Combining Bayesian optimization and Lipschitz optimization.
Mach. Learn., 2020

Adaptive Gradient Methods Converge Faster with Over-Parameterization (and you can do a line-search).
CoRR, 2020

Regret Bounds without Lipschitz Continuity: Online Learning with Relative-Lipschitz Losses.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Handling the Positive-Definite Constraint in the Bayesian Learning Rule.
Proceedings of the 37th International Conference on Machine Learning, 2020

Proposal-Based Instance Segmentation With Point Supervision.
Proceedings of the IEEE International Conference on Image Processing, 2020

Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019
"Active-set complexity" of proximal gradient: How long does it take to find the sparsity pattern?
Optim. Lett., 2019

Stein's Lemma for the Reparameterization Trick with Exponential Family Mixtures.
CoRR, 2019

Instance Segmentation with Point Supervision.
CoRR, 2019

Efficient Deep Gaussian Process Models for Variable-Sized Input.
CoRR, 2019

Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Efficient Deep Gaussian Process Models for Variable-Sized Inputs.
Proceedings of the International Joint Conference on Neural Networks, 2019

Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family Approximations.
Proceedings of the 36th International Conference on Machine Learning, 2019

Efficient Parameter Estimation for DNA Kinetics Modeled as Continuous-Time Markov Chains.
Proceedings of the DNA Computing and Molecular Programming - 25th International Conference, 2019

A Less Biased Evaluation of Out-of-distribution Sample Detectors.
Proceedings of the 30th British Machine Vision Conference 2019, 2019

Where are the Masks: Instance Segmentation with Image-level Supervision.
Proceedings of the 30th British Machine Vision Conference 2019, 2019

Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Are we there yet? Manifold identification of gradient-related proximal methods.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Distributed Maximization of "Submodular plus Diversity" Functions for Multi-label Feature Selection on Huge Datasets.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018
Does Your Model Know the Digit 6 Is Not a Cat? A Less Biased Evaluation of "Outlier" Detectors.
CoRR, 2018

New Insights into Bootstrapping for Bandits.
CoRR, 2018

MASAGA: A Linearly-Convergent Stochastic First-Order Method for Optimization on Manifolds.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2018

SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Online Learning Rate Adaptation with Hypergradient Descent.
Proceedings of the 6th International Conference on Learning Representations, 2018

Where Are the Blobs: Counting by Localization with Point Supervision.
Proceedings of the Computer Vision - ECCV 2018, 2018

2017
Erratum to: Minimizing finite sums with the stochastic average gradient.
Math. Program., 2017

Minimizing finite sums with the stochastic average gradient.
Math. Program., 2017

Diffusion Independent Semi-Bandit Influence Maximization.
CoRR, 2017

Model-Independent Online Learning for Influence Maximization.
Proceedings of the 34th International Conference on Machine Learning, 2017

Inferring Parameters for an Elementary Step Model of DNA Structure Kinetics with Locally Context-Dependent Arrhenius Rates.
Proceedings of the DNA Computing and Molecular Programming - 23rd International Conference, 2017

Horde of Bandits using Gaussian Markov Random Fields.
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

2016
Convergence Rates for Greedy Kaczmarz Algorithms, and Faster Randomized Kaczmarz Rules Using the Orthogonality Graph.
CoRR, 2016

Fast Patch-based Style Transfer of Arbitrary Style.
CoRR, 2016

Convergence Rates for Greedy Kaczmarz Algorithms, and Randomized Kaczmarz Rules Using the Orthogonality Graph.
Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence, 2016

Faster Stochastic Variational Inference using Proximal-Gradient Methods with General Divergence Functions.
Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence, 2016

Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2016

Play and Learn: Using Video Games to Train Computer Vision Models.
Proceedings of the British Machine Vision Conference 2016, 2016

2015
Hierarchical Maximum-Margin Clustering.
CoRR, 2015

Convergence of Proximal-Gradient Stochastic Variational Inference under Non-Decreasing Step-Size Sequence.
CoRR, 2015

Stop Wasting My Gradients: Practical SVRG.
CoRR, 2015

StopWasting My Gradients: Practical SVRG.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Coordinate Descent Converges Faster with the Gauss-Southwell Rule Than Random Selection.
Proceedings of the 32nd International Conference on Machine Learning, 2015

Non-Uniform Stochastic Average Gradient Method for Training Conditional Random Fields.
Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, 2015

2014
Convex Optimization for Big Data: Scalable, randomized, and parallel algorithms for big data analytics.
IEEE Signal Process. Mag., 2014

Convex Optimization for Big Data.
CoRR, 2014

2013
Erratum: Hybrid Deterministic-Stochastic Methods for Data Fitting.
SIAM J. Sci. Comput., 2013

Block-Coordinate Frank-Wolfe Optimization for Structural SVMs.
Proceedings of the 30th International Conference on Machine Learning, 2013

2012
Hybrid Deterministic-Stochastic Methods for Data Fitting.
SIAM J. Sci. Comput., 2012

On Sparse, Spectral and Other Parameterizations of Binary Probabilistic Models.
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012

A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method
CoRR, 2012

Stochastic Block-Coordinate Frank-Wolfe Optimization for Structural SVMs
CoRR, 2012

A Stochastic Gradient Method with an Exponential Convergence Rate for Strongly-Convex Optimization with Finite Training Sets
CoRR, 2012

A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

2011
Generalized Fast Approximate Energy Minimization via Graph Cuts: Alpha-Expansion Beta-Shrink Moves
CoRR, 2011

Generalized Fast Approximate Energy Minimization via Graph Cuts: a-Expansion b-Shrink Moves.
Proceedings of the UAI 2011, 2011

Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

2010
Modeling annotator expertise: Learning when everybody knows a bit of something.
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010

Convex Structure Learning in Log-Linear Models: Beyond Pairwise Potentials.
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010

Causal learning without DAGs.
Proceedings of the Causality: Objectives and Assessment (NIPS 2008 Workshop), 2010

2009
Optimizing Costly Functions with Simple Constraints: A Limited-Memory Projected Quasi-Newton Algorithm.
Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, 2009

Modeling Discrete Interventional Data using Directed Cyclic Graphical Models.
Proceedings of the UAI 2009, 2009

Group Sparse Priors for Covariance Estimation.
Proceedings of the UAI 2009, 2009

Increased discrimination in level set methods with embedded conditional random fields.
Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

2008
An interior-point stochastic approximation method and an L1-regularized delta rule.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Structure learning in random fields for heart motion abnormality detection.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

2007
3D Variational Brain Tumor Segmentation using a High Dimensional Feature Set.
Proceedings of the IEEE 11th International Conference on Computer Vision, 2007

Fast Optimization Methods for L1 Regularization: A Comparative Study and Two New Approaches.
Proceedings of the Machine Learning: ECML 2007, 2007

Learning Graphical Model Structure Using L1-Regularization Paths.
Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, 2007

2006
Learning a Classification-based Glioma Growth Model Using MRI Data.
J. Comput., 2006

Accelerated training of conditional random fields with stochastic gradient methods.
Proceedings of the Machine Learning, 2006

A Classification-Based Glioma Diffusion Model Using MRI Data.
Proceedings of the Advances in Artificial Intelligence, 2006

2005
Support Vector Random Fields for Spatial Classification.
Proceedings of the Knowledge Discovery in Databases: PKDD 2005, 2005

Segmenting brain tumors using alignment-based features.
Proceedings of the Fourth International Conference on Machine Learning and Applications, 2005

Segmenting Brain Tumors with Conditional Random Fields and Support Vector Machines.
Proceedings of the Computer Vision for Biomedical Image Applications, 2005


  Loading...