Andrej Risteski

According to our database1, Andrej Risteski authored at least 67 papers between 2012 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Progressive distillation induces an implicit curriculum.
CoRR, 2024

On the Benefits of Memory for Modeling Time-Dependent PDEs.
CoRR, 2024

Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Outliers with Opposing Signals Have an Outsized Effect on Neural Network Optimization.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Fit Like You Sample: Sample-Efficient Generalized Score Matching from Fast Mixing Diffusions.
Proceedings of the Thirty Seventh Annual Conference on Learning Theory, June 30, 2024

2023
Fit Like You Sample: Sample-Efficient Generalized Score Matching from Fast Mixing Markov Chains.
CoRR, 2023

Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation.
CoRR, 2023

Transformers are uninterpretable with myopic methods: a case study with bounded Dyck grammars.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Provable benefits of score matching.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Deep Equilibrium Based Neural Operators for Steady-State PDEs.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Provable benefits of annealing for estimating normalizing constants: Importance Sampling, Noise-Contrastive Estimation, and beyond.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Neural Network Approximations of PDEs Beyond Linearity: A Representational Perspective.
Proceedings of the International Conference on Machine Learning, 2023

How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding.
Proceedings of the International Conference on Machine Learning, 2023

Pitfalls of Gaussians as a noise distribution in NCE.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Statistical Efficiency of Score Matching: The View from Isoperimetry.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
Masked prediction tasks: a parameter identifiability view.
CoRR, 2022

Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient for Out-of-Distribution Generalization.
CoRR, 2022

Continual learning: a feature extraction formalization, an efficient algorithm, and fundamental obstructions.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Masked Prediction: A Parameter Identifiability View.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Iterative Feature Matching: Toward Provable Domain Generalization with Logarithmic Environments.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

The Effects of Invertibility on the Representational Complexity of Encoders in Variational Autoencoders.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Analyzing and Improving the Optimization Landscape of Noise-Contrastive Estimation.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Variational autoencoders in the presence of low-dimensional data: landscape and implicit bias.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Sampling Approximately Low-Rank Ising Models: MCMC meets Variational Methods.
Proceedings of the Conference on Learning Theory, 2-5 July 2022, London, UK., 2022

An Online Learning Approach to Interpolation and Extrapolation in Domain Generalization.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

Contrasting the landscape of contrastive and non-contrastive learning.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021
Universal Approximation for Log-concave Distributions using Well-conditioned Normalizing Flows.
CoRR, 2021

Parametric Complexity Bounds for Approximating PDEs with Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Universal Approximation Using Well-Conditioned Normalizing Flows.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Representational aspects of depth and conditioning in normalizing flows.
Proceedings of the 38th International Conference on Machine Learning, 2021

The Risks of Invariant Risk Minimization.
Proceedings of the 9th International Conference on Learning Representations, 2021

Efficient sampling from the Bingham distribution.
Proceedings of the Algorithmic Learning Theory, 2021

Contrastive learning of strong-mixing continuous-time stochastic processes.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

The Limitations of Limited Context for Constituency Parsing.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Fast Convergence for Langevin Diffusion with Matrix Manifold Structure.
CoRR, 2020

Empirical Study of the Benefits of Overparameterization in Learning Latent Variable Models.
Proceedings of the 37th International Conference on Machine Learning, 2020

On Learning Language-Invariant Representations for Universal Machine Translation.
Proceedings of the 37th International Conference on Machine Learning, 2020

2019
Benefits of Overparameterization in Single-Layer Latent Variable Generative Models.
CoRR, 2019

Mean-field approximation, convex hierarchies, and the optimality of correlation rounding: a unified perspective.
Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, 2019

The Comparative Power of ReLU Networks and Polynomial Kernels in the Presence of Sparse Latent Structure.
Proceedings of the 7th International Conference on Learning Representations, 2019

Approximability of Discriminators Implies Diversity in GANs.
Proceedings of the 7th International Conference on Learning Representations, 2019

Sum-of-squares meets square loss: Fast rates for agnostic tensor completion.
Proceedings of the Conference on Learning Theory, 2019

2018
Linear Algebraic Structure of Word Senses, with Applications to Polysemy.
Trans. Assoc. Comput. Linguistics, 2018

Simulated Tempering Langevin Monte Carlo II: An Improved Proof using Soft Markov Chain Decomposition.
CoRR, 2018

Representational Power of ReLU Networks and Polynomial Kernels: Beyond Worst-Case Analysis.
CoRR, 2018

Beyond Log-concavity: Provable Guarantees for Sampling Multi-modal Distributions using Simulated Tempering Langevin Monte Carlo.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Do GANs learn the distribution? Some Theory and Empirics.
Proceedings of the 6th International Conference on Learning Representations, 2018

2017
New Techniques for Learning and Inference in Bayesian Models
PhD thesis, 2017

Theoretical limitations of Encoder-Decoder GAN architectures.
CoRR, 2017

Extending and Improving Wordnet via Unsupervised Word Embeddings.
CoRR, 2017

Provable benefits of representation learning.
CoRR, 2017

Provable learning of noisy-OR networks.
Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, 2017

On the Ability of Neural Nets to Express Distributions.
Proceedings of the 30th Conference on Learning Theory, 2017

2016
A Latent Variable Model Approach to PMI-based Word Embeddings.
Trans. Assoc. Comput. Linguistics, 2016

On Routing Disjoint Paths in Bounded Treewidth Graphs.
Proceedings of the 15th Scandinavian Symposium and Workshops on Algorithm Theory, 2016

Algorithms and matching lower bounds for approximately-convex optimization.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Approximate maximum entropy principles via Goemans-Williamson with applications to provable variational methods.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Recovery Guarantee of Non-negative Matrix Factorization via Alternating Updates.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Recovery guarantee of weighted low-rank approximation via alternating minimization.
Proceedings of the 33nd International Conference on Machine Learning, 2016

How to calculate partition functions using convex programming hierarchies: provable bounds for variational methods.
Proceedings of the 29th Conference on Learning Theory, 2016

2015
Random Walks on Context Spaces: Towards an Explanation of the Mysteries of Semantic Word Embeddings.
CoRR, 2015

On some provably correct cases of variational inference for topic models.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Label optimal regret bounds for online local learning.
Proceedings of The 28th Conference on Learning Theory, 2015

2014
Skeletal configurations of ribbon trees.
Discret. Appl. Math., 2014

2012
Skeletal Rigidity of Phylogenetic Trees
CoRR, 2012

What makes a Tree a Straight Skeleton?
Proceedings of the 24th Canadian Conference on Computational Geometry, 2012


  Loading...