Jascha Sohl-Dickstein
Affiliations:- Google Brain, Mountain View, CA, USA
- UC Berkeley, Redwood Center for Theoretical Neuroscience, CA, USA (PhD 2012)
According to our database1,
Jascha Sohl-Dickstein
authored at least 110 papers
between 2010 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on github.com
-
on ai.google
-
on rctn.org
On csauthors.net:
Bibliography
2024
Trans. Mach. Learn. Res., 2024
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability.
CoRR, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
2023
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models.
Trans. Mach. Learn. Res., 2023
Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?
CoRR, 2023
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC.
Proceedings of the International Conference on Machine Learning, 2023
2022
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies (Extended Abstract).
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022
Proceedings of the International Conference on Machine Learning, 2022
Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling.
Proceedings of the International Conference on Machine Learning, 2022
Proceedings of the Conference on Lifelong Learning Agents, 2022
2021
CoRR, 2021
Rapid training of deep neural networks without skip connections or normalization layers using Deep Kernel Shaping.
CoRR, 2021
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Whitening and Second Order Optimization Both Make Information in the Dataset Unusable During Training, and Can Reduce or Prevent Generalization.
Proceedings of the 38th International Conference on Machine Learning, 2021
Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies.
Proceedings of the 38th International Conference on Machine Learning, 2021
Proceedings of the 9th International Conference on Learning Representations, 2021
2020
Is Batch Norm unique? An empirical investigation and prescription to emulate the best properties of common normalizers without batch dependence.
CoRR, 2020
Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves.
CoRR, 2020
Whitening and second order optimization both destroy information about the dataset, and can make generalization impossible.
CoRR, 2020
A new method for parameter estimation in probabilistic models: Minimum probability flow.
CoRR, 2020
CoRR, 2020
CoRR, 2020
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
Proceedings of the 37th International Conference on Machine Learning, 2020
Proceedings of the 8th International Conference on Learning Representations, 2020
2019
J. Mach. Learn. Res., 2019
CoRR, 2019
Eliminating all bad Local Minima from Loss Landscapes without even adding an Extra Unit.
CoRR, 2019
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019
The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study.
Proceedings of the 36th International Conference on Machine Learning, 2019
Proceedings of the 36th International Conference on Machine Learning, 2019
Proceedings of the 36th International Conference on Machine Learning, 2019
Proceedings of the 7th International Conference on Learning Representations, 2019
Proceedings of the 7th International Conference on Learning Representations, 2019
Proceedings of the 7th International Conference on Learning Representations, 2019
Proceedings of the 7th International Conference on Learning Representations, 2019
Proceedings of the Deep Generative Models for Highly Structured Data, 2019
2018
CoRR, 2018
Guided evolutionary strategies: escaping the curse of dimensionality in random search.
CoRR, 2018
CoRR, 2018
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10, 000-Layer Vanilla Convolutional Neural Networks.
Proceedings of the 35th International Conference on Machine Learning, 2018
Proceedings of the 6th International Conference on Learning Representations, 2018
Proceedings of the 6th International Conference on Learning Representations, 2018
Proceedings of the 6th International Conference on Learning Representations, 2018
Proceedings of the 6th International Conference on Learning Representations, 2018
2017
Minimum and Maximum Entropy Distributions for Binary Systems with Known Means and Pairwise Correlations.
Entropy, 2017
CoRR, 2017
SVCCA: Singular Vector Canonical Correlation Analysis for Deep Understanding and Improvement.
CoRR, 2017
REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017
SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017
Proceedings of the 34th International Conference on Machine Learning, 2017
Proceedings of the 34th International Conference on Machine Learning, 2017
Proceedings of the 34th International Conference on Machine Learning, 2017
REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models.
Proceedings of the 5th International Conference on Learning Representations, 2017
Proceedings of the 5th International Conference on Learning Representations, 2017
Proceedings of the 5th International Conference on Learning Representations, 2017
Proceedings of the 5th International Conference on Learning Representations, 2017
Proceedings of the 5th International Conference on Learning Representations, 2017
Proceedings of the 5th International Conference on Learning Representations, 2017
2016
CoRR, 2016
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016
2015
Technical Note on Equivalence Between Recurrent Neural Network Time Series Models and Variational Bayesian Models.
CoRR, 2015
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015
Proceedings of the 32nd International Conference on Machine Learning, 2015
2014
PLoS Comput. Biol., 2014
Fast large-scale optimization by unifying stochastic gradient and quasi-Newton methods.
Proceedings of the 31th International Conference on Machine Learning, 2014
Proceedings of the 31th International Conference on Machine Learning, 2014
2013
Proceedings of the Workshops at the 16th International Conference on Artificial Intelligence in Education AIED 2013, 2013
Proceedings of the Workshops at the 16th International Conference on Artificial Intelligence in Education AIED 2013, 2013
2012
PhD thesis, 2012
CoRR, 2012
The Natural Gradient by Analogy to Signal Whitening, and Recipes and Tricks for its Use
CoRR, 2012
Training sparse natural image models with a fast Gibbs sampler of an extended state space.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012
2011
Proceedings of the 28th International Conference on Machine Learning, 2011
Proceedings of the IEEE International Conference on Computer Vision, 2011
Proceedings of the 2011 Data Compression Conference (DCC 2011), 2011
2010