We stand with Ukraine

We stand with Ukraine

Jascha Sohl-Dickstein

Affiliations:

Google Brain, Mountain View, CA, USA
UC Berkeley, Redwood Center for Theoretical Neuroscience, CA, USA (PhD 2012)

According to our database¹, Jascha Sohl-Dickstein authored at least 110 papers between 2010 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

Online presence:

On csauthors.net:

Bibliography

2024

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2024

Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability.

[BibT_eX]

[DOI]

CoRR, 2024

Training LLMs over Neurally Compressed Text.

[BibT_eX]

[DOI]

,

,

,

Jeffrey Pennington

,

,

Jascha Sohl-Dickstein

,

CoRR, 2024

The boundary of neural network trainability is fractal.

[BibT_eX]

[DOI]

Jascha Sohl-Dickstein

CoRR, 2024

Position: Levels of AGI for Operationalizing Progress on the Path to AGI.

[BibT_eX]

[DOI]

Meredith Ringel Morris

,

Jascha Sohl-Dickstein

,

,

,

,

Aleksandra Faust

,

Clément Farabet

,

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Scaling Exponents Across Parameterizations and Optimizers.

[BibT_eX]

[DOI]

Katie E. Everett

,

,

Mitchell Wortsman

,

Alexander A. Alemi

,

,

,

,

Jascha Sohl-Dickstein

,

Leslie Pack Kaelbling

,

,

Jeffrey Pennington

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Small-scale proxies for large-scale Transformer training instabilities.

[BibT_eX]

[DOI]

Mitchell Wortsman

,

,

,

Katie E. Everett

,

Alexander A. Alemi

,

,

John D. Co-Reyes

,

,

,

,

Jeffrey Pennington

,

Jascha Sohl-Dickstein

,

,

,

,

Simon Kornblith

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models.

[BibT_eX]

[DOI]

Aarohi Srivastava

,

Abhinav Rastogi

,

,

Abu Awal Md Shoeb

,

,

,

,

,

,

Adrià Garriga-Alonso

,

Agnieszka Kluska

,

Aitor Lewkowycz

,

,

,

,

,

Alexander W. Kocurek

,

,

,

,

,

,

,

,

,

,

,

Anantharaman S. Iyer

,

Anders Andreassen

,

,

Andrea Santilli

,

Andreas Stuhlmüller

,

,

,

Andrew K. Lampinen

,

,

,

,

,

,

,

Antonio Norelli

,

,

Arash Gholamidavoodi

,

,

,

Arun Kirubarajan

,

Asher Mullokandov

,

Ashish Sabharwal

,

,

,

,

,

B. Ryan Roberts

,

,

,

Bartlomiej Bojanowski

,

Batuhan Özyurt

,

Behnam Hedayatnia

,

Behnam Neyshabur

,

,

,

,

Bill Yuchen Lin

,

,

,

,

,

Catherine Stinson

,

Cedrick Argueta

,

Cèsar Ferri Ramírez

,

,

Charles Rathkopf

,

,

,

,

Chris Callison-Burch

,

,

Christian Voigt

,

Christopher D. Manning

,

Christopher Potts

,

,

Clara E. Rivera

,

,

,

Courtney Ashcraft

,

Cristina Garbacea

,

,

,

,

,

,

,

Daniel Khashabi

,

,

Daniel Moseguí González

,

Danielle Perszyk

,

Danny Hernandez

,

,

Daphne Ippolito

,

,

,

,

,

Debajyoti Datta

,

,

,

,

,

,

,

,

,

,

Dimitri Coelho Mollo

,

,

,

,

Ekaterina Shutova

,

Ekin Dogus Cubuk

,

,

Eleanor Hagerman

,

Elizabeth Barnes

,

Elizabeth Donoway

,

,

Emanuele Rodolà

,

,

,

,

,

,

,

,

Ethan J. Jerzak

,

,

Eunice Engefu Manyasi

,

Evgenii Zheltonozhskii

,

,

,

Fernando Martínez-Plumed

,

Francesca Happé

,

François Chollet

,

,

,

Genta Indra Winata

,

,

Germán Kruszewski

,

Giambattista Parascandolo

,

Giorgio Mariani

,

,

Gonzalo Jaimovitch-López

,

,

,

Hana Galijasevic

,

,

,

Hannaneh Hajishirzi

,

,

,

,

Hinrich Schütze

,

,

,

,

,

,

,

Jack Geissinger

,

Jackson Kernion

,

,

,

Jaime Fernández Fisac

,

,

,

,

,

,

,

Janelle Wingfield

,

,

,

Jascha Sohl-Dickstein

,

,

,

,

Jekaterina Novikova

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Jonathan Batchelder

,

Jonathan Berant

,

,

,

José Hernández-Orallo

,

Joseph Boudeman

,

,

,

Joshua B. Tenenbaum

,

,

,

,

,

,

Karthik Gopalakrishnan

,

Katerina Ignatyeva

,

,

Kaustubh D. Dhole

,

,

,

,

Kristen Chiafullo

,

Ksenia Shkaruta

,

,

,

Kyle Richardson

,

,

,

,

,

,

Lidia Contreras Ochando

,

Louis-Philippe Morency

,

,

,

,

,

,

Luis Oliveros Colón

,

,

Lütfi Kerem Senel

,

,

,

Maartje ter Hoeve

,

,

,

,

,

,

,

María José Ramírez-Quintana

,

,

Mario Giulianelli

,

,

Martin Potthast

,

Matthew L. Leavitt

,

,

Mátyás Schubert

,

Medina Baitemirova

,

,

Melvin McElrath

,

,

,

,

Michael I. Ivanitskiy

,

Michael Starritt

,

,

Michal Swedrowski

,

Michele Bevilacqua

,

Michihiro Yasunaga

,

,

,

,

,

,

,

,

Moin Aminnaseri

,

,

,

Mukund Varma T.

,

,

,

,

Neta Gur-Ari Krakover

,

Nicholas Cameron

,

Nicholas Roberts

,

,

Nicole Martinez

,

,

,

Niklas Muennighoff

,

Nitish Shirish Keskar

,

,

,

,

,

,

,

Omar Elbaghdadi

,

,

,

Pablo Antonio Moreno Casares

,

,

,

,

,

Pegah Alipoormolabashi

,

,

,

,

Peter Eckersley

,

,

,

Piotr Milkowski

,

,

Pouya Pezeshkpour

,

,

,

,

,

,

Rachel Etta Rudolph

,

,

,

,

Raphaël Millière

,

,

,

,

,

Robbe Raymaekers

,

,

,

,

,

,

,

,

,

Ruslan Salakhutdinov

,

,

,

,

,

,

,

Saif M. Mohammad

,

,

,

,

,

Samuel Gruetter

,

Samuel R. Bowman

,

Samuel S. Schoenholz

,

,

,

,

Sarik Ghazarian

,

,

,

Sebastian Bischoff

,

Sebastian Gehrmann

,

Sebastian Schuster

,

Sepideh Sadeghi

,

,

,

Shashank Srivastava

,

,

,

,

Shixiang Shane Gu

,

Shubh Pachchigar

,

Shubham Toshniwal

,

,

Shyamolima (Shammie) Debnath

,

,

Simon Thormeyer

,

,

,

Sneha Priscilla Makini

,

,

,

Sriharsha Hatwar

,

Stanislas Dehaene

,

,

,

Stella Biderman

,

,

,

Steven T. Piantadosi

,

Stuart M. Shieber

,

Summer Misherghi

,

Svetlana Kiritchenko

,

,

,

,

,

,

,

Tatsu Hashimoto

,

,

Théo Desbordes

,

Theodore Rothschild

,

,

,

Tiberius Nkinyili

,

,

,

,

Tobias Gerstenberg

,

,

Trishala Neeraj

,

,

,

,

,

,

Victoria Nyamai

,

,

Vinay V. Ramasesh

,

Vinay Uday Prabhu

,

Vishakh Padmakumar

,

,

,

William Saunders

,

,

,

,

,

,

,

,

Yadollah Yaghoobzadeh

,

,

,

,

,

,

,

,

Yonatan Belinkov

,

,

,

,

,

,

,

,

,

Trans. Mach. Learn. Res., 2023

Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?

[BibT_eX]

[DOI]

CoRR, 2023

Levels of AGI: Operationalizing Progress on the Path to AGI.

[BibT_eX]

[DOI]

Meredith Ringel Morris

,

Jascha Sohl-Dickstein

,

,

,

,

Aleksandra Faust

,

Clément Farabet

,

CoRR, 2023

Noise-Reuse in Online Evolution Strategies.

[BibT_eX]

[DOI]

,

,

Jascha Sohl-Dickstein

,

,

CoRR, 2023

Variance-Reduced Gradient Estimation via Noise-Reuse in Online Evolution Strategies.

[BibT_eX]

[DOI]

,

,

Jascha Sohl-Dickstein

,

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC.

[BibT_eX]

[DOI]

,

,

,

Joshua B. Tenenbaum

,

Sander Dieleman

,

,

Jascha Sohl-Dickstein

,

,

Will Sussman Grathwohl

Proceedings of the International Conference on Machine Learning, 2023

2022

General-Purpose In-Context Learning by Meta-Learning Transformers.

[BibT_eX]

[DOI]

,

,

Jascha Sohl-Dickstein

,

CoRR, 2022

VeLO: Training Versatile Learned Optimizers by Scaling Up.

[BibT_eX]

[DOI]

,

,

C. Daniel Freeman

,

,

,

,

,

,

,

,

Jascha Sohl-Dickstein

CoRR, 2022

Language Model Cascades.

[BibT_eX]

[DOI]

,

,

Aitor Lewkowycz

,

,

,

Raphael Gontijo Lopes

,

,

Henryk Michalewski

,

,

Jascha Sohl-Dickstein

,

,

CoRR, 2022

A Closer Look at Learned Optimization: Stability, Robustness, and Inductive Biases.

[BibT_eX]

[DOI]

,

,

Jascha Sohl-Dickstein

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies (Extended Abstract).

[BibT_eX]

[DOI]

,

,

Jascha Sohl-Dickstein

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Fast Finite Width Neural Tangent Kernel.

[BibT_eX]

[DOI]

,

Jascha Sohl-Dickstein

,

Samuel S. Schoenholz

Proceedings of the International Conference on Machine Learning, 2022

Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling.

[BibT_eX]

[DOI]

,

,

Jeffrey Pennington

,

Jascha Sohl-Dickstein

Proceedings of the International Conference on Machine Learning, 2022

Practical Tradeoffs between Memory, Compute, and Performance in Learned Optimizers.

[BibT_eX]

[DOI]

,

C. Daniel Freeman

,

,

Niru Maheswaranathan

,

Jascha Sohl-Dickstein

Proceedings of the Conference on Lifelong Learning Agents, 2022

2021

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation.

[BibT_eX]

[DOI]

Kaustubh D. Dhole

,

,

Sebastian Gehrmann

,

,

,

,

Abinaya Mahendiran

,

,

Ashish Srivastava

,

,

,

Jascha Sohl-Dickstein

,

,

,

,

Sebastian Ruder

,

,

,

,

,

,

Ian Berlot-Attwell

,

,

,

Marco Antonio Sobrevilla Cabezudo

,

Samuel Cahyawijaya

,

,

,

Mukund Choudhary

,

Christian Clauss

,

,

,

,

,

,

Thomas Dopierre

,

Paul-Alexis Dray

,

,

Tatiana Ekeinhor

,

Marco Di Giovanni

,

,

,

,

,

Fabrice Harel-Canada

,

,

,

Przemyslaw K. Joniak

,

,

Venelin Kovatchev

,

Kalpesh Krishna

,

,

,

Seungjae Ryan Lee

,

Corey James Levinson

,

,

,

,

Andrey Lukyanenko

,

Vukosi Marivate

,

,

,

,

,

Nafise Sadat Moosavi

,

Niklas Muennighoff

,

Timothy Sum Hon Mun

,

,

,

,

,

Nivranshu Pasricha

,

,

,

,

,

,

,

Pawan Kumar Rajpoot

,

,

,

Nicholas Roberts

,

Juan Diego Rodriguez

,

,

Paulo Henrique Santos Vasconcellos

,

,

Robin M. Schmidt

,

,

Tshephisho Sefara

,

,

,

,

,

,

,

,

,

,

,

,

Taylor Sorensen

,

,

Aman Srivastava

,

K. V. Aditya Srivatsa

,

,

Mukund Varma T.

,

,

Fiona Anting Tan

,

,

,

,

,

,

,

,

,

,

Genta Indra Winata

,

,

Witold Wydmanski

,

,

,

,

,

CoRR, 2021

Rapid training of deep neural networks without skip connections or normalization layers using Deep Kernel Shaping.

[BibT_eX]

[DOI]

,

,

Guillaume Desjardins

,

Grzegorz Swirszcz

,

Valentin Dalibard

,

Jascha Sohl-Dickstein

,

Samuel S. Schoenholz

CoRR, 2021

Training Learned Optimizers with Randomly Initialized Learned Optimizers.

[BibT_eX]

[DOI]

,

C. Daniel Freeman

,

Niru Maheswaranathan

,

Jascha Sohl-Dickstein

CoRR, 2021

Reverse engineering learned optimizers reveals known and novel mechanisms.

[BibT_eX]

[DOI]

Niru Maheswaranathan

,

,

,

,

Jascha Sohl-Dickstein

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Whitening and Second Order Optimization Both Make Information in the Dataset Unusable During Training, and Can Reduce or Prevent Generalization.

[BibT_eX]

[DOI]

,

Daniel Duckworth

,

Samuel S. Schoenholz

,

,

Jascha Sohl-Dickstein

Proceedings of the 38th International Conference on Machine Learning, 2021

Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies.

[BibT_eX]

[DOI]

,

,

Jascha Sohl-Dickstein

Proceedings of the 38th International Conference on Machine Learning, 2021

Score-Based Generative Modeling through Stochastic Differential Equations.

[BibT_eX]

[DOI]

,

Jascha Sohl-Dickstein

,

Diederik P. Kingma

,

,

,

Proceedings of the 9th International Conference on Learning Representations, 2021

2020

Parallel Training of Deep Networks with Local Updates.

[BibT_eX]

[DOI]

,

,

,

,

Badreddine Noune

,

,

Jascha Sohl-Dickstein

,

CoRR, 2020

Towards NNGP-guided Neural Architecture Search.

[BibT_eX]

[DOI]

,

,

,

,

Jascha Sohl-Dickstein

CoRR, 2020

Is Batch Norm unique? An empirical investigation and prescription to emulate the best properties of common normalizers without batch dependence.

[BibT_eX]

[DOI]

,

Jascha Sohl-Dickstein

CoRR, 2020

Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves.

[BibT_eX]

[DOI]

,

Niru Maheswaranathan

,

C. Daniel Freeman

,

,

Jascha Sohl-Dickstein

CoRR, 2020

Whitening and second order optimization both destroy information about the dataset, and can make generalization impossible.

[BibT_eX]

[DOI]

,

Daniel Duckworth

,

Samuel S. Schoenholz

,

,

Jascha Sohl-Dickstein

CoRR, 2020

A new method for parameter estimation in probabilistic models: Minimum probability flow.

[BibT_eX]

[DOI]

Jascha Sohl-Dickstein

,

Peter Battaglino

,

Michael Robert DeWeese

CoRR, 2020

Exact posterior distributions of wide Bayesian neural networks.

[BibT_eX]

[DOI]

,

,

,

Jeffrey Pennington

,

Jascha Sohl-Dickstein

CoRR, 2020

The large learning rate phase of deep learning: the catapult mechanism.

[BibT_eX]

[DOI]

Aitor Lewkowycz

,

,

,

Jascha Sohl-Dickstein

,

CoRR, 2020

Using a thousand optimization tasks to learn hyperparameter search strategies.

[BibT_eX]

[DOI]

,

Niru Maheswaranathan

,

,

C. Daniel Freeman

,

,

Jascha Sohl-Dickstein

CoRR, 2020

On the infinite width limit of neural networks with a standard parameterization.

[BibT_eX]

[DOI]

Jascha Sohl-Dickstein

,

,

Samuel S. Schoenholz

,

CoRR, 2020

Finite Versus Infinite Neural Networks: an Empirical Study.

[BibT_eX]

[DOI]

,

Samuel S. Schoenholz

,

Jeffrey Pennington

,

,

,

,

Jascha Sohl-Dickstein

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling.

[BibT_eX]

[DOI]

,

,

Jascha Sohl-Dickstein

,

Hugo Larochelle

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Infinite attention: NNGP and NTK for deep attention networks.

[BibT_eX]

[DOI]

,

,

Jascha Sohl-Dickstein

,

Proceedings of the 37th International Conference on Machine Learning, 2020

Neural Tangents: Fast and Easy Infinite Neural Networks in Python.

[BibT_eX]

[DOI]

,

,

,

,

Alexander A. Alemi

,

Jascha Sohl-Dickstein

,

Samuel S. Schoenholz

Proceedings of the 8th International Conference on Learning Representations, 2020

2019

Measuring the Effects of Data Parallelism on Neural Network Training.

[BibT_eX]

[DOI]

Christopher J. Shallue

,

,

Joseph M. Antognini

,

Jascha Sohl-Dickstein

,

,

J. Mach. Learn. Res., 2019

Neural reparameterization improves structural optimization.

[BibT_eX]

[DOI]

,

Jascha Sohl-Dickstein

,

CoRR, 2019

Using learned optimizers to make models robust to input noise.

[BibT_eX]

[DOI]

,

Niru Maheswaranathan

,

Jonathon Shlens

,

Jascha Sohl-Dickstein

,

CoRR, 2019

Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent.

[BibT_eX]

[DOI]

,

,

Samuel S. Schoenholz

,

,

Jascha Sohl-Dickstein

,

Jeffrey Pennington

CoRR, 2019

Eliminating all bad Local Minima from Loss Landscapes without even adding an Extra Unit.

[BibT_eX]

[DOI]

Jascha Sohl-Dickstein

,

Kenji Kawaguchi

CoRR, 2019

Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent.

[BibT_eX]

[DOI]

,

,

Samuel S. Schoenholz

,

,

,

Jascha Sohl-Dickstein

,

Jeffrey Pennington

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Invertible Convolutional Flow.

[BibT_eX]

[DOI]

,

Dale Schuurmans

,

Jascha Sohl-Dickstein

,

,

Daniel Duckworth

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study.

[BibT_eX]

[DOI]

,

Jascha Sohl-Dickstein

,

,

Samuel L. Smith

Proceedings of the 36th International Conference on Machine Learning, 2019

Understanding and correcting pathologies in the training of learned optimizers.

[BibT_eX]

[DOI]

,

Niru Maheswaranathan

,

,

C. Daniel Freeman

,

Jascha Sohl-Dickstein

Proceedings of the 36th International Conference on Machine Learning, 2019

Guided evolutionary strategies: augmenting random search with surrogate gradients.

[BibT_eX]

[DOI]

Niru Maheswaranathan

,

,

,

,

Jascha Sohl-Dickstein

Proceedings of the 36th International Conference on Machine Learning, 2019

A Mean Field Theory of Batch Normalization.

[BibT_eX]

[DOI]

,

Jeffrey Pennington

,

,

Jascha Sohl-Dickstein

,

Samuel S. Schoenholz

Proceedings of the 7th International Conference on Learning Representations, 2019

Bayesian Deep Convolutional Networks with Many Channels are Gaussian Processes.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Daniel A. Abolafia

,

Jeffrey Pennington

,

Jascha Sohl-Dickstein

Proceedings of the 7th International Conference on Learning Representations, 2019

Meta-Learning Update Rules for Unsupervised Representation Learning.

[BibT_eX]

[DOI]

,

Niru Maheswaranathan

,

,

Jascha Sohl-Dickstein

Proceedings of the 7th International Conference on Learning Representations, 2019

Adversarial Reprogramming of Neural Networks.

[BibT_eX]

[DOI]

Gamaleldin F. Elsayed

,

Ian J. Goodfellow

,

Jascha Sohl-Dickstein

Proceedings of the 7th International Conference on Learning Representations, 2019

A RAD approach to deep mixture models.

[BibT_eX]

[DOI]

,

Jascha Sohl-Dickstein

,

,

Hugo Larochelle

Proceedings of the Deep Generative Models for Highly Structured Data, 2019

2018

Learned optimizers that outperform SGD on wall-clock and test loss.

[BibT_eX]

[DOI]

,

Niru Maheswaranathan

,

,

C. Daniel Freeman

,

Jascha Sohl-Dickstein

CoRR, 2018

Bayesian Convolutional Neural Networks with Many Channels are Gaussian Processes.

[BibT_eX]

[DOI]

,

,

,

,

Daniel A. Abolafia

,

Jeffrey Pennington

,

Jascha Sohl-Dickstein

CoRR, 2018

Guided evolutionary strategies: escaping the curse of dimensionality in random search.

[BibT_eX]

[DOI]

Niru Maheswaranathan

,

,

,

Jascha Sohl-Dickstein

CoRR, 2018

Stochastic natural gradient descent draws posterior samples in function space.

[BibT_eX]

[DOI]

Samuel L. Smith

,

Daniel Duckworth

,

,

Jascha Sohl-Dickstein

CoRR, 2018

Learning Unsupervised Learning Rules.

[BibT_eX]

[DOI]

,

Niru Maheswaranathan

,

,

Jascha Sohl-Dickstein

CoRR, 2018

Adversarial Examples that Fool both Human and Computer Vision.

[BibT_eX]

[DOI]

Gamaleldin F. Elsayed

,

,

,

Nicolas Papernot

,

,

Ian J. Goodfellow

,

Jascha Sohl-Dickstein

CoRR, 2018

Adversarial Examples that Fool both Computer Vision and Time-Limited Humans.

[BibT_eX]

[DOI]

Gamaleldin F. Elsayed

,

,

,

Nicolas Papernot

,

,

Ian J. Goodfellow

,

Jascha Sohl-Dickstein

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

PCA of high dimensional random walks with comparison to neural network training.

[BibT_eX]

[DOI]

Joseph M. Antognini

,

Jascha Sohl-Dickstein

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10, 000-Layer Vanilla Convolutional Neural Networks.

[BibT_eX]

[DOI]

,

,

Jascha Sohl-Dickstein

,

Samuel S. Schoenholz

,

Jeffrey Pennington

Proceedings of the 35th International Conference on Machine Learning, 2018

Sensitivity and Generalization in Neural Networks: an Empirical Study.

[BibT_eX]

[DOI]

,

,

Daniel A. Abolafia

,

Jeffrey Pennington

,

Jascha Sohl-Dickstein

Proceedings of the 6th International Conference on Learning Representations, 2018

Learning to Learn Without Labels.

[BibT_eX]

[DOI]

,

Niru Maheswaranathan

,

,

Jascha Sohl-Dickstein

Proceedings of the 6th International Conference on Learning Representations, 2018

Generalizing Hamiltonian Monte Carlo with Neural Networks.

[BibT_eX]

[DOI]

,

Matthew D. Hoffman

,

Jascha Sohl-Dickstein

Proceedings of the 6th International Conference on Learning Representations, 2018

Deep Neural Networks as Gaussian Processes.

[BibT_eX]

[DOI]

,

,

,

Samuel S. Schoenholz

,

Jeffrey Pennington

,

Jascha Sohl-Dickstein

Proceedings of the 6th International Conference on Learning Representations, 2018

2017

Minimum and Maximum Entropy Distributions for Binary Systems with Known Means and Pairwise Correlations.

[BibT_eX]

[DOI]

Badr F. Albanna

,

Christopher Hillar

,

Jascha Sohl-Dickstein

,

Michael Robert DeWeese

Entropy, 2017

A Correspondence Between Random Neural Networks and Statistical Field Theory.

[BibT_eX]

[DOI]

Samuel S. Schoenholz

,

Jeffrey Pennington

,

Jascha Sohl-Dickstein

CoRR, 2017

SVCCA: Singular Vector Canonical Correlation Analysis for Deep Understanding and Improvement.

[BibT_eX]

[DOI]

,

,

,

Jascha Sohl-Dickstein

CoRR, 2017

REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models.

[BibT_eX]

[DOI]

,

,

Chris J. Maddison

,

Dieterich Lawson

,

Jascha Sohl-Dickstein

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability.

[BibT_eX]

[DOI]

,

,

,

Jascha Sohl-Dickstein

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Learned Optimizers that Scale and Generalize.

[BibT_eX]

[DOI]

Olga Wichrowska

,

Niru Maheswaranathan

,

Matthew W. Hoffman

,

Sergio Gomez Colmenarejo

,

,

Nando de Freitas

,

Jascha Sohl-Dickstein

Proceedings of the 34th International Conference on Machine Learning, 2017

On the Expressive Power of Deep Neural Networks.

[BibT_eX]

[DOI]

,

,

Jon M. Kleinberg

,

,

Jascha Sohl-Dickstein

Proceedings of the 34th International Conference on Machine Learning, 2017

Input Switched Affine Networks: An RNN Architecture Designed for Interpretability.

[BibT_eX]

[DOI]

Jakob N. Foerster

,

,

Jascha Sohl-Dickstein

,

,

Proceedings of the 34th International Conference on Machine Learning, 2017

REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models.

[BibT_eX]

[DOI]

,

,

Chris J. Maddison

,

Jascha Sohl-Dickstein

Proceedings of the 5th International Conference on Learning Representations, 2017

Deep Information Propagation.

[BibT_eX]

[DOI]

Samuel S. Schoenholz

,

,

,

Jascha Sohl-Dickstein

Proceedings of the 5th International Conference on Learning Representations, 2017

Unrolled Generative Adversarial Networks.

[BibT_eX]

[DOI]

,

,

,

Jascha Sohl-Dickstein

Proceedings of the 5th International Conference on Learning Representations, 2017

Explaining the Learning Dynamics of Direct Feedback Alignment.

[BibT_eX]

[DOI]

,

,

Samuel S. Schoenholz

,

,

Jascha Sohl-Dickstein

Proceedings of the 5th International Conference on Learning Representations, 2017

Density estimation using Real NVP.

[BibT_eX]

[DOI]

,

Jascha Sohl-Dickstein

,

Proceedings of the 5th International Conference on Learning Representations, 2017

Capacity and Trainability in Recurrent Neural Networks.

[BibT_eX]

[DOI]

Jasmine Collins

,

Jascha Sohl-Dickstein

,

Proceedings of the 5th International Conference on Learning Representations, 2017

2016

Survey of Expressivity in Deep Neural Networks.

[BibT_eX]

[DOI]

,

,

Jon M. Kleinberg

,

,

Jascha Sohl-Dickstein

CoRR, 2016

Improved generator objectives for GANs.

[BibT_eX]

[DOI]

,

Alexander A. Alemi

,

Jascha Sohl-Dickstein

,

Anelia Angelova

CoRR, 2016

A universal tradeoff between power, precision and speed in physical communication.

[BibT_eX]

[DOI]

Subhaneil Lahiri

,

Jascha Sohl-Dickstein

,

CoRR, 2016

Intelligible Language Modeling with Input Switched Affine Networks.

[BibT_eX]

[DOI]

Jakob N. Foerster

,

,

,

Jascha Sohl-Dickstein

,

CoRR, 2016

Exponential expressivity in deep neural networks through transient chaos.

[BibT_eX]

[DOI]

,

Subhaneil Lahiri

,

,

Jascha Sohl-Dickstein

,

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

2015

A Device for Human Ultrasonic Echolocation.

[BibT_eX]

[DOI]

Jascha Sohl-Dickstein

,

,

Benjamin M. Gaub

,

Chris C. Rodgers

,

,

Michael Robert DeWeese

,

Nicol S. Harper

IEEE Trans. Biomed. Eng., 2015

Technical Note on Equivalence Between Recurrent Neural Network Time Series Models and Variational Bayesian Models.

[BibT_eX]

[DOI]

Jascha Sohl-Dickstein

,

Diederik P. Kingma

CoRR, 2015

Deep Knowledge Tracing.

[BibT_eX]

[DOI]

,

Jonathan Bassen

,

,

,

,

Leonidas J. Guibas

,

Jascha Sohl-Dickstein

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Deep Unsupervised Learning using Nonequilibrium Thermodynamics.

[BibT_eX]

[DOI]

Jascha Sohl-Dickstein

,

,

Niru Maheswaranathan

,

Proceedings of the 32nd International Conference on Machine Learning, 2015

2014

Modeling Higher-Order Correlations within Cortical Microcolumns.

[BibT_eX]

[DOI]

,

Jascha Sohl-Dickstein

,

Charles M. Gray

,

Bruno A. Olshausen

PLoS Comput. Biol., 2014

Analyzing noise in autoencoders and deep networks.

[BibT_eX]

[DOI]

,

Jascha Sohl-Dickstein

,

CoRR, 2014

Fast large-scale optimization by unifying stochastic gradient and quasi-Newton methods.

[BibT_eX]

[DOI]

Jascha Sohl-Dickstein

,

,

Proceedings of the 31th International Conference on Machine Learning, 2014

Hamiltonian Monte Carlo Without Detailed Balance.

[BibT_eX]

[DOI]

Jascha Sohl-Dickstein

,

Mayur Mudigonda

,

Michael Robert DeWeese

Proceedings of the 31th International Conference on Machine Learning, 2014

2013

An adaptive low dimensional quasi-Newton sum of functions optimizer.

[BibT_eX]

[DOI]

Jascha Sohl-Dickstein

,

,

CoRR, 2013

Measurably Increasing Motivation in MOOCs.

[BibT_eX]

[DOI]

Joseph Jay Williams

,

,

,

Jascha Sohl-Dickstein

Proceedings of the Workshops at the 16th International Conference on Artificial Intelligence in Education AIED 2013, 2013

Controlled experiments on millions of students to personalize learning.

[BibT_eX]

[DOI]

,

,

,

,

Jascha Sohl-Dickstein

Proceedings of the Workshops at the 16th International Conference on Artificial Intelligence in Education AIED 2013, 2013

2012

Efficient Methods for Unsupervised Learning of Probabilistic Models.

[BibT_eX]

[DOI]

Jascha Sohl-Dickstein

PhD thesis, 2012

Efficient Methods for Unsupervised Learning of Probabilistic Models

[BibT_eX]

[DOI]

Jascha Sohl-Dickstein

CoRR, 2012

Hamiltonian Monte Carlo with Reduced Momentum Flips

[BibT_eX]

[DOI]

Jascha Sohl-Dickstein

CoRR, 2012

Hamiltonian Annealed Importance Sampling for partition function estimation

[BibT_eX]

[DOI]

Jascha Sohl-Dickstein

,

Benjamin J. Culpepper

CoRR, 2012

The Natural Gradient by Analogy to Signal Whitening, and Recipes and Tricks for its Use

[BibT_eX]

[DOI]

Jascha Sohl-Dickstein

CoRR, 2012

Training sparse natural image models with a fast Gibbs sampler of an extended state space.

[BibT_eX]

[DOI]

,

Jascha Sohl-Dickstein

,

Matthias Bethge

Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

2011

Minimum Probability Flow Learning.

[BibT_eX]

[DOI]

Jascha Sohl-Dickstein

,

Peter Battaglino

,

Michael Robert DeWeese

Proceedings of the 28th International Conference on Machine Learning, 2011

Building a better probabilistic model of images by factorization.

[BibT_eX]

[DOI]

Benjamin J. Culpepper

,

Jascha Sohl-Dickstein

,

Bruno A. Olshausen

Proceedings of the IEEE International Conference on Computer Vision, 2011

Lie Group Transformation Models for Predictive Video Coding.

[BibT_eX]

[DOI]

Ching Ming Wang

,

Jascha Sohl-Dickstein

,

,

Bruno A. Olshausen

Proceedings of the 2011 Data Compression Conference (DCC 2011), 2011

2010

An Unsupervised Algorithm For Learning Lie Group Transformations

[BibT_eX]

[DOI]

Jascha Sohl-Dickstein

,

,

Bruno A. Olshausen

CoRR, 2010

Loading...