Ilya Sutskever

Affiliations:
  • Safe Superintelligence Inc., San Francisco, CA, USA
  • OpenAI, San Francisco, CA, USA (former)
  • Google Inc, Mountain View, CA, USA (former)
  • University of Toronto, Canada (PhD 2013)


According to our database1, Ilya Sutskever authored at least 82 papers between 2007 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Scaling and evaluating sparse autoencoders.
CoRR, 2024

Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Let's Verify Step by Step.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Consistency Models.
Proceedings of the International Conference on Machine Learning, 2023

Robust Speech Recognition via Large-Scale Weak Supervision.
Proceedings of the International Conference on Machine Learning, 2023

Formal Mathematics Statement Curriculum Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.
Proceedings of the International Conference on Machine Learning, 2022

2021
Unsupervised Neural Machine Translation with Generative Language Models Only.
CoRR, 2021

Evaluating Large Language Models Trained on Code.
CoRR, 2021

Zero-Shot Text-to-Image Generation.
Proceedings of the 38th International Conference on Machine Learning, 2021

Learning Transferable Visual Models From Natural Language Supervision.
Proceedings of the 38th International Conference on Machine Learning, 2021

2020
Generative Language Modeling for Automated Theorem Proving.
CoRR, 2020

Jukebox: A Generative Model for Music.
CoRR, 2020


Distribution Augmentation for Generative Modeling.
Proceedings of the 37th International Conference on Machine Learning, 2020

Generative Pretraining From Pixels.
Proceedings of the 37th International Conference on Machine Learning, 2020

Deep Double Descent: Where Bigger Models and More Data Hurt.
Proceedings of the 8th International Conference on Learning Representations, 2020

2019
Dota 2 with Large Scale Deep Reinforcement Learning.
CoRR, 2019

Generating Long Sequences with Sparse Transformers.
CoRR, 2019

GamePad: A Learning Environment for Theorem Proving.
Proceedings of the 7th International Conference on Learning Representations, 2019

FFJORD: Free-Form Continuous Dynamics for Scalable Reversible Generative Models.
Proceedings of the 7th International Conference on Learning Representations, 2019

2018
Some Considerations on Learning to Explore via Meta-Reinforcement Learning.
CoRR, 2018

The Importance of Sampling inMeta-Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Emergent Complexity via Multi-Agent Competition.
Proceedings of the 6th International Conference on Learning Representations, 2018

Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments.
Proceedings of the 6th International Conference on Learning Representations, 2018

2017
Evolution Strategies as a Scalable Alternative to Reinforcement Learning.
CoRR, 2017

Learning to Generate Reviews and Discovering Sentiment.
CoRR, 2017

An online sequence-to-sequence model for noisy speech recognition.
CoRR, 2017

ImageNet classification with deep convolutional neural networks.
Commun. ACM, 2017

One-Shot Imitation Learning.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Third Person Imitation Learning.
Proceedings of the 5th International Conference on Learning Representations, 2017

Variational Lossy Autoencoder.
Proceedings of the 5th International Conference on Learning Representations, 2017

Learning online alignments with continuous rewards policy gradient.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016
Mastering the game of Go with deep neural networks and tree search.
Nat., 2016

Neural Random Access Machines.
ERCIM News, 2016

Extensions and Limitations of the Neural GPU.
CoRR, 2016

Neural Programmer: Inducing Latent Programs with Gradient Descent.
Proceedings of the 4th International Conference on Learning Representations, 2016

Multi-task Sequence to Sequence Learning.
Proceedings of the 4th International Conference on Learning Representations, 2016

Neural GPUs Learn Algorithms.
Proceedings of the 4th International Conference on Learning Representations, 2016

MuProp: Unbiased Backpropagation for Stochastic Neural Networks.
Proceedings of the 4th International Conference on Learning Representations, 2016

RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning.
CoRR, 2016

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems.
CoRR, 2016

Improving Variational Autoencoders with Inverse Autoregressive Flow.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

An Online Sequence-to-Sequence Model Using Partial Conditioning.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Continuous Deep Q-Learning with Model-based Acceleration.
Proceedings of the 33nd International Conference on Machine Learning, 2016

2015
Reinforcement Learning Neural Turing Machines.
CoRR, 2015

Towards Principled Unsupervised Learning.
CoRR, 2015

Adding Gradient Noise Improves Learning for Very Deep Networks.
CoRR, 2015

Move Evaluation in Go Using Deep Convolutional Neural Networks.
Proceedings of the 3rd International Conference on Learning Representations, 2015

An Online Sequence-to-Sequence Model Using Partial Conditioning.
CoRR, 2015

Grammar as a Foreign Language.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

An Empirical Exploration of Recurrent Network Architectures.
Proceedings of the 32nd International Conference on Machine Learning, 2015

Addressing the Rare Word Problem in Neural Machine Translation.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

2014
Dropout: a simple way to prevent neural networks from overfitting.
J. Mach. Learn. Res., 2014

Recurrent Neural Network Regularization.
CoRR, 2014

Learning to Execute.
CoRR, 2014

Intriguing properties of neural networks.
Proceedings of the 2nd International Conference on Learning Representations, 2014

Learning Factored Representations in a Deep Mixture of Experts.
Proceedings of the 2nd International Conference on Learning Representations, 2014

Sequence to Sequence Learning with Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

2013
Training Recurrent Neural Networks.
PhD thesis, 2013

Exploiting Similarities among Languages for Machine Translation.
CoRR, 2013

Distributed Representations of Words and Phrases and their Compositionality.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Stochastic k-Neighborhood Selection for Supervised and Unsupervised Learning.
Proceedings of the 30th International Conference on Machine Learning, 2013

On the importance of initialization and momentum in deep learning.
Proceedings of the 30th International Conference on Machine Learning, 2013

2012
Training Deep and Recurrent Networks with Hessian-Free Optimization.
Proceedings of the Neural Networks: Tricks of the Trade - Second Edition, 2012

Improving neural networks by preventing co-adaptation of feature detectors
CoRR, 2012

Cardinality Restricted Boltzmann Machines.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Estimating the Hessian by Back-propagating Curvature.
Proceedings of the 29th International Conference on Machine Learning, 2012

2011
Generating Text with Recurrent Neural Networks.
Proceedings of the 28th International Conference on Machine Learning, 2011

Learning Recurrent Neural Networks with Hessian-Free Optimization.
Proceedings of the 28th International Conference on Machine Learning, 2011

2010
Temporal-Kernel Recurrent Neural Networks.
Neural Networks, 2010

On the Convergence Properties of Contrastive Divergence.
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010

Parallelizable Sampling of Markov Random Fields.
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010

2009
Modelling Relational Data using Bayesian Clustered Tensor Factorization.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

A simpler unified analysis of budget perceptrons.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

2008
Deep, Narrow Sigmoid Belief Networks Are Universal Approximators.
Neural Comput., 2008

The Recurrent Temporal Restricted Boltzmann Machine.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Using matrices to model symbolic relationship.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Mimicking Go Experts with Convolutional Neural Networks.
Proceedings of the Artificial Neural Networks, 2008

2007
Learning Multilevel Distributed Representations for High-Dimensional Sequences.
Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, 2007

Visualizing Similarity Data with a Mixture of Maps.
Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, 2007


  Loading...