Charles Sutton

Orcid: 0000-0002-0041-3820

Affiliations:
  • Google Research, Mountain View, CA, USA
  • University of Edinburgh, School of Informatics
  • The Alan Turing Institute, London, UK


According to our database1, Charles Sutton authored at least 129 papers between 2003 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Natural Language Outlines for Code: Literate Programming in the LLM Era.
CoRR, 2024

UQE: A Query Engine for Unstructured Databases.
CoRR, 2024

NExT: Teaching Large Language Models to Reason about Code Execution.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

A Probabilistic Framework for Modular Continual Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Programming Language Processing (Dagstuhl Seminar 23062).
Dagstuhl Reports, February, 2023

PaLM: Scaling Language Modeling with Pathways.
J. Mach. Learn. Res., 2023

Universal Self-Consistency for Large Language Model Generation.
CoRR, 2023

ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis.
CoRR, 2023

LambdaBeam: Neural Program Search with Higher-Order Functions and Lambdas.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Training Chain-of-Thought via Latent-Variable Inference.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Can Large Language Models Reason about Program Invariants?
Proceedings of the International Conference on Machine Learning, 2023

Any-scale Balanced Samplers for Discrete Space.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Natural Language to Code Generation in Interactive Data Science Notebooks.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Conditional Independence by Typing.
ACM Trans. Program. Lang. Syst., 2022

A Library for Representing Python Programs as Graphs for Machine Learning.
CoRR, 2022

Language Model Cascades.
CoRR, 2022

Repairing Systematic Outliers by Learning Clean Subspaces in VAEs.
CoRR, 2022

Compositional Generalization and Decomposition in Neural Program Synthesis.
CoRR, 2022

CrossBeam: Learning to Search in Bottom-Up Program Synthesis.
Proceedings of the Tenth International Conference on Learning Representations, 2022

2021
Show Your Work: Scratchpads for Intermediate Computation with Language Models.
CoRR, 2021

Program Synthesis with Large Language Models.
CoRR, 2021

A Bayesian-Symbolic Approach to Reasoning and Learning in Intuitive Physics.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Learning Semantic Representations to Verify Hardware Designs.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Latent Programmer: Discrete Latent Codes for Program Synthesis.
Proceedings of the 38th International Conference on Machine Learning, 2021

SpreadsheetCoder: Formula Prediction from Semi-structured Context.
Proceedings of the 38th International Conference on Machine Learning, 2021

BUSTLE: Bottom-Up Program Synthesis Through Learning-Guided Exploration.
Proceedings of the 9th International Conference on Learning Representations, 2021

Couplings for Multinomial Hamiltonian Monte Carlo.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020
BUSTLE: Bottom-up program-Synthesis Through Learning-guided Exploration.
CoRR, 2020

Neural Program Synthesis with a Differentiable Fixer.
CoRR, 2020

SCELMo: Source Code Embeddings from Language Models.
CoRR, 2020

OptTyper: Probabilistic Type Inference by Optimising Logical and Natural Constraints.
CoRR, 2020

Towards Modular Algorithm Induction.
CoRR, 2020

Learning Discrete Energy-based Models via Auxiliary-variable Local Exploration.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Learning to Execute Programs with Instruction Pointer Attention Graph Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

How Often Do Single-Statement Bugs Occur?: The ManySStuBs4J Dataset.
Proceedings of the MSR '20: 17th International Conference on Mining Software Repositories, 2020

Learning to Fix Build Errors with Graph2Diff Neural Networks.
Proceedings of the ICSE '20: 42nd International Conference on Software Engineering, Workshops, Seoul, Republic of Korea, 27 June, 2020

Where should I comment my code?: a dataset and model for predicting locations that need comments.
Proceedings of the ICSE-NIER 2020: 42nd International Conference on Software Engineering, New Ideas and Emerging Results, Seoul, South Korea, 27 June, 2020

Open-vocabulary models for source code.
Proceedings of the ICSE '20: 42nd International Conference on Software Engineering, Companion Volume, Seoul, South Korea, 27 June, 2020

Big code != big vocabulary: open-vocabulary models for source code.
Proceedings of the ICSE '20: 42nd International Conference on Software Engineering, Seoul, South Korea, 27 June, 2020

Incremental Sampling Without Replacement for Sequence Models.
Proceedings of the 37th International Conference on Machine Learning, 2020

Generative Ratio Matching Networks.
Proceedings of the 8th International Conference on Learning Representations, 2020

Learning to Represent Programs with Property Signatures.
Proceedings of the 8th International Conference on Learning Representations, 2020

Global Relational Models of Source Code.
Proceedings of the 8th International Conference on Learning Representations, 2020

Robust Variational Autoencoders for Outlier Detection and Repair of Mixed-Type Data.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019
Probabilistic programming with densities in SlicStan: efficient, flexible, and deterministic.
Proc. ACM Program. Lang., 2019

Wrangling messy CSV files by detecting row and type patterns.
Data Min. Knowl. Discov., 2019

Robust Variational Autoencoders for Outlier Detection in Mixed-Type Data.
CoRR, 2019

Maybe Deep Neural Networks are the Best Choice for Modeling Source Code.
CoRR, 2019

Learning Semantic Annotations for Tabular Data.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Variational Russian Roulette for Deep Bayesian Nonparametrics.
Proceedings of the 36th International Conference on Machine Learning, 2019

GEMSEC: graph embedding with self clustering.
Proceedings of the ASONAM '19: International Conference on Advances in Social Networks Analysis and Mining, 2019

ColNet: Embedding the Semantics of Web Tables for Column Type Prediction.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Mining Semantic Loop Idioms.
IEEE Trans. Software Eng., 2018

A Survey of Machine Learning for Big Code and Naturalness.
ACM Comput. Surv., 2018

Deep Learning to Detect Redundant Method Comments.
CoRR, 2018

Ratio Matching MMD Nets: Low dimensional projections for effective deep generative models.
CoRR, 2018

Variational Inference In Pachinko Allocation Machines.
CoRR, 2018

Synthesis of Differentiable Functional Programs for Lifelong Learning.
CoRR, 2018

Interpreting Deep Classifier by Visual Distillation of Dark Knowledge.
CoRR, 2018

Wrattler: Reproducible, live and polyglot notebooks.
Proceedings of the 10th USENIX Workshop on the Theory and Practice of Provenance, 2018

HOUDINI: Lifelong Learning as Program Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Deep Dungeons and Dragons: Learning Character-Action Interactions from Role-Playing Game Transcripts.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Data Diff: Interpretable, Executable Summaries of Changes in Distributions for Data Wrangling.
Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018

Summarizing Software API Usage Examples Using Clustering Techniques.
Proceedings of the Fundamental Approaches to Software Engineering, 2018

Sequence-to-Point Learning With Neural Networks for Non-Intrusive Load Monitoring.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Autofolding for Source Code Summarization.
IEEE Trans. Software Eng., 2017

Popularity of arXiv.org within Computer Science.
CoRR, 2017

VEEGAN: Reducing Mode Collapse in GANs using Implicit Variational Learning.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Learning Continuous Semantic Representations of Symbolic Expressions.
Proceedings of the 34th International Conference on Machine Learning, 2017

Autoencoding Variational Inference For Topic Models.
Proceedings of the 5th International Conference on Learning Representations, 2017

2016
A Bayesian Approach to Parameter Inference in Queueing Networks.
ACM Trans. Model. Comput. Simul., 2016

Sequence-to-point learning with neural networks for nonintrusive load monitoring.
CoRR, 2016

Clustering with a Reject Option: Interactive Clustering as Bayesian Prior Elicitation.
CoRR, 2016

Tailored Mutants Fit Bugs Better.
CoRR, 2016

More Semantics More Robust: Improving Android Malware Classifiers.
Proceedings of the 9th ACM Conference on Security & Privacy in Wireless and Mobile Networks, 2016

Parameter-free probabilistic API mining across GitHub.
Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2016

Composite Denoising Autoencoders.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2016

A Bayesian Network Model for Interesting Itemsets.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2016

A Subsequence Interleaving Model for Sequential Pattern Mining.
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016

Context Matters: Towards Extracting a Citation's Context Using Linguistic Features.
Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, 2016

On Robust Malware Classifiers by Verifying Unwanted Behaviours.
Proceedings of the Integrated Formal Methods - 12th International Conference, 2016

TASSAL: autofolding for source code summarization.
Proceedings of the 38th International Conference on Software Engineering, 2016

A Convolutional Attention Network for Extreme Summarization of Source Code.
Proceedings of the 33nd International Conference on Machine Learning, 2016

A text-mining approach to explain unwanted behaviours.
Proceedings of the 9th European Workshop on System Security, 2016

Explaining Unwanted Behaviours in Context.
Proceedings of the 1st International Workshop on Innovations in Mobile Privacy and Security, 2016

2015
Programming with "Big Code" (Dagstuhl Seminar 15472).
Dagstuhl Reports, 2015

Scheduled denoising autoencoders.
Proceedings of the 3rd International Conference on Learning Representations, 2015

Compressing LSTMs into CNNs.
CoRR, 2015

Parameter-Free Probabilistic API Mining at GitHub Scale.
CoRR, 2015

Suggesting accurate method and class names.
Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, 2015

Latent Bayesian melding for integrating individual and population models.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

2014
Autofolding for Source Code Summarization.
CoRR, 2014

Learning Natural Coding Conventions.
CoRR, 2014

Word storms: multiples of word clouds for visual comparison of documents.
Proceedings of the 23rd International World Wide Web Conference, 2014

Mining idioms from source code.
Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, (FSE-22), Hong Kong, China, November 16, 2014

Learning natural coding conventions.
Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, (FSE-22), Hong Kong, China, November 16, 2014

Signal Aggregate Constraints in Additive Factorial HMMs, with Application to Energy Disaggregation.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Semi-Separable Hamiltonian Monte Carlo for Inference in Bayesian Hierarchical Models.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

2013
Supporting User-Defined Functions on Uncertain Data.
Proc. VLDB Endow., 2013

Mining source code repositories at massive scale using language modeling.
Proceedings of the 10th Working Conference on Mining Software Repositories, 2013

Why, when, and what: analyzing stack overflow questions by topic, type, and code.
Proceedings of the 10th Working Conference on Mining Software Repositories, 2013

Multiple-source cross-validation.
Proceedings of the 30th International Conference on Machine Learning, 2013

2012
An Introduction to Conditional Random Fields.
Found. Trends Mach. Learn., 2012

Continuous Relaxations for Discrete Hamiltonian Monte Carlo.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Machine learning and multimedia content generation for energy demand reduction.
Proceedings of the Sustainable Internet and ICT for Sustainability, 2012

2011
Distributed inference and query processing for RFID tracking and monitoring.
Proc. VLDB Endow., 2011

Quasi-Newton Methods for Markov Chain Monte Carlo.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

2010
Inference and Learning in Networks of Queues.
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010

Bayesian Inference in Queueing Networks
CoRR, 2010

2009
Piecewise training for structured prediction.
Mach. Learn., 2009

Probabilistic Inference over RFID Streams in Mobile Environments.
Proceedings of the 25th International Conference on Data Engineering, 2009

Automatic exploration of datacenter performance regimes.
Proceedings of the 1st Workshop on Automated Control for Datacenters and Clouds, 2009

Statistical Machine Learning Makes Automatic Control Practical for Internet Datacenters.
Proceedings of the Workshop on Hot Topics in Cloud Computing, 2009

Capturing Data Uncertainty in High-Volume Stream Processing.
Proceedings of the Fourth Biennial Conference on Innovative Data Systems Research, 2009

2008
Probabilistic Inference in Queueing Networks.
Proceedings of the Third Workshop on Tackling Computer Systems Problems with Machine Learning Techniques, 2008

Exploiting Machine Learning to Subvert Your Spam Filter.
Proceedings of the First USENIX Workshop on Large-Scale Exploits and Emergent Threats, 2008

Unsupervised deduplication using cross-field dependencies.
Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2008

2007
Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data.
J. Mach. Learn. Res., 2007

Improved Dynamic Schedules for Belief Propagation.
Proceedings of the UAI 2007, 2007

Piecewise pseudolikelihood for efficient training of conditional random fields.
Proceedings of the Machine Learning, 2007

2006
Reducing Weight Undertraining in Structured Discriminative Learning.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2006

Sparse Forward-Backward Using Minimum Divergence Beams for Fast Training Of Conditional Random Fields.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
Piecewise Training for Undirected Models.
Proceedings of the UAI '05, 2005

Composition of Conditional Random Fields for Transfer Learning.
Proceedings of the HLT/EMNLP 2005, 2005

Joint Parsing and Semantic Role Labeling.
Proceedings of the Ninth Conference on Computational Natural Language Learning, 2005

Learning in Markov Random Fields with Contrastive Free Energies.
Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, 2005

2003
Guided Incremental Construction of Belief Networks.
Proceedings of the Advances in Intelligent Data Analysis V, 2003

Very Predictive Ngrams for Space-Limited Probabilistic Models.
Proceedings of the Advances in Intelligent Data Analysis V, 2003


  Loading...