Quoc V. Le

Orcid: 0000-0002-1087-2844

Affiliations:
  • Google Inc., Mountain View, CA, USA
  • Stanford University, Computer Science Department, CA, USA


According to our database1, Quoc V. Le authored at least 238 papers between 2005 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Solving olympiad geometry without human demonstrations.
Nat., January, 2024

Scaling Instruction-Finetuned Language Models.
J. Mach. Learn. Res., 2024

Michelangelo: Long Context Evaluations Beyond Haystacks via Latent Structure Queries.
CoRR, 2024

Large Language Monkeys: Scaling Inference Compute with Repeated Sampling.
CoRR, 2024

HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal Reasoning.
CoRR, 2024

NATURAL PLAN: Benchmarking LLMs on Natural Language Planning.
CoRR, 2024

Long-form factuality in large language models.
CoRR, 2024

Self-Discover: Large Language Models Self-Compose Reasoning Structures.
CoRR, 2024

Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Large Language Models as Optimizers.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal Reasoning.
Proceedings of the Computer Vision - ECCV 2024, 2024

Beyond ChatBots: ExploreLLM for Structured Thoughts and Personalized Model Responses.
Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, 2024

FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Combined scaling for zero-shot transfer learning.
Neurocomputing, October, 2023

AutoNumerics-Zero: Automated Discovery of State-of-the-Art Mathematical Functions.
CoRR, 2023

Simple synthetic data reduces sycophancy in large language models.
CoRR, 2023

FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization Search.
CoRR, 2023

Symbolic Discovery of Optimization Algorithms.
CoRR, 2023

Unified Functional Hashing in Automatic Machine Learning.
CoRR, 2023

Noise2Music: Text-conditioned Music Generation with Diffusion Models.
CoRR, 2023

PyGlove: Efficiently Exchanging ML Ideas as Code.
CoRR, 2023

DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Symbolic Discovery of Optimization Algorithms.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Brainformers: Trading Simplicity for Efficiency.
Proceedings of the International Conference on Machine Learning, 2023

The Flan Collection: Designing Data and Methods for Effective Instruction Tuning.
Proceedings of the International Conference on Machine Learning, 2023

Least-to-Most Prompting Enables Complex Reasoning in Large Language Models.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Self-Consistency Improves Chain of Thought Reasoning in Language Models.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Inverse Scaling Can Become U-Shaped.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Symbol tuning improves in-context learning in language models.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Transcending Scaling Laws with 0.1% Extra Compute.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Hyperscale Hardware Optimized Neural Architecture Search.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition.
IEEE J. Sel. Top. Signal Process., 2022

Inverse scaling can become U-shaped.
CoRR, 2022

Scaling Instruction-Finetuned Language Models.
CoRR, 2022

Rationale-Augmented Ensembles in Language Models.
CoRR, 2022

Least-to-Most Prompting Enables Complex Reasoning in Large Language Models.
CoRR, 2022

Resource-Constrained Neural Architecture Search on Tabular Datasets.
CoRR, 2022

Revisiting Multi-Scale Feature Fusion for Semantic Segmentation.
CoRR, 2022

Self-Consistency Improves Chain of Thought Reasoning in Language Models.
CoRR, 2022

DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection.
CoRR, 2022

Chain of Thought Prompting Elicits Reasoning in Large Language Models.
CoRR, 2022

LaMDA: Language Models for Dialog Applications.
CoRR, 2022

The Carbon Footprint of Machine Learning Training Will Plateau, Then Shrink.
Computer, 2022

G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Mixture-of-Experts with Expert Choice Routing.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

TabNAS: Rejection Sampling for Neural Architecture Search on Tabular Datasets.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Transformer Quality in Linear Time.
Proceedings of the International Conference on Machine Learning, 2022


Finetuned Language Models are Zero-Shot Learners.
Proceedings of the Tenth International Conference on Learning Representations, 2022

DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

A full-stack search technique for domain optimized deep learning accelerators.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

2021
A graph placement methodology for fast chip design.
Nat., 2021

Combined Scaling for Zero-shot Transfer Learning.
CoRR, 2021

Primer: Searching for Efficient Transformers for Language Modeling.
CoRR, 2021

Program Synthesis with Large Language Models.
CoRR, 2021

A Full-stack Accelerator Search Technique for Vision Applications.
CoRR, 2021

Carbon Emissions and Large Neural Network Training.
CoRR, 2021

SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network.
CoRR, 2021

Searching for Efficient Transformers for Language Modeling.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Pay Attention to MLPs.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

CoAtNet: Marrying Convolution and Attention for All Data Sizes.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021

Towards Domain-Agnostic Contrastive Learning.
Proceedings of the 38th International Conference on Machine Learning, 2021

EfficientNetV2: Smaller Models and Faster Training.
Proceedings of the 38th International Conference on Machine Learning, 2021

Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision.
Proceedings of the 38th International Conference on Machine Learning, 2021

Evolving Reinforcement Learning Algorithms.
Proceedings of the 9th International Conference on Learning Representations, 2021

Multi-Task Self-Training for Learning General Representations.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

STraTA: Self-Training with Task Augmentation for Better Few-shot Learning.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Meta Pseudo Labels.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Simple Copy-Paste Is a Strong Data Augmentation Method for Instance Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Searching for Fast Model Families on Datacenter Accelerators.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

AutoDropout: Learning Dropout Patterns to Regularize Deep Networks.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Towards Domain-Agnostic Contrastive Learning.
CoRR, 2020

Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition.
CoRR, 2020

Smooth Adversarial Training.
CoRR, 2020

AutoHAS: Differentiable Hyper-parameter and Architecture Search.
CoRR, 2020

Chip Placement with Deep Reinforcement Learning.
CoRR, 2020

Towards a Human-like Open-Domain Chatbot.
CoRR, 2020

Rethinking Pre-training and Self-training.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Unsupervised Data Augmentation for Consistency Training.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

PyGlove: Symbolic Programming for Automated Machine Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Evolving Normalization-Activation Layers.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

RandAugment: Practical Automated Data Augmentation with a Reduced Search Space.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Neural Input Search for Large Scale Recommendation Models.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

Improved Noisy Student Training for Automatic Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks.
Proceedings of the 37th International Conference on Machine Learning, 2020

AutoML-Zero: Evolving Machine Learning Algorithms From Scratch.
Proceedings of the 37th International Conference on Machine Learning, 2020

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators.
Proceedings of the 8th International Conference on Learning Representations, 2020

Neural Symbolic Reader: Scalable Integration of Distributed and Symbolic Representations for Reading Comprehension.
Proceedings of the 8th International Conference on Learning Representations, 2020

Specaugment on Large Scale Datasets.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Pre-Training Transformers as Energy-Based Cloze Models.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Learning Data Augmentation Strategies for Object Detection.
Proceedings of the Computer Vision - ECCV 2020, 2020

BigNAS: Scaling up Neural Architecture Search with Big Single-Stage Models.
Proceedings of the Computer Vision - ECCV 2020, 2020

Efficient Scale-Permuted Backbone with Learned Resource Distribution.
Proceedings of the Computer Vision - ECCV 2020, 2020

Improving 3D Object Detection Through Progressive Population Based Augmentation.
Proceedings of the Computer Vision - ECCV 2020, 2020

Adversarial Examples Improve Image Recognition.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Self-Training With Noisy Student Improves ImageNet Classification.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

EfficientDet: Scalable and Efficient Object Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

MnasFPN: Learning Latency-Aware Pyramid Architecture for Object Detection on Mobile Devices.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Can Weight Sharing Outperform Random Architecture Search? An Investigation With TuNAS.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Natural Questions: a Benchmark for Question Answering Research.
Trans. Assoc. Comput. Linguistics, 2019

RandAugment: Practical data augmentation with no separate search.
CoRR, 2019

Neural Input Search for Large Scale Recommendation Models.
CoRR, 2019

Selfie: Self-supervised Pretraining for Image Embedding.
CoRR, 2019

Unsupervised Data Augmentation.
CoRR, 2019

Using Videos to Evaluate Image Model Robustness.
CoRR, 2019

NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection.
CoRR, 2019

Soft Conditional Computation.
CoRR, 2019

Mixtape: Breaking the Softmax Bottleneck Efficiently.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

XLNet: Generalized Autoregressive Pretraining for Language Understanding.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

CondConv: Conditionally Parameterized Convolutions for Efficient Inference.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Saccader: Improving Accuracy of Hard Attention Models for Vision.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.
Proceedings of the 36th International Conference on Machine Learning, 2019

The Evolved Transformer.
Proceedings of the 36th International Conference on Machine Learning, 2019

The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study.
Proceedings of the 36th International Conference on Machine Learning, 2019

Diversity and Depth in Per-Example Routing Models.
Proceedings of the 7th International Conference on Learning Representations, 2019

Searching for MobileNetV3.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Attention Augmented Convolutional Networks.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

MnasNet: Platform-Aware Neural Architecture Search for Mobile.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Do Better ImageNet Models Transfer Better?
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

AutoAugment: Learning Augmentation Strategies From Data.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

MixConv: Mixed Depthwise Convolutional Kernels.
Proceedings of the 30th British Machine Vision Conference 2019, 2019

Transformer-XL: Attentive Language Models beyond a Fixed-Length Context.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

BAM! Born-Again Multi-Task Networks for Natural Language Understanding.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Regularized Evolution for Image Classifier Architecture Search.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Scalable and accurate deep learning with electronic health records.
npj Digit. Medicine, 2018

Domain Adaptive Transfer Learning with Specialist Models.
CoRR, 2018

GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism.
CoRR, 2018

Backprop Evolution.
CoRR, 2018

MnasNet: Platform-Aware Neural Architecture Search for Mobile.
CoRR, 2018

Memory Augmented Policy Optimization for Program Synthesis with Generalization.
CoRR, 2018

Stochastic natural gradient descent draws posterior samples in function space.
CoRR, 2018

A Simple Method for Commonsense Reasoning.
CoRR, 2018

AutoAugment: Learning Augmentation Policies from Data.
CoRR, 2018

Scalable and accurate deep learning for electronic health records.
CoRR, 2018

Neural Program Synthesis with Priority Queue Training.
CoRR, 2018

Memory Augmented Policy Optimization for Program Synthesis and Semantic Parsing.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

DropBlock: A regularization method for convolutional networks.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Learning Longer-term Dependencies in RNNs with Auxiliary Losses.
Proceedings of the 35th International Conference on Machine Learning, 2018

Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games?
Proceedings of the 35th International Conference on Machine Learning, 2018

Efficient Neural Architecture Search via Parameter Sharing.
Proceedings of the 35th International Conference on Machine Learning, 2018

Understanding and Simplifying One-Shot Architecture Search.
Proceedings of the 35th International Conference on Machine Learning, 2018

QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension.
Proceedings of the 6th International Conference on Learning Representations, 2018

Learning Longer-term Dependencies in RNNs with Auxiliary Losses.
Proceedings of the 6th International Conference on Learning Representations, 2018

A Bayesian Perspective on Generalization and Stochastic Gradient Descent.
Proceedings of the 6th International Conference on Learning Representations, 2018

Don't Decay the Learning Rate, Increase the Batch Size.
Proceedings of the 6th International Conference on Learning Representations, 2018

Searching for Activation Functions.
Proceedings of the 6th International Conference on Learning Representations, 2018

Faster Discovery of Neural Architectures by Searching for Paths in a Large Model.
Proceedings of the 6th International Conference on Learning Representations, 2018

A Hierarchical Model for Device Placement.
Proceedings of the 6th International Conference on Learning Representations, 2018

Intriguing Properties of Adversarial Examples.
Proceedings of the 6th International Conference on Learning Representations, 2018

Evolving modular neural sequence architectures with genetic programming.
Proceedings of the Genetic and Evolutionary Computation Conference Companion, 2018

AirDialogue: An Environment for Goal-Oriented Dialogue Research.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Semi-Supervised Sequence Modeling with Cross-View Training.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Learning Transferable Architectures for Scalable Image Recognition.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017
Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation.
Trans. Assoc. Comput. Linguistics, 2017

Don't Decay the Learning Rate, Increase the Batch Size.
CoRR, 2017

Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model.
CoRR, 2017

Large-Scale Evolution of Image Classifiers.
CoRR, 2017

Massive Exploration of Neural Machine Translation Architectures.
CoRR, 2017

Effective Domain Mixing for Neural Machine Translation.
Proceedings of the Second Conference on Machine Translation, 2017

Tacotron: Towards End-to-End Speech Synthesis.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Large-Scale Evolution of Image Classifiers.
Proceedings of the 34th International Conference on Machine Learning, 2017

Device Placement Optimization with Reinforcement Learning.
Proceedings of the 34th International Conference on Machine Learning, 2017

Neural Optimizer Search with Reinforcement Learning.
Proceedings of the 34th International Conference on Machine Learning, 2017

Neural Architecture Search with Reinforcement Learning.
Proceedings of the 5th International Conference on Learning Representations, 2017

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer.
Proceedings of the 5th International Conference on Learning Representations, 2017

Learning a Natural Language Interface with Neural Programmer.
Proceedings of the 5th International Conference on Learning Representations, 2017

HyperNetworks.
Proceedings of the 5th International Conference on Learning Representations, 2017

Latent Sequence Decompositions.
Proceedings of the 5th International Conference on Learning Representations, 2017

Neural Combinatorial Optimization with Reinforcement Learning.
Proceedings of the 5th International Conference on Learning Representations, 2017

Unsupervised Pretraining for Sequence to Sequence Learning.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Learning to Skim Text.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

2016
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation.
CoRR, 2016

Neural Programmer: Inducing Latent Programs with Gradient Descent.
Proceedings of the 4th International Conference on Learning Representations, 2016

Multi-task Sequence to Sequence Learning.
Proceedings of the 4th International Conference on Learning Representations, 2016

Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision (Short Version).
CoRR, 2016

End-to-end Learning for Text and Speech.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

An Online Sequence-to-Sequence Model Using Partial Conditioning.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Listen, attend and spell: A neural network for large vocabulary conversational speech recognition.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
A Neural Conversational Model.
CoRR, 2015

Adding Gradient Noise Improves Learning for Very Deep Networks.
CoRR, 2015

A Simple Way to Initialize Recurrent Networks of Rectified Linear Units.
CoRR, 2015

An Online Sequence-to-Sequence Model Using Partial Conditioning.
CoRR, 2015

Document Embedding with Paragraph Vectors.
CoRR, 2015

Listen, Attend and Spell.
CoRR, 2015

Semi-supervised Sequence Learning.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Addressing the Rare Word Problem in Neural Machine Translation.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

2014
Grounded Compositional Semantics for Finding and Describing Images with Sentences.
Trans. Assoc. Comput. Linguistics, 2014

Fastfood: Approximate Kernel Expansions in Loglinear Time.
CoRR, 2014

Sequence to Sequence Learning with Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Distributed Representations of Sentences and Documents.
Proceedings of the 31th International Conference on Machine Learning, 2014

2013
Scalable feature learning.
PhD thesis, 2013

Exploiting Similarities among Languages for Machine Translation.
CoRR, 2013

Using Web Co-occurrence Statistics for Improving Image Categorization.
CoRR, 2013

Fastfood - Computing Hilbert Space Expansions in loglinear time.
Proceedings of the 30th International Conference on Machine Learning, 2013

On rectified linear units for speech processing.
Proceedings of the IEEE International Conference on Acoustics, 2013

Building high-level features using large scale unsupervised learning.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
Large Scale Distributed Deep Networks.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Learning invariant features of tumor signatures.
Proceedings of the 9th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 2012

Recurrent Neural Networks for Noise Reduction in Robust ASR.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Building high-level features using large scale unsupervised learning.
Proceedings of the 29th International Conference on Machine Learning, 2012

2011
ICA with Reconstruction Cost for Efficient Overcomplete Feature Learning.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

On optimization methods for deep learning.
Proceedings of the 28th International Conference on Machine Learning, 2011

Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

2010
Bundle Methods for Regularized Risk Minimization.
J. Mach. Learn. Res., 2010

Tiled convolutional neural networks.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Grasping novel objects with depth segmentation.
Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010

Low-cost accelerometers for robotic manipulator perception.
Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010

Learning to grasp objects with multiple contact points.
Proceedings of the IEEE International Conference on Robotics and Automation, 2010

2009
Learning Graph Matching.
IEEE Trans. Pattern Anal. Mach. Intell., 2009

Estimating Labels from Label Proportions.
J. Mach. Learn. Res., 2009

Measuring Invariances in Deep Networks.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Joint calibration of multiple sensors.
Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009

Scalable learning for object detection with GPU hardware.
Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009

High-accuracy 3D sensing for mobile manipulation: Improving object detection and door opening.
Proceedings of the 2009 IEEE International Conference on Robotics and Automation, 2009

Proximal regularization for online and batch learning.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

2008
Tighter Bounds for Structured Estimation.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

2007
Direct Optimization of Ranking Measures
CoRR, 2007

COFI RANK - Maximum Margin Matrix Factorization for Collaborative Ranking .
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

Bundle Methods for Machine Learning.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

A scalable modular convex solver for regularized risk minimization.
Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2007

Learning Graph Matching.
Proceedings of the IEEE 11th International Conference on Computer Vision, 2007

2006
Nonparametric Quantile Estimation.
J. Mach. Learn. Res., 2006

Learning to Rank with Nonsmooth Cost Functions.
Proceedings of the Advances in Neural Information Processing Systems 19, 2006

Simpler knowledge-based support vector machines.
Proceedings of the Machine Learning, 2006

Transductive Gaussian Process Regression with Automatic Model Selection.
Proceedings of the Machine Learning: ECML 2006, 2006

2005
Mapping Maintenance for Data Integration Systems.
Proceedings of the 31st International Conference on Very Large Data Bases, Trondheim, Norway, August 30, 2005

Large-Scale Multiclass Transduction.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

Heteroscedastic Gaussian process regression.
Proceedings of the Machine Learning, 2005


  Loading...