Sainbayar Sukhbaatar

According to our database1, Sainbayar Sukhbaatar authored at least 49 papers between 2011 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Thinking LLMs: General Instruction Following with Thought Generation.
CoRR, 2024

Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces.
CoRR, 2024

Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge.
CoRR, 2024

Following Length Constraints in Instructions.
CoRR, 2024

Contextual Position Encoding: Learning to Count What's Important.
CoRR, 2024

Iterative Reasoning Preference Optimization.
CoRR, 2024

Reverse Training to Nurse the Reversal Curse.
CoRR, 2024

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM.
CoRR, 2024

Teaching Large Language Models to Reason with Reinforcement Learning.
CoRR, 2024

Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping.
CoRR, 2024

Self-Rewarding Language Models.
CoRR, 2024

Self-Rewarding Language Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023
Some things are more CRINGE than others: Preference Optimization with the Pairwise Cringe Loss.
CoRR, 2023

System 2 Attention (is something you might need too).
CoRR, 2023

Improving Open Language Models by Learning from Organic Interactions.
CoRR, 2023

Large Language Model Programs.
CoRR, 2023

Think Before You Act: Unified Policy for Interleaving Language Reasoning with Actions.
CoRR, 2023

MINOTAUR: Multi-task Video Grounding From Multimodal Queries.
CoRR, 2023

Learning to Reason and Memorize with Self-Notes.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

The CRINGE Loss: Learning what language not to model.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

A Data Source for Reasoning Embodied Agents.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Walk the Random Walk: Learning to Discover and Reach Goals Without Supervision.
CoRR, 2022

Temporal abstractions-augmented temporally contrastive learning: An alternative to the Laplacian in RL.
Proceedings of the Uncertainty in Artificial Intelligence, 2022

Staircase Attention for Recurrent Processing of Sequences.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Memory-Augmented Reinforcement Learning for Image-Goal Navigation.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022

Director: Generator-Classifiers For Supervised Language Modeling.
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, 2022

Learning Goal-Conditioned Policies Offline with Self-Supervised Reward Shaping.
Proceedings of the Conference on Robot Learning, 2022

2021
Hash Layers For Large Sparse Models.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Not All Memories are Created Equal: Learning to Forget by Expiring.
Proceedings of the 38th International Conference on Machine Learning, 2021

2020
Learning to Visually Navigate in Photorealistic Environments Without any Supervision.
CoRR, 2020

Accessing Higher-level Representations in Sequential Transformers with Feedback Memory.
CoRR, 2020

2019
Augmenting Self-attention with Persistent Memory.
CoRR, 2019

Learning when to Communicate at Scale in Multiagent Cooperative and Competitive Tasks.
Proceedings of the 7th International Conference on Learning Representations, 2019

Adaptive Attention Span in Transformers.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Training Hybrid Language Models by Marginalizing over Segmentations.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Elements of Intelligence: Memory, Communication and Intrinsic Motivation.
PhD thesis, 2018

Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement Learning.
CoRR, 2018

Planning with Arithmetic and Geometric Attributes.
CoRR, 2018

Composable Planning with Attributes.
Proceedings of the 35th International Conference on Machine Learning, 2018

Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play.
Proceedings of the 6th International Conference on Learning Representations, 2018

2017
Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play.
CoRR, 2017

2016
Learning Multiagent Communication with Backpropagation.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

2015
Simple Baseline for Visual Question Answering.
CoRR, 2015

Weakly Supervised Memory Networks.
CoRR, 2015

MazeBase: A Sandbox for Learning from Games.
CoRR, 2015

Learning from Noisy Labels with Deep Neural Networks.
Proceedings of the 3rd International Conference on Learning Representations, 2015

End-To-End Memory Networks.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

2013
Auto-pooling: Learning to Improve Invariance of Image Features from Image Sequences
Proceedings of the 1st International Conference on Learning Representations, 2013

2011
Robust Generation of Dynamical Patterns in Human Motion by a Deep Belief Nets.
Proceedings of the 3rd Asian Conference on Machine Learning, 2011


  Loading...