Tomasz Korbak

Orcid: 0000-0002-6258-2013

According to our database1, Tomasz Korbak authored at least 29 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Learning from Natural Language Feedback.
Trans. Mach. Learn. Res., 2024

Aligning language models with human preferences.
CoRR, 2024

Foundational Challenges in Assuring Alignment and Safety of Large Language Models.
CoRR, 2024

Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data.
CoRR, 2024

Towards Understanding Sycophancy in Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Compositional Preference Models for Aligning LMs.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A".
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Self-organisation, (M, R)-systems and enactive cognitive science.
Adapt. Behav., February, 2023

Inverse Scaling: When Bigger Isn't Better.
Trans. Mach. Learn. Res., 2023

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback.
Trans. Mach. Learn. Res., 2023

Towards Understanding Sycophancy in Language Models.
CoRR, 2023

Taken out of context: On measuring situational awareness in LLMs.
CoRR, 2023

Training Language Models with Language Feedback at Scale.
CoRR, 2023

Improving Code Generation by Training with Natural Language Feedback.
CoRR, 2023

Models of symbol emergence in communication: a conceptual review and a guide for avoiding local minima.
CoRR, 2023

Pretraining Language Models with Human Preferences.
Proceedings of the International Conference on Machine Learning, 2023

Aligning Language Models with Preferences through f-divergence Minimization.
Proceedings of the International Conference on Machine Learning, 2023

2022
On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Controlling Conditional Language Models without Catastrophic Forgetting.
Proceedings of the International Conference on Machine Learning, 2022

RL with KL penalties is better viewed as Bayesian inference.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

2021
Controlling Conditional Language Models with Distributional Policy Gradients.
CoRR, 2021

Energy-Based Models for Code Generation under Compilability Constraints.
CoRR, 2021

Catalytic Role Of Noise And Necessity Of Inductive Biases In The Emergence Of Compositional Communication.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

2020
Measuring non-trivial compositionality in emergent communication.
CoRR, 2020

The Emergence of Action-grounded Compositional Communication.
Proceedings of the 42th Annual Meeting of the Cognitive Science Society, 2020

2019
Developmentally motivated emergence of compositional communication via template transfer.
CoRR, 2019

Exploiting Unsupervised Pre-training and Automated Feature Engineering for Low-resource Hate Speech Detection in Polish.
CoRR, 2019

2017
Fine-tuning Tree-LSTM for phrase-level sentiment classification on a Polish dependency treebank. Submission to PolEval task 2.
CoRR, 2017

Fine-Tuning Tree-LSTM for Phrase-Level Sentiment Classification on a Polish Dependency Treebank.
Proceedings of the Human Language Technology. Challenges for Computer Science and Linguistics, 2017


  Loading...