Piotr Stanczyk

According to our database1, Piotr Stanczyk authored at least 12 papers between 2010 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Gemma 2: Improving Open Language Models at a Practical Size.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
CoRR, 2024

BOND: Aligning LLMs with Best-of-N Distillation.
CoRR, 2024

On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
GKD: Generalized Knowledge Distillation for Auto-regressive Sequence Models.
CoRR, 2023

Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2021
RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning.
CoRR, 2021

Launchpad: A Programming Model for Distributed Machine Learning Research.
CoRR, 2021

What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study.
Proceedings of the 9th International Conference on Learning Representations, 2021

2020
What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study.
CoRR, 2020

SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference.
Proceedings of the 8th International Conference on Learning Representations, 2020

Google Research Football: A Novel Reinforcement Learning Environment.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2010
Perfect Matching for Biconnected Cubic Graphs in <i>O</i>(<i>n</i> log<sup>2</sup><i>n</i>) Time.
Proceedings of the SOFSEM 2010: Theory and Practice of Computer Science, 2010


  Loading...