Atli Kosson

According to our database1, Atli Kosson authored at least 11 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Analyzing & Reducing the Need for Learning Rate Warmup in GPT Training.
CoRR, 2024

Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations.
CoRR, 2024

Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Ghost Noise for Regularizing Deep Neural Networks.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Memory Efficient Mixed-Precision Optimizers.
CoRR, 2023

Rotational Optimizers: Simple & Robust DNN Training.
CoRR, 2023

Hardware-Efficient Transformer Training via Piecewise Affine Operations.
CoRR, 2023

Multiplication-Free Transformer Training via Piecewise Affine Operations.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2021
Pipelined Backpropagation at Scale: Training Large Models without Batches.
Proceedings of the Fourth Conference on Machine Learning and Systems, 2021

2020
Adaptive Braking for Mitigating Gradient Delay.
CoRR, 2020

2019
Online Normalization for Training Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019


  Loading...