Omid Saremi

Orcid: 0000-0001-6718-093X

According to our database1, Omid Saremi authored at least 13 papers between 2015 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
The Slingshot Effect: A Late-Stage Optimization Anomaly in Adaptive Gradient Methods.
Trans. Mach. Learn. Res., 2024

How JEPA Avoids Noisy Features: The Implicit Bias of Deep Linear Self Distillation Networks.
CoRR, 2024

How Far Can Transformers Reason? The Locality Barrier and Inductive Scratchpad.
CoRR, 2024

What Algorithms can Transformers Learn? A Study in Length Generalization.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

LiDAR: Sensing Linear Probing Performance in Joint Embedding SSL Architectures.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Vanishing Gradients in Reinforcement Finetuning of Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

When can transformers reason with abstract symbols?
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Adaptivity and Modularity for Efficient Generalization Over Task Complexity.
CoRR, 2023

2022
The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon.
CoRR, 2022

2021
Implicit Greedy Rank Learning in Autoencoders via Overparameterized Linear Networks.
CoRR, 2021

Implicit Acceleration and Feature Learning in Infinitely Wide Neural Networks with Bottlenecks.
CoRR, 2021

2015
An Improved Continuous-Action Extended Classifier Systems for Function Approximation.
Proceedings of the Complex Adaptive Systems 2015 Conference, San Jose, 2015

An Improved eXtended Classifier System for the Real-time-input Real-time-output (XCSRR) Stability Control of a Biped Robot.
Proceedings of the Complex Adaptive Systems 2015 Conference, San Jose, 2015


  Loading...