Rohan Anil

According to our database1, Rohan Anil authored at least 33 papers between 2016 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs.
CoRR, 2024

Michelangelo: Long Context Evaluations Beyond Haystacks via Latent Structure Queries.
CoRR, 2024

Learning from straggler clients in federated learning.
CoRR, 2024

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context.
CoRR, 2024

Combining Axes Preconditioners through Kronecker Approximation for Deep Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Layerwise Bregman Representation Learning of Neural Networks with Applications to Knowledge Distillation.
Trans. Mach. Learn. Res., 2023

Gemini: A Family of Highly Capable Multimodal Models.
CoRR, 2023

Heterogeneous Federated Learning Using Knowledge Codistillation.
CoRR, 2023

Benchmarking Neural Network Training Algorithms.
CoRR, 2023

PaLM 2 Technical Report.
CoRR, 2023

Sketchy: Memory-efficient Adaptive Regularization with Frequent Directions.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

A Computationally Efficient Sparsified Online Newton Method.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2022
Layerwise Bregman Representation Learning with Applications to Knowledge Distillation.
CoRR, 2022

N-Grammer: Augmenting Transformers with latent n-grams.
CoRR, 2022

Learning from Randomly Initialized Neural Network Features.
CoRR, 2022

Step-size Adaptation Using Exponentiated Gradient Updates.
CoRR, 2022

On the Factory Floor: ML Engineering for Industrial-Scale Ads Recommendation Models.
Proceedings of the 5th Workshop on Online Recommender Systems and User Modeling co-located with the 16th ACM Conference on Recommender Systems, 2022

Large-Scale Differentially Private BERT.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Knowledge distillation: A good teacher is patient and consistent.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

LocoProp: Enhancing BackProp via Local Loss Optimization.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021
A Large Batch Optimizer Reality Check: Traditional, Generic Optimizers Suffice Across Batch Sizes.
CoRR, 2021

Efficiently Identifying Task Groupings for Multi-Task Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

2020
Measuring and Harnessing Transference in Multi-Task Learning.
CoRR, 2020

Disentangling Adaptive Gradient Methods from Learning Rates.
CoRR, 2020

Second Order Optimization Made Practical.
CoRR, 2020

Stochastic Optimization with Laggard Data Pipelines.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

2019
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling.
CoRR, 2019

Memory-Efficient Adaptive Optimization for Large-Scale Learning.
CoRR, 2019

Memory Efficient Adaptive Optimization.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Robust Bi-Tempered Logistic Loss Based on Bregman Divergences.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank.
Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019

2018
Large scale distributed neural network training through online distillation.
Proceedings of the 6th International Conference on Learning Representations, 2018

2016
Wide & Deep Learning for Recommender Systems.
Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, 2016


  Loading...