We stand with Ukraine

We stand with Ukraine

Nitish Shirish Keskar

Orcid: 0000-0002-2223-8496

According to our database¹, Nitish Shirish Keskar authored at least 38 papers between 2015 and 2023.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2023

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models.

[BibT_eX]

[DOI]

Aarohi Srivastava

,

Abhinav Rastogi

,

,

Abu Awal Md Shoeb

,

,

,

,

,

,

Adrià Garriga-Alonso

,

Agnieszka Kluska

,

Aitor Lewkowycz

,

,

,

,

,

Alexander W. Kocurek

,

,

,

,

,

,

,

,

,

,

,

Anantharaman S. Iyer

,

Anders Andreassen

,

,

Andrea Santilli

,

Andreas Stuhlmüller

,

,

,

Andrew K. Lampinen

,

,

,

,

,

,

,

Antonio Norelli

,

,

Arash Gholamidavoodi

,

,

,

Arun Kirubarajan

,

Asher Mullokandov

,

Ashish Sabharwal

,

,

,

,

,

B. Ryan Roberts

,

,

,

Bartlomiej Bojanowski

,

Batuhan Özyurt

,

Behnam Hedayatnia

,

Behnam Neyshabur

,

,

,

,

Bill Yuchen Lin

,

,

,

,

,

Catherine Stinson

,

Cedrick Argueta

,

Cèsar Ferri Ramírez

,

,

Charles Rathkopf

,

,

,

,

Chris Callison-Burch

,

,

Christian Voigt

,

Christopher D. Manning

,

Christopher Potts

,

,

Clara E. Rivera

,

,

,

Courtney Ashcraft

,

Cristina Garbacea

,

,

,

,

,

,

,

Daniel Khashabi

,

,

Daniel Moseguí González

,

Danielle Perszyk

,

Danny Hernandez

,

,

Daphne Ippolito

,

,

,

,

,

Debajyoti Datta

,

,

,

,

,

,

,

,

,

,

Dimitri Coelho Mollo

,

,

,

,

Ekaterina Shutova

,

Ekin Dogus Cubuk

,

,

Eleanor Hagerman

,

Elizabeth Barnes

,

Elizabeth Donoway

,

,

Emanuele Rodolà

,

,

,

,

,

,

,

,

Ethan J. Jerzak

,

,

Eunice Engefu Manyasi

,

Evgenii Zheltonozhskii

,

,

,

Fernando Martínez-Plumed

,

Francesca Happé

,

François Chollet

,

,

,

Genta Indra Winata

,

,

Germán Kruszewski

,

Giambattista Parascandolo

,

Giorgio Mariani

,

,

Gonzalo Jaimovitch-López

,

,

,

Hana Galijasevic

,

,

,

Hannaneh Hajishirzi

,

,

,

,

Hinrich Schütze

,

,

,

,

,

,

,

Jack Geissinger

,

Jackson Kernion

,

,

,

Jaime Fernández Fisac

,

,

,

,

,

,

,

Janelle Wingfield

,

,

,

Jascha Sohl-Dickstein

,

,

,

,

Jekaterina Novikova

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Jonathan Batchelder

,

Jonathan Berant

,

,

,

José Hernández-Orallo

,

Joseph Boudeman

,

,

,

Joshua B. Tenenbaum

,

,

,

,

,

,

Karthik Gopalakrishnan

,

Katerina Ignatyeva

,

,

Kaustubh D. Dhole

,

,

,

,

Kristen Chiafullo

,

Ksenia Shkaruta

,

,

,

Kyle Richardson

,

,

,

,

,

,

Lidia Contreras Ochando

,

Louis-Philippe Morency

,

,

,

,

,

,

Luis Oliveros Colón

,

,

Lütfi Kerem Senel

,

,

,

Maartje ter Hoeve

,

,

,

,

,

,

,

María José Ramírez-Quintana

,

,

Mario Giulianelli

,

,

Martin Potthast

,

Matthew L. Leavitt

,

,

Mátyás Schubert

,

Medina Baitemirova

,

,

Melvin McElrath

,

,

,

,

Michael I. Ivanitskiy

,

Michael Starritt

,

,

Michal Swedrowski

,

Michele Bevilacqua

,

Michihiro Yasunaga

,

,

,

,

,

,

,

,

Moin Aminnaseri

,

,

,

Mukund Varma T.

,

,

,

,

Neta Gur-Ari Krakover

,

Nicholas Cameron

,

Nicholas Roberts

,

,

Nicole Martinez

,

,

,

Niklas Muennighoff

,

Nitish Shirish Keskar

,

,

,

,

,

,

,

Omar Elbaghdadi

,

,

,

Pablo Antonio Moreno Casares

,

,

,

,

,

Pegah Alipoormolabashi

,

,

,

,

Peter Eckersley

,

,

,

Piotr Milkowski

,

,

Pouya Pezeshkpour

,

,

,

,

,

,

Rachel Etta Rudolph

,

,

,

,

Raphaël Millière

,

,

,

,

,

Robbe Raymaekers

,

,

,

,

,

,

,

,

,

Ruslan Salakhutdinov

,

,

,

,

,

,

,

Saif M. Mohammad

,

,

,

,

,

Samuel Gruetter

,

Samuel R. Bowman

,

Samuel S. Schoenholz

,

,

,

,

Sarik Ghazarian

,

,

,

Sebastian Bischoff

,

Sebastian Gehrmann

,

Sebastian Schuster

,

Sepideh Sadeghi

,

,

,

Shashank Srivastava

,

,

,

,

Shixiang Shane Gu

,

Shubh Pachchigar

,

Shubham Toshniwal

,

,

Shyamolima (Shammie) Debnath

,

,

Simon Thormeyer

,

,

,

Sneha Priscilla Makini

,

,

,

Sriharsha Hatwar

,

Stanislas Dehaene

,

,

,

Stella Biderman

,

,

,

Steven T. Piantadosi

,

Stuart M. Shieber

,

Summer Misherghi

,

Svetlana Kiritchenko

,

,

,

,

,

,

,

Tatsu Hashimoto

,

,

Théo Desbordes

,

Theodore Rothschild

,

,

,

Tiberius Nkinyili

,

,

,

,

Tobias Gerstenberg

,

,

Trishala Neeraj

,

,

,

,

,

,

Victoria Nyamai

,

,

Vinay V. Ramasesh

,

Vinay Uday Prabhu

,

Vishakh Padmakumar

,

,

,

William Saunders

,

,

,

,

,

,

,

,

Yadollah Yaghoobzadeh

,

,

,

,

,

,

,

,

Yonatan Belinkov

,

,

,

,

,

,

,

,

,

Trans. Mach. Learn. Res., 2023

2022

Generating Negative Samples for Sequential Recommendation.

[BibT_eX]

[DOI]

,

,

,

Nitish Shirish Keskar

,

,

Julian J. McAuley

,

CoRR, 2022

Modeling Multi-hop Question Answering as Single Sequence Prediction.

[BibT_eX]

[DOI]

,

Kazuma Hashimoto

,

,

Nitish Shirish Keskar

,

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021

Combining Data-driven Supervision with Human-in-the-loop Feedback for Entity Resolution.

[BibT_eX]

[DOI]

,

Shelby Heinecke

,

,

Nitish Shirish Keskar

,

,

,

Stanislav Georgiev

,

,

Joseph Esposito

,

CoRR, 2021

Mirostat: a Neural Text decoding Algorithm that directly controls perplexity.

[BibT_eX]

[DOI]

,

Govardana Sachitanandam Ramachandran

,

Nitish Shirish Keskar

,

Lav R. Varshney

Proceedings of the 9th International Conference on Learning Representations, 2021

Unsupervised Paraphrasing with Pretrained Language Models.

[BibT_eX]

[DOI]

,

,

,

Nitish Shirish Keskar

,

,

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

GeDi: Generative Discriminator Guided Sequence Generation.

[BibT_eX]

[DOI]

,

Akhilesh Deepak Gotmare

,

,

Nitish Shirish Keskar

,

,

,

Nazneen Fatema Rajani

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Char2Subword: Extending the Subword Embedding Space Using Robust Character Compositionality.

[BibT_eX]

[DOI]

Gustavo Aguilar

,

,

,

,

Nitish Shirish Keskar

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

2020

Unsupervised Paraphrase Generation via Dynamic Blocking.

[BibT_eX]

[DOI]

,

,

,

,

Nitish Shirish Keskar

,

CoRR, 2020

Char2Subword: Extending the Subword Embedding Space from Pre-trained Models Using Robust Character Compositionality.

[BibT_eX]

[DOI]

Gustavo Aguilar

,

,

,

Nazneen Fatema Rajani

,

Nitish Shirish Keskar

,

CoRR, 2020

Mirostat: A Perplexity-Controlled Neural Text Decoding Algorithm.

[BibT_eX]

[DOI]

,

Govardana Sachitanandam Ramachandran

,

Nitish Shirish Keskar

,

Lav R. Varshney

CoRR, 2020

ProGen: Language Modeling for Protein Generation.

[BibT_eX]

[DOI]

,

,

,

Nitish Shirish Keskar

,

,

Raphael R. Eguchi

,

,

CoRR, 2020

Improving out-of-distribution generalization via multi-task self-supervised pretraining.

[BibT_eX]

[DOI]

Isabela Albuquerque

,

,

,

Nitish Shirish Keskar

,

CoRR, 2020

Limits of Detecting Text Generated by Large-Scale Language Models.

[BibT_eX]

[DOI]

Lav R. Varshney

,

Nitish Shirish Keskar

,

Proceedings of the Information Theory and Applications Workshop, 2020

Simple Data Augmentation with the Mask Token Improves Domain Adaptation for Dialog Act Tagging.

[BibT_eX]

[DOI]

,

Kazuma Hashimoto

,

,

Nitish Shirish Keskar

,

,

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

The Thieves on Sesame Street are Polyglots - Extracting Multilingual Models from Monolingual APIs.

[BibT_eX]

[DOI]

Nitish Shirish Keskar

,

,

,

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Assessing Local Generalization Capability in Deep Models.

[BibT_eX]

[DOI]

,

Nitish Shirish Keskar

,

,

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019

Balancing Communication and Computation in Distributed Optimization.

[BibT_eX]

[DOI]

Albert S. Berahas

,

Raghu Bollapragada

,

Nitish Shirish Keskar

,

IEEE Trans. Autom. Control., 2019

A limited-memory quasi-Newton algorithm for bound-constrained non-smooth optimization.

[BibT_eX]

[DOI]

Nitish Shirish Keskar

,

Andreas Wächter

Optim. Methods Softw., 2019

Global Capacity Measures for Deep ReLU Networks via Path Sampling.

[BibT_eX]

[DOI]

,

Jason M. Klusowski

,

,

Nitish Shirish Keskar

,

,

CoRR, 2019

CTRL: A Conditional Transformer Language Model for Controllable Generation.

[BibT_eX]

[DOI]

Nitish Shirish Keskar

,

,

Lav R. Varshney

,

,

CoRR, 2019

Pretrained AI Models: Performativity, Mobility, and Change.

[BibT_eX]

[DOI]

Lav R. Varshney

,

Nitish Shirish Keskar

,

CoRR, 2019

XLDA: Cross-Lingual Data Augmentation for Natural Language Inference and Question Answering.

[BibT_eX]

[DOI]

,

,

Nitish Shirish Keskar

,

,

CoRR, 2019

Unifying Question Answering and Text Classification via Span Extraction.

[BibT_eX]

[DOI]

Nitish Shirish Keskar

,

,

,

CoRR, 2019

Coarse-grain Fine-grain Coattention Network for Multi-evidence Question Answering.

[BibT_eX]

[DOI]

,

,

Nitish Shirish Keskar

,

Proceedings of the 7th International Conference on Learning Representations, 2019

A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation.

[BibT_eX]

[DOI]

Akhilesh Gotmare

,

Nitish Shirish Keskar

,

,

Proceedings of the 7th International Conference on Learning Representations, 2019

Neural Text Summarization: A Critical Evaluation.

[BibT_eX]

[DOI]

Wojciech Kryscinski

,

Nitish Shirish Keskar

,

,

,

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

2018

Identifying Generalization Properties in Neural Networks.

[BibT_eX]

[DOI]

,

Nitish Shirish Keskar

,

,

CoRR, 2018

The Natural Language Decathlon: Multitask Learning as Question Answering.

[BibT_eX]

[DOI]

,

Nitish Shirish Keskar

,

,

CoRR, 2018

Using Mode Connectivity for Loss Landscape Analysis.

[BibT_eX]

[DOI]

Akhilesh Gotmare

,

Nitish Shirish Keskar

,

,

CoRR, 2018

An Analysis of Neural Language Modeling at Multiple Scales.

[BibT_eX]

[DOI]

,

Nitish Shirish Keskar

,

CoRR, 2018

Regularizing and Optimizing LSTM Language Models.

[BibT_eX]

[DOI]

,

Nitish Shirish Keskar

,

Proceedings of the 6th International Conference on Learning Representations, 2018

2017

Improving Generalization Performance by Switching from Adam to SGD.

[BibT_eX]

[DOI]

Nitish Shirish Keskar

,

CoRR, 2017

Weighted Transformer Network for Machine Translation.

[BibT_eX]

[DOI]

,

Nitish Shirish Keskar

,

CoRR, 2017

On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima.

[BibT_eX]

[DOI]

Nitish Shirish Keskar

,

Dheevatsa Mudigere

,

,

Mikhail Smelyanskiy

,

Ping Tak Peter Tang

Proceedings of the 5th International Conference on Learning Representations, 2017

2016

A second-order method for convex l<sub>1</sub>-regularized optimization with active-set prediction.

[BibT_eX]

[DOI]

Nitish Shirish Keskar

,

,

Figen Öztoprak

,

Andreas Wächter

Optim. Methods Softw., 2016

adaQN: An Adaptive Quasi-Newton Algorithm for Training RNNs.

[BibT_eX]

[DOI]

Nitish Shirish Keskar

,

Albert S. Berahas

Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2016

2015

A nonmonotone learning rate strategy for SGD training of deep neural networks.

[BibT_eX]

[DOI]

Nitish Shirish Keskar

,

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Loading...