Dimitris S. Papailiopoulos

Affiliations:
  • University of Wisconsin-Madison
  • University of California, Berkeley, AMPLab (former)
  • University of Texas at Austin, Dept. of ECE (former)


According to our database1, Dimitris S. Papailiopoulos authored at least 92 papers between 2008 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Predictive Pipelined Decoding: A Compute-Latency Trade-off for Exact LLM Decoding.
Trans. Mach. Learn. Res., 2024

Mini-Batch Optimization of Contrastive Loss.
Trans. Mach. Learn. Res., 2024

Everything Everywhere All at Once: LLMs can In-Context Learn Multiple Tasks in Superposition.
CoRR, 2024

From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data.
CoRR, 2024

How Well Can Transformers Emulate In-context Newton's Method?
CoRR, 2024

Can Mamba Learn How To Learn? A Comparative Study on In-Context Learning Tasks.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

CHAI: Clustered Head Attention for Efficient LLM Inference.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Looped Transformers are Better at Learning Learning Algorithms.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Teaching Arithmetic to Small Transformers.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Dissecting Chain-of-Thought: A Study on Compositional In-Context Learning of MLPs.
CoRR, 2023

The Expressive Power of Tuning Only the Norm Layers.
CoRR, 2023

Transformers as Algorithms: Generalization and Implicit Model Selection in In-context Learning.
CoRR, 2023

Dissecting Chain-of-Thought: Compositionality through In-Context Filtering and Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Cuttlefish: Low-Rank Model Training without All the Tuning.
Proceedings of the Sixth Conference on Machine Learning and Systems, 2023

Transformers as Algorithms: Generalization and Stability in In-context Learning.
Proceedings of the International Conference on Machine Learning, 2023

Looped Transformers as Programmable Computers.
Proceedings of the International Conference on Machine Learning, 2023

The Expressive Power of Tuning Only the Normalization Layers.
Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

Prompted LLMs as Chatbot Modules for Long Open-domain Conversation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
A Better Way to Decay: Proximal Gradient Training Algorithms for Neural Nets.
CoRR, 2022

Rare Gems: Finding Lottery Tickets at Initialization.
CoRR, 2022

Rare Gems: Finding Lottery Tickets at Initialization.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

LIFT: Language-Interfaced Fine-Tuning for Non-language Machine Learning Tasks.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

On the Utility of Gradient Compression in Distributed Training Systems.
Proceedings of the Fifth Conference on Machine Learning and Systems, 2022

GenLabel: Mixup Relabeling using Generative Models.
Proceedings of the International Conference on Machine Learning, 2022

Permutation-Based SGD: Is Random Optimal?
Proceedings of the Tenth International Conference on Learning Representations, 2022

Utilizing Language-Image Pretraining for Efficient and Robust Bilingual Word Alignment.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Finding Nearly Everything within Random Binary Networks.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021
Finding Everything within Random Binary Networks.
CoRR, 2021

An Exponential Improvement on the Memorization Capacity of Deep Threshold Networks.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Pufferfish: Communication-efficient Models At No Extra Cost.
Proceedings of the Fourth Conference on Machine Learning and Systems, 2021

Adaptive Gradient Communication via Critical Learning Regime Identification.
Proceedings of the Fourth Conference on Machine Learning and Systems, 2021

2020
Machine Learning From Distributed, Streaming Data [From the Guest Editors].
IEEE Signal Process. Mag., 2020

Accordion: Adaptive Gradient Communication via Critical Learning Regime Identification.
CoRR, 2020

Optimal Lottery Tickets via SubsetSum: Logarithmic Over-Parameterization is Sufficient.
CoRR, 2020

Attack of the Tails: Yes, You Really Can Backdoor Federated Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Optimal Lottery Tickets via Subset Sum: Logarithmic Over-Parameterization is Sufficient.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Bad Global Minima Exist and SGD Can Reach Them.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Closing the convergence gap of SGD without replacement.
Proceedings of the 37th International Conference on Machine Learning, 2020

Federated Learning with Matched Averaging.
Proceedings of the 8th International Conference on Learning Representations, 2020

2019
Convergence and Margin of Adversarial Training on Separable Data.
CoRR, 2019

SysML: The New Frontier of Machine Learning Systems.
CoRR, 2019

ErasureHead: Distributed Gradient Descent without Delays Using Approximate Gradient Coding.
CoRR, 2019

DETOX: A Redundancy-based Framework for Faster and More Robust Gradient Aggregation.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Does Data Augmentation Lead to Positive Margin?
Proceedings of the 36th International Conference on Machine Learning, 2019

A Geometric Perspective on the Transferability of Adversarial Directions.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018
Speeding Up Distributed Machine Learning Using Codes.
IEEE Trans. Inf. Theory, 2018

Coding Theory for Inference, Learning and Optimization (Dagstuhl Seminar 18112).
Dagstuhl Reports, 2018

Gradient Coding via the Stochastic Block Model.
CoRR, 2018

DRACO: Robust Distributed Training via Redundant Gradients.
CoRR, 2018

ATOMO: Communication-efficient Learning via Atomic Sparsification.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

The Effect of Network Width on the Performance of Large-batch Training.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Gradient Coding Using the Stochastic Block Model.
Proceedings of the 2018 IEEE International Symposium on Information Theory, 2018

DRACO: Byzantine-resilient Distributed Training via Redundant Gradients.
Proceedings of the 35th International Conference on Machine Learning, 2018

Stability and Generalization of Learning Algorithms that Converge to Global Optima.
Proceedings of the 35th International Conference on Machine Learning, 2018

Gradient Diversity: a Key Ingredient for Scalable Distributed Learning.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

2017
Perturbed Iterate Analysis for Asynchronous Stochastic Optimization.
SIAM J. Optim., 2017

Approximate Gradient Coding via Sparse Random Graphs.
CoRR, 2017

Gradient Diversity Empowers Distributed Learning.
CoRR, 2017

Coded computation for multicore setups.
Proceedings of the 2017 IEEE International Symposium on Information Theory, 2017

2016
Optimal Locally Repairable Codes and Connections to Matroid Theory.
IEEE Trans. Inf. Theory, 2016

Locality and Availability in Distributed Storage.
IEEE Trans. Inf. Theory, 2016

CYCLADES: Conflict-free Asynchronous Machine Learning.
CoRR, 2016

Bipartite Correlation Clustering: Maximizing Agreements.
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, 2016

2015
On the Worst-Case Approximability of Sparse PCA.
CoRR, 2015

Parallel Correlation Clustering on Big Graphs.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Sparse PCA via Bipartite Matchings.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Orthogonal NMF through Subspace Exploration.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

2014
Locally Repairable Codes.
IEEE Trans. Inf. Theory, 2014

The Sparse Principal Component of a Constant-Rank Matrix.
IEEE Trans. Inf. Theory, 2014

A Repair Framework for Scalar MDS Codes.
IEEE J. Sel. Areas Commun., 2014

Provable deterministic leverage score sampling.
Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014

On codes with availability for distributed storage.
Proceedings of the 6th International Symposium on Communications, 2014

Finding Dense Subgraphs via Low-Rank Bilinear Optimization.
Proceedings of the 31th International Conference on Machine Learning, 2014

Nonnegative Sparse PCA with Provable Guarantees.
Proceedings of the 31th International Conference on Machine Learning, 2014

Combinatorial QPs via a low-dimensional subspace sampling.
Proceedings of the 48th Annual Conference on Information Sciences and Systems, 2014

2013
Maximum-Likelihood Noncoherent PAM Detection.
IEEE Trans. Commun., 2013

XORing Elephants: Novel Erasure Codes for Big Data.
Proc. VLDB Endow., 2013

Sparse PCA through Low-rank Approximations.
Proceedings of the 30th International Conference on Machine Learning, 2013

Availability and locality in distributed storage.
Proceedings of the IEEE Global Conference on Signal and Information Processing, 2013

2012
Interference Alignment as a Rank Constrained Rank Minimization.
IEEE Trans. Signal Process., 2012

Feedback in the K-user interference channel.
Proceedings of the 2012 IEEE International Symposium on Information Theory, 2012

Simple regenerating codes: Network coding for cloud storage.
Proceedings of the IEEE INFOCOM 2012, Orlando, FL, USA, March 25-30, 2012, 2012

Maximum-likelihood blind PAM detection.
Proceedings of IEEE International Conference on Communications, 2012

2011
Distributed storage codes through Hadamard designs.
Proceedings of the 2011 IEEE International Symposium on Information Theory Proceedings, 2011

Sparse principal component of a rank-deficient matrix.
Proceedings of the 2011 IEEE International Symposium on Information Theory Proceedings, 2011

Repair optimal erasure codes through hadamard designs.
Proceedings of the 49th Annual Allerton Conference on Communication, 2011

2010
Maximum-likelihood noncoherent OSTBC detection with polynomial complexity.
IEEE Trans. Wirel. Commun., 2010

Distributed storage codes meet multiple-access wiretap channels.
Proceedings of the 48th Annual Allerton Conference on Communication, 2010

MCMC methods for integer least-squares problems.
Proceedings of the 48th Annual Allerton Conference on Communication, 2010

2008
Polynomial-complexity maximum-likelihood block noncoherent MPSK detection.
Proceedings of the IEEE International Conference on Acoustics, 2008

Efficient computation of the M-phase vector that maximizes a rank-deficient quadratic form.
Proceedings of the 42nd Annual Conference on Information Sciences and Systems, 2008

Efficient maximum-likelihood noncoherent orthogonal STBC detection.
Proceedings of the 46th Annual Allerton Conference on Communication, 2008


  Loading...