Esha Choukse

Orcid: 0000-0003-0371-5522

According to our database¹, Esha Choukse authored at least 31 papers between 2016 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

EcoServe: Designing Carbon-Aware AI Inference Systems.

[BibT_eX]

[DOI]

CoRR, February, 2025

Towards Efficient Large Multimodal Model Serving.

[BibT_eX]

[DOI]

CoRR, February, 2025

Towards Resource-Efficient Compound AI Systems.

[BibT_eX]

[DOI]

CoRR, January, 2025

TAPAS: Thermal- and Power-Aware Scheduling for LLM Inference in Cloud Platforms.

[BibT_eX]

[DOI]

CoRR, January, 2025

2024

DroidSpeak: Enhancing Cross-LLM Communication.

[BibT_eX]

[DOI]

CoRR, 2024

Mnemosyne: Parallelization Strategies for Efficiently Serving Multi-Million Context Length LLM Inference Requests Without Approximations.

[BibT_eX]

[DOI]

CoRR, 2024

Intelligent Router for LLM Workloads: Improving Performance Through Workload-Aware Scheduling.

[BibT_eX]

[DOI]

CoRR, 2024

DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency.

[BibT_eX]

[DOI]

CoRR, 2024

Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference.

[BibT_eX]

[DOI]

CoRR, 2024

Junctiond: Extending FaaS Runtimes with Kernel-Bypass.

[BibT_eX]

[DOI]

CoRR, 2024

Input-Dependent Power Usage in GPUs.

[BibT_eX]

[DOI]

Theo Gregersen

Pratyush Patel

Esha Choukse

Proceedings of the SC24-W: Workshops of the International Conference for High Performance Computing, 2024

Making Kernel Bypass Practical for the Cloud with Junction.

[BibT_eX]

[DOI]

Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation, 2024

Mosaic: Harnessing the Micro-Architectural Resources of Servers in Serverless Environments.

[BibT_eX]

[DOI]

Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

Memory Allocation Under Hardware Compression.

[BibT_eX]

[DOI]

Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

Designing Cloud Servers for Lower Carbon.

[BibT_eX]

[DOI]

Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

SmartOClock: Workload- and Risk-Aware Overclocking in the Cloud.

[BibT_eX]

[DOI]

Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

Splitwise: Efficient Generative LLM Inference Using Phase Splitting.

[BibT_eX]

[DOI]

Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

DyLeCT: Achieving Huge-page-like Translation Performance for Hardware-compressed Memory.

[BibT_eX]

[DOI]

Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

Characterizing Power Management Opportunities for LLMs in the Cloud.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

Optimizing GPU Data Center Power.

[BibT_eX]

[DOI]

Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems, 2024

2023

POLCA: Power Oversubscription in LLM Cloud Providers.

[BibT_eX]

[DOI]

CoRR, 2023

Towards Improved Power Management in Cloud GPUs.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2023

Myths and Misconceptions Around Reducing Carbon Embedded in Cloud Platforms.

[BibT_eX]

[DOI]

Proceedings of the 2nd Workshop on Sustainable Computer Systems, 2023

2022

Overclocking in Immersion-Cooled Datacenters.

[BibT_eX]

[DOI]

IEEE Micro, 2022

Translation-optimized Memory Compression for Capacity.

[BibT_eX]

[DOI]

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

2020

Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

2019

PruneTrain: Gradual Structured Pruning from Scratch for Faster Neural Network Training.

[BibT_eX]

[DOI]

CoRR, 2019

PruneTrain: fast neural network training by dynamic sparse model reconfiguration.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2019

2018

CompressPoints: An Evaluation Methodology for Compressed Memory Systems.

[BibT_eX]

[DOI]

Esha Choukse

Mattan Erez

Alaa R. Alameldeen

IEEE Comput. Archit. Lett., 2018

Compresso: Pragmatic Main Memory Compression.

[BibT_eX]

[DOI]

Esha Choukse

Mattan Erez

Alaa R. Alameldeen

Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

2016

Bit-Plane Compression: Transforming Data for Better Compression in Many-Core Architectures.

[BibT_eX]

[DOI]

Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

Esha Choukse

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...