Ashwin M. Aji

According to our database1, Ashwin M. Aji authored at least 29 papers between 2008 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Scalable Training of Graph Foundation Models for Atomistic Materials Modeling: A Case Study with HydraGNN.
CoRR, 2024

2023
DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies.
CoRR, 2023


2019
Optimizing Hyperplane Sweep Operations Using Asynchronous Multi-grain GPU Tasks.
Proceedings of the IEEE International Symposium on Workload Characterization, 2019

Adaptive Task Aggregation for High-Performance Sparse Solvers on GPUs.
Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2018
Investigating Data Layout Transformations in Chapel.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

Taming irregular applications via advanced dynamic parallelism on GPUs.
Proceedings of the 15th ACM International Conference on Computing Frontiers, 2018

2017
Characterizing data organization effects on heterogeneous memory architectures.
Proceedings of the 2017 International Symposium on Code Generation and Optimization, 2017

2016
MPI-ACC: Accelerator-Aware MPI for Scientific Applications.
IEEE Trans. Parallel Distributed Syst., 2016

MultiCL: Enabling automatic scheduling for task-parallel workloads in OpenCL.
Parallel Comput., 2016

Implementing directed acyclic graphs with the heterogeneous system architecture.
Proceedings of the 9th Annual Workshop on General Purpose Processing using Graphics Processing Unit, 2016

2015
Programming High-Performance Clusters with Heterogeneous Computing Devices.
PhD thesis, 2015

Automatic Command Queue Scheduling for Task-Parallel Workloads in OpenCL.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

2013
Synchronization and Ordering Semantics in Hybrid MPI+GPU Programming.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Online Performance Projection for Clusters with Heterogeneous GPUs.
Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

pVOCL: Power-Aware Dynamic Placement and Migration in Virtualized GPU Environments.
Proceedings of the IEEE 33rd International Conference on Distributed Computing Systems, 2013

On the efficacy of GPU-integrated MPI for scientific applications.
Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, 2013

Contagion Diffusion with EpiSimdemics.
Proceedings of the Parallel Science and Engineering Applications - The Charm++ Approach., 2013

2012
Efficient Intranode Communication in GPU-Accelerated Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Simulating the Spread of Infectious Disease over Large Realistic Social Networks Using Charm++.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

DMA-Assisted, Intranode Communication in GPU Accelerated Systems.
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

MPI-ACC: An Integrated and Extensible Approach to Data Movement in Accelerator-based Systems.
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

2011
Poster: large-scale computational epidemiology modeling using charm++.
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, 2011

High-performance biocomputing for simulating the spread of contagion over large contact networks.
Proceedings of the IEEE 1st International Conference on Computational Advances in Bio and Medical Sciences, 2011

Bounding the effect of partition camping in GPU kernels.
Proceedings of the 8th Conference on Computing Frontiers, 2011

2010
GPU-RMAP: Accelerating Short-Read Mapping on Graphics Processors.
Proceedings of the 13th IEEE International Conference on Computational Science and Engineering, 2010

2009
On the Robust Mapping of Dynamic Programming onto a Graphics Processing Unit.
Proceedings of the 15th IEEE International Conference on Parallel and Distributed Systems, 2009

2008
Cell-SWat: modeling and scheduling wavefront computations on the cell broadband engine.
Proceedings of the 5th Conference on Computing Frontiers, 2008

Optimizing performance, cost, and sensitivity in pairwise sequence search on a cluster of PlayStations.
Proceedings of the 8th IEEE International Conference on Bioinformatics and Bioengineering, 2008


  Loading...