Mahesh Ravishankar

Orcid: 0000-0003-4782-6638

According to our database1, Mahesh Ravishankar authored at least 16 papers between 2010 and 2022.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2022
TinyIREE: An ML Execution Environment for Embedded Systems From Compilation to Deployment.
IEEE Micro, 2022

Composable and Modular Code Generation in MLIR: A Structured and Retargetable Approach to Tensor Compiler Construction.
CoRR, 2022

Structured Operations: Modular Design of Code Generators for Tensor Compilers.
Proceedings of the Languages and Compilers for Parallel Computing, 2022

2019
Automatic acceleration of Numpy applications on GPUs and multicore CPUs.
CoRR, 2019

2018
Domain-Specific Optimization and Generation of High-Performance GPU Code for Stencil Computations.
Proc. IEEE, 2018

Diesel: DSL for linear algebra and neural net computations on GPUs.
Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages, 2018

2016
Effective resource management for enhancing performance of 2D and 3D stencils on GPUs.
Proceedings of the 9th Annual Workshop on General Purpose Processing using Graphics Processing Unit, 2016

Resource Conscious Reuse-Driven Tiling for GPUs.
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

2015
Forma: a DSL for image processing applications to target GPUs and multi-core CPUs.
Proceedings of the 8th Workshop on General Purpose Processing using GPUs, 2015

Distributed memory code generation for mixed Irregular/Regular computations.
Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2015

Fusing convolution kernels through tiling.
Proceedings of the 2nd ACM SIGPLAN International Workshop on Libraries, 2015

2014
Automatic parallelization of a class of irregular loops for distributed memory systems.
ACM Trans. Parallel Comput., 2014

2013
Beyond reuse distance analysis: Dynamic analysis for characterization of data locality potential.
ACM Trans. Archit. Code Optim., 2013

2012
Code generation for parallel execution of a class of irregular loops on distributed memory systems.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

Dynamic trace-based analysis of vectorization potential of applications.
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2012

2010
Optimal loop unrolling for GPGPU programs.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010


  Loading...