Michel Steuwer

Orcid: 0000-0001-5048-0741

Affiliations:
  • Technische Universität Berlin, Germany
  • University of Edinburgh, UK (former)
  • University of Glasgow, UK (former)


According to our database1, Michel Steuwer authored at least 68 papers between 2011 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Shoggoth: A Formal Foundation for Strategic Rewriting.
Proc. ACM Program. Lang., January, 2024

Guided Equality Saturation.
Proc. ACM Program. Lang., January, 2024

Descend: A Safe GPU Systems Programming Language.
Proc. ACM Program. Lang., 2024

Collection skeletons: Declarative abstractions for data collections.
J. Syst. Softw., 2024

The MLIR Transform Dialect. Your compiler is more powerful than you think.
CoRR, 2024

Welcome from the Program Chairs.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2024

A shared compilation stack for distributed-memory parallelism in stencil DSLs.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
Artifact for Shoggoth - A Formal Foundation for Strategic Rewriting.
Dataset, November, 2023

Structural Subtyping as Parametric Polymorphism.
Proc. ACM Program. Lang., October, 2023

Artifact for Shoggoth - A Formal Foundation for Strategic Rewriting.
Dataset, October, 2023

Achieving High Performance the Functional Way: Expressing High-Performance Optimizations as Rewrite Strategies.
Commun. ACM, March, 2023

Primrose: Selecting Container Data Types by Their Properties.
Art Sci. Eng. Program., February, 2023

Sidekick compilation with xDSL.
CoRR, 2023

Traced Types for Safe Strategic Rewriting.
CoRR, 2023

BaCO: A Fast and Portable Bayesian Compiler Optimization Framework.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
RISE & Shine: Language-Oriented Compiler Design.
CoRR, 2022

Systematically extending a high-level code generator with support for tensor cores.
Proceedings of the GPGPU@PPoPP 2022: Proceedings of the 14th Workshop on General Purpose Processing Using GPU, 2022

Investigating magic numbers: improving the inlining heuristic in the Glasgow Haskell Compiler.
Proceedings of the Haskell '22: 15th ACM SIGPLAN International Haskell Symposium, Ljubljana, Slovenia, September 15, 2022

Generating Work Efficient Scan Implementations for GPUs the Functional Way.
Proceedings of the Euro-Par 2022: Parallel Processing, 2022

2021
Efficient Auto-Tuning of Parallel Programs with Interdependent Tuning Parameters via Auto-Tuning Framework (ATF).
ACM Trans. Archit. Code Optim., 2021

Sketch-Guided Equality Saturation: Scaling Equality Saturation to Complex Optimizations in Languages with Bindings.
CoRR, 2021

Row-Polymorphic Types for Strategic Rewriting.
CoRR, 2021

Code Generation for Room Acoustics Simulations with Complex Boundary Conditions.
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

Generating high performance code for irregular data structures using dependent types.
Proceedings of the FHPNC 2021: Proceedings of the 9th ACM SIGPLAN International Workshop on Functional High-Performance and Numerical Computing, 2021

Report from the Artifact Evaluation Committee.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2021

Towards a Domain-Extensible Compiler: Optimizing an Image Processing Pipeline on Mobile CPUs.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2021

Integrating a functional pattern-based IR into MLIR.
Proceedings of the CC '21: 30th ACM SIGPLAN International Conference on Compiler Construction, 2021

2020
Tiling Optimizations for Stencil Computations Using Rewrite Rules in Lift.
ACM Trans. Archit. Code Optim., 2020

Achieving high-performance the functional way: a functional pearl on expressing high-performance optimizations as rewrite strategies.
Proc. ACM Program. Lang., 2020

A Language for Describing Optimization Strategies.
CoRR, 2020

High-level hardware feature extraction for GPU performance prediction of stencils.
Proceedings of the GPGPU@PPoPP '20: 13th Annual Workshop on General Purpose Processing using Graphics Processing Unit colocated with 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020

DelayRepay: delayed execution for kernel fusion in Python.
Proceedings of the DLS 2020: Proceedings of the 16th ACM SIGPLAN International Symposium on Dynamic Languages, 2020

Generating fast sparse matrix vector multiplication from a high level generic functional IR.
Proceedings of the CC '20: 29th International Conference on Compiler Construction, 2020

2019
High-level synthesis of functional patterns with Lift.
Proceedings of the 6th ACM SIGPLAN International Workshop on Libraries, 2019

Position-dependent arrays and their application for high performance code generation.
Proceedings of the 8th ACM SIGPLAN International Workshop on Functional High-Performance and Numerical Computing, 2019

Generating efficient FFT GPU code with Lift.
Proceedings of the 8th ACM SIGPLAN International Workshop on Functional High-Performance and Numerical Computing, 2019

2018
Introducing Parallelism to the Ranges TS.
Proceedings of the International Workshop on OpenCL, 2018

High performance stencil code generation with lift.
Proceedings of the 2018 International Symposium on Code Generation and Optimization, 2018

Automatic Matching of Legacy Code to Heterogeneous APIs: An Idiomatic Approach.
Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018

2017
Strategy Preserving Compilation for Parallel Functional Code.
CoRR, 2017

Just-In-Time GPU Compilation for Interpreted Languages with Partial Evaluation.
Proceedings of the 13th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, 2017

Towards Composable GPU Programming: Programming GPUs with Eager Actions and Lazy Views.
Proceedings of the 8th International Workshop on Programming Models and Applications for Multicores and Manycores, 2017

A Transformation-Based Approach to Developing High-Performance GPU Programs.
Proceedings of the Perspectives of System Informatics, 2017

Lift: a functional data-parallel IR for high-performance GPU code generation.
Proceedings of the 2017 International Symposium on Code Generation and Optimization, 2017

2016
Performance portable GPU code generation for matrix multiplication.
Proceedings of the 9th Annual Workshop on General Purpose Processing using Graphics Processing Unit, 2016

Multi-stage programming for GPUs in C++ using PACXX.
Proceedings of the 9th Annual Workshop on General Purpose Processing using Graphics Processing Unit, 2016

Matrix multiplication beyond auto-tuning: rewrite-based GPU code generation.
Proceedings of the 2016 International Conference on Compilers, 2016

2015
Patterns and Rewrite Rules for Systematic Code Generation (From High-Level Functional Patterns to High-Performance OpenCL Code).
CoRR, 2015

Autotuning OpenCL Workgroup Size for Stencil Patterns.
CoRR, 2015

Runtime Code Generation and Data Management for Heterogeneous Computing in Java.
Proceedings of the Principles and Practices of Programming on The Java Platform, 2015

Generating performance portable code using rewrite rules: from high-level functional expressions to high-performance OpenCL code.
Proceedings of the 20th ACM SIGPLAN International Conference on Functional Programming, 2015

Verbesserung der Programmierbarkeit und Performance-Portabilität von Manycore-Prozessoren.
Proceedings of the Ausgezeichnete Informatikdissertationen 2015, 2015

Improving programmability and performance portability on many-core processors.
PhD thesis, 2015

2014
SkelCL: a high-level extension of OpenCL for multi-GPU systems.
J. Supercomput., 2014

High-Level Programming of Stencil Computations on Multi-GPU Systems Using the SkelCL Library.
Parallel Process. Lett., 2014

Introducing and Implementing the Allpairs Skeleton for Programming Multi-GPU Systems.
Int. J. Parallel Program., 2014

gCUP: rapid GPU-based HIV-1 co-receptor usage prediction for next-generation sequencing.
Bioinform., 2014

A Composable Array Function Interface for Heterogeneous Computing in Java.
Proceedings of the ARRAY'14: Proceedings of the 2014 ACM SIGPLAN International Workshop on Libraries, 2014

Towards High-Level Programming for Systems with Many Cores.
Proceedings of the Perspectives of System Informatics, 2014

2013
dOpenCL: Towards uniform programming of distributed heterogeneous multi-/many-core systems.
J. Parallel Distributed Comput., 2013

SkelCL: Enhancing OpenCL for High-Level Programming of Multi-GPU Systems.
Proceedings of the Parallel Computing Technologies - 12th International Conference, 2013

High-Level Programming for Medical Imaging on Multi-GPU Systems Using the SkelCL Library.
Proceedings of the International Conference on Computational Science, 2013

2012
A High-Level Programming Approach for Distributed Systems with Accelerators.
Proceedings of the New Trends in Software Methodologies, Tools and Techniques, 2012

Towards High-Level Programming of Multi-GPU Systems Using the SkelCL Library.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

dOpenCL: Towards a Uniform Programming Approach for Distributed Heterogeneous Multi-/Many-Core Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Uniform High-Level Programming of Many-Core and Multi-GPU Systems.
Proceedings of the Transition of HPC Towards Exascale Computing, 2012

Using the SkelCL Library for High-Level GPU Programming of 2D Applications.
Proceedings of the Euro-Par 2012: Parallel Processing Workshops, 2012

2011
SkelCL - A Portable Skeleton Library for High-Level GPU Programming.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011


  Loading...