Keshav Pingali

Orcid: 0000-0002-0484-4636

According to our database1, Keshav Pingali authored at least 207 papers between 1985 and 2024.

Collaborative distances:

Awards

IEEE Fellow

IEEE Fellow 2010, "For contributions to compilers and parallel computing".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
A Deep Dive into Task-Based Parallelism in Python.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

Modeling Tsunami Waves at the Coastline of Valparaiso Area of Chile with Physics Informed Neural Networks.
Proceedings of the Computational Science - ICCS 2024, 2024

Kimbap: A Node-Property Map System for Distributed Graph Analytics.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
Fast parallel IGA-ADS solver for time-dependent Maxwell's equations.
Comput. Math. Appl., December, 2023

Physics Informed Neural Network Code for 2D Transient Problems (PINN-2DT) Compatible with Google Colab.
CoRR, 2023

2022
Tuning three-dimensional tumor progression simulations on a cluster of GPGPUs.
J. Comput. Appl. Math., 2022

Supermodeling, a convergent data assimilation meta-procedure used in simulation of tumor progression.
Comput. Math. Appl., 2022

Parla: A Python Orchestration System for Heterogeneous Architectures.
Proceedings of the SC22: International Conference for High Performance Computing, 2022

HyperNetVec: Fast and Scalable Hierarchical Embedding for Hypergraphs.
Proceedings of the Network Science - 7th International Winter Conference, 2022

A Simple, Fast, and GPU-friendly Steiner-Tree Heuristic.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

SPRoute 2.0: A detailed-routability-driven deterministic parallel global router with soft capacity.
Proceedings of the 27th Asia and South Pacific Design Automation Conference, 2022

2021
CuSP: A Customizable Streaming Edge Partitioner for Distributed Graph Analytics.
ACM SIGOPS Oper. Syst. Rev., 2021

Parallel graph-grammar-based algorithm for the longest-edge refinement of triangular meshes and the pollution simulations in Lesser Poland area.
Eng. Comput., 2021

An Open-Source EDA Flow for Asynchronous Logic.
IEEE Des. Test, 2021

Sonic: A Sampling-based Online Controller for Streaming Applications.
CoRR, 2021

Optimizing Graph Transformer Networks with Graph-based Techniques.
CoRR, 2021

NetVec: A Scalable Hypergraph Embedding System.
CoRR, 2021

Exploiting Asynchronous Priority Scheduling in Parallel Eikonal Solvers.
CoRR, 2021

BiPart: a parallel and deterministic hypergraph partitioner.
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021

Sandslash: a two-level framework for efficient graph pattern mining.
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

2020
Groute: Asynchronous Multi-GPU Programming Model with Applications to Large-scale Graph Processing.
ACM Trans. Parallel Comput., 2020

Single Machine Graph Analytics on Massive Datasets Using Intel Optane DC Persistent Memory.
Proc. VLDB Endow., 2020

Pangolin: An Efficient and Flexible Graph Mining System on CPU and GPU.
Proc. VLDB Endow., 2020

BiPart: A Parallel and Deterministic Multilevel Hypergraph Partitioner.
CoRR, 2020

A Fine-Grained Hybrid CPU-GPU Algorithm for Betweenness Centrality Computations.
CoRR, 2020

A Study of Graph Analytics for Massive Datasets on Distributed Multi-GPUs.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

A Study of APIs for Graph Analytics Workloads.
Proceedings of the IEEE International Symposium on Workload Characterization, 2020


Parallel Shared-Memory Isogeometric Residual Minimization (iGRM) for Three-Dimensional Advection-Diffusion Problems.
Proceedings of the Computational Science - ICCS 2020, 2020

Cyclone: A Static Timing and Power Engine for Asynchronous Circuits.
Proceedings of the 26th IEEE International Symposium on Asynchronous Circuits and Systems, 2020

A Methodology for Principled Approximation in Visual SLAM.
Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020

2019
Proteus: Language and Runtime Support for Self-Adaptive Software Development.
IEEE Softw., 2019

Derivative grammars: a symbolic approach to parsing with derivatives.
Proc. ACM Program. Lang., 2019

Supermodeling of tumor dynamics with parallel isogeometric analysis solver.
CoRR, 2019

An Adaptive Load Balancer For Graph Analytical Applications on GPUs.
CoRR, 2019

An elementary introduction to Kalman filtering.
Commun. ACM, 2019

Hypergraph grammar based multi-thread multi-frontal direct solver with Galois scheduler.
Comput. Sci., 2019

A round-efficient distributed betweenness centrality algorithm.
Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2019

GrAPL Keynote 1.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2019

SPRoute: A Scalable Parallel Negotiation-based Global Router.
Proceedings of the International Conference on Computer-Aided Design, 2019

DistTC: High Performance Distributed Triangle Counting.
Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, 2019

Phoenix: A Substrate for Resilient Distributed Graph Analytics.
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

SLAMBooster: An Application-Aware Online Controller for Approximation in Dense SLAM.
Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

Gluon-Async: A Bulk-Asynchronous System for Distributed and Heterogeneous Graph Analytics.
Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2018
A Study of Partitioning Policies for Graph Analytics on Large-scale Distributed Platforms.
Proc. VLDB Endow., 2018

SLAMBooster: An Application-aware Controller for Approximation in SLAM.
CoRR, 2018

Applications of A Hyper-Graph Grammar System in Adaptive Finite-Element Computations.
Int. J. Appl. Math. Comput. Sci., 2018

Gluon: a communication-optimizing substrate for distributed heterogeneous graph analytics.
Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2018

A Lightweight Communication Runtime for Distributed Graph Analytics.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

Unlocking fine-grain parallelism for AIG rewriting.
Proceedings of the International Conference on Computer-Aided Design, 2018

Abelian: A Compiler for Graph Analytics on Distributed, Heterogeneous Platforms.
Proceedings of the Euro-Par 2018: Parallel Processing, 2018

Can Parallel Programming Revolutionize EDA Tools?
Proceedings of the Advanced Logic Synthesis, 2018

2017
IGA-ADS: Isogeometric analysis FEM using ADS solver.
Comput. Phys. Commun., 2017

Dynamic Load Balancing Strategies for Graph Applications on GPUs.
CoRR, 2017

Capri: A Control System for Approximate Programs.
CoRR, 2017

Groute: An Asynchronous Multi-GPU Programming Model for Irregular Computations.
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017

Parallel triangle counting and k-truss identification using graph-centric methods.
Proceedings of the 2017 IEEE High Performance Extreme Computing Conference, 2017

What Scalable Programs Need from Transactional Memory.
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

2016
Adaptive Work-Efficient Connected Components on the GPU.
CoRR, 2016

Lowering IrGL to CUDA.
CoRR, 2016

Parallel graph analytics.
Commun. ACM, 2016

DSMR: a shared and distributed memory algorithm for single-source shortest path problem.
Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2016

A compiler for throughput optimization of graph algorithms on GPUs.
Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, 2016

Synchronization Trade-Offs in GPU Implementations of Graph Algorithms.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

DSMR: A Parallel Algorithm for Single-Source Shortest Path Problem.
Proceedings of the 2016 International Conference on Supercomputing, 2016

Hypergraph Grammars in Non-stationary hp-adaptive Finite Element Method.
Proceedings of the International Conference on Computational Science 2016, 2016

Hybrid Direct and Iterative Solver with Library of Multi-criteria Optimal Orderings for h Adaptive Finite Element Method Computations.
Proceedings of the International Conference on Computational Science 2016, 2016

Proactive Control of Approximate Programs.
Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems, 2016

2015
Introduction to the Special Issue on PPoPP'12.
ACM Trans. Parallel Comput., 2015

Quasi-Optimal Elimination Trees for 2D Grids with Singularities.
Sci. Program., 2015

Scaling Runtimes for Irregular Algorithms to Large-Scale NUMA Systems.
Computer, 2015

Parallel program = operator + schedule + parallel data structure.
Proceedings of the 2015 International Conference on Embedded Computer Systems: Architectures, 2015

Stochastic gradient descent on GPUs.
Proceedings of the 8th Workshop on General Purpose Processing using GPUs, 2015

Synthesizing parallel graph programs via automated planning.
Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2015

Automatic Tuning of Task Scheduling Policies on Multicore Architectures.
Proceedings of the Parallel Computing: On the Road to Exascale, 2015

Telescopic Hybrid Fast Solver for 3D Elliptic Problems with Point Singularities.
Proceedings of the International Conference on Computational Science, 2015

Scalable Data-Driven PageRank: Algorithms, System Issues, and Lessons Learned.
Proceedings of the Euro-Par 2015: Parallel Processing, 2015

Priority Queues Are Not Good Concurrent Priority Schedulers.
Proceedings of the Euro-Par 2015: Parallel Processing, 2015

A Graphical Model for Context-Free Grammar Parsing.
Proceedings of the Compiler Construction - 24th International Conference, 2015

Kinetic Dependence Graphs.
Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, 2015

2014
Brief announcement: parallelization of asynchronous variational integrators forshared memory architectures.
Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures, 2014

Parallelization of Reordering Algorithms for Bandwidth and Wavefront Reduction.
Proceedings of the International Conference for High Performance Computing, 2014

High-speed graph analytics with the galois system.
Proceedings of the first workshop on Parallel programming for analytics applications, 2014

Author retrospective for synthesizing transformations for locality enhancement of imperfectly-nested loop nests.
Proceedings of the ACM International Conference on Supercomputing 25th Anniversary Volume, 2014

Graph Grammar based Multi-thread Multi-frontal Direct Solver with Galois Scheduler.
Proceedings of the International Conference on Computational Science, 2014

Deterministic galois: on-demand, portable and parameterless.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2014

Adaptive heterogeneous scheduling for integrated GPUs.
Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

2013
A lightweight infrastructure for graph analytics.
Proceedings of the ACM SIGOPS 24th Symposium on Operating Systems Principles, 2013

Betweenness centrality: algorithms and implementations.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013

Morph algorithms on GPUs.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013

Data-Driven Versus Topology-driven Irregular Computations on GPUs.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

Atomic-free irregular computations on GPUs.
Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units, 2013

2012
Parallelizing SuperFine.
Proceedings of the ACM Symposium on Applied Computing, 2012

A GPU implementation of inclusion-based points-to analysis.
Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012

Elixir: a system for synthesizing concurrent graph programs.
Proceedings of the 27th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2012

Parallel Clustered Low-Rank Approximation of Graphs and Its Application to Link Prediction.
Proceedings of the Languages and Compilers for Parallel Computing, 2012

A quantitative study of irregular programs on GPUs.
Proceedings of the 2012 IEEE International Symposium on Workload Characterization, 2012

Processor Allocation for Optimistic Parallelization of Irregular Programs.
Proceedings of the Computational Science and Its Applications - ICCSA 2012, 2012

2011
Locality of Reference and Parallel Processing.
Proceedings of the Encyclopedia of Parallel Computing, 2011

Brief announcement: processor allocation for optimistic parallelization of irregular programs.
Proceedings of the SPAA 2011: Proceedings of the 23rd Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2011

Ordered vs. unordered: a comparison of parallelism and work-efficiency in irregular algorithms.
Proceedings of the 16th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2011

A shape analysis for optimizing parallel graph programs.
Proceedings of the 38th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2011

Parallelizing irregular algorithms: a pattern language.
Proceedings of the 18th Conference on Pattern Languages of Programs, 2011

The tao of parallelism in algorithms.
Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, 2011

Exploiting the commutativity lattice.
Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, 2011

Synthesizing concurrent schedulers for irregular algorithms.
Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, 2011

2010
La prossima vita at TOPLAS.
ACM Trans. Program. Lang. Syst., 2010

La dolce vita at TOPLAS.
ACM Trans. Program. Lang. Syst., 2010

Programming Multicores: Do Applications Programmers Need to Write Explicitly Parallel Programs?
IEEE Micro, 2010

Structure-driven optimizations for amorphous data-parallel programs.
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010

Parallel inclusion-based points-to analysis.
Proceedings of the 25th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2010

Parallel Graph Partitioning on Multicore Architectures.
Proceedings of the Languages and Compilers for Parallel Computing, 2010

Towards a science of parallel programming.
Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, 2010

Ordered and unordered algorithms for parallel breadth first search.
Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, 2010

2009
Remembrances of things past.
ACM Trans. Program. Lang. Syst., 2009

Optimistic parallelism requires abstractions.
Commun. ACM, 2009

Compiler research: the next 50 years.
Commun. ACM, 2009

How much parallelism is there in irregular applications?
Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2009

Lonestar: A suite of parallel irregular programs.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2009

Compiler-enhanced incremental checkpointing for OpenMP applications.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

09191 Abstracts Collection - Fault Tolerance in High-Performance Computing and Grids.
Proceedings of the Fault Tolerance in High-Performance Computing and Grids, 03.05., 2009

2008
Parallel and Vector Programming Languages.
Proceedings of the Wiley Encyclopedia of Computer Science and Engineering, 2008

An Experimental Study of Self-Optimizing Dense Linear Algebra Software.
Proc. IEEE, 2008

Scheduling strategies for optimistic parallel execution of irregular programs.
Proceedings of the SPAA 2008: Proceedings of the 20th Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2008

On the Scalability of an Automatically Parallelized Irregular Application.
Proceedings of the Languages and Compilers for Parallel Computing, 2008

Data-parallel abstractions for irregular programs.
Proceedings of the 5th Conference on Computing Frontiers, 2008

Optimistic parallelism benefits from data partitioning.
Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, 2008

2007
Editorial: A changing of the guard.
ACM Trans. Program. Lang. Syst., 2007

An experimental comparison of cache-oblivious and cache-conscious programs.
Proceedings of the SPAA 2007: Proceedings of the 19th Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2007

Compiler-Enhanced Incremental Checkpointing.
Proceedings of the Languages and Compilers for Parallel Computing, 2007

Scheduling Issues in Optimistic Parallelization.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

2006
Mobile MPI programs in computational grids.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2006

Is Cache-Oblivious DGEMM Viable?
Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

Recent advances in checkpoint/recovery systems.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

A distributed system based on web services for computational science simulations.
Proceedings of the 20th Annual International Conference on Supercomputing, 2006

Experimental evaluation of application-level checkpointing for OpenMP programs.
Proceedings of the 20th Annual International Conference on Supercomputing, 2006

2005
Is Search Really Necessary to Generate High-Performance BLAS?
Proc. IEEE, 2005

Automatic measurement of memory hierarchy parameters.
Proceedings of the International Conference on Measurements and Modeling of Computer Systems, 2005

Automatic Measurement of Instruction Cache Capacity.
Proceedings of the Languages and Compilers for Parallel Computing, 2005

Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization.
Proceedings of the Languages and Compilers for Parallel Computing, 2005

A Language for the Compact Representation of Multiple Program Versions.
Proceedings of the Languages and Compilers for Parallel Computing, 2005

Optimizing Checkpoint Sizes in the C3 System.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

Think globally, search locally.
Proceedings of the 19th Annual International Conference on Supercomputing, 2005

2004
A Load Balancing Framework for Adaptive and Asynchronous Applications.
IEEE Trans. Parallel Distributed Syst., 2004

Look Left, Look Right, Look Left Again: An Application of Fractal Symbolic Analysis to Linear Algebra Code Restructuring.
Int. J. Parallel Program., 2004

Implementation and Evaluation of a Scalable Application-Level Checkpoint-Recovery Scheme for MPI Programs.
Proceedings of the ACM/IEEE SC2004 Conference on High Performance Networking and Computing, 2004

O'SOAP - A Web Services Framework for DDDAS Applications.
Proceedings of the Computational Science, 2004

Application-level checkpointing for shared memory programs.
Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems, 2004

2003
Fractal symbolic analysis.
ACM Trans. Program. Lang. Syst., 2003

Algorithms for computing the static single assignment form.
J. ACM, 2003

Automated application-level checkpointing of MPI programs.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2003

A comparison of empirical and model-driven optimization.
Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation 2003, 2003

C<sup>3</sup>: A System for Automating Application-Level Checkpointing of MPI Programs.
Proceedings of the Languages and Compilers for Parallel Computing, 2003

Collective operations in application-level fault-tolerant MPI.
Proceedings of the 17th Annual International Conference on Supercomputing, 2003


2002
Date movement and control substrate for parallel adaptive applications.
Concurr. Comput. Pract. Exp., 2002

Next Generation System Software for Future High-End Computing Systems.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

2001
Data-Centric Transformations for Locality Enhancement.
Int. J. Parallel Program., 2001

Synthesizing Transformations for Locality Enhancement of Imperfectly-Nested Loop Nests.
Int. J. Parallel Program., 2001

Topic 04: Compilers for High Performance.
Proceedings of the Euro-Par 2001: Parallel Processing, 2001

Solving Alignment Using Elementary Linear Algebra.
Proceedings of the Compiler Optimizations for Scalable Parallel Systems Languages, 2001

2000
Landing CG on EARTH: A Case Study of Fine-Grained Multithreading on an Evolutionary Path.
Proceedings of the Proceedings Supercomputing 2000, 2000

A Framework for Sparse Matrix Code Synthesis from High-level Specifications.
Proceedings of the Proceedings Supercomputing 2000, 2000

Tiling Imperfectly-Nested Loop Nests.
Proceedings of the Proceedings Supercomputing 2000, 2000

Parallel FEM Simulation of Crack Propagation - Challenges, Status, and Perspectives.
Proceedings of the Parallel and Distributed Processing, 2000

Next-generation generic programming and its application to sparse matrix computations.
Proceedings of the 14th international conference on Supercomputing, 2000

Left-Looking to Right-Looking and Vice Versa: An Application of Fractal Symbolic Analysis to Linear Algebra Code Restructuring.
Proceedings of the Euro-Par 2000, Parallel Processing, 6th International Euro-Par Conference, Munich, Germany, August 29, 2000

Automatic Generation of Block-Recursive Codes.
Proceedings of the Euro-Par 2000, Parallel Processing, 6th International Euro-Par Conference, Munich, Germany, August 29, 2000

1999
High-level semantic optimization of numerical codes.
Proceedings of the 13th international conference on Supercomputing, 1999

An experimental evaluation of tiling and shackling for memory hierarchy management.
Proceedings of the 13th international conference on Supercomputing, 1999

A case for source-level transformations in MATLAB.
Proceedings of the Second Conference on Domain-Specific Languages (DSL '99), 1999

1997
Optimal Control Dependence Computation and the Roman Chariots Problem.
ACM Trans. Program. Lang. Syst., 1997

Compiling Parallel Code for Sparse Matrix Applications.
Proceedings of the ACM/IEEE Conference on Supercomputing, 1997

Compiling Parallel Sparse Code for User-Defined Data Structures.
Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, 1997

Data-centric Multi-level Blocking.
Proceedings of the ACM SIGPLAN '97 Conference on Programming Language Design and Implementation (PLDI), 1997

Sparse Code Generation for Imperfectly Nested Loops with Dependences.
Proceedings of the 11th international conference on Supercomputing, 1997

Compiler and Run-Time Support for Semi-Structured Applications.
Proceedings of the 11th international conference on Supercomputing, 1997

A Relational Approach to the Compilation of Sparse Matrix Programs.
Proceedings of the Euro-Par '97 Parallel Processing, 1997

Data Movement and Control Substrate for Parallel Scientific Computing.
Proceedings of the Communication and Architectural Support for Network-Based Parallel Computing, 1997

1996
Transformations for Imperfectly Nested Loops.
Proceedings of the 1996 ACM/IEEE Conference on Supercomputing, 1996

Generalized Dominance and Control Dependence.
Proceedings of the ACM SIGPLAN'96 Conference on Programming Language Design and Implementation (PLDI), 1996

1995
APT: A Data Structure for Optimal Control Dependence Computation.
Proceedings of the ACM SIGPLAN'95 Conference on Programming Language Design and Implementation (PLDI), 1995

Automatic Parallelization of the Conjugate Gradient Algorithm.
Proceedings of the Languages and Compilers for Parallel Computing, 1995

1994
Compiling for Distributed Memory Architectures.
IEEE Trans. Parallel Distributed Syst., 1994

A singular loop transformation framework based on non-singular matrices.
Int. J. Parallel Program., 1994

The Program Structure Tree: Computing Control Regions in Linear Time.
Proceedings of the ACM SIGPLAN'94 Conference on Programming Language Design and Implementation (PLDI), 1994

1993
Access Normalization: Loop Restructuring for NUMA Compilers.
ACM Trans. Comput. Syst., 1993

Dependence-Based Program Analysis.
Proceedings of the ACM SIGPLAN'93 Conference on Programming Language Design and Implementation (PLDI), 1993

Register renaming and dynamic speculation: an alternative approach.
Proceedings of the 26th Annual International Symposium on Microarchitecture, 1993

1992
Loop Transformations for NUMA Machines.
Proceedings of the 2nd SIGPLAN Workshop on Languages, Compilers, and Run-Time Environments for Distributed Memory Multiprocessors, Boulder, Colorado, September 30, 1992

Abstract Semantics for a Higher-Order Functional Language with Logic Variables.
Proceedings of the Conference Record of the Nineteenth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 1992

1991
A Fully Abstract Semantics for a First-Order Functional Language with Logic Variables.
ACM Trans. Program. Lang. Syst., 1991

Accumulators: New Logic Variable Abstractions for Functional Languages.
Theor. Comput. Sci., 1991

From Control Flow to Dataflow.
J. Parallel Distributed Comput., 1991

Dependence Flow Graphs: An Algebraic Approach to Program Dependencies.
Proceedings of the Conference Record of the Eighteenth Annual ACM Symposium on Principles of Programming Languages, 1991

An Executable Representation of Distance and Direction.
Proceedings of the Languages and Compilers for Parallel Computing, 1991

1990
Static Scheduling for Dynamic Dataflow Machines.
J. Parallel Distributed Comput., 1990

Compiling for Locality.
Proceedings of the 1990 International Conference on Parallel Processing, 1990

From Control Flow to Dataflow.
Proceedings of the 1990 International Conference on Parallel Processing, 1990

1989
I-Structures: Data Structures for Parallel Computing.
ACM Trans. Program. Lang. Syst., 1989

Process Decomposition Through Locality of Reference.
Proceedings of the ACM SIGPLAN'89 Conference on Programming Language Design and Implementation (PLDI), 1989

A Fully Abstract Semantics for a Functional Language with Logic Variables
Proceedings of the Fourth Annual Symposium on Logic in Computer Science (LICS '89), 1989

1988
Fine-grain compilation for pipelined machines.
J. Supercomput., 1988

Lazy evaluation and the logic variable.
Proceedings of the 2nd international conference on Supercomputing, 1988

1986
Clarification of "Feeding Inputs on Demand" in Efficient Demand-Driven Evaluation - Part 1.
ACM Trans. Program. Lang. Syst., 1986

Efficient Demand-Driven Evaluation - Part 2.
ACM Trans. Program. Lang. Syst., 1986

1985
Efficient Demand-Driven Evaluation - Part 1.
ACM Trans. Program. Lang. Syst., 1985


  Loading...