Rajiv Gupta

Orcid: 0000-0002-9348-3974

Affiliations:
  • University of California, Riverside, CA, USA
  • University of Arizona, Tucson, AZ, USA (1999 - 2007)
  • University of Pittsburgh, PA, USA (1990 - 1999)
  • Philips Laboratories, Briarcliff Manor, NY, USA (1987 - 1990)
  • University of Pittsburgh, PA, USA (PhD 1987)


According to our database1, Rajiv Gupta authored at least 295 papers between 1985 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Core Graph: Exploiting Edge Centrality to Speedup the Evaluation of Iterative Graph Queries.
Proceedings of the Nineteenth European Conference on Computer Systems, 2024

2023
Graph Analytics on Evolving Data (Abstract).
CoRR, 2023

MEGA Evolving Graph Accelerator.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

OMRGx: Programmable and Transparent Out-of-Core Graph Partitioning and Processing.
Proceedings of the 2023 ACM SIGPLAN International Symposium on Memory Management, 2023

Taming Misaligned Graph Traversals in Concurrent Graph Processing (Abstract).
Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing, 2023

CommonGraph: Graph Analytics on Evolving Data (Abstract).
Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing, 2023

Expressway: Prioritizing Edges for Distributed Evaluation of Graph Queries.
Proceedings of the IEEE International Conference on Big Data, 2023

Glign: Taming Misaligned Graph Traversals in Concurrent Graph Processing.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

CommonGraph: Graph Analytics on Evolving Data.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
SimGQ+: Simultaneously evaluating iterative point-to-all and point-to-point graph queries.
J. Parallel Distributed Comput., 2022

2021
VRGQ: Evaluating a Stream of Iterative Graph Queries via Value Reuse.
ACM SIGOPS Oper. Syst. Rev., 2021

GO: Out-Of-Core Partitioning of Large Irregular Graphs.
Proceedings of the IEEE International Conference on Networking, Architecture and Storage, 2021

JetStream: Graph Analytics on Streaming Data with Event-Driven Hardware Accelerator.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

DSGEN: concolic testing GPU implementations of concurrent dynamic data structures.
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

Tripoline: generalized incremental graph processing via graph triangle inequality.
Proceedings of the EuroSys '21: Sixteenth European Conference on Computer Systems, 2021

G-Morph: Induced Subgraph Isomorphism Search of Labeled Graphs on a GPU.
Proceedings of the Euro-Par 2021: Parallel Processing, 2021

2020
GraphPulse: An Event-Driven Hardware Accelerator for Asynchronous Graph Processing.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

SimGQ: Simultaneously Evaluating Iterative Graph Queries.
Proceedings of the 27th IEEE International Conference on High Performance Computing, 2020

Subway: minimizing data transfer during out-of-GPU-memory graph processing.
Proceedings of the EuroSys '20: Fifteenth EuroSys Conference 2020, 2020

BEAD: Batched Evaluation of Iterative Graph Queries with Evolving Analytics Demands.
Proceedings of the 2020 IEEE International Conference on Big Data (IEEE BigData 2020), 2020

2019
DProf: distributed profiler with strong guarantees.
Proc. ACM Program. Lang., 2019

Annotation guided collection of context-sensitive parallel execution profiles.
Formal Methods Syst. Des., 2019

When the Attacker Knows a Lot: The GAGA Graph Anonymizer.
Proceedings of the Information Security - 22nd International Conference, 2019

Figment: Fine-grained Permission Management for Mobile Apps.
Proceedings of the 2019 IEEE Conference on Computer Communications, 2019

Dynamic slicing for Android.
Proceedings of the 41st International Conference on Software Engineering, 2019

White-Box Program Tuning.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2019

Efficient concolic testing of MPI applications.
Proceedings of the 28th International Conference on Compiler Construction, 2019

MultiLyra: Scalable Distributed Evaluation of Batches of Iterative Graph Queries.
Proceedings of the 2019 IEEE International Conference on Big Data (IEEE BigData), 2019

Enabling Faster Convergence in Distributed Irregular Graph Processing.
Proceedings of the 2019 IEEE International Conference on Big Data (IEEE BigData), 2019

PnP: Pruning and Prediction for Point-To-Point Iterative Graph Analytics.
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

2018
Software Speculation on Caching DSMs.
Int. J. Parallel Program., 2018

OMR: out-of-core MapReduce for large data sets.
Proceedings of the 2018 ACM SIGPLAN International Symposium on Memory Management, 2018

COMPI: Concolic Testing for MPI Applications.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

Non-intrusively Avoiding Scaling Problems in and out of MPI Collectives.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

Droid M+: Developer Support for Imbibing Android's New Permission Model.
Proceedings of the 2018 on Asia Conference on Computer and Communications Security, 2018

2017
Parastack: efficient hang detection for MPI programs at large scale.
Proceedings of the International Conference for High Performance Computing, 2017

Enabling Work-Efficiency for High Performance Vertex-Centric Graph Analytics on GPUs.
Proceedings of the Seventh Workshop on Irregular Applications: Architectures and Algorithms, 2017

Annotation Guided Collection of Context-Sensitive Parallel Execution Profiles.
Proceedings of the Runtime Verification - 17th International Conference, 2017

Where Is the Weakest Link? A Study on Security Discrepancies Between Android Apps and Their Website Counterparts.
Proceedings of the Passive and Active Measurement - 18th International Conference, 2017

CoRAL: Confined Recovery in Distributed Asynchronous Graph Processing.
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

KickStarter: Fast and Accurate Computations on Streaming Graphs via Trimmed Approximations.
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

2016
Synergistic Analysis of Evolving Graphs.
ACM Trans. Archit. Code Optim., 2016

Tumbler: An Effective Load-Balancing Technique for Multi-CPU Multicore Systems.
ACM Trans. Archit. Code Optim., 2016

Load the Edges You Need: A Generic I/O Optimization for Disk-based Graph Processing.
Proceedings of the 2016 USENIX Annual Technical Conference, 2016

Proving Concurrent Data Structures Linearizable.
Proceedings of the 27th IEEE International Symposium on Software Reliability Engineering, 2016

Eliminating Intra-Warp Load Imbalance in Irregular Nested Patterns via Collaborative Task Engagement.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

CuMAS: Data Transfer Aware Multi-Application Scheduling for Shared GPUs.
Proceedings of the 2016 International Conference on Supercomputing, 2016

Efficient Processing of Large Graphs via Input Reduction.
Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, 2016

Parallel Execution Profiles.
Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, 2016

Automatic fault location for data structures.
Proceedings of the 25th International Conference on Compiler Construction, 2016

Safe and flexible adaptation via alternate data structure representations.
Proceedings of the 25th International Conference on Compiler Construction, 2016

2015
MG++: Memory graphs for analyzing dynamic data structures.
Proceedings of the 22nd IEEE International Conference on Software Analysis, 2015

RAIVE: runtime assessment of floating-point instability by vectorization.
Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, 2015

Efficient warp execution in presence of divergence with collaborative context collection.
Proceedings of the 48th International Symposium on Microarchitecture, 2015

Size Oblivious Programming with InfiniMem.
Proceedings of the Languages and Compilers for Parallel Computing, 2015

Experience report: How do bug characteristics differ across severity classes: A multi-platform study.
Proceedings of the 26th IEEE International Symposium on Software Reliability Engineering, 2015

PeerWave: Exploiting Wavefront Parallelism on GPUs with Peer-SM Synchronization.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

A cross-platform analysis of bugs and bug-fixing in open source projects: desktop vs. Android vs. iOS.
Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering, 2015

Predicting concurrency bugs: how many, what kind and where are they?
Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering, 2015

Optimizing Caching DSM for Distributed Software Speculation.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

Scalable SIMD-Efficient Graph Processing on GPUs.
Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015

Stadium Hashing: Scalable and Flexible Hashing on GPUs.
Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015

2014
Erratum: A system for debugging via online tracing and dynamic slicing.
Softw. Pract. Exp., 2014

Adapting Graph Application Performance via Alternate Data Structure Representation.
CoRR, 2014

Fence Scoping.
Proceedings of the International Conference for High Performance Computing, 2014

Lock contention aware thread migrations.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2014

ASPIRE: exploiting asynchronous parallelism in iterative algorithms using a relaxed consistency based DSM.
Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, 2014

LightPlay: Efficient Replay with GPUs.
Proceedings of the Languages and Compilers for Parallel Computing, 2014

Optimistic Parallelism on GPUs.
Proceedings of the Languages and Compilers for Parallel Computing, 2014

ABC2: Adaptively Balancing Computation and Communication in a DSM Cluster of Multicores for Irregular Applications.
Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

CuSha: vertex-centric graph processing on GPUs.
Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing, 2014

A paradigm shift in GP-GPU computing: task based execution of applications with dynamic data dependencies.
Proceedings of the DIDC'14, 2014

DrDebug: Deterministic Replay based Cyclic Debugging with Dynamic Slicing.
Proceedings of the 12th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2014

Shuffling: a framework for lock contention aware thread scheduling for multicore multiprocessor systems.
Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

2013
ADAPT: A framework for coscheduling multithreaded programs.
ACM Trans. Archit. Code Optim., 2013

A dynamic self-scheduling scheme for heterogeneous multiprocessor architectures.
ACM Trans. Archit. Code Optim., 2013

A state alteration and inspection-based interactive debugger.
Proceedings of the 13th IEEE International Working Conference on Source Code Analysis and Manipulation, 2013

Generating sound and effective memory debuggers.
Proceedings of the International Symposium on Memory Management, 2013

Relevant inputs analysis and its applications.
Proceedings of the IEEE 24th International Symposium on Software Reliability Engineering, 2013

Programming Support for Speculative Execution with Software Transactional Memory.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Address-aware fences.
Proceedings of the International Conference on Supercomputing, 2013

Lightweight fault detection in parallelized programs.
Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization, 2013

2012
Thread Tranquilizer: Dynamically reducing performance variation.
ACM Trans. Archit. Code Optim., 2012

PLDS: Partitioning linked data structures for parallelism.
ACM Trans. Archit. Code Optim., 2012

Erratum: A system for debugging via online tracing and dynamic slicing.
Softw. Pract. Exp., 2012

A system for debugging via online tracing and dynamic slicing.
Softw. Pract. Exp., 2012

Efficient Sequential Consistency Using Conditional Fences.
Int. J. Parallel Program., 2012

Speculative parallelization on GPGPUs.
Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012

Effective parallelization of loops in the presence of I/O operations.
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2012

Efficient sequential consistency via conflict ordering.
Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems, 2012

Enhancing LRU replacement via phantom associativity.
Proceedings of the 16th Workshop on Interaction between Compilers and Computer Architectures, 2012

2011
Dynamic access distance driven cache replacement.
ACM Trans. Archit. Code Optim., 2011

Isolating bugs in multithreaded programs using execution suppression.
Softw. Pract. Exp., 2011

Enhanced speculative parallelization via incremental recovery.
Proceedings of the 16th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2011

SpiceC: scalable parallelism via implicit copying and explicit commit.
Proceedings of the 16th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2011

Thread reinforcer: Dynamically determining number of threads via OS level monitoring.
Proceedings of the 2011 IEEE International Symposium on Workload Characterization, 2011

No More Backstabbing... A Faithful Scheduling Policy for Multithreaded Programs.
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

2010
Execution suppression: An automated iterative technique for locating memory errors.
ACM Trans. Program. Lang. Syst., 2010

Supporting speculative parallelization in the presence of dynamic data structures.
Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation, 2010

Learning universal probabilistic models for fault localization.
Proceedings of the 9th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering, 2010

Speculative parallelization using state separation and multiple value prediction.
Proceedings of the 9th International Symposium on Memory Management, 2010

2009
Compiler-Assisted Memory Encryption for Embedded Processors.
Trans. High Perform. Embed. Archit. Compil., 2009

Automated dynamic detection of busy-wait synchronizations.
Softw. Pract. Exp., 2009

Runtime monitoring on multicores via OASES.
ACM SIGOPS Oper. Syst. Rev., 2009

Speculative Parallelization of Sequential Loops on Multicores.
Int. J. Parallel Program., 2009

Architectural support for shadow memory in multiprocessors.
Proceedings of the 5th International Conference on Virtual Execution Environments, 2009

Speculative Optimizations for Parallel Programs on Multicores.
Proceedings of the Languages and Compilers for Parallel Computing, 2009

BugFix: A learning-based tool to assist developers in fixing bugs.
Proceedings of the 17th IEEE International Conference on Program Comprehension, 2009

Self-recovery in server programs.
Proceedings of the 8th International Symposium on Memory Management, 2009

ECMon: exposing cache events for monitoring.
Proceedings of the 36th International Symposium on Computer Architecture (ISCA 2009), 2009

Effective and efficient localization of multiple faults using value replacement.
Proceedings of the 25th IEEE International Conference on Software Maintenance (ICSM 2009), 2009

Detecting virus mutations via dynamic matching.
Proceedings of the 25th IEEE International Conference on Software Maintenance (ICSM 2009), 2009

2008
Copy or Discard execution model for speculative parallelization on multicores.
Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-41 2008), 2008

Dynamic recognition of synchronization operations for improved data race detection.
Proceedings of the ACM/SIGSOFT International Symposium on Software Testing and Analysis, 2008

Support for symmetric shadow memory in multiprocessors.
Proceedings of the 6th Workshop on Parallel and Distributed Systems: Testing, 2008

Fault localization using value replacement.
Proceedings of the ACM/SIGSOFT International Symposium on Software Testing and Analysis, 2008

Scalable dynamic information flow tracking and its applications.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Dynamic slicing of multithreaded programs for race detection.
Proceedings of the 24th IEEE International Conference on Software Maintenance (ICSM 2008), September 28, 2008

Identifying the root causes of memory bugs using corrupted memory location suppression.
Proceedings of the 24th IEEE International Conference on Software Maintenance (ICSM 2008), September 28, 2008

Avoiding Program Failures Through Safe Execution Perturbations.
Proceedings of the 32nd Annual IEEE International Computer Software and Applications Conference, 2008

2007
Introduction to the special LCTES'05 issue.
ACM Trans. Embed. Comput. Syst., 2007

Unified control flow and data dependence traces.
ACM Trans. Archit. Code Optim., 2007

Locating faulty code by multiple points slicing.
Softw. Pract. Exp., 2007

The design and evaluation of path matching schemes on compressed control flow traces.
J. Syst. Softw., 2007

A study of effectiveness of dynamic slicing in locating real faults.
Empir. Softw. Eng., 2007

Towards locating execution omission errors.
Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, 2007

Enabling tracing Of long-running multithreaded programs via dynamic execution reduction.
Proceedings of the ACM/SIGSOFT International Symposium on Software Testing and Analysis, 2007

ExPert: Dynamic Analysis Based Fault Location via Execution Perturbations.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

ONTRAC: A system for efficient ONline TRACing for debugging.
Proceedings of the 23rd IEEE International Conference on Software Maintenance (ICSM 2007), 2007

Matching Control Flow of Program Versions.
Proceedings of the 23rd IEEE International Conference on Software Maintenance (ICSM 2007), 2007

Whole Execution Traces and Their Use in Debugging.
Proceedings of the Compiler Design Handbook: Optimizations and Machine Code Generation, 2007

2006
Compressing heap data for improved memory performance.
Softw. Pract. Exp., 2006

Dynamic slicing long running programs through execution fast forwarding.
Proceedings of the 14th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2006

Pruning dynamic slices with confidence.
Proceedings of the ACM SIGPLAN 2006 Conference on Programming Language Design and Implementation, 2006

Temporal Analysis of Routing Activity for Anomaly Detection in Ad hoc Networks.
Proceedings of the IEEE 3rd International Conference on Mobile Adhoc and Sensor Systems, 2006

Locating faults through automated predicate switching.
Proceedings of the 28th International Conference on Software Engineering (ICSE 2006), 2006

2005
Cost and precision tradeoffs of dynamic data slicing algorithms.
ACM Trans. Program. Lang. Syst., 2005

Dynamic coalescing for 16-bit instructions.
ACM Trans. Embed. Comput. Syst., 2005

Whole execution traces and their applications.
ACM Trans. Archit. Code Optim., 2005

Matching execution histories of program versions.
Proceedings of the 10th European Software Engineering Conference held jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2005

Efficient Use of Invisible Registers in Thumb Code.
Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-38 2005), 2005

Locating faulty code using failure-inducing chops.
Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering (ASE 2005), 2005

Data Dependence Based Testability Transformation in Automated Test Generation.
Proceedings of the 16th International Symposium on Software Reliability Engineering (ISSRE 2005), 2005

SENSS: Security Enhancement to Symmetric Shared Memory Multiprocessors.
Proceedings of the 11th International Conference on High-Performance Computer Architecture (HPCA-11 2005), 2005

Exploiting a Computation Reuse Cache to Reduce Energy in Network Processors.
Proceedings of the High Performance Embedded Architectures and Compilers, 2005

Experimental evaluation of using dynamic slices for fault location.
Proceedings of the Sixth International Workshop on Automated Debugging, 2005

Extended Whole Program Paths.
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT 2005), 2005

2004
Frequent value encoding for low power data buses.
ACM Trans. Design Autom. Electr. Syst., 2004

Cost effective dynamic program slicing.
Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation 2004, 2004

Whole Execution Traces.
Proceedings of the 37th Annual International Symposium on Microarchitecture (MICRO-37 2004), 2004

Selective backbone construction for topology control in ad hoc networks.
Proceedings of the 2004 IEEE International Conference on Mobile Ad-hoc and Sensor Systems, 2004

Speculative Subword Register Allocation in Embedded Processors.
Proceedings of the Languages and Compilers for High Performance Computing, 2004

Profile-Guided Java Program Partitioning for Power Aware Computing.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

Efficient Forward Computation of Dynamic Slices Using Reduced Ordered Binary Decision Diagrams.
Proceedings of the 26th International Conference on Software Engineering (ICSE 2004), 2004

Extending Path Profiling across Loop Backedges and Procedure Boundaries.
Proceedings of the 2nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2004), 2004

2003
Algorithms for Supporting Compiled Communication.
IEEE Trans. Parallel Distributed Syst., 2003

Mixed-width instruction sets.
Commun. ACM, 2003

Code Compaction of Matching Single-Entry Multiple-Exit Regions.
Proceedings of the Static Analysis, 10th International Symposium, 2003

Bitwidth aware global register allocation.
Proceedings of the Conference Record of POPL 2003: The 30th SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2003

Enhancing the performance of 16-bit code using augmenting instructions.
Proceedings of the 2003 Conference on Languages, 2003

Precise Dynamic Slicing Algorithms.
Proceedings of the 25th International Conference on Software Engineering, 2003

Enabling Partial Cache Line Prefetching Through Data Compression.
Proceedings of the 32nd International Conference on Parallel Processing (ICPP 2003), 2003

Hiding Program Slices for Software Security.
Proceedings of the 1st IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2003), 2003

Simple offset assignment in presence of subword data.
Proceedings of the International Conference on Compilers, 2003

2002
Frequent value locality and its applications.
ACM Trans. Embed. Comput. Syst., 2002

Dynamic Memory Disambiguation in the Presence of Out-of-order Store Issuing.
J. Instr. Level Parallelism, 2002

Debugging and Testing Optimizers through Comparison Checking.
Proceedings of the Compiler Optimization Meets Compiler Verification, 2002

Energy efficient frequent value data cache design.
Proceedings of the 35th Annual International Symposium on Microarchitecture, 2002

Profile guided selection of ARM and thumb instructions.
Proceedings of the 2002 Joint Conference on Languages, 2002

Path Matching in Compressed Control Flow Trace.
Proceedings of the 2002 Data Compression Conference (DCC 2002), 2002

Data Compression Transformations for Dynamically Allocated Data Structures.
Proceedings of the Compiler Construction, 11th International Conference, 2002

Optimizing Static Power Dissipation by Functional Units in Superscalar Processors.
Proceedings of the Compiler Construction, 11th International Conference, 2002

A Representation for Bit Section Based Analysis and Optimization.
Proceedings of the Compiler Construction, 11th International Conference, 2002

Bit section instruction set extension of ARM for embedded applications.
Proceedings of the International Conference on Compilers, 2002

Profile-Guided Compiler Optimizations.
Proceedings of the Compiler Design Handbook: Optimizations and Machine Code Generation, 2002

Data Flow Testing.
Proceedings of the Compiler Design Handbook: Optimizations and Machine Code Generation, 2002

2001
Performance of Multi-hop Communications Using Logical Topologies on Optical Torus Networks.
J. Parallel Distributed Comput., 2001

Timestamped Whole Program Path Representation and its Applications.
Proceedings of the 2001 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2001

FV encoding for low-power data I/O.
Proceedings of the 2001 International Symposium on Low Power Electronics and Design, 2001

Energy-efficient load and store reuse.
Proceedings of the 2001 International Symposium on Low Power Electronics and Design, 2001

Load and store reuse using register file contents.
Proceedings of the 15th international conference on Supercomputing, 2001

Instruction Wake-Up in Wide Issue Superscalars.
Proceedings of the Euro-Par 2001: Parallel Processing, 2001

SPMD Execution in the Presence of Dynamic Data Structures.
Proceedings of the Compiler Optimizations for Scalable Parallel Systems Languages, 2001

2000
FULLDOC: A Full Reporting Debugger for Optimized Code.
Proceedings of the Static Analysis, 7th International Symposium, 2000

ABCD: eliminating array bounds checks on demand.
Proceedings of the 2000 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2000

Frequent value compression in data caches.
Proceedings of the 33rd Annual IEEE/ACM International Symposium on Microarchitecture, 2000

Load Redundancy Removal through Instruction Reuse.
Proceedings of the 2000 International Conference on Parallel Processing, 2000

Frequent Value Locality and Value-Centric Data Cache Design.
Proceedings of the ASPLOS-IX Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems, 2000

1999
Distributed Path Reservation Algorithms for Multiplexed All-Optical Interconnection Networks.
IEEE Trans. Computers, 1999

Distributed Control Protocols for Wavelength Reservation and their Performance Evaluation.
Photonic Netw. Commun., 1999

Compilation techniques for parallel systems.
Parallel Comput., 1999

Load-Reuse Analysis: Design and Evaluation.
Proceedings of the 1999 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 1999

Value Prediction in VLIW Machines.
Proceedings of the 26th Annual International Symposium on Computer Architecture, 1999

Compiler Analysis to Support Compiled Communication for HPF-Like Programs.
Proceedings of the 13th International Parallel Processing Symposium / 10th Symposium on Parallel and Distributed Processing (IPPS / SPDP '99), 1999

Global Context-Based Value Prediction.
Proceedings of the Fifth International Symposium on High-Performance Computer Architecture, 1999

Comparison Checking: An Approach to Avoid Debugging of Optimized Code.
Proceedings of the Software Engineering, 1999

Register Pressure Sensitive Redundancy Elimination.
Proceedings of the Compiler Construction, 8th International Conference, 1999

Caching and Predicting Branch Sequences for Improved Fetch Effectiveness.
Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques, 1999

1998
Experimental evaluation of on-line techniques for removing monitoring intrusion.
Proceedings of the SIGMETRICS Symposium on Parallel and Distributed Tools, 1998

Complete removal of redundant expressions (with retrospective)
Proceedings of the 20 Years of the ACM SIGPLAN Conference on Programming Language Design and Implementation 1979-1999, 1998

Complete Removal of Redundant Computations.
Proceedings of the ACM SIGPLAN '98 Conference on Programming Language Design and Implementation (PLDI), 1998

Data Flow Analysis Driven Dynamic Data Partitioning.
Proceedings of the Languages, 1998

Integrated Instruction Scheduling and Register Allocation Techniques.
Proceedings of the Languages and Compilers for Parallel Computing, 1998

A Protocol for Removing Communication Intrusion in Monitored Distributed Systems.
Proceedings of the 18th International Conference on Distributed Computing Systems, 1998

Automatic Generation of Microarchitecture Simulators.
Proceedings of the 1998 International Conference on Computer Languages, 1998

Path Profile Guided Partial Redundancy Elimination Using Speculation.
Proceedings of the 1998 International Conference on Computer Languages, 1998

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks.
Proceedings of the International Conference On Computer Communications and Networks (ICCCN 1998), 1998

A Code Motion Framework for Global Instruction Scheduling.
Proceedings of the Compiler Construction, 7th International Conference, 1998

Superscalar Execution with Direct Data Forwarding.
Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques, 1998

Capturing the Effects of Code Improving Transformations.
Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques, 1998

1997
Hybrid Slicing: Integrating Dynamic Information with Static Analysis.
ACM Trans. Softw. Eng. Methodol., 1997

A Practical Framework for Demand-Driven Interprocedural Data Flow Analysis.
ACM Trans. Program. Lang. Syst., 1997

Demand-Driven Data Flow Analysis for Communication Optimization.
Parallel Process. Lett., 1997

On-line error detection through data duplication in distributed-memory systems.
Microprocess. Microsystems, 1997

Interprocedural Conditional Branch Elimination.
Proceedings of the ACM SIGPLAN '97 Conference on Programming Language Design and Implementation (PLDI), 1997

Partial Dead Code Elimination using Slicing Transformations.
Proceedings of the ACM SIGPLAN '97 Conference on Programming Language Design and Implementation (PLDI), 1997

Does Time-Division Multiplexing Close the Gap between Memory and Optical Communication Speeds?
Proceedings of the Parallel Computer Routing and Communication, 1997

Resource-Sensitive Profile-Directed Data Flow Analysis for Code Optimization.
Proceedings of the Thirtieth Annual IEEE/ACM International Symposium on Microarchitecture, 1997

An Array Data Flow Analysis Based Communication Optimizer.
Proceedings of the Languages and Compilers for Parallel Computing, 1997

Code optimization as a side effect of instruction scheduling.
Proceedings of the Fourth International on High-Performance Computing, 1997

Refining Data Flow Information Using Infeasible Paths.
Proceedings of the Software Engineering, 1997

Path Profile Guided Partial Dead Code Elimation Using Predication.
Proceedings of the 1997 Conference on Parallel Architectures and Compilation Techniques (PACT '97), 1997

1996
Loop Transformations for Fault Detection in Regular Loops on Massively Parallel Systems.
IEEE Trans. Parallel Distributed Syst., 1996

Program Slicing-Based Regression Testing Techniques.
Softw. Test. Verification Reliab., 1996

A Compact Task Graph Representation for Real-Time Scheduling.
Real Time Syst., 1996

Array Data Flow Analysis for Load-Store Optimizations in Fine-Grained Architectures.
Int. J. Parallel Program., 1996

Guaranteed intrusion removal from monitored distributed applications.
Proceedings of the Eighth IEEE Symposium on Parallel and Distributed Processing, 1996

Compiled Communication for All-Optical TDM Networks.
Proceedings of the 1996 ACM/IEEE Conference on Supercomputing, 1996

Integrating Program Optimizations and Transformations with the Scheduling of Instruction Level Parallelism.
Proceedings of the Languages and Compilers for Parallel Computing, 1996

A Demand-Driven Analyzer for Data Flow Testing at the Integration Level.
Proceedings of the 18th International Conference on Software Engineering, 1996

A Timestamp-based Selective Invalidation Scheme for Multiprocessor Cache Coherence.
Proceedings of the 1996 International Conference on Parallel Processing, 1996

Designing a Non-intrusive Monitoring Tool for Developing Complex Distributed Applications.
Proceedings of the 2nd IEEE International Conference on Engineering of Complex Computer Systems (ICECCS '96), 1996

On-Line Avoidance of the Intrusive Affects of Monitoring on Runtime Scheduling Decisions.
Proceedings of the 16th International Conference on Distributed Computing Systems, 1996

Real-Time Scheduling Using Compact Task Graphs.
Proceedings of the 16th International Conference on Distributed Computing Systems, 1996

1995
Loop Monotonic Statements.
IEEE Trans. Software Eng., 1995

Generalized Dominators.
Inf. Process. Lett., 1995

Adaptive loop transformations for scientific programs.
Proceedings of the Seventh IEEE Symposium on Parallel and Distributed Processing, 1995

Hybrid Slicing: An Approach for Refining Static Slices Using Dynamic Information.
Proceedings of the Third ACM SIGSOFT Symposium on Foundations of Software Engineering, 1995

Demand-driven Computation of Interprocedural Data Flow.
Proceedings of the Conference Record of POPL'95: 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 1995

Array Data Flow Analysis for Load-Store Optimizations in Superscalar Architectures.
Proceedings of the Languages and Compilers for Parallel Computing, 1995

GURRR: A Global Unified Resource Requirements Representation.
Proceedings of the Proceedings ACM SIGPLAN Workshop on Intermediate Representations (IR'95), 1995

Priority based data flow testing.
Proceedings of the International Conference on Software Maintenance, 1995

Dynamic Techniques for Minimizing the Intrusive Effect of Monitoring Actions.
Proceedings of the 15th International Conference on Distributed Computing Systems, Vancouver, British Columbia, Canada, May 30, 1995

1994
The Combining DAG: A Technique for Parallel Data Flow Analysis.
IEEE Trans. Parallel Distributed Syst., 1994

Efficient Register Allocation via Coloring Using Clique Separators.
ACM Trans. Program. Lang. Syst., 1994

Exploiting Program Semantics for Efficient Instrumentation of Distributed Event Recognitions.
Proceedings of the 13th Symposium on Reliable Distributed Systems, 1994

Busy-Idle Profiles and Compact Task Graphs: Compile-Time Support for Interleaved and Overlapped Scheduling of Real- Time Tasks.
Proceedings of the 15th IEEE Real-Time Systems Symposium (RTSS '94), 1994

Resource Spackling: A Framework for Integrating Register Allocation in Local and Global Schedulers.
Proceedings of the Parallel Architectures and Compilation Techniques, 1994

A Framework for Partial Data Flow Analysis.
Proceedings of the International Conference on Software Maintenance, 1994

Perturbation Analysis: A Static Analysis Approach for the Non-Intrusive Monitoring of Distributed Programs.
Proceedings of the 1994 International Conference on Parallel Processing, 1994

Debugging Distributed Programs through the Detection of Simultaneous Events.
Proceedings of the 14th International Conference on Distributed Computing Systems, 1994

Reducing the Cost of Data Flow Analysis By Congruence Partitioning.
Proceedings of the Compiler Construction, 5th International Conference, 1994

1993
A Methodology for Controlling the Size of a Test Suite.
ACM Trans. Softw. Eng. Methodol., 1993

Employing Static Information in the Generation of Test Cases.
Softw. Test. Verification Reliab., 1993

Optimizing Array Bound Checks Using Flow Analysis.
LOPLAS, 1993

A Practical Data Flow Framework for Array Reference Analysis and its Use in Optimizations.
Proceedings of the ACM SIGPLAN'93 Conference on Programming Language Design and Implementation (PLDI), 1993

Towards a Non-Intrusive Approach for Monitoring Distributed Computations through Perturbation Analysis.
Proceedings of the Languages and Compilers for Parallel Computing, 1993

URSA: A Unified ReSource Allocator for Registers and Functional Units in VLIW Architectures.
Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism, 1993

Compilation Techiques for Optimizing Communication on Distributed-Memory Systems.
Proceedings of the 1993 International Conference on Parallel Processing, 1993

1992
Synchronization and Communication Costs of Loop Partitioning on Shared-Memory Multiprocessor Systems.
IEEE Trans. Parallel Distributed Syst., 1992

Compiler Support for Object-Oriented Real-Time Software.
IEEE Softw., 1992

SPMD Execution of Programs with Pointer-Based Data Structures on Distributed Memory Machines.
J. Parallel Distributed Comput., 1992

Exploiting parallelism on a fine-grained MIMD architecture based upon channel queues.
Int. J. Parallel Program., 1992

Techniques for Integrating Parallelizing Transformations and Compiler-Based Scheduling Methods.
Proceedings of the Proceedings Supercomputing '92, 1992

Generalized Dominators and Post-Dominators.
Proceedings of the Conference Record of the Nineteenth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 1992

A shape matching approach for scheduling fine-grained parallelism.
Proceedings of the 25th Annual International Symposium on Microarchitecture, 1992

Distributed Slicing and Partial Re-execution for Distributed Programs.
Proceedings of the Languages and Compilers for Parallel Computing, 1992

The Combining Dag: A Technique for Parallel DataMow Analysis.
Proceedings of the 6th International Parallel Processing Symposium, 1992

Automatic Generation of a Compact Test Suit.
Proceedings of the Algorithms, Software, Architecture, 1992

An approach to regression testing using slicing.
Proceedings of the Conference on Software Maintenance, 1992

SPMD execution of programs with dynamic data structures on distributed memory machines.
Proceedings of the ICCL'92, 1992

Register Pipelining: An Integrated Approach to Register Allocation for Scalar and Subscripted Variables.
Proceedings of the Compiler Construction, 1992

1991
Compile-Time Techniques for Improving Scalar Access Performance in Parallel Memories.
IEEE Trans. Parallel Distributed Syst., 1991

Executing Loops on a Fine-Grained MIMD Architecture.
Proceedings of the 24th Annual IEEE/ACM International Symposium on Microarchitecture, 1991

Loop Monotonic Computations: An Approach for the Efficient Run-Time Detection of Races.
Proceedings of the Symposium on Testing, Analysis, and Verification, 1991

1990
Region Scheduling: An Approach for Detecting and Redistributing Parallelism.
IEEE Trans. Software Eng., 1990

Debugging Code Reorganized by a Trace Scheduling Compiler.
Struct. Program., 1990

High speed synchronization of processors using fuzzy barriers.
Int. J. Parallel Program., 1990

Achieving low cost synchronization in a multiprocessor system.
Future Gener. Comput. Syst., 1990

The design of a RISC based multiprocessor chip.
Proceedings of the Proceedings Supercomputing '90, New York, NY, USA, November 12-16, 1990, 1990

Improving instruction cache behavior by reducing cache pollution.
Proceedings of the Proceedings Supercomputing '90, New York, NY, USA, November 12-16, 1990, 1990

Loop displacement: an approach for transforming and scheduling loops for parallel execution.
Proceedings of the Proceedings Supercomputing '90, New York, NY, USA, November 12-16, 1990, 1990

Applying Compiler Techniques to Scheduling in Real-Time Systems.
Proceedings of the Real-Time Systems Symposium, 1990

Employing Register Channels for the Exploitation of Instruction Level Parallelism.
Proceedings of the Second ACM SIGPLAN Symposium on Princiles & Practice of Parallel Programming (PPOPP), 1990

A Fresh Look at Optimizing Array Bound Checking.
Proceedings of the ACM SIGPLAN'90 Conference on Programming Language Design and Implementation (PLDI), 1990

A fine-grained MIMD architecture based upon register channels.
Proceedings of the 23rd Annual Workshop and Symposium on Microprogramming and Microarchitecture, 1990

Opportunistic Evaluation of Communication Link Loads.
Proceedings of the 10th International Conference on Distributed Computing Systems (ICDCS 1990), May 28, 1990

1989
Compilation techniques for a reconfigurable LIW architecture.
J. Supercomput., 1989

A scalable implementation of barrier synchronization using an adaptive combining tree.
Int. J. Parallel Program., 1989

Register Allocation via Clique Separators.
Proceedings of the ACM SIGPLAN'89 Conference on Programming Language Design and Implementation (PLDI), 1989

The Fuzzy Barrier: A Mechanism for High Speed Synchronization of Processors.
Proceedings of the ASPLOS-III Proceedings, 1989

1988
Compile-time Techniques for Efficient Utilization of Parallel Memories.
Proceedings of the ACM/SIGPLAN PPEALS 1988, 1988

1987
A Reconfigurable LIW Architecture.
Proceedings of the International Conference on Parallel Processing, 1987

1986
SHAPE: a highly adaptable and parallel system.
Proceedings of the 14th ACM Annual Conference on Computer Science, 1986

1985
The efficiency of storage management schemes for Ada programs.
ACM SIGPLAN Notices, 1985


  Loading...