David J. Lilja
Orcid: 0000-0003-3785-8206
According to our database1,
David J. Lilja
authored at least 233 papers
between 1988 and 2022.
Collaborative distances:
Collaborative distances:
Awards
IEEE Fellow
IEEE Fellow 2006, "For contributions to statistical methodologies for performance assessment of computing systems.".
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2022
Work-in-Progress: ExpCache: Online-Learning based Cache Replacement Policy for Non-Volatile Memory.
Proceedings of the International Conference on Compilers, 2022
2021
IEEE Trans. Emerg. Top. Comput., 2021
Proceedings of the PEARC '21: Practice and Experience in Advanced Research Computing, 2021
HeuristicDB: a hybrid storage database system using a non-volatile memory block device.
Proceedings of the SYSTOR '21: The 14th ACM International Systems and Storage Conference, 2021
2020
ACM Trans. Model. Perform. Evaluation Comput. Syst., 2020
IEEE Trans. Computers, 2020
Enhancing the Top-Down Microarchitectural Analysis Method Using Purchasing Power Parity Theory.
Proceedings of the Languages and Compilers for Parallel Computing, 2020
Proceedings of the IEEE International Symposium on Circuits and Systems, 2020
AdaEmb-Encoder: Adaptive Embedding Spatial Encoder-Based Deduplication for Backing Up Classifier Training Data.
Proceedings of the 39th IEEE International Performance Computing and Communications Conference, 2020
PBCCF: Accelerated Deduplication by Prefetching Backup Content Correlated Fingerprints.
Proceedings of the 38th IEEE International Conference on Computer Design, 2020
2019
IEEE Trans. Very Large Scale Integr. Syst., 2019
NetStorage: A synchronized trace-driven replayer for network-storage system evaluation.
Perform. Evaluation, 2019
Neural Network Classifiers Using a Hardware-Based Approximate Activation Function with a Hybrid Stochastic Multiplier.
ACM J. Emerg. Technol. Comput. Syst., 2019
ACM J. Emerg. Technol. Comput. Syst., 2019
Proceedings of the 10th IEEE Annual Ubiquitous Computing, 2019
Proceedings of the 20th International Symposium on Quality Electronic Design, 2019
Using DCT-based Approximate Communication to Improve MPI Performance in Parallel Clusters.
Proceedings of the 38th IEEE International Performance Computing and Communications Conference, 2019
HAML-SSD: A Hardware Accelerated Hotness-Aware Machine Learning based SSD Management.
Proceedings of the International Conference on Computer-Aided Design, 2019
Proceedings of the 2019 on Great Lakes Symposium on VLSI, 2019
Energy-Efficient Convolutional Neural Networks with Deterministic Bit-Stream Processing.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019
Proceedings of the 30th IEEE International Conference on Application-specific Systems, 2019
2018
IEEE Trans. Very Large Scale Integr. Syst., 2018
Approximate Communication: Techniques for Reducing Communication Bottlenecks in Large-Scale Parallel Systems.
ACM Comput. Surv., 2018
Enhancing the Ensemble of Exemplar-SVMs for Binary Classification Using Concurrent Selection and Ensemble Learning.
Proceedings of the 9th IEEE Annual Ubiquitous Computing, 2018
Reducing Relational Database Performance Bottlenecks Using 3D XPoint Storage Technology.
Proceedings of the 17th IEEE International Conference On Trust, 2018
Efficient and Fast Approximate Consensus with Epidemic Failure Detection at Extreme Scale.
Proceedings of the 26th Euromicro International Conference on Parallel, 2018
Tier-Code: An XOR-Based RAID-6 Code with Improved Write and Degraded-Mode Read Performance.
Proceedings of the 2018 IEEE International Conference on Networking, 2018
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018
Towards Theoretical Cost Limit of Stochastic Number Generators for Stochastic Computing.
Proceedings of the 2018 IEEE Computer Society Annual Symposium on VLSI, 2018
Parallel implementation of finite state machines for reducing the latency of stochastic computing.
Proceedings of the 19th International Symposium on Quality Electronic Design, 2018
Proceedings of the 19th International Symposium on Quality Electronic Design, 2018
TNT: A Solver for Large Dense Least-Squares Problems that Takes Conjugate Gradient from Bad in Theory, to Good in Practice.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018
HyperProtect: Enhancing the Performance of a Dynamic Backup System Using Intelligent Scheduling.
Proceedings of the 37th IEEE International Performance Computing and Communications Conference, 2018
Proceedings of the International Conference on Computer-Aided Design, 2018
2017
IEEE Trans. Very Large Scale Integr. Syst., 2017
IEEE Trans. Computers, 2017
ACM J. Emerg. Technol. Comput. Syst., 2017
IET Comput. Digit. Tech., 2017
TraceRAR: An I/O Performance Evaluation Tool for Replaying, Analyzing, and Regenerating Traces.
Proceedings of the 2017 International Conference on Networking, Architecture, and Storage, 2017
Proceedings of the 18th International Symposium on Quality Electronic Design, 2017
Determining work partitioning on closely coupled heterogeneous computing systems using statistical design of experiments.
Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017
Kinetic Action: Performance Analysis of Integrated Key-Value Storage Devices vs. LevelDB Servers.
Proceedings of the 23rd IEEE International Conference on Parallel and Distributed Systems, 2017
Proceedings of the 2017 IEEE International Conference on Computer Design, 2017
Neural Network Classifiers Using Stochastic Computing with a Hardware-Oriented Approximate Activation Function.
Proceedings of the 2017 IEEE International Conference on Computer Design, 2017
TNT-NN: A Fast Active Set Method for Solving Large Non-Negative Least Squares Problems.
Proceedings of the International Conference on Computational Science, 2017
Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017
Proceedings of the 12th IEEE International Conference on ASIC, 2017
2016
A High-Capacity Separable Reversible Method for Hiding Multiple Messages in Encrypted Images.
CoRR, 2016
Ps-Code: A New Code for Improved Degraded Mode Read and Write Performance of RAID Systems.
Proceedings of the IEEE International Conference on Networking, 2016
Using Stochastic Computing to Reduce the Hardware Requirements for a Restricted Boltzmann Machine Classifier.
Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2016
Proceedings of the 21st Asia and South Pacific Design Automation Conference, 2016
2015
Design space exploration for efficient computing in Solid State drives with the Storage Processing Unit.
Proceedings of the 10th IEEE International Conference on Networking, 2015
Proceedings of the 21st IEEE International Conference on Parallel and Distributed Systems, 2015
A hardware implementation of a radial basis function neural network using stochastic logic.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015
An FPGA implementation of a Restricted Boltzmann Machine classifier using stochastic bit streams.
Proceedings of the 26th IEEE International Conference on Application-specific Systems, 2015
2014
IEEE Trans. Very Large Scale Integr. Syst., 2014
IEEE Trans. Computers, 2014
Proceedings of the Euro-Par 2014: Parallel Processing Workshops, 2014
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014
2013
Int. J. Comput. Sci. Eng., 2013
Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013
Exploiting free silicon for energy-efficient computing directly in NAND flash-based solid-state storage systems.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2013
A Stepwise Approach to Software-Hardware Performance Co-optimization Using Design of Experiments.
Proceedings of the 39. International Computer Measurement Group Conference, 2013
A divide-and-conquer approach for solving singular value decomposition on a heterogeneous system.
Proceedings of the Computing Frontiers Conference, 2013
Accelerating the performance of stochastic encoding-based computations by sharing bits in consecutive bit streams.
Proceedings of the 24th International Conference on Application-Specific Systems, 2013
2012
Proceedings of the IEEE 24th International Symposium on Computer Architecture and High Performance Computing, 2012
Proceedings of the Integrated Circuit and System Design. Power and Timing Modeling, 2012
Proceedings of the 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, 2012
Proceedings of the 30th International IEEE Conference on Computer Design, 2012
A stochastic reconfigurable architecture for fault-tolerant computation with sequential logic.
Proceedings of the 30th International IEEE Conference on Computer Design, 2012
An efficient implementation of numerical integration using logical computation on stochastic bit streams.
Proceedings of the 2012 IEEE/ACM International Conference on Computer-Aided Design, 2012
The synthesis of complex arithmetic computation on stochastic bit streams using sequential logic.
Proceedings of the 2012 IEEE/ACM International Conference on Computer-Aided Design, 2012
Weighted area technique for electromechanically enabled logic computation with cantilever-based NEMS switches.
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012
Romano: autonomous storage management using performance prediction in multi-tenant datacenters.
Proceedings of the ACM Symposium on Cloud Computing, SOCC '12, 2012
The synthesis of linear Finite State Machine-based Stochastic Computational Elements.
Proceedings of the 17th Asia and South Pacific Design Automation Conference, 2012
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012
2011
IEEE Trans. Computers, 2011
Fault tolerance for nanotechnology devices at the bit and module levels with history index of correct computation.
IET Comput. Digit. Tech., 2011
Performance analysis of single-phase, multiphase, and multicomponent lattice-Boltzmann fluid flow simulations on GPU clusters.
Concurr. Comput. Pract. Exp., 2011
Sampling-based garbage collection metadata management scheme for flash-based storage.
Proceedings of the IEEE 27th Symposium on Mass Storage Systems and Technologies, 2011
Proceedings of the 2011 International Conference on Distributed Computing Systems, 2011
Proceedings of the IEEE 29th International Conference on Computer Design, 2011
A programmable and scalable technique to design spintronic logic circuits based on magnetic tunnel junctions.
Proceedings of the 21st ACM Great Lakes Symposium on VLSI 2010, 2011
Performing bitwise logic operations in cache using spintronics-based magnetic tunnel junctions.
Proceedings of the 8th Conference on Computing Frontiers, 2011
A low power fault-tolerance architecture for the kernel density estimation based image segmentation algorithm.
Proceedings of the 22nd IEEE International Conference on Application-specific Systems, 2011
2010
Cross-layer speculative architecture for end systems and gateways in computer networks with lossy links.
Wirel. Networks, 2010
Using Resampling Techniques to Compute Confidence Intervals for the Harmonic Mean of Rate-Based Performance Metrics.
IEEE Comput. Archit. Lett., 2010
Proceedings of the IEEE 26th Symposium on Mass Storage Systems and Technologies, 2010
Proceedings of the IEEE 26th Symposium on Mass Storage Systems and Technologies, 2010
Proceedings of the 2010 IEEE International Symposium on Workload Characterization, 2010
Proceedings of the 28th International Conference on Computer Design, 2010
2009
IEEE Trans. Very Large Scale Integr. Syst., 2009
Comput. Geosci., 2009
Proceedings of the 2nd International Conference on Security of Information and Networks, 2009
Proceedings of the 17th Annual Meeting of the IEEE/ACM International Symposium on Modelling, 2009
Proceedings of the ICPP 2009, 2009
Proceedings of the 2009 International Conference on Computer-Aided Design, 2009
Proceedings of the 19th ACM Great Lakes Symposium on VLSI 2009, 2009
Using a Statistical Approach for Optimal Security Parameter Determination.
Proceedings of the 2009 International Conference on Security & Management, 2009
2008
IEEE Trans. Very Large Scale Integr. Syst., 2008
Exploiting the Impact of Database System Configuration Parameters: A Design of Experiments Approach.
IEEE Data Eng. Bull., 2008
Independent Component Analysis and Evolutionary Algorithms for Building Representative Benchmark Subsets.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2008
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008
Proceedings of the 24th International Conference on Data Engineering Workshops, 2008
Statistically translating low-level error probabilities to increase the accuracy and efficiency of reliability simulations in hardware description languages.
Proceedings of the 18th ACM Great Lakes Symposium on VLSI 2008, 2008
Proceedings of the Design, Automation and Test in Europe, 2008
Archer: A Community Distributed Computing Infrastructure for Computer Architecture Research and Education.
Proceedings of the Collaborative Computing: Networking, 2008
Proceedings of the 5th Conference on Computing Frontiers, 2008
2007
IEEE Trans. Computers, 2007
IEEE Comput. Archit. Lett., 2007
Proceedings of the 20th International Conference on VLSI Design (VLSI Design 2007), 2007
Improving nanoelectronic designs using a statistical approach to identify key parameters in circuit level SEU simulations.
Proceedings of the 2007 IEEE International Symposium on Nanoscale Architectures, 2007
MEMESTAR: A Simulation Framework for Reliability Evaluation over Multiple Environments.
Proceedings of the 8th International Symposium on Quality of Electronic Design (ISQED 2007), 2007
SCRAP: A Statistical Approach for Creating a Database Query Workload Based on Performance Bottlenecks.
Proceedings of the IEEE 10th International Symposium on Workload Characterization, 2007
Analysis of Statistical Sampling in Microarchitecture Simulation: Metric, Methodology and Program Characterization.
Proceedings of the IEEE 10th International Symposium on Workload Characterization, 2007
Exploring subsets of standard cell libraries to exploit natural fault masking capabilities for reliable logic.
Proceedings of the 17th ACM Great Lakes Symposium on VLSI 2007, 2007
Scaling Analytical Models for Soft Error Rate Estimation Under a Multiple-Fault Environment.
Proceedings of the Tenth Euromicro Conference on Digital System Design: Architectures, 2007
Proceedings of the 2007 Design, Automation and Test in Europe Conference and Exposition, 2007
2006
Simulation of Computer Architectures: Simulators, Benchmarks, Methodologies, and Recommendations.
IEEE Trans. Computers, 2006
Int. J. Commun. Syst., 2006
Proceedings of the 2006 IEEE International Symposium on Performance Analysis of Systems and Software, 2006
Proceedings of the 2006 IEEE International Symposium on Performance Analysis of Systems and Software, 2006
Temperature-aware floorplanning of microarchitecture blocks with IPC-power dependence modeling and transient analysis.
Proceedings of the 2006 International Symposium on Low Power Electronics and Design, 2006
Proceedings of the 2006 IEEE International Symposium on Workload Characterization, 2006
The exigency of benchmark and compiler drift: designing tomorrow's processors with yesterday's tools.
Proceedings of the 20th Annual International Conference on Supercomputing, 2006
Proceedings of the Eleventh Annual IEEE International High-Level Design Validation and Test Workshop 2006, 2006
Proceedings of the Handbook of Nature-Inspired and Innovative Computing, 2006
2005
The Impact of Incorrectly Speculated Memory Operations in a Multithreaded Architecture.
IEEE Trans. Parallel Distributed Syst., 2005
IEEE Trans. Computers, 2005
A Novel Memory Structure for Embedded Systems: Flexible Sequential and Random Access Memory.
J. Comput. Sci. Technol., 2005
Proceedings of the 22nd IEEE / 13th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST 2005), 2005
The Applicability of Adaptive Control Theory to QoS Design: Limitations and Solutions.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005
Proceedings of the 11th International Conference on High-Performance Computer Architecture (HPCA-11 2005), 2005
Dynamic Code Region (DCR) Based Program Phase Tracking and Prediction for Dynamic Optimizations.
Proceedings of the High Performance Embedded Architectures and Compilers, 2005
Microarchitecture-aware floorplanning using a statistical design of experiments approach.
Proceedings of the 42nd Design Automation Conference, 2005
Proceedings of the 2005 Conference on Asia South Pacific Design Automation, 2005
2004
IEEE Trans. Computers, 2004
State Pruning for Test Vector Generation for a Multiprocessor Cache Coherence Protocol.
Proceedings of the 15th IEEE International Workshop on Rapid System Prototyping (RSP 2004), 2004
The NanoBox Project: Exploring Fabrics of Self-Correcting Logic Blocks for High Defect Rate Molecular Device Technologies.
Proceedings of the 2004 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2004), 2004
Proceedings of the 10th International Conference on Parallel and Distributed Systems, 2004
Using ECN Marks to Improve TCP Performance over Lossy Links.
Proceedings of the ICETE 2004, 2004
Comparing Exact and Approximate Spatial Auto-regression Model Solutions for Spatial Data Analysis.
Proceedings of the Geographic Information Science, Third International Conference, 2004
Proceedings of the Euro-Par 2004 Parallel Processing, 2004
The Recursive NanoBox Processor Grid: A Reliable System Architecture for Unreliable Nanotechnology Devices.
Proceedings of the 2004 International Conference on Dependable Systems and Networks (DSN 2004), 28 June, 2004
An active data-aware cache consistency protocol for highly-scalable data-shipping DBMS architectures.
Proceedings of the First Conference on Computing Frontiers, 2004
Proceedings of the 1st International Conference on Broadband Networks (BROADNETS 2004), 2004
Enhancing the Memory Performance of Embedded Systems with the Flexible Sequential and Random Access Memory.
Proceedings of the Advances in Computer Systems Architecture, 9th Asia-Pacific Conference, 2004
2003
IEEE Trans. Computers, 2003
IEEE Comput. Archit. Lett., 2003
Using Incorrect Speculation to Prefetch Data in a Concurrent Multithreaded Processor.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003
Proceedings of the Intelligent Data Engineering and Automated Learning, 2003
Proceedings of the Ninth International Symposium on High-Performance Computer Architecture (HPCA'03), 2003
2002
Dynamically adapting to system load and program behavior in multiprogrammed multiprocessor systems.
Concurr. Comput. Pract. Exp., 2002
MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research.
IEEE Comput. Archit. Lett., 2002
Proceedings of the 20th International Conference on Computer Design (ICCD 2002), 2002
Increasing Instruction-Level Parallelism with Instruction Precomputation (Research Note).
Proceedings of the Euro-Par 2002, 2002
Exploiting the Prefetching Effect Provided by Executing Mispredicted Load Instructions.
Proceedings of the Euro-Par 2002, 2002
2001
Coarse-Grained Thread Pipelining: A Speculative Parallel Execution Model for Shared-Memory Multiprocessors.
IEEE Trans. Parallel Distributed Syst., 2001
Implementing a dynamic processor allocation policy for multiprogrammed parallel applications in the Solaris.
Concurr. Comput. Pract. Exp., 2001
Proceedings of the 19th International Conference on Computer Design (ICCD 2001), 2001
Automatic Verification of Instruction Set Simulation Using Synchronized State Comparison.
Proceedings of the Proceedings 34th Annual Simulation Symposium (SS 2001), 2001
2000
IEEE Trans. Parallel Distributed Syst., 2000
IEEE Trans. Computers, 2000
Adv. Comput., 2000
Proceedings of the 14th International Parallel & Distributed Processing Symposium (IPDPS'00), 2000
A Comprehensive Dynamic Processor Allocation Scheme for Multiprogrammed Multiprocessor Systems.
Proceedings of the 2000 International Conference on Parallel Processing, 2000
A Balanced Approach to High-Level Verification: Performance Trade-Offs in Verifying Large-Scale Multiprocessors.
Proceedings of the 2000 International Conference on Parallel Processing, 2000
Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques (PACT'00), 2000
1999
Performance-Based Path Determination for Interprocessor Communication in Distributed Computing Systems.
IEEE Trans. Parallel Distributed Syst., 1999
Special Issue on Compilation and Architectural Support for Parallel Applications - Guest Editor's Introduction.
J. Parallel Distributed Comput., 1999
Proceedings of the 1999 workshop on Computer architecture education, 1999
A Network Status Predictor to Support Dynamic Scheduling in Network-Based Computing Systems.
Proceedings of the 13th International Parallel Processing Symposium / 10th Symposium on Parallel and Distributed Processing (IPPS / SPDP '99), 1999
Proceedings of the IEEE International Conference On Computer Design, 1999
Proceedings of the Fifth International Symposium on High-Performance Computer Architecture, 1999
1998
Comparing Processor Allocation Strategies in Multiprogrammed Shared-Memory Multiprocessors.
J. Parallel Distributed Comput., 1998
Integrating Parallelizing Compilation Technology and Processor Architecture for Cost-Effective Concurrent multithreading.
J. Inf. Sci. Eng., 1998
Concurr. Pract. Exp., 1998
An Efficient Strategy for Developing a Simulator for a Novel Concurrent Multithreaded Processor Architecture.
Proceedings of the MASCOTS 1998, 1998
Proceedings of the 12th International Parallel Processing Symposium / 9th Symposium on Parallel and Distributed Processing (IPPS/SPDP '98), March 30, 1998
Proceedings of the 12th international conference on Supercomputing, 1998
The Effect of using State-Based Priority Information in a Shared-Memory Multiprocessor Cache Replacement Policy.
Proceedings of the 1998 International Conference on Parallel Processing (ICPP '98), 1998
High-Level Information - An Approach for Integrating Front-End and Back-End Compilers.
Proceedings of the 1998 International Conference on Parallel Processing (ICPP '98), 1998
Characterization of Communication Patterns in Message-Passing Parallel Scientific Application Programs.
Proceedings of the Network-Based Parallel Computing: Communication, 1998
1997
An Effective Processor Allocation Strategy for Multiprogrammed Shared-Memory Multiprocessors.
IEEE Trans. Parallel Distributed Syst., 1997
J. Parallel Distributed Comput., 1997
Proceedings of the 6th International Symposium on High Performance Distributed Computing, 1997
Proceedings of the 1997 Workshop on High-Level Programming Models and Supportive Environments (HIPS '97), 1997
Exploiting multiple heterogeneous networks to reduce communication costs in parallel programs.
Proceedings of the 6th Heterogeneous Computing Workshop, 1997
1996
Proceedings of the 1996 workshop on Computer architecture education, 1996
Efficient Execution of Parallel Applications in Multiprogrammed Multiprocessor Systems.
Proceedings of IPPS '96, 1996
Performance Analysis and Prediction of Processor Scheduling Strategies in Multiprogrammed Shared-Memory Multiprocessors.
Proceedings of the 1996 International Conference on Parallel Processing, 1996
Proceedings of the 16th International Conference on Distributed Computing Systems, 1996
1995
The Potential of Compile-Time Analysis to Adapt the Cache Coherence Enforcement Strategy to the Data Sharing Characteristics.
IEEE Trans. Parallel Distributed Syst., 1995
Partitioning tasks between a pair of interconnected heterogeneous processors: A case study.
Concurr. Pract. Exp., 1995
Concurr. Pract. Exp., 1995
Loop-Level Process Control: An Effective Processor Allocation Policy for Multiprogrammed Shared-Memory Multiprocessors.
Proceedings of the Job Scheduling Strategies for Parallel Processing, 1995
A Circulating Active Barrier Synchronization Mechanism.
Proceedings of the 1995 International Conference on Parallel Processing, 1995
Proceedings of the 1995 International Conference on Computer Design (ICCD '95), 1995
Proceedings of the 28th Annual Hawaii International Conference on System Sciences (HICSS-28), 1995
1994
The Impact of Parallel Loop Scheduling Strategies on Prefetching in a Shared Memory Multiprocessor.
IEEE Trans. Parallel Distributed Syst., 1994
A Multiprocessor Architecture Combining Fine-Grained and Coarse-Grained Parallelism Strategies.
Parallel Comput., 1994
Proceedings of the Parallel Architectures and Compilation Techniques, 1994
An evaluation of a compiler optimization for improving the performance of a coherence directory.
Proceedings of the 8th international conference on Supercomputing, 1994
A Distributed Hardware Mechanism for Process Synchronization on Shared-Bus Multiprocessors.
Proceedings of the 1994 International Conference on Parallel Processing, 1994
Self-Adjusting Scheduling: An On-Line Optimization Technique for Locality Management and Load Balancing.
Proceedings of the 1994 International Conference on Parallel Processing, 1994
Proceedings of the Proceedings 1994 IEEE International Conference on Computer Design: VLSI in Computer & Processors, 1994
A Data Parallel Implementation of the TRFD Program from the Perfect Benchmarks.
Proceedings of the Massively Parallel Processing Applications and Develompent, 1994
1993
IEEE Trans. Parallel Distributed Syst., 1993
Cache Coherence in Large-Scale Shared-Memory Multiprocessors: Issues and Comparisons.
ACM Comput. Surv., 1993
Proceedings of the 1993 International Conference on Parallel Processing, 1993
1991
Processor parallelism considerations and memory latency reduction in shared memory multiprocessors
PhD thesis, 1991
Proceedings of the 5th international conference on Supercomputing, 1991
Architectural alternatives for exploiting parallelism.
IEEE, ISBN: 978-0-8186-2642-5, 1991
1990
Comparing Parallelism Extraction Techniques: Superscalar Processors, Pipelined Processors, and Multiprocessors.
Proceedings of the 1990 International Conference on Parallel Processing, 1990
1988