Krishna M. Kavi

Orcid: 0000-0003-1581-8166

  • University of North Texas, Denton, USA

According to our database1, Krishna M. Kavi authored at least 129 papers between 1980 and 2024.

Collaborative distances:



In proceedings 
PhD thesis 


Online presence:



SecurityCloak: Protection against cache timing and speculative memory access attacks.
J. Syst. Archit., 2024

Guard Cache: Creating Noisy Side-Channels.
IEEE Comput. Archit. Lett., 2023

Guard Cache: Creating False Cache Hits and Misses To Mitigate Side-Channel Attacks.
Proceedings of the Silicon Valley Cybersecurity Conference, 2023

Streaming Sparse Data on Architectures with Vector Extensions using Near Data Processing.
Proceedings of the International Symposium on Memory Systems, 2023

Performance Implications of Async Memcpy and UVM: A Tale of Two Data Transfer Modes.
Proceedings of the IEEE International Symposium on Workload Characterization, 2023

NextGen-Malloc: Giving Memory Allocator Its Own Room in the House.
Proceedings of the 19th Workshop on Hot Topics in Operating Systems, 2023

Memory-Side Acceleration and Sparse Compression for Quantized Packed Convolutions.
Proceedings of the 2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), 2022

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

Sparse-T: Hardware Accelerator Thread for Unstructured Sparse Data Processing.
Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, 2022

Dynamically Adapting Page Migration Policies Based on Applications' Memory Access Behaviors.
ACM J. Emerg. Technol. Comput. Syst., 2021

On-the-fly Page Migration and Address Reconciliation for Heterogeneous Memory Systems.
ACM J. Emerg. Technol. Comput. Syst., 2020

CHASM: Security Evaluation of Cache Mapping Schemes.
Proceedings of the Embedded Computer Systems: Architectures, Modeling, and Simulation, 2020

Towards Application-Specific Address Mapping for Emerging Memory Devices.
Proceedings of the MEMSYS 2020: The International Symposium on Memory Systems, 2020

ExPress: Simultaneously Achieving Storage, Execution and Energy Efficiencies in Moderately Sparse Matrix Computations.
Proceedings of the MEMSYS 2020: The International Symposium on Memory Systems, 2020

Reconfigurable dataflow graphs for processing-in-memory.
Proceedings of the 20th International Conference on Distributed Computing and Networking, 2019

3D-DRAM Performance for Different OpenMP Scheduling Techniques in Multicore Systems.
Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems, 2018

Exploring the Processing-in-Memory design space.
J. Syst. Archit., 2017

Dataflow based Near Data Computing Achieves Excellent Energy Efficiency.
Proceedings of the 8th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies, 2017

DVFS Space Exploration in Power Constrained Processing-in-Memory Systems.
Proceedings of the Architecture of Computing Systems - ARCS 2017, 2017

HBM-Resident Prefetching for Heterogeneous Memory System.
Proceedings of the Architecture of Computing Systems - ARCS 2017, 2017

Prefetching as a Potentially Effective Technique for Hybrid Memory Optimization.
Proceedings of the Second International Symposium on Memory Systems, 2016

Memory organizations for 3D-DRAMs and PCMs in processor memory hierarchy.
J. Syst. Archit., 2015

Optimus: Framework of Vulnerabilities, Attacks, Defenses and SLA Ontologies.
Int. J. Next Gener. Comput., 2015

Concurrency, Synchronization, and Speculation - The Dataflow Way.
Adv. Comput., 2015

Recycling trash in cache.
Proceedings of the 2015 ACM SIGPLAN International Symposium on Memory Management, 2015

Ontology of Secure Service Level Agreement.
Proceedings of the 16th IEEE International Symposium on High Assurance Systems Engineering, 2015

Processing-in-Memory: Exploring the Design Space.
Proceedings of the Architecture of Computing Systems - ARCS 2015, 2015

Potential Energy Savings through Eliminating Unnecessary Writes in the Cache-Memory Hierarchy.
Int. J. Comput. Their Appl., 2014

Hardware and Application Profiling Tools.
Adv. Comput., 2014

Characterizing Workload of Web Applications on Virtualized Servers.
Proceedings of the Big Data Benchmarks, Performance Optimization, and Emerging Hardware, 2014

Trash in cache: detecting eternally silent stores.
Proceedings of the workshop on Memory Systems Performance and Correctness, 2014

Improving Node-Level MapReduce Performance Using Processing-in-Memory Technologies.
Proceedings of the Euro-Par 2014: Parallel Processing Workshops, 2014

3D DRAM and PCMs in Processor Memory Hierarchy.
Proceedings of the Architecture of Computing Systems - ARCS 2014, 2014

Gleipnir: a memory profiling and tracing tool.
SIGARCH Comput. Archit. News, 2013

Finding Near-Optimum Message Scheduling Settings for SHA-256 Variants Using Genetic Algorithms.
J. Inf. Sci. Eng., 2013

VULCAN: Vulnerability Assessment Framework for Cloud Computing.
Proceedings of the IEEE 7th International Conference on Software Security and Reliability, 2013

A Multi-core Memory Organization for 3-D DRAM as Main Memory.
Proceedings of the Architecture of Computing Systems - ARCS 2013, 2013

A comparative analysis of performance improvement schemes for cache memories.
Comput. Electr. Eng., 2012

Trace Driven Data Structure Transformations.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

New Memory Organizations for 3D DRAM and PCMs.
Proceedings of the Architecture of Computing Systems - ARCS 2012 - 25th International Conference, Munich, Germany, February 28, 2012

International Conference on Computational Science, ICCS 2011 Gleipnir: A Memory Analysis Tool.
Proceedings of the International Conference on Computational Science, 2011

Parabilis: Speeding up Single-Threaded Applications by Extracting Fine-Grained Threads for Multi-core Execution.
Proceedings of the 10th International Symposium on Parallel and Distributed Computing, 2011

Evaluation of Techniques to Improve Cache Access Uniformities.
Proceedings of the International Conference on Parallel Processing, 2011

Smaller Split L-1 Data Caches for Multi-core Processing Systems.
Proceedings of the 10th International Symposium on Pervasive Systems, 2009

Improving Uniformity of Cache Access Pattern using Split Data Caches.
Proceedings of the 22nd International Conference on Parallel and Distributed Computing and Communication Systems, 2009

Real-Time Systems: An Introduction and the State-of-the-Art.
Proceedings of the Wiley Encyclopedia of Computer Science and Engineering, 2008

Dataflow Computers: Their History and Future.
Proceedings of the Wiley Encyclopedia of Computer Science and Engineering, 2008

An Ontology-Based Integrated Assessment Framework for High-Assurance Systems.
Proceedings of the 2th IEEE International Conference on Semantic Computing (ICSC 2008), 2008

A Non-blocking Multithreaded Architecture with Support for Speculative Threads.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2008

Feasibility of decoupling memory management from the execution pipeline.
J. Syst. Archit., 2007

A non-preemptive scheduling algorithm for soft real-time systems.
Comput. Electr. Eng., 2007

Reconfigurable split data caches: a novel scheme for embedded systems.
Proceedings of the 2007 ACM Symposium on Applied Computing (SAC), 2007

Estimate Validity Regions for Nearest Neighbor Queries.
Proceedings of the ICSOFT 2007, 2007

A Methodology to Evaluate Agent Oriented Software Engineering Techniques.
Proceedings of the 40th Hawaii International International Conference on Systems Science (HICSS-40 2007), 2007

Making a case for split data caches for embedded applications.
SIGARCH Comput. Archit. News, 2006

Intelligent memory manager: Reducing cache pollution due to memory management functions.
J. Syst. Archit., 2006

Tiny split data-caches make big performance impact for embedded applications.
J. Embed. Comput., 2006

A Page-based Hybrid (Software-Hardware) Dynamic Memory Allocator.
IEEE Comput. Archit. Lett., 2006

Performance Enhancement by Eliminating Redundant Function Execution.
Proceedings of the Proceedings 39th Annual Simulation Symposium (ANSS-39 2006), 2006

A Study of Reconfigurable Split Data Caches and Instruction Caches.
Proceedings of the ISCA 19th International Conference on Parallel and Distributed Computing Systems, 2006

A Hardware Assisted High Performance PHK Memory Manager.
Proceedings of the ISCA 19th International Conference on Parallel and Distributed Computing Systems, 2006

Speculative Thread Execution in a Multithreaded Dataflow Architecture.
Proceedings of the ISCA 19th International Conference on Parallel and Distributed Computing Systems, 2006

Improving data cache performance with integrated use of split caches, victim cache and stream buffers.
SIGARCH Comput. Archit. News, 2005

An Efficient Non-Preemptive Real-Time Scheduling.
Proceedings of the ISCA 18th International Conference on Parallel and Distributed Computing Systems, 2005

What can we gain by unfolding loops?
ACM SIGPLAN Notices, 2004

Multi-Agent System Case Studies in Command and Control, Information Fusion and Datat Managment.
Informatica (Slovenia), 2004

A Study of Separate Array and Scalar Caches.
Proceedings of the 18th Annual Symposium on High Performance Computing Systems and Applications, 2004

An Unfolding-Based Loop Optimization Technique.
Proceedings of the Software and Compilers for Embedded Systems, 7th International Workshop, 2003

Loop Transformation Techniques To Aid In Loop Unrolling and Multithreading.
Proceedings of the ISCA 16th International Conference on Parallel and Distributed Computing Systems, 2003

Utilization of Separate Caches to Eliminate Cache Pollution Caused by Memory Management Functions.
Proceedings of the ISCA 16th International Conference on Parallel and Distributed Computing Systems, 2003

Mutual Exclusion on Optical Buses.
Parallel Process. Lett., 2002

Visual requirement representation.
J. Syst. Softw., 2002

Modeling Multithreaded Applications Using Petri Nets.
Int. J. Parallel Program., 2002

Scheduled Dataflow: Execution Paradigm, Architecture, and Performance Evaluation.
IEEE Trans. Computers, 2001

File Allocation Algorithms to Minimize Data Transmission Time in Distributed Computing Systems.
J. Inf. Sci. Eng., 2001

Storage Allocation for Real-Time, Embedded Systems.
Proceedings of the Embedded Software, First International Workshop, 2001

Performance Evaluation of a Non-Blocking Multithreaded Architecture for Embedded, Real-Time and DSP Applications.
Proceedings of the ISCA 14th International Conference on Parallel and Distributed Computing Systems, 2001

Multimedia File Allocation on VC Networks Using Multipath Routing.
IEEE Trans. Computers, 2000

Execution and Cache Performance of the Scheduled Dataflow Architecture.
J. Univers. Comput. Sci., 2000

Shared memory and distributed shared memory systems: A survey.
Adv. Comput., 2000

Computer Systems Research: The Pressure Is On.
Computer, 1999

A Decoupled Scheduled Dataflow Multithreaded Architecture.
Proceedings of the 1999 International Symposium on Parallel Architectures, 1999

Fault-Tolerance Using Cache-Coherent Distributed Shared Memory Systems.
Proceedings of the 1999 International Symposium on Parallel Architectures, 1999

Concurrent Data Access in Mobile Heterogeneous Systems.
Proceedings of the 32nd Annual Hawaii International Conference on System Sciences (HICSS-32), 1999

Cyclic Staggered Scheme: A Loop Allocation Policy for DOACROSS Loops.
IEEE Trans. Computers, 1998

Design of cache memories for dataflow architecture.
J. Syst. Archit., 1998

Multithreaded Systems.
Adv. Comput., 1998

Memory Latency And Thread Migration Challenges For Distributed Shared Memory Systems.
Proceedings of the Thirty-First Annual Hawaii International Conference on System Sciences, 1998

Parallelization of DOALL and DOACROSS Loops - A Survey.
Adv. Comput., 1997

Multi-Threaded Systems: Issues, Solutions and Future.
Proceedings of the 30th Annual Hawaii International Conference on System Sciences (HICSS-30), 1997

VL-STAG: An Allocation Policy for DOACROSS Loops.
Proceedings of the IASTED International Conference on Parallel and Distributed Systems, 1997

Guest Editors' Introduction: Software Tools Assessment.
IEEE Softw., 1996

Specification and Analysis of Real-Time Systems Using CSP and Petri Nets.
Int. J. Softw. Eng. Knowl. Eng., 1996

Cache Memories for Dataflow Systems.
IEEE Parallel Distributed Technol. Syst. Appl., 1996

A loop allocation policy for DOACROSS loops.
Proceedings of the Eighth IEEE Symposium on Parallel and Distributed Processing, 1996

Cache memories in dataflow architecture.
Proceedings of the Seventh IEEE Symposium on Parallel and Distributed Processing, 1995

Design of Cache Memories for Multi-Threaded Dataflow Architecture.
Proceedings of the 22nd Annual International Symposium on Computer Architecture, 1995

Reliability analysis of CSP specifications using Petri nets and Markov processes.
Proceedings of the 28th Annual Hawaii International Conference on System Sciences (HICSS-28), 1995

Straggered Scheme: A Loop Allocation Policy.
Proceedings of the PARLE '94: Parallel Architectures and Languages Europe, 1994

Loop Allocation Scheme for Multithreaded Dataflow Computers.
Proceedings of the 8th International Symposium on Parallel Processing, 1994

Specification of Stochastic Properties with CSP.
Proceedings of the Proceedings 1994 International Conference on Parallel and Distributed Systems, 1994

Improvements to the ETS Dynamic Dataflow Architecture.
Proceedings of the 27th Annual Hawaii International Conference on System Sciences (HICSS-27), 1994

A Performability Model for Soft Real-Time Systems.
Proceedings of the 27th Annual Hawaii International Conference on System Sciences (HICSS-27), 1994

Cache Design for an Explicit Token Store Data Flow Architecture.
Proceedings of the Fifth IEEE Symposium on Parallel and Distributed Processing, 1993

PARSA: A Parallel Program Scheduling and Assessment Environment.
Proceedings of the 1993 International Conference on Parallel Processing, 1993

Parallelism in Object-Oriented Languages: A Survey.
IEEE Softw., 1992

Reliability Measurement: From Theory to Practice.
IEEE Softw., 1992

Real-time systems design methodologies: An introduction and a survey.
J. Syst. Softw., 1992

Hierarchical Interconnection Networks: Routing Performance in the Presence of Faults.
Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing, 1992

SUVS: a distributed real-time system testbed for fault-tolerant computing.
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied Computing: Technological Challenges of the 1990's, 1992

Extending N-grid group authorization using compact encoding.
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied Computing: Technological Challenges of the 1990's, 1992

A New Cache Coherency and Address Translation Consistency Protocol.
Proceedings of the 1992 International Conference on Parallel Processing, 1992

An Efficient Data Interface for Heterogeneous Distributed Environments.
Proceedings of the 12th International Conference on Distributed Computing Systems, 1992

Real-time systems - abstractions, languages, and design methodologies.
IEEE, ISBN: 978-0-8186-3152-8, 1992

Specification of concurrent processes using a dataflow model of computation and partially ordered events.
J. Syst. Softw., 1991

Stochastic Data flow Graph Models for the Reliability Analysis of Interconnection and Computer Networks.
J. Inf. Sci. Eng., 1991

A Heterogeneous Distributed Processing Interface Specification Language.
Proceedings of the International Conference on Parallel Processing, 1991

A decomposition approach for analysis of parallel processing systems.
Proceedings of the Second IEEE Symposium on Parallel and Distributed Processing, 1990

An n-grid model for group authorization.
Proceedings of the Sixth Annual Computer Security Applications Conference, 1990

A review of specification and verification methods for parallel programs including the dataflow approach.
Proc. IEEE, 1989

Isomorphisms Between Petri Nets and Dataflow Graphs.
IEEE Trans. Software Eng., 1987

Architectural Support for Object Oriented Languages.
Proceedings of the COMPCON'87, 1987

A Formal Definition of Data Flow Graph Models.
IEEE Trans. Computers, 1986

Architecture quality.
SIGARCH Comput. Archit. News, 1984

Message Repository Definitional Facility: An Architectural Model for Interprocess Communication.
Proceedings of the 11th Annual Symposium on Computer Architecture, 1984

Effect of Declarations on Software Metrics: An Experiment in Software Science.
Proceedings of the Selected papers of the 1982 ACM SIGMETRICS Workshop on Software Metrics, 1982

HLL architectures: Pitfalls and predilections.
Proceedings of the 9th International Symposium on Computer Architecture (ISCA 1982), 1982

Innovative architectures and commercial computers: a summary of the panel discussion at NCC 1981.
SIGARCH Comput. Archit. News, 1981

Semantics of an algorithm.
SIGARCH Comput. Archit. News, 1980
