Omer Khan

Orcid: 0000-0001-6293-7403

Affiliations:
  • University of Connecticut, USA


According to our database1, Omer Khan authored at least 98 papers between 2008 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of two.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
ASPLOS 2024 Artifact for "MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training".
Dataset, February, 2024

ASPLOS 2024 Artifact for "MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training".
Dataset, February, 2024

PruneGNN: Algorithm-Architecture Pruning Framework for Graph Neural Network Acceleration.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

Masked Memory Primitive for Key Insulated Schemes.
Proceedings of the IEEE International Symposium on Hardware Oriented Security and Trust, 2024

MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
ASM: An Adaptive Secure Multicore for Co-located Mutually Distrusting Processes.
ACM Trans. Archit. Code Optim., September, 2023

Characterization of Timing-based Software Side-channel Attacks and Mitigations on Network-on-Chip Hardware.
ACM J. Emerg. Technol. Comput. Syst., July, 2023

Hardware Root-of-Trust implementations in Trusted Execution Environments.
IACR Cryptol. ePrint Arch., 2023

MaxK-GNN: Towards Theoretical Speed Limits for Accelerating Graph Neural Networks Training.
CoRR, 2023

MergePath-SpMM: Parallel Sparse Matrix-Matrix Algorithm for Graph Neural Network Acceleration.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023

Accel-GCN: High-Performance GPU Accelerator Design for Graph Convolution Networks.
Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

2022
Secure Remote Attestation with Strong Key Insulation Guarantees.
CoRR, 2022

A performance predictor for implementation selection of parallelized static and temporal graph algorithms.
Concurr. Comput. Pract. Exp., 2022

SSE: Security Service Engines to Accelerate Enclave Performance in Secure Multicore Processors.
IEEE Comput. Archit. Lett., 2022

Protecting On-Chip Data Access Against Timing-Based Side-Channel Attacks on Multicores.
Proceedings of the 2022 IEEE International Symposium on Secure and Private Execution Environment Design (SEED), 2022

Characterization of mitigation schemes against timing-based side-channel attacks on PCIe hardware.
Proceedings of the 23rd International Symposium on Quality Electronic Design, 2022

Towards Sparsification of Graph Neural Networks.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

CoDG-ReRAM: An Algorithm-Hardware Co-design to Accelerate Semi-Structured GNNs on ReRAM.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

On the Design of Quantum Graph Convolutional Neural Network in the NISQ-Era and Beyond.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

Towards Real-Time Temporal Graph Learning.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

MultiCon: An Efficient Timing-based Side Channel Attack on Shared Memory Multicores.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

HD-CPS: Hardware-assisted Drift-aware Concurrent Priority Scheduler for Shared Memory Multicores.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

2021
PRISM: Strong Hardware Isolation-based Soft-Error Resilient Multicore Architecture with High Performance and Availability at Low Hardware Overheads.
ACM Trans. Archit. Code Optim., 2021

Bilinear Map Based One-Time Signature Scheme with Secret Key Exposure.
IACR Cryptol. ePrint Arch., 2021

Autonomous Secure Remote Attestation even when all Used and to be Used Digital Keys Leak.
IACR Cryptol. ePrint Arch., 2021

Accelerating Concurrent Priority Scheduling Using Adaptive in-Hardware Task Distribution in Multicores.
IEEE Comput. Archit. Lett., 2021

Seeds of SEED: Characterizing Enclave-level Parallelism in Secure Multicore Processors.
Proceedings of the 2021 International Symposium on Secure and Private Execution Environment Design (SEED), 2021

Message from the General Chairs.
Proceedings of the IEEE International Symposium on Workload Characterization, 2021

Timing-based side-channel attack and mitigation on PCIe connected distributed embedded systems.
Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021

An Efficient Algorithm for the Construction of Dynamically Updating Trajectory Networks.
Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021

ConNOC: A Practical Timing Channel Attack on Network-on-chip Hardware in a Multicore Processor.
Proceedings of the IEEE International Symposium on Hardware Oriented Security and Trust, 2021

2020
OPTIMUS: A Security-Centric Dynamic Hardware Partitioning Scheme for Processors that Prevent Microarchitecture State Attacks.
IEEE Trans. Computers, 2020

In-Hardware Moving Compute to Data Model to Accelerate Thread Synchronization on Large Multicores.
IEEE Micro, 2020

Exploring accelerator and parallel graph algorithmic choices for temporal graphs.
Proceedings of the PMAM@PPoPP '20: Eleventh International Workshop on Programming Models and Applications for Multicores and Manycores colocated with the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020

Accelerating relax-ordered task-parallel workloads using multi-level dependency checking.
Proceedings of the ICS '20: 2020 International Conference on Supercomputing, 2020

IRONHIDE: A Secure Multicore that Efficiently Mitigates Microarchitecture State Attacks for Interactive Applications.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

2019
Guest Editors Introduction: Special Section on Emerging Technologies in Computer Design.
IEEE Trans. Emerg. Top. Comput., 2019

Advancing the State-of-the-Art in Hardware Trojans Detection.
IEEE Trans. Dependable Secur. Comput., 2019

Accelerating Synchronization Using Moving Compute to Data Model at 1, 000-core Multicore Scale.
ACM Trans. Archit. Code Optim., 2019

IRONHIDE: A Secure Multicore Architecture that Leverages Hardware Isolation Against Microarchitecture State Attacks.
CoRR, 2019

HeteroMap: A Runtime Performance Predictor for Efficient Processing of Graph Analytics on Heterogeneous Multi-Accelerators.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2019

POSTER: Exploiting Multi-Level Task Dependencies to Prune Redundant Work in Relax-Ordered Task-Parallel Algorithms.
Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2018
Guest Editorial: Special Section on Defect and Fault Tolerance in VLSI and Nanotechnology.
IEEE Trans. Emerg. Top. Comput., 2018

Declarative Resilience: A Holistic Soft-Error Resilient Multicore Architecture that Trades off Program Accuracy for Efficiency.
ACM Trans. Embed. Comput. Syst., 2018

Multicore Resource Isolation for Deterministic, Resilient and Secure Concurrent Execution of Safety-Critical Applications.
IEEE Comput. Archit. Lett., 2018

Software-Hardware Managed Last-level Cache Allocation Scheme for Large-Scale NVRAM-Based Multicores Executing Parallel Data Analytics Applications.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

Breaking the Oblivious-RAM Bandwidth Wall.
Proceedings of the 36th IEEE International Conference on Computer Design, 2018

Accelerating Synchronization in Graph Analytics Using Moving Compute to Data Model on Tilera TILE-Gx72.
Proceedings of the 36th IEEE International Conference on Computer Design, 2018

2017
Efficient Situational Scheduling of Graph Workloads on Single-Chip Multicores and GPUs.
IEEE Micro, 2017

Exploiting the Tradeoff between Program Accuracy and Soft-error Resiliency Overhead for Machine Learning Workloads.
CoRR, 2017

Accelerating Graph and Machine Learning Workloads Using a Shared Memory Multicore Architecture with Auxiliary Support for In-hardware Explicit Messaging.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

GraphTuner: An Input Dependence Aware Loop Perforation Scheme for Efficient Execution of Approximated Graph Algorithms.
Proceedings of the 2017 IEEE International Conference on Computer Design, 2017

2016
Locality-aware data replication in the last-level cache for large scale multicores.
J. Supercomput., 2016

LDAC: Locality-Aware Data Access Control for Large-Scale Multicore Cache Hierarchies.
ACM Trans. Archit. Code Optim., 2016

Efficient Error-Detection and Recovery Mechanisms for Reliability and Resiliency of Multicores.
Proceedings of the 29th International Conference on VLSI Design and 15th International Conference on Embedded Systems, 2016

GPU concurrency choices in graph analytics.
Proceedings of the 2016 IEEE International Symposium on Workload Characterization, 2016

Foreword.
Proceedings of the 2016 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, 2016

2015
M-MAP: Multi-Factor Memory Authentication for Secure Embedded Processors.
IACR Cryptol. ePrint Arch., 2015

The Execution Migration Machine: Directoryless Shared-Memory Architecture.
Computer, 2015

A Cross-Layer Multicore Architecture to Tradeoff Program Accuracy and Resilience Overheads.
IEEE Comput. Archit. Lett., 2015

Efficient parallel packet processing using a shared memory many-core processor with hardware support to accelerate communication.
Proceedings of the 10th IEEE International Conference on Networking, 2015

Exploring the performance implications of memory safety primitives in many-core processors executing multi-threaded workloads.
Proceedings of the Fourth Workshop on Hardware and Architectural Support for Security and Privacy, 2015

CRONO: A Benchmark Suite for Multithreaded Graph Algorithms Executing on Futuristic Multicores.
Proceedings of the 2015 IEEE International Symposium on Workload Characterization, 2015

Efficient parallelization of path planning workload on single-chip shared-memory multicores.
Proceedings of the 2015 IEEE High Performance Extreme Computing Conference, 2015

OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols involving Invalidation-Free Data Access.
Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015

2014
NUCA-L1: A Non-Uniform Access Latency Level-1 Cache Architecture for Multicores Operating at Near-Threshold Voltages.
ACM Trans. Archit. Code Optim., 2014

HaTCh: Hardware Trojan Catcher.
IACR Cryptol. ePrint Arch., 2014

Thread Migration Prediction for Distributed Shared Caches.
IEEE Comput. Archit. Lett., 2014

Locality-aware data replication in the Last-Level Cache.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

Suppressing the Oblivious RAM timing channel while making information leakage and program efficiency trade-offs.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

2013
Toward Holistic Soft-Error-Resilient Shared-Memory Multicores.
Computer, 2013

A framework to accelerate sequential programs on homogeneous multicores.
Proceedings of the 21st IEEE/IFIP International Conference on VLSI and System-on-Chip, 2013

The locality-aware adaptive cache coherence protocol.
Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013

Towards efficient dynamic data placement in NoC-based multicores.
Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013

A private level-1 cache architecture to exploit the latency and capacity tradeoffs in multicores operating at near-threshold voltages.
Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013

MARTHA: architecture for control and emulation of power electronics and smart grid systems.
Proceedings of the Design, Automation and Test in Europe, 2013

2012
HORNET: A Cycle-Level Multicore Simulator.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2012

Empirical model for cooperative resizing of processor structures to exploit power-performance efficiency at runtime.
IET Circuits Devices Syst., 2012

Low-Latency Mechanisms for Near-Threshold Operation of Private Caches in Shared Memory Multicores.
Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture, 2012

A low-overhead dynamic optimization framework for multicores.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

2011
Microvisor: A Runtime Architecture for Thermal Management in Chip Multiprocessors.
Trans. High Perform. Embed. Archit. Compil., 2011

Hardware/Software Codesign Architecture for Online Testing in Chip Multiprocessors.
IEEE Trans. Dependable Secur. Comput., 2011

DCC: A Dependable Cache Coherence Multicore Architecture.
IEEE Comput. Archit. Lett., 2011

Brief announcement: distributed shared memory based on computation migration.
Proceedings of the SPAA 2011: Proceedings of the 23rd Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2011

Time-Predictable Computer Architecture for Cyber-Physical Systems: Digital Emulation of Power Electronics Systems.
Proceedings of the 32nd IEEE Real-Time Systems Symposium, 2011

Deadlock-free fine-grained thread migration.
Proceedings of the NOCS 2011, 2011

Scalable, accurate multicore simulation in the 1000-core era.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2011

ARCc: A case for an architecturally redundant cache-coherence architecture for large multicores.
Proceedings of the IEEE 29th International Conference on Computer Design, 2011

Performance Per Watt Benefits of Dynamic Core Morphing in Asymmetric Multicores.
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

2010
Thread Relocation: A Runtime Architecture for Tolerating Hard Errors in Chip Multiprocessors.
IEEE Trans. Computers, 2010

Shadow checker (SC): A low-cost hardware scheme for online detection of faults in small memory structures of a microprocessor.
Proceedings of the 2011 IEEE International Test Conference, 2010

A self-adaptive scheduler for asymmetric multi-cores.
Proceedings of the 20th ACM Great Lakes Symposium on VLSI 2009, 2010

A model to exploit power-performance efficiency in superscalar processors via structure resizing.
Proceedings of the 20th ACM Great Lakes Symposium on VLSI 2009, 2010

2009
Predictive Thermal Management for Chip Multiprocessors Using Co-designed Virtual Machines.
Proceedings of the High Performance Embedded Architectures and Compilers, 2009

Improving yield and reliability of chip multiprocessors.
Proceedings of the Design, Automation and Test in Europe, 2009

Hardware/software co-design architecture for thermal management of chip multiprocessors.
Proceedings of the Design, Automation and Test in Europe, 2009

A self-adaptive system architecture to address transistor aging.
Proceedings of the Design, Automation and Test in Europe, 2009

2008
A framework for predictive dynamic temperature management of microprocessor systems.
Proceedings of the 2008 International Conference on Computer-Aided Design, 2008


  Loading...