Omer Khan
Orcid: 0000-0001-6293-7403Affiliations:
- University of Connecticut, USA
According to our database1,
Omer Khan
authored at least 99 papers
between 2008 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
ASPLOS 2024 Artifact for "MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training".
Dataset, February, 2024
ASPLOS 2024 Artifact for "MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training".
Dataset, February, 2024
SSE: Security Service Engines to Scale Enclave Parallelism for System Interactive Applications.
Proceedings of the International Symposium on Secure and Private Execution Environment Design, 2024
PruneGNN: Algorithm-Architecture Pruning Framework for Graph Neural Network Acceleration.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024
Proceedings of the IEEE International Symposium on Hardware Oriented Security and Trust, 2024
MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
2023
ACM Trans. Archit. Code Optim., September, 2023
Characterization of Timing-based Software Side-channel Attacks and Mitigations on Network-on-Chip Hardware.
ACM J. Emerg. Technol. Comput. Syst., July, 2023
IACR Cryptol. ePrint Arch., 2023
MaxK-GNN: Towards Theoretical Speed Limits for Accelerating Graph Neural Networks Training.
CoRR, 2023
MergePath-SpMM: Parallel Sparse Matrix-Matrix Algorithm for Graph Neural Network Acceleration.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023
Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023
2022
A performance predictor for implementation selection of parallelized static and temporal graph algorithms.
Concurr. Comput. Pract. Exp., 2022
SSE: Security Service Engines to Accelerate Enclave Performance in Secure Multicore Processors.
IEEE Comput. Archit. Lett., 2022
Protecting On-Chip Data Access Against Timing-Based Side-Channel Attacks on Multicores.
Proceedings of the 2022 IEEE International Symposium on Secure and Private Execution Environment Design (SEED), 2022
Characterization of mitigation schemes against timing-based side-channel attacks on PCIe hardware.
Proceedings of the 23rd International Symposium on Quality Electronic Design, 2022
Proceedings of the IEEE 40th International Conference on Computer Design, 2022
CoDG-ReRAM: An Algorithm-Hardware Co-design to Accelerate Semi-Structured GNNs on ReRAM.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022
On the Design of Quantum Graph Convolutional Neural Network in the NISQ-Era and Beyond.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022
Proceedings of the IEEE 40th International Conference on Computer Design, 2022
Proceedings of the IEEE 40th International Conference on Computer Design, 2022
HD-CPS: Hardware-assisted Drift-aware Concurrent Priority Scheduler for Shared Memory Multicores.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022
2021
PRISM: Strong Hardware Isolation-based Soft-Error Resilient Multicore Architecture with High Performance and Availability at Low Hardware Overheads.
ACM Trans. Archit. Code Optim., 2021
IACR Cryptol. ePrint Arch., 2021
Autonomous Secure Remote Attestation even when all Used and to be Used Digital Keys Leak.
IACR Cryptol. ePrint Arch., 2021
Accelerating Concurrent Priority Scheduling Using Adaptive in-Hardware Task Distribution in Multicores.
IEEE Comput. Archit. Lett., 2021
Seeds of SEED: Characterizing Enclave-level Parallelism in Secure Multicore Processors.
Proceedings of the 2021 International Symposium on Secure and Private Execution Environment Design (SEED), 2021
Proceedings of the IEEE International Symposium on Workload Characterization, 2021
Timing-based side-channel attack and mitigation on PCIe connected distributed embedded systems.
Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021
An Efficient Algorithm for the Construction of Dynamically Updating Trajectory Networks.
Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021
ConNOC: A Practical Timing Channel Attack on Network-on-chip Hardware in a Multicore Processor.
Proceedings of the IEEE International Symposium on Hardware Oriented Security and Trust, 2021
2020
OPTIMUS: A Security-Centric Dynamic Hardware Partitioning Scheme for Processors that Prevent Microarchitecture State Attacks.
IEEE Trans. Computers, 2020
In-Hardware Moving Compute to Data Model to Accelerate Thread Synchronization on Large Multicores.
IEEE Micro, 2020
Proceedings of the PMAM@PPoPP '20: Eleventh International Workshop on Programming Models and Applications for Multicores and Manycores colocated with the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020
Accelerating relax-ordered task-parallel workloads using multi-level dependency checking.
Proceedings of the ICS '20: 2020 International Conference on Supercomputing, 2020
IRONHIDE: A Secure Multicore that Efficiently Mitigates Microarchitecture State Attacks for Interactive Applications.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020
2019
Guest Editors Introduction: Special Section on Emerging Technologies in Computer Design.
IEEE Trans. Emerg. Top. Comput., 2019
IEEE Trans. Dependable Secur. Comput., 2019
Accelerating Synchronization Using Moving Compute to Data Model at 1, 000-core Multicore Scale.
ACM Trans. Archit. Code Optim., 2019
IRONHIDE: A Secure Multicore Architecture that Leverages Hardware Isolation Against Microarchitecture State Attacks.
CoRR, 2019
HeteroMap: A Runtime Performance Predictor for Efficient Processing of Graph Analytics on Heterogeneous Multi-Accelerators.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2019
POSTER: Exploiting Multi-Level Task Dependencies to Prune Redundant Work in Relax-Ordered Task-Parallel Algorithms.
Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019
2018
Guest Editorial: Special Section on Defect and Fault Tolerance in VLSI and Nanotechnology.
IEEE Trans. Emerg. Top. Comput., 2018
Declarative Resilience: A Holistic Soft-Error Resilient Multicore Architecture that Trades off Program Accuracy for Efficiency.
ACM Trans. Embed. Comput. Syst., 2018
Multicore Resource Isolation for Deterministic, Resilient and Secure Concurrent Execution of Safety-Critical Applications.
IEEE Comput. Archit. Lett., 2018
Software-Hardware Managed Last-level Cache Allocation Scheme for Large-Scale NVRAM-Based Multicores Executing Parallel Data Analytics Applications.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018
Proceedings of the 36th IEEE International Conference on Computer Design, 2018
Accelerating Synchronization in Graph Analytics Using Moving Compute to Data Model on Tilera TILE-Gx72.
Proceedings of the 36th IEEE International Conference on Computer Design, 2018
2017
Efficient Situational Scheduling of Graph Workloads on Single-Chip Multicores and GPUs.
IEEE Micro, 2017
Exploiting the Tradeoff between Program Accuracy and Soft-error Resiliency Overhead for Machine Learning Workloads.
CoRR, 2017
Accelerating Graph and Machine Learning Workloads Using a Shared Memory Multicore Architecture with Auxiliary Support for In-hardware Explicit Messaging.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017
GraphTuner: An Input Dependence Aware Loop Perforation Scheme for Efficient Execution of Approximated Graph Algorithms.
Proceedings of the 2017 IEEE International Conference on Computer Design, 2017
2016
J. Supercomput., 2016
LDAC: Locality-Aware Data Access Control for Large-Scale Multicore Cache Hierarchies.
ACM Trans. Archit. Code Optim., 2016
Efficient Error-Detection and Recovery Mechanisms for Reliability and Resiliency of Multicores.
Proceedings of the 29th International Conference on VLSI Design and 15th International Conference on Embedded Systems, 2016
Proceedings of the 2016 IEEE International Symposium on Workload Characterization, 2016
Proceedings of the 2016 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, 2016
2015
IACR Cryptol. ePrint Arch., 2015
Computer, 2015
A Cross-Layer Multicore Architecture to Tradeoff Program Accuracy and Resilience Overheads.
IEEE Comput. Archit. Lett., 2015
Efficient parallel packet processing using a shared memory many-core processor with hardware support to accelerate communication.
Proceedings of the 10th IEEE International Conference on Networking, 2015
Exploring the performance implications of memory safety primitives in many-core processors executing multi-threaded workloads.
Proceedings of the Fourth Workshop on Hardware and Architectural Support for Security and Privacy, 2015
CRONO: A Benchmark Suite for Multithreaded Graph Algorithms Executing on Futuristic Multicores.
Proceedings of the 2015 IEEE International Symposium on Workload Characterization, 2015
Efficient parallelization of path planning workload on single-chip shared-memory multicores.
Proceedings of the 2015 IEEE High Performance Extreme Computing Conference, 2015
OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols involving Invalidation-Free Data Access.
Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015
2014
NUCA-L1: A Non-Uniform Access Latency Level-1 Cache Architecture for Multicores Operating at Near-Threshold Voltages.
ACM Trans. Archit. Code Optim., 2014
IEEE Comput. Archit. Lett., 2014
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014
Suppressing the Oblivious RAM timing channel while making information leakage and program efficiency trade-offs.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014
2013
Proceedings of the 21st IEEE/IFIP International Conference on VLSI and System-on-Chip, 2013
Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013
Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013
A private level-1 cache architecture to exploit the latency and capacity tradeoffs in multicores operating at near-threshold voltages.
Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013
MARTHA: architecture for control and emulation of power electronics and smart grid systems.
Proceedings of the Design, Automation and Test in Europe, 2013
2012
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2012
Empirical model for cooperative resizing of processor structures to exploit power-performance efficiency at runtime.
IET Circuits Devices Syst., 2012
Low-Latency Mechanisms for Near-Threshold Operation of Private Caches in Shared Memory Multicores.
Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture, 2012
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012
2011
Trans. High Perform. Embed. Archit. Compil., 2011
IEEE Trans. Dependable Secur. Comput., 2011
IEEE Comput. Archit. Lett., 2011
Proceedings of the SPAA 2011: Proceedings of the 23rd Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2011
Time-Predictable Computer Architecture for Cyber-Physical Systems: Digital Emulation of Power Electronics Systems.
Proceedings of the 32nd IEEE Real-Time Systems Symposium, 2011
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2011
ARCc: A case for an architecturally redundant cache-coherence architecture for large multicores.
Proceedings of the IEEE 29th International Conference on Computer Design, 2011
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011
2010
Thread Relocation: A Runtime Architecture for Tolerating Hard Errors in Chip Multiprocessors.
IEEE Trans. Computers, 2010
Shadow checker (SC): A low-cost hardware scheme for online detection of faults in small memory structures of a microprocessor.
Proceedings of the 2011 IEEE International Test Conference, 2010
Proceedings of the 20th ACM Great Lakes Symposium on VLSI 2009, 2010
A model to exploit power-performance efficiency in superscalar processors via structure resizing.
Proceedings of the 20th ACM Great Lakes Symposium on VLSI 2009, 2010
2009
Predictive Thermal Management for Chip Multiprocessors Using Co-designed Virtual Machines.
Proceedings of the High Performance Embedded Architectures and Compilers, 2009
Proceedings of the Design, Automation and Test in Europe, 2009
Hardware/software co-design architecture for thermal management of chip multiprocessors.
Proceedings of the Design, Automation and Test in Europe, 2009
Proceedings of the Design, Automation and Test in Europe, 2009
2008
Proceedings of the 2008 International Conference on Computer-Aided Design, 2008