David R. Kaeli

IEEE Trans. Emerg. Top. Comput., 2020

ArmorAll: Compiler-based Resilience Targeting GPU Applications.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2020

Editorial: A Message from the Editor-in-Chief.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2020

Exploiting Bank Conflict-based Side-channel Timing Leakage of GPUs.

[BibT_eX]

[DOI]

Zhen Hang Jiang

ACM Trans. Archit. Code Optim., 2020

Exploring GPU acceleration of Deep Neural Networks using Block Circulant Matrices.

[BibT_eX]

[DOI]

Parallel Comput., 2020

MGPU-TSM: A Multi-GPU System with Truly Shared Memory.

[BibT_eX]

[DOI]

CoRR, 2020

HALCONE : A Hardware-Level Timestamp-based Cache Coherence Scheme for Multi-GPU systems.

[BibT_eX]

[DOI]

CoRR, 2020

Design Space Exploration of Accelerators and End-to-End DNN Evaluation with TFLITE-SOC.

[BibT_eX]

[DOI]

Elmira Karimi

Marti Torrents Lapuerta

José Cano

José L. Abellán

Proceedings of the 32nd IEEE International Symposium on Computer Architecture and High Performance Computing, 2020

A Smart Background Scheduler for Storage Systems.

[BibT_eX]

[DOI]

Maher Kachmar

Proceedings of the 28th International Symposium on Modeling, 2020

Message from the Program Chairs : IISWC 2020.

[BibT_eX]

[DOI]

Devesh Tiwari

Proceedings of the IEEE International Symposium on Workload Characterization, 2020

Using Undersampling with Ensemble Learning to Identify Factors Contributing to Preterm Birth.

[BibT_eX]

[DOI]

Proceedings of the 19th IEEE International Conference on Machine Learning and Applications, 2020

Griffin: Hardware-Software Support for Efficient Page Migration in Multi-GPU Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

Hardware/Software Obfuscation against Timing Side-channel Attack on a GPU.

[BibT_eX]

[DOI]

Elmira Karimi

Proceedings of the 2020 IEEE International Symposium on Hardware Oriented Security and Trust, 2020

Vega: A Computer Vision Processing Enhancement Framework with Graph-based Acceleration.

[BibT_eX]

[DOI]

Julian Gutierrez

Proceedings of the 53rd Hawaii International Conference on System Sciences, 2020

A Novel GPU Overdrive Fault Attack.

[BibT_eX]

[DOI]

Majid Sabbagh

Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

Introducing Gamettes: A Playful Approach for Capturing Decision-Making for Informing Behavioral Models.

[BibT_eX]

[DOI]

Jacqueline A. Griffin

Fernando Fernandes dos Santos

Stacy Marsella

Casper Harteveld

Proceedings of the CHI '20: CHI Conference on Human Factors in Computing Systems, 2020

Valkyrie: Leveraging Inter-TLB Locality to Enhance GPU Performance.

[BibT_eX]

[DOI]

Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020

2019

Analyzing and Increasing the Reliability of Convolutional Neural Networks on GPUs.

[BibT_eX]

[DOI]

Pedro Foletto Pimenta

IEEE Trans. Reliab., 2019

Intra-Cluster Coalescing and Distributed-Block Scheduling to Reduce GPU NoC Pressure.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2019

Side-channel Timing Attack of RSA on a GPU.

[BibT_eX]

[DOI]

Chao Luo

ACM Trans. Archit. Code Optim., 2019

HAWS: Accelerating GPU Wavefront Execution through Selective Out-of-order Execution.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2019

Student cluster competition 2018, team northeastern university: Reproducing performance of a multi-physics simulations of the Tsunamigenic 2004 Sumatra Megathrust earthquake on the AMD EPYC 7551 architecture.

[BibT_eX]

[DOI]

Parallel Comput., 2019

Summarizing CPU and GPU Design Trends with Product Data.

[BibT_eX]

[DOI]

Yifan Sun

CoRR, 2019

Priority-Based PCIe Scheduling for Multi-Tenant Multi-GPU Systems.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2019

MGPUSim: enabling multi-GPU performance modeling and optimization.

[BibT_eX]

[DOI]

Proceedings of the 46th International Symposium on Computer Architecture, 2019

Exploiting Adaptive Data Compression to Improve Performance and Energy-Efficiency of Compute Workloads in Multi-GPU Systems.

[BibT_eX]

[DOI]

Yifan Sun

Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019

Discovering Programmer Intention Behind Written Source Code.

[BibT_eX]

[DOI]

Gadiel Sznaier Camps

Proceedings of the 18th IEEE International Conference On Machine Learning And Applications, 2019

A Comprehensive Evaluation of the Effects of Input Data on the Resilience of GPU Applications.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, 2019

PCFI: Program Counter Guided Fault Injection for Accelerating GPU Reliability Assessment.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

2018

Lightweight Hardware Transactional Memory for GPU Scratchpad Memory.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2018

Block Cooperation: Advancing Lifetime of Resistive Memories by Increasing Utilization of Error Correcting Codes.

[BibT_eX]

[DOI]

Amir Kavyan Ziabari

ACM Trans. Archit. Code Optim., 2018

Student cluster competition 2017, team Northeastern University: Reproducing vectorization of the Tersoff multi-body potential on the NVIDIA V100.

[BibT_eX]

[DOI]

Parallel Comput., 2018

Power Analysis Attack of an AES GPU Implementation.

[BibT_eX]

[DOI]

J. Hardw. Syst. Secur., 2018

MGSim + MGMark: A Framework for Multi-GPU System Research.

[BibT_eX]

[DOI]

CoRR, 2018

An Integrated simulation Framework for examining Resiliency in pharmaceutical supply Chains considering Human Behaviors.

[BibT_eX]

[DOI]

Jacqueline A. Griffin

Proceedings of the 2018 Winter Simulation Conference, 2018

Characterizing the Microarchitectural Implications of a Convolutional Neural Network (CNN) Execution on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM/SPEC International Conference on Performance Engineering, 2018

PRISM: predicting resilience of GPU applications using statistical methods.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2018

Employing Student Retention Strategies for an Introductory GPU Programming Course.

[BibT_eX]

[DOI]

Julian Gutierrez

Fritz Previlon

Arturo González-Escribano

Proceedings of the 2018 IEEE/ACM Workshop on Education for High-Performance Computing, 2018

Peachy Parallel Assignments (EduHPC 2018).

[BibT_eX]

[DOI]

Eduardo Rodriguez-Gutiez

David P. Bunde

Proceedings of the 2018 IEEE/ACM Workshop on Education for High-Performance Computing, 2018

Evaluating Performance Tradeoffs on the Radeon Open Compute Platform.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2018

Intra-Cluster Coalescing to Reduce GPU NoC Pressure.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

Profiling DNN Workloads on a Volta-based DGX-1 System.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Symposium on Workload Characterization, 2018

A Timing Side-Channel Attack on a Mobile GPU.

[BibT_eX]

[DOI]

Proceedings of the 36th IEEE International Conference on Computer Design, 2018

Defensive dropout for hardening deep neural networks under adversarial attacks.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Computer-Aided Design, 2018

Effective simple-power analysis attacks of elliptic curve cryptography on embedded systems.

[BibT_eX]

[DOI]

Chao Luo

Proceedings of the International Conference on Computer-Aided Design, 2018

GPU acceleration of RSA is vulnerable to side-channel timing attacks.

[BibT_eX]

[DOI]

Chao Luo

Proceedings of the International Conference on Computer-Aided Design, 2018

Evaluating the Resilience of Parallel Applications.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, 2018

Evaluating the impact of execution parameters on program vulnerability in GPU applications.

[BibT_eX]

[DOI]

Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

Airavat: Improving energy efficiency of heterogeneous applications.

[BibT_eX]

[DOI]

Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

An Efficient Data Management Framework for Puerto Rico Testsite for Exploring Contamination Threats (PROTECT).

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Big Data (IEEE BigData 2018), 2018

A Hybrid Approach to Identifying Key Factors in Environmental Health Studies.

[BibT_eX]

[DOI]

Zlatan Feric

Xiangyu Li

Sheikh Mokhlesur Rahman

Proceedings of the IEEE International Conference on Big Data (IEEE BigData 2018), 2018

Interactive Kernel Dimension Alternative Clustering on GPUs.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM 2018 International Conference on Advances in Social Networks Analysis and Mining, 2018

Iterative Spectral Method for Alternative Clustering.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

2017

Scalable and massively parallel Monte Carlo photon transport simulations for heterogeneous computing platforms.

[BibT_eX]

[DOI]

Leiming Yu

Qianqian Fang

CoRR, 2017

DNNMark: A Deep Neural Network Benchmark Suite for GPUs.

[BibT_eX]

[DOI]

Proceedings of the General Purpose GPUs, 2017

Combining architectural fault-injection and neutron beam testing approaches toward better understanding of GPU soft-error resilience.

[BibT_eX]

[DOI]

Proceedings of the IEEE 60th International Midwest Symposium on Circuits and Systems, 2017

REMAP: a reliability/endurance mechanism for advancing PCM.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Memory Systems, 2017

Multi2Sim Kepler: A detailed architectural GPU simulator.

[BibT_eX]

[DOI]

Xun Gong

Rafael Ubal

Proceedings of the 2017 IEEE International Symposium on Performance Analysis of Systems and Software, 2017

Moka: Model-based concurrent kernel analysis.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017

Dual Dictionary Compression for the Last Level Cache.

[BibT_eX]

[DOI]

Akshay Lahiry

Proceedings of the 2017 IEEE International Conference on Computer Design, 2017

Quality of Service-Aware Dynamic Voltage and Frequency Scaling for Mobile 3D Graphics Applications.

[BibT_eX]

[DOI]

Navid Farazmand

Proceedings of the 2017 IEEE International Conference on Computer Design, 2017

Cost-effective write disturbance mitigation techniques for advancing PCM density.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE/ACM International Conference on Computer-Aided Design, 2017

A Novel Side-Channel Timing Attack on GPUs.

[BibT_eX]

[DOI]

Zhen Hang Jiang

Proceedings of the on Great Lakes Symposium on VLSI 2017, 2017

Hardware Support for Scratchpad Memory Transactions on GPU Architectures.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2017: Parallel Processing - 23rd International Conference on Parallel and Distributed Computing, Santiago de Compostela, Spain, August 28, 2017

Exploring the Potential for Collaborative Data Compression and Hard-Error Tolerance in PCM Memories.

[BibT_eX]

[DOI]

Amin Jadidi

Mohammad Arjomand

Mahmut T. Kandemir

Chita R. Das

Proceedings of the 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2017

Live together or Die Alone: Block cooperation to extend lifetime of resistive memories.

[BibT_eX]

[DOI]

Amir Kavyan Ziabari

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

TwinKernels: an execution model to improve GPU hardware scheduling at compile time.

[BibT_eX]

[DOI]

Proceedings of the 2017 International Symposium on Code Generation and Optimization, 2017

High-Performance Monte Carlo Simulations for Photon Migration and Applications in Optical Brain Functional Imaging.

[BibT_eX]

[DOI]

Leiming Yu

Qianqian Fang

Proceedings of the Handbook of Large-Scale Distributed Computing in Smart Healthcare, 2017

2016

UMH: A Hardware-Based Unified Memory Hierarchy for Systems with Multiple Discrete GPUs.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2016

21st Century Computer Architecture.

[BibT_eX]

[DOI]

CoRR, 2016

A Fast Level-Set Segmentation Algorithm for Image Processing Designed For Parallel Architectures.

[BibT_eX]

[DOI]

Julian Gutierrez

Proceedings of the 6th Workshop on Irregular Applications: Architecture and Algorithms, 2016

A comprehensive performance analysis of HSA and OpenCL 2.0.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on Performance Analysis of Systems and Software, 2016

Mystic: Predictive Scheduling for GPU Based Cloud Servers Using Machine Learning.

[BibT_eX]

[DOI]

Xiangyu Li

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Balancing Scalar and Vector Execution on GPU Architectures.

[BibT_eX]

[DOI]

Zhongliang Chen

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Hetero-mark, a benchmark suite for CPU-GPU collaborative computing.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on Workload Characterization, 2016

Hardware thread reordering to boost OpenCL throughput on FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 34th IEEE International Conference on Computer Design, 2016

A complete key recovery timing attack on a GPU.

[BibT_eX]

[DOI]

Zhen Hang Jiang

Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

Modeling player decisions in a supply chain game.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computational Intelligence and Games, 2016

2015

Exploring the Efficiency of the OpenCL Pipe Semantic on an FPGA.

[BibT_eX]

[DOI]

SIGARCH Comput. Archit. News, 2015

A reuse-based refresh policy for energy-aware eDRAM caches.

[BibT_eX]

[DOI]

Microprocess. Microsystems, 2015

Side-Channel Analysis of MAC-Keccak Hardware Implementations.

[BibT_eX]

[DOI]

IACR Cryptol. ePrint Arch., 2015

NUPAR: A Benchmark Suite for Modern GPU Architectures.

[BibT_eX]

[DOI]

Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering, Austin, TX, USA, January 31, 2015

Visualization of OpenCL application execution on CPU-GPU systems.

[BibT_eX]

[DOI]

Proceedings of the Workshop on Computer Architecture Education, 2015

Engaging sophomores in embedded design using robotics.

[BibT_eX]

[DOI]

Amir Momeni

Fritz Previlon

Agamemnon Despopoulos

Gunar Schirner

John Kimani

Proceedings of the Workshop on Computer Architecture Education, 2015

Field, experimental, and analytical data on large-scale HPC systems and evaluation of the implications for exascale system design.

[BibT_eX]

[DOI]

Proceedings of the 33rd IEEE VLSI Test Symposium, 2015

High performance computing of fiber scattering simulation.

[BibT_eX]

[DOI]

Proceedings of the 8th Workshop on General Purpose Processing using GPUs, 2015

Asymmetric NoC Architectures for GPU Systems.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Networks-on-Chip, 2015

Securing virtual execution environments through machine learning-based intrusion detection.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Workshop on Machine Learning for Signal Processing, 2015

A framework for visualization of OpenCL applications execution: a tutorial.

[BibT_eX]

[DOI]

Proceedings of the 3rd International Workshop on OpenCL, 2015

Exploring the features of OpenCL 2.0.

[BibT_eX]

[DOI]

Proceedings of the 3rd International Workshop on OpenCL, 2015

Leveraging Silicon-Photonic NoC for Designing Scalable GPUs.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

Side-channel power analysis of a GPU AES implementation.

[BibT_eX]

[DOI]

Proceedings of the 33rd IEEE International Conference on Computer Design, 2015

Bridging Architecture and Programming for Throughput-Oriented Vision Processing (Abstract Only).

[BibT_eX]

[DOI]

Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2015

Performance of the NVIDIA Jetson TK1 in HPC.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

2014

Harnessing the Power of GPUs to Speed Up Feature Selection for Outlier Detection.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2014

Aggressive Value Prediction on a GPU.

[BibT_eX]

[DOI]

Enqiang Sun

Int. J. Parallel Program., 2014

Analyzing power efficiency of optimization techniques and algorithm design methods for applications on heterogeneous platforms.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2014

Power Analysis Attack on Hardware Implementation of MAC-Keccak on FPGAs.

[BibT_eX]

[DOI]

IACR Cryptol. ePrint Arch., 2014

System Call Anomaly Detection Using Multi-HMMs.

[BibT_eX]

[DOI]

Esra N. Yolacan

Jennifer G. Dy

Proceedings of the IEEE Eighth International Conference on Software Security and Reliability, 2014

Runtime Support for Adaptive Spatial Partitioning and Inter-Kernel Communication on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Symposium on Computer Architecture and High Performance Computing, 2014

Calculating Architectural Vulnerability Factors for Spatial Multi-Bit Transient Faults.

[BibT_eX]

[DOI]

Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014

A parallel clustering algorithm for placement.

[BibT_eX]

[DOI]

Amir Momeni

Perhaad Mistry

Proceedings of the Fifteenth International Symposium on Quality Electronic Design, 2014

Scalable and efficient implementation of correlation power analysis using graphics processing units (GPUs).

[BibT_eX]

[DOI]

Proceedings of the HASP 2014, 2014

Scalar Waving: Improving the Efficiency of SIMD Execution on GPUs.

[BibT_eX]

[DOI]

Ayse Yilmazer

Zhongliang Chen

Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

GPU-Accelerated HMM for Speech Recognition.

[BibT_eX]

[DOI]

Leiming Yu

Proceedings of the 43rd International Conference on Parallel Processing Workshops, 2014

Accelerated Connected Component Labeling Using CUDA Framework.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision and Graphics - International Conference, 2014

Exploring the Heterogeneous Design Space for both Performance and Reliability.

[BibT_eX]

[DOI]

Proceedings of the 51st Annual Design Automation Conference 2014, 2014

Performance Evaluation and Optimization Mechanisms for Inter-operable Graphics and Computation on GPUs.

[BibT_eX]

[DOI]

Xiang Gong

Proceedings of the Seventh Workshop on General Purpose Processing Using GPUs, 2014

Fast Fourier Transform (FFT) on GPUs.

[BibT_eX]

[DOI]

Gunar Schirner

Proceedings of the Numerical Computations with GPUs, 2014

2013

Quantifying the energy efficiency of FFT on heterogeneous platforms.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2013

Characterizing scalar opportunities in GPGPU applications.

[BibT_eX]

[DOI]

Zhongliang Chen

Norman Rubin

Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2013

HQL: A Scalable Synchronization Mechanism for GPUs.

[BibT_eX]

[DOI]

Ayse Yilmazer

Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

Analyzing Optimization Techniques for Power Efficiency on Heterogeneous Platforms.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Unstructured Control Flow in GPGPU.

[BibT_eX]

[DOI]

Rodrigo Dominguez

Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Datacenters as Controllable Load Resources in the Electricity Market.

[BibT_eX]

[DOI]

Proceedings of the IEEE 33rd International Conference on Distributed Computing Systems, 2013

Architecture-Independent Dynamic Information Flow Tracking.

[BibT_eX]

[DOI]

Ryan Whelan

Tim Leek

Proceedings of the Compiler Construction - 22nd International Conference, 2013

Valar: a benchmark suite to study the dynamic behavior of heterogeneous systems.

[BibT_eX]

[DOI]

Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units, 2013

Heterogeneous Computing with OpenCL - Revised OpenCL 1.2 Edition.

[BibT_eX]

[DOI]

Morgan Kaufmann, ISBN: 978-0-12-405894-1, 2013

2012

A Sequentially Consistent Multiprocessor Architecture for Out-of-Order Retirement of Instructions.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2012

Local Kernel Density Ratio-Based Feature Selection for Outlier Detection.

[BibT_eX]

[DOI]

Proceedings of the 4th Asian Conference on Machine Learning, 2012

Dione: A Flexible Disk Monitoring and Analysis Framework.

[BibT_eX]

[DOI]

Jennifer Mankin

Dimitrios S. Nikolopoulos

Proceedings of the Research in Attacks, Intrusions, and Defenses, 2012

GPU-Accelerated Feature Selection for Outlier Detection Using the Local Kernel Density Ratio.

[BibT_eX]

[DOI]

Proceedings of the 12th IEEE International Conference on Data Mining, 2012

Feature Weighting and Selection Using Hypothesis Margin of Boosting.

[BibT_eX]

[DOI]

Proceedings of the 12th IEEE International Conference on Data Mining, 2012

Topic 16: GPU and Accelerators Computing.

[BibT_eX]

[DOI]

Alex Ramírez

Satoshi Matsuoka

Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

Enabling task-level scheduling on heterogeneous platforms.

[BibT_eX]

[DOI]

Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units, 2012

Multi2Sim: a simulation framework for CPU-GPU computing.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

2011

Exploiting Memory Access Patterns to Improve Memory Performance in Data-Parallel Architectures.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2011

Guest Editor's Introduction: Special Issue on High-Performance Computing with Accelerators.

[BibT_eX]

[DOI]

David A. Bader

Volodymyr V. Kindratenko

IEEE Trans. Parallel Distributed Syst., 2011

Accelerating an Imaging Spectroscopy Algorithm for Submerged Marine Environments Using Graphics Processing Units.

[BibT_eX]

[DOI]

James A. Goodman

Dana Schaa

IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2011

Virtual machine monitor-based lightweight intrusion detection.

[BibT_eX]

[DOI]

ACM SIGOPS Oper. Syst. Rev., 2011

Workload Characterization at the Virtualization Layer.

[BibT_eX]

[DOI]

Proceedings of the MASCOTS 2011, 2011

A Novel Feature Selection for Intrusion Detection in Virtual Machine Environments.

[BibT_eX]

[DOI]

Proceedings of the IEEE 23rd International Conference on Tools with Artificial Intelligence, 2011

Feature Selection Metric Using AUC Margin for Small Samples and Imbalanced Data Classification Problems.

[BibT_eX]

[DOI]

Proceedings of the 10th International Conference on Machine Learning and Applications and Workshops, 2011

The convergence of HPC and embedded systems in our heterogeneous computing future.

[BibT_eX]

[DOI]

David Akodes

Proceedings of the IEEE 29th International Conference on Computer Design, 2011

Increasing power/performance resource efficiency on virtualized enterprise servers.

[BibT_eX]

[DOI]

Emmanuel Arzuaga

Proceedings of the 8th Conference on Computing Frontiers, 2011

Analyzing program flow within a many-kernel OpenCL application.

[BibT_eX]

[DOI]

Proceedings of 4th Workshop on General Purpose Processing on Graphics Processing Units, 2011

Caracal: dynamic translation of runtime environments for GPUs.

[BibT_eX]

[DOI]

Rodrigo Dominguez

Dana Schaa

Proceedings of 4th Workshop on General Purpose Processing on Graphics Processing Units, 2011

2010

Quantifying load imbalance on virtualized enterprise servers.

[BibT_eX]

[DOI]

Emmanuel Arzuaga

Proceedings of the first joint WOSP/SIPEW International Conference on Performance Engineering, 2010

Data Structures and Transformations for Physically Based Simulation on a GPU.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing for Computational Science - VECPAR 2010, 2010

Toward Whole-System Dynamic Analysis for ARM-Based Mobile Devices.

[BibT_eX]

[DOI]

Ryan Whelan

Proceedings of the Recent Advances in Intrusion Detection, 13th International Symposium, 2010

Data transformations enabling loop vectorization on multithreaded data parallel architectures.

[BibT_eX]

[DOI]

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010

Using hardware vulnerability factors to enhance AVF analysis.

[BibT_eX]

[DOI]

Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010

Effective Virtual Machine Monitor Intrusion Detection Using Feature Selection on Highly Imbalanced Data.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Machine Learning and Applications, 2010

Out-of-order retirement of instructions in sequentially consistent multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the 28th International Conference on Computer Design, 2010

Accelerating the local outlier factor algorithm on a GPU for intrusion detection systems.

[BibT_eX]

[DOI]

Malak Alshawabkeh

Byunghyun Jang

Proceedings of 3rd Workshop on General Purpose Processing on Graphics Processing Units, 2010

2009

AGAMOS: A Graph-Based Approach to Modulo Scheduling for Clustered Microarchitectures.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2009

Obtaining FPGA soft error rate in high performance information systems.

[BibT_eX]

[DOI]

Brian Mullins

Microelectron. Reliab., 2009

Software transactional memory for multicore embedded systems.

[BibT_eX]

[DOI]

Jennifer Mankin

John Ardini

Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, 2009

Profile-Guided Optimization of Critical Medical Imaging Algorithms.

[BibT_eX]

[DOI]

Proceedings of the 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, Boston, MA, USA, June 28, 2009

Multi GPU Implementation of Iterative Tomographic Reconstruction Algorithms.

[BibT_eX]

[DOI]

Proceedings of the 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, Boston, MA, USA, June 28, 2009

Exploring the multiple-GPU design space.

[BibT_eX]

[DOI]

Dana Schaa

Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Eliminating microarchitectural dependency from Architectural Vulnerability.

[BibT_eX]

[DOI]

Proceedings of the 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), 2009

Accelerating phase unwrapping and affine transformations for optical quadrature microscopy using CUDA.

[BibT_eX]

[DOI]

Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, 2009

Architecture-aware optimization targeting multithreaded stream computing.

[BibT_eX]

[DOI]

Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, 2009

2008

Acknowledgment to special issue reviewers.

[BibT_eX]

[DOI]

Miriam Leeser

J. Parallel Distributed Comput., 2008

Special issue: General-purpose processing using graphics processing units.

[BibT_eX]

[DOI]

Miriam Leeser

J. Parallel Distributed Comput., 2008

Interactive Deformable Registration Visualization and Analysis of 4D Computed Tomography.

[BibT_eX]

[DOI]

Proceedings of the Medical Biometrics, First International Conference, 2008

A Field Analysis of System-level Effects of Soft Errors Occurring in Microprocessors used in Information Systems.

[BibT_eX]

[DOI]

Syed Zafar Shazli

Mohammed A. Abdul-Aziz

Proceedings of the 2008 IEEE International Test Conference, 2008

Archer: A Community Distributed Computing Infrastructure for Computer Architecture Research and Education.

[BibT_eX]

[DOI]

Renato J. O. Figueiredo

Proceedings of the Collaborative Computing: Networking, 2008

Applying Spectral Analysis to Identify Individual Application Signatures.

[BibT_eX]

George Smirnov

Kenneth Hu

Proceedings of the 34th International Computer Measurement Group Conference, 2008

Quantifying software vulnerability.

[BibT_eX]

[DOI]

Proceedings of the 5th Conference on Computing Frontiers, 2008

2007

Power Aware External Bus Arbitration for System-on-a-Chip Embedded Systems.

[BibT_eX]

[DOI]

Ke Ning

Trans. High Perform. Embed. Archit. Compil., 2007

Characterization of file I/O activity for SPEC CPU2006.

[BibT_eX]

[DOI]

Dong Ye

Joydeep Ray

SIGARCH Comput. Archit. News, 2007

Case Study: Soft Error Rate Analysis in Storage Systems.

[BibT_eX]

[DOI]

Brian Mullins

Proceedings of the 25th IEEE VLSI Test Symposium (VTS 2007), 2007

Exploring Novel Parallelization Technologies for 3-D Imaging Applications.

[BibT_eX]

[DOI]

Proceedings of the 19th Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2007), 2007

Stream Image Processing on a Dual-Core Embedded System.

[BibT_eX]

[DOI]

Michael G. Benjamin

Proceedings of the Embedded Computer Systems: Architectures, 2007

External memory page remapping for embedded multimedia systems.

[BibT_eX]

[DOI]

Ke Ning

Proceedings of the 2007 ACM SIGPLAN/SIGBED Conference on Languages, 2007

Heterogeneous Clustered VLIW Microarchitectures.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Symposium on Code Generation and Optimization (CGO 2007), 2007

2006

Addressing a workload characterization study to the design of consistency protocols.

[BibT_eX]

[DOI]

J. Supercomput., 2006

Reducing Data Cache Susceptibility to Soft Errors.

[BibT_eX]

[DOI]

IEEE Trans. Dependable Secur. Comput., 2006

An adjustable linear time parallel algorithm for maximum weight bipartite matching.

[BibT_eX]

[DOI]

Waleed Meleis

Inf. Process. Lett., 2006

Experiences with the Blackfin architecture in an embedded systems lab.

[BibT_eX]

[DOI]

Michael G. Benjamin

Richard Platcow

Proceedings of the 2006 Workshop on Computer Architecture Education, 2006

Performance Characterization of SPEC CPU2006 Integer Benchmarks on x86-64 Architecture.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Symposium on Workload Characterization, 2006

Acceleration of Maximum Likelihood Estimation for Tomosynthesis Mammography.

[BibT_eX]

[DOI]

Proceedings of the 12th International Conference on Parallel and Distributed Systems, 2006

Vulnerability analysis of L2 cache elements to single event upsets.

[BibT_eX]

[DOI]

Proceedings of the Conference on Design, Automation and Test in Europe, 2006

Hunting Trojan Horses.

[BibT_eX]

[DOI]

Proceedings of the 1st Workshop on Architectural and System Support for Improving Software Dependability, 2006

2005

A reliable return address stack: microarchitectural features to defeat stack smashing.

[BibT_eX]

[DOI]

Dong Ye

SIGARCH Comput. Archit. News, 2005

Characterizing antivirus workload execution.

[BibT_eX]

[DOI]

Derek Uluski

Micha Moffie

SIGARCH Comput. Archit. News, 2005

ASM: application security monitor.

[BibT_eX]

[DOI]

Micha Moffie

SIGARCH Comput. Archit. News, 2005

Introduction to the special issue.

[BibT_eX]

[DOI]

Robert Cohn

SIGARCH Comput. Archit. News, 2005

Subsequence Matching on Structured Time Series Data.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGMOD International Conference on Management of Data, 2005

Demystifying on-the-fly spill code.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, 2005

A multinomial clustering model for fast simulation of computer architecture designs.

[BibT_eX]

[DOI]

Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2005

Balancing Performance and Reliability in the Memory Hierarchy.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005

Load Balancing using Grid-based Peer-to-Peer Parallel I/O.

[BibT_eX]

[DOI]

Proceedings of the 2005 IEEE International Conference on Cluster Computing (CLUSTER 2005), September 26, 2005

Exploiting temporal locality in drowsy cache policies.

[BibT_eX]

[DOI]

Proceedings of the Second Conference on Computing Frontiers, 2005

2004

Removing communications in clustered microarchitectures through instruction replication.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2004

Developing object-oriented parallel iterative methods.

[BibT_eX]

[DOI]

Chakib Ouarraui

Int. J. High Perform. Comput. Netw., 2004

Characterizing the Dynamic Behavior of Workload Execution in SVM systems.

[BibT_eX]

[DOI]

Proceedings of the 16th Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2004), 2004

A Study of Errant Pipeline Flushes Caused by Value Misspeculation.

[BibT_eX]

[DOI]

Deniz Balkan

Proceedings of the 16th Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2004), 2004

Bus Power Estimation and Power-Efficient Bus Arbitration for System-on-a-Chip Embedded Systems.

[BibT_eX]

[DOI]

Ke Ning

Proceedings of the Power-Aware Computer Systems, 4th International Workshop, 2004

Execution-Driven Simulation of Network Storage Systems.

[BibT_eX]

[DOI]

Proceedings of the 12th International Workshop on Modeling, 2004

Parallel Maximum Weight Bipartite Matching Algorithms for Scheduling in Input-Queued Switches.

[BibT_eX]

[DOI]

Luis O. Jimenez-Rodriguez

Waleed Meleis

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

A MATLAB toolbox for Hyperspectral Image Analysis.

[BibT_eX]

[DOI]

Emmanuel Arzuaga-Cruz

Miguel Velez-Reyes

Hector T. Velazquez-Santana

Eladio Rodriguez-Diaz

Alexey Castrodad-Carrau

Laura E. Santos-Campis

Cesar Santiago

Proceedings of the 2004 IEEE International Geoscience and Remote Sensing Symposium, 2004

Bi-Criteria Models for All-Uses Test Suite Reduction.

[BibT_eX]

[DOI]

Jennifer Black

Emanuel Melachrinoudis

Proceedings of the 26th International Conference on Software Engineering (ICSE 2004), 2004

2003

Realizing high IPC through a scalable memory-latency tolerant multipath microarchitecture.

[BibT_eX]

[DOI]

SIGARCH Comput. Archit. News, 2003

Levo - A Scalable Processor With High IPC.

[BibT_eX]

[DOI]

J. Instr. Level Parallelism, 2003

The CenSSIS Image Database.

[BibT_eX]

[DOI]

Proceedings of the 15th International Conference on Scientific and Statistical Database Management (SSDBM 2003), 2003

Source level transformations to improve I/O data partitioning.

[BibT_eX]

[DOI]

Proceedings of the International Workshop on Storage Network Architecture and Parallel I/Os, 2003

Dynamic Input Buffer Allocation (DIBA) for Fault Tolerant Ethernet Packet Switching.

[BibT_eX]

Zainalabedin Navabi

Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2003

Instruction Replication for Clustered Microarchitectures.

[BibT_eX]

[DOI]

Proceedings of the 36th Annual International Symposium on Microarchitecture, 2003

Profile-guided I/O partitioning.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual International Conference on Supercomputing, 2003

2002

Localized Message Passing Structure for High Speed Ethernet Packet Switching.

[BibT_eX]

Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2002

Realizing High IPC Using Time-Tagged Resource-Flow Computing.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2002, 2002

Exploiting Pseudo-Schedules to Guide Data Dependence Graph Partitioning.

[BibT_eX]

[DOI]

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques (PACT 2002), 2002

2001

Introduction to the Special Section on High Performance Memory Systems.

[BibT_eX]

[DOI]

Haldun Hadimioglu

Fabrizio Lombardi

IEEE Trans. Computers, 2001

Workshop on binary translation - 2001.

[BibT_eX]

[DOI]

Erik R. Altman

SIGARCH Comput. Archit. News, 2001

WBT-2000: workshop on binary translation - 2000.

[BibT_eX]

[DOI]

Erik R. Altman

SIGARCH Comput. Archit. News, 2001

2000

Using cache line coloring to perform aggressive procedure inlining.

[BibT_eX]

[DOI]

Hakan Aydin

SIGARCH Comput. Archit. News, 2000

Welcome to the Opportunities of Binary Translation.

[BibT_eX]

[DOI]

Erik R. Altman

Yaron Sheffer

Computer, 2000

Learning outside of the classroom: the Northeastern University research co-op fellowship program.

[BibT_eX]

[DOI]

Gabby Yi

Ellen Duwart

Proceedings of the 2000 workshop on Computer architecture education, 2000

Accurate simulation and evaluation of code reordering.

[BibT_eX]

[DOI]

Proceedings of the 2000 IEEE International Symposium on Performance Analysis of Systems and Software, 2000

DSPTune: A Performance Evaluation Toolset for the SHARC Signal Processor.

[BibT_eX]

[DOI]

Proceedings of the Proceedings 33th Annual Simulation Symposium (SS 2000), 2000

1999

Analysis of Temporal-Based Program Behavior for Improved Instruction Cache Performance.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1999

Improving the accuracy of indirect branch prediction via branch classification.

[BibT_eX]

[DOI]

SIGARCH Comput. Archit. News, 1999

Branch-directed and pointer-based data cache prefetching.

[BibT_eX]

[DOI]

Yue Liu

Mona Dimitri

J. Syst. Archit., 1999

Indirect Branch Prediction Using Data Compression Techniques.

[BibT_eX]

[DOI]

J. Instr. Level Parallelism, 1999

Fifth Annual Workshop on Computer Education.

[BibT_eX]

[DOI]

Bruce L. Jacob

Proceedings of the Fifth International Symposium on High-Performance Computer Architecture, 1999

1998

VLSI design in the 3rd dimension.

[BibT_eX]

[DOI]

Integr., 1998

Tracing and Characterization of Windows NT-based System Workloads.

[BibT_eX]

[DOI]

Jason P. Casmira

David P. Hunter

Digit. Tech. J., 1998

Predicting Indirect Branches via Data Compression.

[BibT_eX]

[DOI]

Proceedings of the 31st Annual IEEE/ACM International Symposium on Microarchitecture, 1998

Temporal-Based Procedure Reordering for Improved Instruction Cache Performance.

[BibT_eX]

[DOI]

Proceedings of the Fourth International Symposium on High-Performance Computer Architecture, Las Vegas, Nevada, USA, January 31, 1998

Operating System Impact on Trace-Driven Simulation.

[BibT_eX]

[DOI]

Proceedings of the Proceedings 31st Annual Simulation Symposium (SS '98), 1998

1997

Improving the Accuracy of History Based Branch Prediction.

[BibT_eX]

[DOI]

Philip G. Emma

IEEE Trans. Computers, 1997

Performance analysis on a CC-NUMA prototype.

[BibT_eX]

[DOI]

IBM J. Res. Dev., 1997

Operating-system level tracing tools for the DEC AXP architecture.

[BibT_eX]

[DOI]

Jason P. Casmira

John Fraser

Proceedings of the 1997 workshop on Computer architecture education, 1997

Efficient Procedure Mapping Using Cache Line Coloring.

[BibT_eX]

[DOI]

Amir H. Hashemi

Brad Calder

Proceedings of the ACM SIGPLAN '97 Conference on Programming Language Design and Implementation (PLDI), 1997

Analytic Models of Workload Behavior and Pipeline Performance.

[BibT_eX]

[DOI]

Mark S. Squillante

Himanshu Sinh

Proceedings of the MASCOTS 1997, 1997

Digital Computer Architecture.

[BibT_eX]

Proceedings of the Computer Science and Engineering Handbook, 1997

1996

A discussion on non-blocking/lockup-free caches.

[BibT_eX]

[DOI]

Samson Belayneh

SIGARCH Comput. Archit. News, 1996

Real-Time Trace Generation.

[BibT_eX]

Int. J. Comput. Simul., 1996

Improving Multiprocessor Scalability Using Lockup Free Caches.

[BibT_eX]

Samson Belayneh

H. Sinha

Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 1996

Branch-Directed and Stride-Based Data Cache Prefetching.

[BibT_eX]

[DOI]

Yue Liu

Proceedings of the 1996 International Conference on Computer Design (ICCD '96), 1996

Performance Modeling Using Object-Oriented Execution-Driven Simulation.

[BibT_eX]

[DOI]

Christopher J. Sniezek

Proceedings of the Proceedings 29st Annual Simulation Symposium (SS '96), 1996

The DLX instruction set architecture handbook.

[BibT_eX]

Philip M. Sailor

Morgan Kaufmann, ISBN: 978-1-55860-371-4, 1996

1995

Combining object-oriented design and computer architecture into a single senior-level course.

[BibT_eX]

[DOI]

Proceedings of the 1995 Workshop on Computer Architecture Education, 1995

Scalable Performance on a Distributed Shared-Memory Machine.

[BibT_eX]

Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 1995

1993

Issues in Trace-Driven Simulation.

[BibT_eX]

[DOI]

Proceedings of the Performance Evaluation of Computer and Communication Systems, 1993

1992

Contrasting instruction-fetch time and instruction-decode time branch prediction mechanisms: Achieving synergy through their cooperative operation.

[BibT_eX]

[DOI]

Microprocess. Microprogramming, 1992

1991

A Study of 80X86/80X87 Floating-Point Execution.

[BibT_eX]

[DOI]

Zoran Mijanic

Proceedings of the 1991 ACM SIGSMALL/PC Symposium on Small Systems, 1991

Branch History Table Prediction of Moving Target Branches due to Subroutine Returns.

[BibT_eX]

[DOI]

Philip G. Emma

Proceedings of the 18th Annual International Symposium on Computer Architecture. Toronto, 1991

1989

PC Workload Characterization.

[BibT_eX]