Hans Vandierendonck

Deepu John

Bo Jin

CoRR, 2024

Selective Parallel Loading of Large-Scale Compressed Graphs with ParaGrapher.

[BibT_eX]

[DOI]

CoRR, 2024

The Effects of Weight Quantization on Online Federated Learning for the IoT: A Case Study.

[BibT_eX]

[DOI]

Nil Llisterri Giménez

Felix Freitag

IEEE Access, 2024

Differentiating Set Intersections in Maximal Clique Enumeration by Function and Subproblem Size.

[BibT_eX]

[DOI]

Proceedings of the 38th ACM International Conference on Supercomputing, 2024

QClique: Optimizing Performance and Accuracy in Maximum Weighted Clique.

[BibT_eX]

[DOI]

Qasim Abbas

Ian M. Overton

Proceedings of the Euro-Par 2024: Parallel Processing, 2024

Exploiting Data Redundancy in CKKS Encoding for High-Speed Homomorphic Encryption.

[BibT_eX]

[DOI]

Proceedings of the 19th ACM Asia Conference on Computer and Communications Security, 2024

2023

Resource-Efficient Convolutional Networks: A Survey on Model-, Arithmetic-, and Implementation-Level Techniques.

[BibT_eX]

[DOI]

Jesús Martínez del Rincón

Yang Hua

Kiril Dichev

Cheol-Ho Hong

ACM Comput. Surv., 2023

MS-BioGraphs: Sequence Similarity Graph Datasets.

[BibT_eX]

[DOI]

CoRR, 2023

ROMA: Run-Time Object Detection To Maximize Real-Time Accuracy.

[BibT_eX]

[DOI]

Blesson Varghese

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

Decentralised Biomedical Signal Classification using Early Exits.

[BibT_eX]

[DOI]

Li Xiaolin

Bo Jin

Barry Cardiff

Deepu John

Proceedings of the 21st IEEE Interregional NEWCAS Conference, 2023

Dataset Announcement: MS-BioGraphs, Trillion-Scale Public Real-World Sequence Similarity Graphs.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Workload Characterization, 2023

On Overcoming HPC Challenges of Trillion-Scale Real-World Graph Datasets.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Big Data, 2023

2022

Increased Leverage of Transprecision Computing for Machine Vision Applications at the Edge.

[BibT_eX]

[DOI]

Roger F. Woods

J. Signal Process. Syst., 2022

Mixed-Precision Kernel Recursive Least Squares.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., 2022

Model-Agnostic Counterfactual Explanations in Credit Scoring.

[BibT_eX]

[DOI]

Xolani Dastile

Turgay Çelik

IEEE Access, 2022

LOTUS: locality optimizing triangle counting.

[BibT_eX]

[DOI]

Proceedings of the PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2, 2022

Comparison of Two Microcontroller Boards for On-Device Model Training in a Keyword Spotting Task.

[BibT_eX]

[DOI]

Nil Llisterri Giménez

Felix Freitag

Proceedings of the 11th Mediterranean Conference on Embedded Computing, 2022

SAPCo Sort: optimizing Degree-Ordering for Power-Law Graphs.

[BibT_eX]

[DOI]

Proceedings of the International IEEE Symposium on Performance Analysis of Systems and Software, 2022

Software-defined floating-point number formats and their application to graph processing.

[BibT_eX]

[DOI]

Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

MASTIFF: structure-aware minimum spanning tree/forest.

[BibT_eX]

[DOI]

Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

Dengue Fever: From Extreme Climates to Outbreak Prediction.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Data Mining, 2022

Low-Precision Floating-Point Formats: From General-Purpose to Application-Specific.

[BibT_eX]

[DOI]

Azadeh Alsadat Emrani Zarandi

Leonel Sousa

Proceedings of the Approximate Computing, 2022

2021

Towards Lower Precision Adaptive Filters: Facts From Backward Error Analysis of RLS.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2021

Revealing DRAM Operating GuardBands Through Workload-Aware Error Predictive Modeling.

[BibT_eX]

[DOI]

Konstantinos Tovletoglou

IEEE Trans. Computers, 2021

Resource-Efficient Deep Learning: A Survey on Model-, Arithmetic-, and Implementation-Level Techniques.

[BibT_eX]

[DOI]

Jesús Martínez del Rincón

Yang Hua

Kiril Dichev

Cheol-Ho Hong

CoRR, 2021

Reducing the burden of parallel loop schedulers for many-core processors.

[BibT_eX]

[DOI]

Concurr. Comput. Pract. Exp., 2021

Leveraging Transprecision Computing for Machine Vision Applications at the Edge.

[BibT_eX]

[DOI]

Roger F. Woods

Proceedings of the IEEE Workshop on Signal Processing Systems, 2021

How Do Graph Relabeling Algorithms Improve Memory Locality?

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2021

Locality Analysis of Graph Reordering Algorithms.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Workload Characterization, 2021

Exploiting in-Hub Temporal Locality in SpMV-based Graph Processing.

[BibT_eX]

[DOI]

Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021

TOD: Transprecise Object Detection to Maximise Real-Time Accuracy on the Edge.

[BibT_eX]

[DOI]

Proceedings of the 5th IEEE International Conference on Fog and Edge Computing, 2021

Achieving Scalable Consensus by Being Less Writey.

[BibT_eX]

[DOI]

Michael Davis

Proceedings of the HPDC '21: The 30th International Symposium on High-Performance Parallel and Distributed Computing, 2021

Thrifty Label Propagation: Fast Connected Components for Skewed-Degree Graphs.

[BibT_eX]

[DOI]

Dimitrios Christofidellis

Proceedings of the IEEE International Conference on Cluster Computing, 2021

Understood in Translation: Transformers for Domain Understanding.

[BibT_eX]

[DOI]

Matteo Manica

Leonidas Georgopoulos

Proceedings of the Workshop on Scientific Document Understanding co-located with 35th AAAI Conference on Artificial Inteligence, 2021

2020

AIR: Iterative refinement acceleration using arbitrary dynamic precision.

[BibT_eX]

[DOI]

Gregory D. Peterson

Parallel Comput., 2020

Fast load balance parallel graph analytics with an automatic graph data structure selection algorithm.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2020

Graptor: efficient pull and push style vectorized graph processing.

[BibT_eX]

[DOI]

Proceedings of the ICS '20: 2020 International Conference on Supercomputing, 2020

Half-Precision Floating-Point Formats for PageRank: Opportunities and Challenges.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020

2019

Hyperqueues: Design and Implementation of Deterministic Concurrent Queues.

[BibT_eX]

[DOI]

ACM Trans. Parallel Comput., 2019

Fast and Energy-Efficient OLAP Data Management on Hybrid Main Memory Systems.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2019

VEBO: a vertex- and edge-balanced ordering heuristic to load balance parallel graph processing.

[BibT_eX]

[DOI]

Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2019

SAFIRE: Scalable and Accurate Fault Injection for Parallel Multithreaded Applications.

[BibT_eX]

[DOI]

Ignacio Laguna

Martin Schulz

Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019

Workload-Aware DRAM Error Prediction using Machine Learning.

[BibT_eX]

[DOI]

Konstantinos Tovletoglou

Proceedings of the IEEE International Symposium on Workload Characterization, 2019

Stream-Based Representation and Incremental optimization of Technical Market Indicators.

[BibT_eX]

[DOI]

Konstantin Bakanov

Ivor T. A. Spence

Proceedings of the 17th International Conference on High Performance Computing & Simulation, 2019

2018

DARE.

[BibT_eX]

[DOI]

Konstantinos Tovletoglou

Int. J. High Perform. Comput. Appl., 2018

Energy-Efficient Iterative Refinement Using Dynamic Precision.

[BibT_eX]

[DOI]

Sharatchandra Varma Bogaraju

IEEE J. Emerg. Sel. Topics Circuits Syst., 2018

The VINEYARD integrated framework for hardware accelerators in the cloud.

[BibT_eX]

[DOI]

Proceedings of the 18th International Conference on Embedded Computer Systems: Architectures, 2018

Userspace Hypervisor Data Characterization in Virtualized Environment.

[BibT_eX]

[DOI]

Bin Wang

Proceedings of the 24th IEEE International Conference on Parallel and Distributed Systems, 2018

Code and Data Transformations to Address Garbage Collector Performance in Big Data Processing.

[BibT_eX]

[DOI]

Damon Fenacci

Proceedings of the 25th IEEE International Conference on High Performance Computing, 2018

An energy-efficient and error-resilient server ecosystem exceeding conservative scaling limits.

[BibT_eX]

[DOI]

Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

2017

SCALO: Scalability-Aware Parallelism Orchestration for Multi-Threaded Workloads.

[BibT_eX]

[DOI]

Peter Thoman

Bronis R. de Supinski

Thomas Fahringer

ACM Trans. Archit. Code Optim., 2017

GraphGrind: addressing load imbalance of graph partitioning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Supercomputing, 2017

Accelerating Graph Analytics by Utilising the Memory Locality of Graph Partitioning.

[BibT_eX]

[DOI]

Proceedings of the 46th International Conference on Parallel Processing, 2017

2016

Exploiting Significance of Computations for Energy-Constrained Approximate Computing.

[BibT_eX]

[DOI]

Vassilis Vassiliadis

Konstantinos Parasyris

Int. J. Parallel Program., 2016

Energy Optimization of Memory Intensive Parallel workloads.

[BibT_eX]

[DOI]

Chhaya Trehan

CoRR, 2016

Brief Announcement: Energy Optimization of Memory Intensive Parallel Workloads.

[BibT_eX]

[DOI]

Chhaya Trehan

Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures, 2016

NanoStreams: Codesigned microservers for edge analytics in real time.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Embedded Computer Systems: Architectures, 2016

Operator and Workflow Optimization for High-Performance Analytics.

[BibT_eX]

[DOI]

Proceedings of the Workshops of the EDBT/ICDT 2016 Joint Conference, 2016

HPTA: High-performance text analytics.

[BibT_eX]

[DOI]

Karen L. Murphy

Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData 2016), 2016

A scalable and composable map-reduce system.

[BibT_eX]

[DOI]

Bronis R. de Supinski

Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData 2016), 2016

2015

On the potential of significance-driven execution for energy-aware HPC.

[BibT_eX]

[DOI]

Philipp Gschwandtner

Thomas Fahringer

Comput. Sci. Res. Dev., 2015

On the Energy-Efficiency of Byte-Addressable Non-Volatile Memory.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2015

A programming model and runtime system for significance-aware energy-efficient computing.

[BibT_eX]

[DOI]

Vassilis Vassiliadis

Konstantinos Parasyris

Charalambos Chalios

Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2015

A Case Study of OpenMP Applied to Map/Reduce-Style Computations.

[BibT_eX]

[DOI]

Proceedings of the OpenMP: Heterogenous Execution and Data Movements, 2015

Energy-Efficient In-Memory Data Stores on Hybrid Memory Hierarchies.

[BibT_eX]

[DOI]

Proceedings of the 11th International Workshop on Data Management on New Hardware, 2015

A significance-driven programming framework for energy-constrained approximate computing.

[BibT_eX]

[DOI]

Vassilis Vassiliadis

Konstantinos Parasyris

Proceedings of the 12th ACM International Conference on Computing Frontiers, 2015

Software-managed energy-efficient hybrid DRAM/NVM main memory.

[BibT_eX]

[DOI]

Proceedings of the 12th ACM International Conference on Computing Frontiers, 2015

Energy-Efficient Hybrid DRAM/NVM Main Memory.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015

2014

Energy Efficiency through Significance-Based Computing.

[BibT_eX]

[DOI]

Andreas Burg

Uwe Naumann

Computer, 2014

Fast Dynamic Binary Rewriting for flexible thread migration on shared-ISA heterogeneous MPSoCs.

[BibT_eX]

[DOI]

Proceedings of the XIVth International Conference on Embedded Computer Systems: Architectures, 2014

Rigorous specification and low-latency implementation of technical market indicators.

[BibT_eX]

[DOI]

Proceedings of the first workshop on Parallel programming for analytics applications, 2014

2013

Analysis of dependence tracking algorithms for task dataflow execution.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2013

Deterministic scale-free pipeline parallelism with hyperqueues.

[BibT_eX]

[DOI]

Kallia Chronaki

Proceedings of the International Conference for High Performance Computing, 2013

BDDT: Block-Level Dynamic Dependence Analysis for Task-Based Parallelism.

[BibT_eX]

[DOI]

Angelos Papatriantafyllou

Proceedings of the Advanced Parallel Processing Technologies, 2013

2012

Techniques and Tools for Parallelizing Software.

[BibT_eX]

[DOI]

Tom Mens

IEEE Softw., 2012

BDDT: : block-level dynamic dependence analysis for deterministic task-based parallelism.

[BibT_eX]

[DOI]

Angelos Papatriantafyllou

John Kesapides

Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012

2011

Managing SMT resource usage through speculative instruction window weighting.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2011

Averting the Next Software Crisis.

[BibT_eX]

[DOI]

Tom Mens

Computer, 2011

Fairness Metrics for Multi-Threaded Processors.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2011

A programming model for deterministic task parallelism.

[BibT_eX]

[DOI]

Spyros Lyberis

Proceedings of the 2011 ACM SIGPLAN workshop on Memory Systems Performance and Correctness: held in conjunction with PLDI '11, 2011

Parallel Programming of General-Purpose Programs Using Task-Based Programming Models.

[BibT_eX]

[DOI]

Proceedings of the 3rd USENIX Workshop on Hot Topics in Parallelism, 2011

A Unified Scheduler for Recursive and Task Dataflow Parallelism.

[BibT_eX]

[DOI]

Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

2010

A profile-based tool for finding pipeline parallelism in sequential programs.

[BibT_eX]

[DOI]

Parallel Comput., 2010

Accelerating Multiple Sequence Alignment with the Cell BE Processor.

[BibT_eX]

[DOI]

Comput. J., 2010

Implicit hints: Embedding hint bits in programs without ISA changes.

[BibT_eX]

[DOI]

Proceedings of the 28th International Conference on Computer Design, 2010

A methodology for precise comparisons of processor core architectures for homogeneous many-core DSP platforms.

[BibT_eX]

[DOI]

Proceedings of the 2010 Conference on Design & Architectures for Signal & Image Processing, 2010

The Paralax infrastructure: automatic parallelization with a helping hand.

[BibT_eX]

[DOI]

Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, 2010

2009

Fetch Gating Control through Speculative Instruction Window Weighting.

[BibT_eX]

[DOI]

Trans. High Perform. Embed. Archit. Compil., 2009

Towards automatic program partitioning.

[BibT_eX]

[DOI]

Proceedings of the 6th Conference on Computing Frontiers, 2009

2008

Speculative return address stack management revisited.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2008

Behavior-Based Branch Prediction by Dynamically Clustering Branch Instructions.

[BibT_eX]

[DOI]

Veerle Desmet

J. Inf. Sci. Eng., 2008

Extracting coarse-grain parallelism in general-purpose programs.

[BibT_eX]

[DOI]

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008

Experiences with Parallelizing a Bio-informatics Program on the Cell BE.

[BibT_eX]

[DOI]

Proceedings of the High Performance Embedded Architectures and Compilers, 2008

Constructing Optimal XOR-Functions to Minimize Cache Conflict Misses.

[BibT_eX]

[DOI]

Proceedings of the Architecture of Computing Systems, 2008

2007

Function level parallelism driven by data dependencies.

[BibT_eX]

[DOI]

SIGARCH Comput. Archit. News, 2007

Clustered indexing for branch predictors.

[BibT_eX]

[DOI]

Veerle Desmet

Microprocess. Microsystems, 2007

By-passing the out-of-order execution pipeline to increase energy-efficiency.

[BibT_eX]

[DOI]

Proceedings of the 4th Conference on Computing Frontiers, 2007

2006

Building and Validating a Reduced TPC-H Benchmark.

[BibT_eX]

[DOI]

Pedro Trancoso

Proceedings of the 14th International Symposium on Modeling, 2006

On the Impact of OS and Linker Effects on Level-2 Cache Performance.

[BibT_eX]

[DOI]

Proceedings of the 14th International Symposium on Modeling, 2006

The exigency of benchmark and compiler drift: designing tomorrow's processors with yesterday's tools.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual International Conference on Supercomputing, 2006

Application-specific reconfigurable XOR-indexing to eliminate cache conflict misses.

[BibT_eX]

[DOI]

Philippe Manet

Jean-Didier Legat

Proceedings of the Conference on Design, Automation and Test in Europe, 2006

2005

XOR-Based Hash Functions.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2005

2FAR: A 2bcgskew Predictor Fused by an Alloyed Redundant History Skewed Perceptron Branch Predictor.

[BibT_eX]

[DOI]

Veerle Desmet

J. Instr. Level Parallelism, 2005

Reducing TPC-H Benchmarking Time.

[BibT_eX]

[DOI]

Pedro Trancoso

Christodoulos Adamou

Proceedings of the Advances in Informatics, 2005

2004

On Generating Set Index Functions for Randomized Caches.

[BibT_eX]

[DOI]

Comput. J., 2004

Eccentric and fragile benchmarks.

[BibT_eX]

[DOI]

Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and Software, 2004

2003

Highly accurate and efficient evaluation of randomising set index functions.

[BibT_eX]

[DOI]

J. Syst. Archit., 2003

Quantifying the Impact of Input Data Sets on Program Behavior and its Applications.

[BibT_eX]

[DOI]

Lieven Eeckhout

J. Instr. Level Parallelism, 2003

Designing Computer Architecture Research Workloads.

[BibT_eX]

[DOI]

Lieven Eeckhout

Computer, 2003

Trade-offs for Skewed-Associative Caches.

[BibT_eX]

Proceedings of the Parallel Computing: Software Technology, 2003

On the side-effects of code abstraction.

[BibT_eX]

[DOI]

Bjorn De Sutter

Bruno De Bus

Proceedings of the 2003 Conference on Languages, 2003

Trace Substitution.

[BibT_eX]

[DOI]

Hans Logie

Proceedings of the Euro-Par 2003. Parallel Processing, 2003

2002

An Address Transformation Combining Block- and Word-Interleaving.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2002

A Comparative Study of Redundancy in Trace Caches (Research Note).

[BibT_eX]

[DOI]

Alex Ramírez

Mateo Valero

Proceedings of the Euro-Par 2002, 2002

Workload Design: Selecting Representative Program-Input Pairs.

[BibT_eX]

[DOI]

Lieven Eeckhout

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques (PACT 2002), 2002

2001

Efficient profile-based evaluation of randomising set index functions for cache memories.

[BibT_eX]

[DOI]

Proceedings of the 2001 IEEE International Symposium on Performance Analysis of Systems and Software, 2001

Differential FCM: Increasing Value Prediction Accuracy by Improving Table Usage Efficiency.

[BibT_eX]

[DOI]

Bart Goeman

Proceedings of the Seventh International Symposium on High-Performance Computer Architecture (HPCA'01), 2001

2000

A Comparison of Locality-Based and Recency-Based Replacement Policies.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, Third International Symposium, 2000

A Technique for High Bandwidth and Deterministic Low Latency Load/Store Accesses to Multiple Cache Banks.

[BibT_eX]

[DOI]

Henk Neefs