Wim Vanderbauwhede

Orcid: 0000-0001-6768-0037

According to our database1, Wim Vanderbauwhede authored at least 113 papers between 2001 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


Online presence:

On csauthors.net:


Dynamic Loop Fusion in High-Level Synthesis.
Proceedings of the 2025 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2025

Compiler Support for Speculation in Decoupled Access/Execute Architectures.
Proceedings of the 34th ACM SIGPLAN International Conference on Compiler Construction, 2025

Optimising Iteration Scheduling for Full-State Vector Simulation of Quantum Circuits on FPGAs.
CoRR, 2024

Estimating the Increase in Emissions caused by AI-augmented Search.
CoRR, 2024

Wiring Circuits Is Easy as {0, 1, ω}, or Is It... (Artifact).
Dagstuhl Artifacts Ser., 2023

Frugal Computing - On the need for low-carbon and sustainable computing and the path towards zero-carbon computing.
CoRR, 2023

Quantum Circuit-Width Reduction through Parameterisation and Specialisation.
Algorithms, 2023

A High-Frequency Load-Store Queue with Speculative Allocations for High-Level Synthesis.
Proceedings of the International Conference on Field Programmable Technology, 2023

Compiler Discovered Dynamic Scheduling of Irregular Code in High-Level Synthesis.
Proceedings of the 33rd International Conference on Field-Programmable Logic and Applications, 2023

Dynamically Scheduled Memory Operations in Static High-Level Synthesis.
Proceedings of the 31st IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2023

Wiring Circuits Is Easy as {0, 1, ω}, or Is It...
Proceedings of the 37th European Conference on Object-Oriented Programming, 2023

Making legacy Fortran code type safe through automated program transformation.
J. Supercomput., 2022

Transformations for accelerator-based quantum circuit simulation in Haskell.
CoRR, 2022

Reducing FPGA Memory Footprint of Stencil Codes through Automatic Extraction of Memory Patterns.
Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications, 2022

PERCEPTRON: an open-source GPU-accelerated proteoform identification pipeline for top-down proteomics.
Nucleic Acids Res., 2021

FPGAs for Domain Experts.
Int. J. Reconfigurable Comput., 2020

A Framework for Resource Dependent EDSLs in a Dependently Typed Language (Artifact).
Dagstuhl Artifacts Ser., 2020

A Framework for Resource Dependent EDSLs in a Dependently Typed Language (Pearl).
Proceedings of the 34th European Conference on Object-Oriented Programming, 2020

FPGA design space exploration for scientific HPC applications using a fast and accurate cost model based on roofline analysis.
J. Parallel Distributed Comput., 2019

Automatic Pipelining and Vectorization of Scientific Code for FPGAs.
Int. J. Reconfigurable Comput., 2019

Type-Driven Automated Program Transformations and Cost Modelling for Optimising Streaming Programs on FPGAs.
Int. J. Parallel Program., 2019

A Typing Discipline for Hardware Interfaces (Artifact).
Dagstuhl Artifacts Ser., 2019

Value-Dependent Session Design in a Dependently Typed Language.
Proceedings of the Proceedings Programming Language Approaches to Concurrency- and Communication-cEntric Software, 2019

Towards Automatic Transformation of Legacy Scientific Code into OpenCL for Optimal Performance on FPGAs.
CoRR, 2019

Efficient FPGA Cost-Performance Space Exploration using Type-Driven Program Transformations.
Proceedings of the 2019 International Conference on ReConFigurable Computing and FPGAs, 2019

Smart-Cache: Optimising Memory Accesses for Arbitrary Boundaries and Stencils on FPGAs.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2019

A Typing Discipline for Hardware Interfaces.
Proceedings of the 33rd European Conference on Object-Oriented Programming, 2019

The Glasgow Fortran Source-to-Source Compiler.
J. Open Source Softw., 2018

MP-STREAM: A Memory Performance Benchmark for Design Space Exploration on Heterogeneous HPC Devices.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

2D Image Convolution using Three Parallel Programming Models on the Xeon Phi.
CoRR, 2017

Domain-Specific Acceleration and Auto-Parallelization of Legacy Scientific Code in FORTRAN 77 using Source-to-Source Compilation.
CoRR, 2017

An analysis of the feasibility and benefits of GPU/multicore acceleration of the Weather Research and Forecasting model.
Concurr. Comput. Pract. Exp., 2016

Evaluation of the Memory Communication Traffic in a Hierarchical Cache Model for Massively-Manycore Processors.
Proceedings of the 24th Euromicro International Conference on Parallel, 2016

Document classification systems in heterogeneous computing environments.
Proceedings of the 26th International Workshop on Power and Timing Modeling, 2016

A Fast and Accurate Cost Model for FPGA Design Space Exploration in HPC Applications.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Improving Resilience by Deploying Permuted Code onto Physically Unclonable Unique Processors.
Proceedings of the Cybersecurity and Cyberforensics Conference, 2016

Putting Heterogeneous High-Performance Computing at the Fingertips of Domain Experts (NII Shonan Meeting 2015-18).
NII Shonan Meet. Rep., 2015

Steal Locally, Share Globally - A Strategy for Multiprogramming in the Manycore Era.
Int. J. Parallel Program., 2015

Inferring Program Transformations from Type Transformations for Partitioning of Ordered Sets.
CoRR, 2015

Model Coupling between the Weather Research and Forecasting Model and the DPRI Large Eddy Simulator for Urban Flows on GPU-accelerated Multicore Systems.
CoRR, 2015

An Intermediate Language and Estimator for Automated Design Space Exploration on FPGAs.
CoRR, 2015

A Reconfigurable Vector Instruction Processor for Accelerating a Convection Parametrization Model on FPGAs.
CoRR, 2015

Using type transformations to generate program variants for FPGA design space exploration.
Proceedings of the International Conference on ReConFigurable Computing and FPGAs, 2015

Number of Tasks, not Threads, is Key.
Proceedings of the 23rd Euromicro International Conference on Parallel, 2015

FPGAs as Components in Heterogeneous High-Performance Computing Systems: Raising the Abstraction Level.
Proceedings of the Parallel Computing: On the Road to Exascale, 2015

Efficient Parallel Linked List Processing.
Proceedings of the Parallel Computing: On the Road to Exascale, 2015

FPGA Port of a Large Scientific Model from Legacy Code: The Emanuel Convection Scheme.
Proceedings of the Parallel Computing: On the Road to Exascale, 2015

Twinned buffering: A simple and highly effective scheme for parallelization of Successive Over-Relaxation on GPUs and other accelerators.
Proceedings of the 2015 International Conference on High Performance Computing & Simulation, 2015

High Level Programming of Document Classification Systems for Heterogeneous Environments using OpenCL (Abstract Only).
Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2015

The Impact of Traffic Localisation on the Performance of NoCs for Very Large Manycore Systems.
Proceedings of the 10th International Conference on Future Networks and Communications (FNC 2015) / The 12th International Conference on Mobile Systems and Pervasive Computing (MobiSPC 2015) / Affiliated Workshops, 2015

Shortest Path Routing Algorithm for Hierarchical Interconnection Network-on-Chip.
Proceedings of the 10th International Conference on Future Networks and Communications (FNC 2015) / The 12th International Conference on Mobile Systems and Pervasive Computing (MobiSPC 2015) / Affiliated Workshops, 2015

Cache-aware Parallel Programming for Manycore Processors.
CoRR, 2014

List-based Monadic Computations for Dynamically Typed Languages.
Proceedings of the Workshop on Dynamic Languages and Applications, 2014

Accelerating Lagrangian particle dispersion in the atmosphere with OpenCL across multiple platforms.
Proceedings of the International Workshop on OpenCL, 2014

A Parallel Task-Based Approach to Linear Algebra.
Proceedings of the IEEE 13th International Symposium on Parallel and Distributed Computing, 2014

High level programming of FPGAs for HPC and data centric applications.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2014

Comparison of Three Popular Parallel Programming Models on the Intel Xeon Phi.
Proceedings of the Euro-Par 2014: Parallel Processing Workshops, 2014

Design and Evaluation of High-Performance Processing Elements for Reconfigurable Systems.
IEEE Trans. Very Large Scale Integr. Syst., 2013

Throughput/Resource-Efficient Reconfigurable Processor for Multimedia Applications.
IEEE Trans. Very Large Scale Integr. Syst., 2013

A hybrid CPU-FPGA system for high throughput (10Gb/s) streaming document classification.
SIGARCH Comput. Archit. News, 2013

The Glasgow Parallel Reduction Machine: Programming Shared-memory Many-core Systems using Parallel Task Composition.
Proceedings of the Proceedings 6th Workshop on Programming Language Approaches to Concurrency and Communication-cEntric Software, 2013

An Efficient Thread Mapping Strategy for Multiprogramming on Manycore Processors.
Proceedings of the Parallel Computing: Accelerating Computational Science and Engineering (CSE), 2013

An investigation into the feasibility and benefits of GPU/multicore acceleration of the weather research and forecasting model.
Proceedings of the International Conference on High Performance Computing & Simulation, 2013

Implementing data parallelisation in a Nested-Sampling Monte Carlo algorithm.
Proceedings of the International Conference on High Performance Computing & Simulation, 2013

High throughput filtering using FPGA-acceleration.
Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 2013

Impact of Random Dopant Fluctuations on the Timing Characteristics of Flip-Flops.
IEEE Trans. Very Large Scale Integr. Syst., 2012

Throughput Analysis for a High-Performance FPGA-Accelerated Real-Time Search Application.
Int. J. Reconfigurable Comput., 2012

Evaluating FPGA-acceleration for real-time unstructured search.
Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2012

Improving user experience of submitting jobs to HPC resources.
Proceedings of the 2012 International Conference on High Performance Computing & Simulation, 2012

An analytical model of broadcast in QoS-aware wormhole-routed NoCs.
J. Syst. Softw., 2011

A few lines of code, thousands of cores: High-level FPGA programming using vector processor networks.
Proceedings of the 2011 International Conference on High Performance Computing & Simulation, 2011

Communication modeling of multicast in all-port wormhole-routed NoCs.
J. Syst. Softw., 2010

An analytical performance model for the Spidergon NoC with virtual channels.
J. Syst. Archit., 2010

Performance Analysis of On-Chip Communication Structures under Device Variability.
Int. J. Embed. Real Time Commun. Syst., 2010

Radiation-Hardened Reconfigurable Array With Instruction Roll-Back.
IEEE Embed. Syst. Lett., 2010

Search system requirements of patent analysts.
Proceedings of the Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2010

A survey of patent users: an analysis of tasks, behavior, search functionality and system requirements.
Proceedings of the Information Interaction in Context Symposium, 2010

A high-level language for programming a NoC-based Dynamic Reconfiguration Infrastructure.
Proceedings of the 2010 Conference on Design & Architectures for Signal & Image Processing, 2010

A C++-embedded Domain-Specific Language for programming the MORA soft processor array.
Proceedings of the 21st IEEE International Conference on Application-specific Systems Architectures and Processors, 2010

An Analytical Comparison of the Spidergon and Rectangular Mesh NoCs.
J. Interconnect. Networks, 2009

Developing energy efficient filtering systems.
Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2009

Dynamic counter-based broadcast in MANETs.
Proceedings of the 4th ACM workshop on Performance monitoring and measurement of heterogeneous wireless and wired networks, 2009

Debugging FPGA-based packet processing systems through transaction-level communication-centric monitoring.
Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, 2009

Automated instrumentation of FPGA-based systems for system-level transaction monitoring.
Proceedings of the 2008 IEEE International Symposium on System-on-Chip, 2009

Impact of device variability in the communication structures for future synchronous SoC designs.
Proceedings of the 2008 IEEE International Symposium on System-on-Chip, 2009

A performance model of multicast communication in wormhole-routed networks on-chip.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Design and implementation of the Quarc Network on-Chip.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

FPGA-accelerated Information Retrieval: High-efficiency document filtering.
Proceedings of the 19th International Conference on Field Programmable Logic and Applications, 2009

A low cost reconfigurable soft processor for multimedia applications: Design synthesis and programming model.
Proceedings of the 19th International Conference on Field Programmable Logic and Applications, 2009

Architectural Comparison of Instruments for Transaction Level Monitoring of FPGA-Based Packet Processing Systems.
Proceedings of the FCCM 2009, 2009

Programming Model and Low-level Language for a Coarse-Grained Reconfigurable Multimedia Processor.
Proceedings of the 2009 International Conference on Engineering of Reconfigurable Systems & Algorithms, 2009

MAW: A Reliable Lightweight Multi-hop Wireless Sensor Network Routing Protocol.
Proceedings of the 12th IEEE International Conference on Computational Science and Engineering, 2009

A Communication Model of Broadcast in Wormhole-Routed Networks on-Chip.
Proceedings of the IEEE 23rd International Conference on Advanced Information Networking and Applications, 2009

Quarc: A High-Efficiency Network on-Chip Architecture.
Proceedings of the IEEE 23rd International Conference on Advanced Information Networking and Applications, 2009

MORA - An Architecture and Programming Model for a Resource Efficient Coarse Grained Reconfigurable Processor.
Proceedings of the NASA/ESA Conference on Adaptive Hardware and Systems, 2009

Communication Modeling of QoS-Aware Wormhole-Routed NoCs.
J. Interconnect. Networks, 2008

A coarse-grained Dynamically Reconfigurable MAC Processor for power-sensitive multi-standard devices.
Proceedings of the 21st Annual IEEE International SoC Conference, SoCC 2008, 2008

A Performance Model of Communication in the Quarc NoC.
Proceedings of the 14th International Conference on Parallel and Distributed Systems, 2008

Quarc: A Novel Network-On-Chip Architecture.
Proceedings of the 14th International Conference on Parallel and Distributed Systems, 2008

Interface and Reconfiguration Controller for a wireless MAC-oriented dynamically reconfigurable hardware co-processor.
Proceedings of the FPL 2008, 2008

A type system for static typing of a domain-specific language.
Proceedings of the ACM/SIGDA 16th International Symposium on Field Programmable Gate Arrays, 2008

A Formal Semantics for Control and Data flow in the Gannet Service-based System-on-Chip Architecture.
Proceedings of the 2008 International Conference on Engineering of Reconfigurable Systems & Algorithms, 2008

A Hardware Relaxation Paradigm for Solving NP-Hard Problems.
Proceedings of the Visions of Computer Science, 2008

Modeling Differentiated Services-Based QoS in Wormhole-Routed NoCs.
Proceedings of the 22nd International Conference on Advanced Information Networking and Applications, 2008

The Gannet Service Manager: A Distributed Dataflow Controller for Heterogeneous Multi-core SoCs.
Proceedings of the NASA/ESA Conference on Adaptive Hardware and Systems, 2008

A Dynamically Reconfigurable Hardware Co-Processor for a Multi-Standard Wireless MAC Processor.
Proceedings of the NASA/ESA Conference on Adaptive Hardware and Systems, 2008

Communication Modelling of the Spidergon NoC with Virtual Channels.
Proceedings of the 2007 International Conference on Parallel Processing (ICPP 2007), 2007

Analytical modelling of communication in the rectangular mesh NoC.
Proceedings of the 13th International Conference on Parallel and Distributed Systems, 2007

An Analytical Performance Model for the Spidergon NoC.
Proceedings of the 21st International Conference on Advanced Information Networking and Applications (AINA 2007), 2007

Separation of Data flow and Control flow in Reconfigurable Multi-core SoCs using the Gannet Service-based Architecture.
Proceedings of the Second NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2007), 2007

Implementation of Finite State Machines on a Reconfigurable Device.
Proceedings of the Second NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2007), 2007

The Gannet Service-Based SoC: A Service-level Reconfigurable Architecture.
Proceedings of the First NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2006), 2006

A compact test structure for characterisation of leakage currents in sub-micron CMOS technologies.
Microelectron. Reliab., 2001
