Steven Derrien

Orcid: 0000-0002-6281-083X

According to our database1, Steven Derrien authored at least 81 papers between 2000 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 




An Irredundant and Compressed Data Layout to Optimize Bandwidth Utilization of FPGA Accelerators.
CoRR, 2024

Efficient Design Space Exploration for Dynamic & Speculative High-Level Synthesis.
Proceedings of the 34th International Conference on Field-Programmable Logic and Applications, 2024

A Unified Memory Dependency Framework for Speculative High-Level Synthesis.
Proceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction, 2024

Increasing FPGA Accelerators Memory Bandwidth With a Burst-Friendly Memory Layout.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., May, 2023

An Irredundant Decomposition of Data Flow with Affine Dependences.
CoRR, 2023

Rapid Prototyping of Complex Micro-architectures Through High-Level Synthesis.
Proceedings of the Applied Reconfigurable Computing. Architectures, Tools, and Applications, 2023

Automatic Algorithm-Based Fault Tolerance (AABFT) of Stencil Computations.
Proceedings of the 32nd International Conference on Parallel Architectures and Compilation Techniques, 2023

Special Issue on Applied Reconfigurable Computing.
J. Signal Process. Syst., 2022

SpecHLS: Speculative Accelerator Design Using High-Level Synthesis.
IEEE Micro, 2022

Maximal Atomic irRedundant Sets: a Usage-based Dataflow Partitioning Algorithm.
CoRR, 2022

Design Exploration of RISC-V Soft-Cores through Speculative High-Level Synthesis.
Proceedings of the International Conference on Field-Programmable Technology, 2022

Safe Overclocking for CNN Accelerators Through Algorithm-Level Error Detection.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Toward Speculative Loop Pipelining for High-Level Synthesis.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Application-Specific Arithmetic in High-Level Synthesis Tools.
ACM Trans. Archit. Code Optim., 2020

Worst-Case Execution-Time-Aware Parallelization of Model-Based Avionics Applications.
J. Aerosp. Inf. Syst., November, 2019

Hybrid-DBT: Hardware/Software Dynamic Binary Translation Targeting VLIW.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

Reconciling Compiler Optimizations and WCET Estimation Using Iterative Compilation.
Proceedings of the IEEE Real-Time Systems Symposium, 2019

Hiding Communication Delays in Contention-Free Execution for SPM-Based Multi-Core Architectures.
Proceedings of the 31st Euromicro Conference on Real-Time Systems, 2019

Aggressive Memory Speculation in HW/SW Co-Designed Machines.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

Fine-Grain Iterative Compilation for WCET Estimation.
Proceedings of the 18th International Workshop on Worst-Case Execution Time Analysis, 2018

Enabling Overclocking Through Algorithm-Level Error Detection.
Proceedings of the International Conference on Field-Programmable Technology, 2018

Supporting runtime reconfigurable VLIWs cores through dynamic binary translation.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

Using polyhedral techniques to tighten WCET estimates of optimized code: A case study with array contraction.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

Foreword to the Special Section on Reconfigurable Computing.
J. Signal Process. Syst., 2017

Tightening Contention Delays While Scheduling Parallel Applications on Multi-core Architectures.
ACM Trans. Embed. Comput. Syst., 2017

Bridging high-level synthesis and application-specific arithmetic: The case study of floating-point summations.
Proceedings of the 27th International Conference on Field Programmable Logic and Applications, 2017

One size does not fit all: Implementation trade-offs for iterative stencil computations on FPGAs.
Proceedings of the 27th International Conference on Field Programmable Logic and Applications, 2017

A High-Level Synthesis Approach Optimizing Accumulations in Floating-Point Programs Using Custom Formats and Operators.
Proceedings of the 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2017

Hardware-accelerated dynamic binary translation.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

Superword level parallelism aware word length optimization.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

WCET-aware parallelization of model-based applications for multi-cores: The ARGO approach.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

Communication-Based Power Modelling for Heterogeneous Multiprocessor Architectures.
Proceedings of the 10th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2016

System level synthesis for virtual memory enabled hardware threads.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

Demo: SLP-aware word length optimization.
Proceedings of the 2016 Conference on Design and Architectures for Signal and Image Processing (DASIP), 2016

Combining execution pipelines to improve parallel implementation of HMMER on FPGA.
Microprocess. Microsystems, 2015

Component reuse methodology for multi-clock Data-Flow parallel embedded Systems.
ARIMA J., 2014

Toward scalable source level accuracy analysis for floating-point to fixed-point conversion.
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2014

Low Power Reconfigurable Controllers for Wireless Sensor Network Nodes.
Proceedings of the 22nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2014

Polyhedral Bubble Insertion: A Method to Improve Nested Loop Pipelining for High-Level Synthesis.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2013

Compiling Scilab to high performance embedded multicore systems.
Microprocess. Microsystems, 2013

GeCoS: A framework for prototyping custom hardware design flows.
Proceedings of the 13th IEEE International Working Conference on Source Code Analysis and Manipulation, 2013

Derivation of efficient FSM from loop nests.
Proceedings of the 2013 International Conference on Field-Programmable Technology, 2013

Using Model Types to Support Contract-Aware Model Substitutability.
Proceedings of the Modelling Foundations and Applications - 9th European Conference, 2013

Component-Level Datapath Merging in System-Level Design of Wireless Sensor Node Controllers for FPGA-Based Implementations.
Proceedings of the 2013 Euromicro Conference on Digital System Design, 2013

Runtime dependency analysis for loop pipelining in high-level synthesis.
Proceedings of the 50th Annual Design Automation Conference 2013, 2013

System-Level Synthesis for Wireless Sensor Node Controllers: A Complete Design Flow.
ACM Trans. Design Autom. Electr. Syst., 2012

Bridging the chasm between MDE and the world of compilation.
Softw. Syst. Model., 2012

Efficient hardware implementation of data-flow parallel embedded systems.
Proceedings of the 2012 International Conference on Embedded Computer Systems: Architectures, 2012

A flexible approach for compiling scilab to reconfigurable multi-core embedded systems.
Proceedings of the 7th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC), 2012

On Model Subtyping.
Proceedings of the Modelling Foundations and Applications - 8th European Conference, 2012

From Scilab to High Performance Embedded Multicore Systems: The ALMA Approach.
Proceedings of the 15th Euromicro Conference on Digital System Design, 2012

A semiempirical model for wakeup time estimation in power-gated logic clusters.
Proceedings of the 49th Annual Design Automation Conference 2012, 2012

A Compilation- and Simulation-Oriented Architecture Description Language for Multicore Systems.
Proceedings of the 15th IEEE International Conference on Computational Science and Engineering, 2012

A Polynomial Based Approach to Wakeup Time and Energy Estimation in Power-Gated Logic Clusters.
J. Low Power Electron., 2011

Wakeup Time and Wakeup Energy Estimation in Power-Gated Logic Clusters.
Proceedings of the VLSI Design 2011: 24th International Conference on VLSI Design, 2011

Model-Driven Engineering and Optimizing Compilers: A Bridge Too Far?
Proceedings of the Model Driven Engineering Languages and Systems, 2011

ompVerify: Polyhedral Analysis for the OpenMP Programmer.
Proceedings of the OpenMP in the Petascale Era - 7th International Workshop on OpenMP, 2011

Efficient nested loop pipelining in high level synthesis using polyhedral bubble insertion.
Proceedings of the 2011 International Conference on Field-Programmable Technology, 2011

HLS Tools for FPGA: Faster Development with Better Performance.
Proceedings of the Reconfigurable Computing: Architectures, Tools and Applications, 2011

Contributions à la conception d'architectures matérielles dédiées.
, 2011

Hardware Acceleration of HMMER on FPGAs.
J. Signal Process. Syst., 2010

Accelerating HMMER on FPGA using parallel prefixes and reductions.
Proceedings of the International Conference on Field-Programmable Technology, 2010

System Level Synthesis for Ultra Low-Power Wireless Sensor Nodes.
Proceedings of the 13th Euromicro Conference on Digital System Design, 2010

A complete design-flow for the generation of ultra low-power WSN node architectures based on micro-tasking.
Proceedings of the 47th Design Automation Conference, 2010

Ultra Low-power FSM for Control Oriented Applications.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2009), 2009

Deriving efficient control in Process Networks with Compaan/Laura.
Int. J. Embed. Syst., 2008

Parallelizing HMMER for Hardware Acceleration on FPGAs.
Proceedings of the IEEE International Conference on Application-Specific Systems, 2007

Combining Flash Memory and FPGAs to Efficiently Implement a Massively Parallel Algorithm for Content-Based Image Retrieval.
Proceedings of the Reconfigurable Computing: Architectures, 2007

Acceleration of a content-based image-retrieval application on the RDISK cluster.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

Cluster of re-configurable nodes for scanning large genomic banks.
Parallel Comput., 2005

Hardware/Software Interface for Multi-Dimensional Processor Arrays.
Proceedings of the 16th IEEE International Conference on Application-Specific Systems, 2005

A Reconfigurable Parallel Disk System for Filtering Genomic Banks.
Proceedings of the International Conference on Engineering of Reconfigurable Systems and Algorithms, June 23, 2003

Energy/Power Estimation of Regular Processor Arrays.
Proceedings of the 15th International Symposium on System Synthesis (ISSS 2002), 2002

Combined instruction and loop parallelism in array synthesis for FPGAs.
Proceedings of the 14th International Symposium on Systems Synthesis, 2001

Loop Tiling for Reconfigurable Accelerators.
Proceedings of the Field-Programmable Logic and Applications, 2001

Combining Instruction and Loop Level Parallelism for FPGAs.
Proceedings of the 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, 2001

Interfacing compiled FPGA programs: the MMAlpha approach.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2000

Optimal Partitioning for FPGA Based Regular Array Implementations.
Proceedings of the 2000 International Conference on Parallel Computing in Electrical Engineering (PARELEC 2000), 2000

Approximating a Single Viewpoint in Panoramic Imaging Devices.
Proceedings of the 2000 IEEE International Conference on Robotics and Automation, 2000

FCCMS and the Memory Wall.
Proceedings of the 8th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2000), 2000
