Andreas Gerstlauer

Orcid: 0000-0002-6748-2054

Affiliations:
  • University of Texas at Austin, USA


According to our database1, Andreas Gerstlauer authored at least 166 papers between 2000 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
A Hierarchical Classification Method for High-accuracy Instruction Disassembly with Near-field EM Measurements.
ACM Trans. Embed. Comput. Syst., January, 2024

Distributed Convolutional Neural Network Training on Mobile and Edge Clusters.
CoRR, 2024

A Survey of Distributed Learning in Cloud, Mobile, and Edge Settings.
CoRR, 2024

Characterizing Machine Learning-Based Runtime Prefetcher Selection.
IEEE Comput. Archit. Lett., 2024

SoK Paper: Power Side-Channel Malware Detection.
Proceedings of the 13th International Workshop on Hardware and Architectural Support for Security and Privacy, 2024

Efficient Approaches for GEMM Acceleration on Leading AI-Optimized FPGAs.
Proceedings of the 32nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2024

2023
Machine Learning-Based Microarchitecture- Level Power Modeling of CPUs.
IEEE Trans. Computers, April, 2023

Learning-based Phase-aware Multi-core CPU Workload Forecasting.
ACM Trans. Design Autom. Electr. Syst., March, 2023

Lightweight ML-based Runtime Prefetcher Selection on Many-core Platforms.
CoRR, 2023

Performance and Energy Simulation of Spiking Neuromorphic Architectures for Fast Exploration.
Proceedings of the 2023 International Conference on Neuromorphic Systems, 2023

LAWS: Large-Scale Accelerated Wave Simulations on FPGAs.
Proceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2023

Memory Latency Distribution-Driven Regulation for Temporal Isolation in MPSoCs.
Proceedings of the 35th Euromicro Conference on Real-Time Systems, 2023

Special Session: Machine Learning for Embedded System Design.
Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis, 2023

FAWS: FPGA Acceleration of Large-Scale Wave Simulations.
Proceedings of the 34th IEEE International Conference on Application-specific Systems, 2023

2022
Introduction to the Special Issue on Approximate Systems.
ACM Trans. Design Autom. Electr. Syst., 2022

Characterizing Approximate Adders and Multipliers for Mitigating Aging and Temperature Degradations.
IEEE Trans. Circuits Syst. I Regul. Pap., 2022

CASPHAr: Cache-Managed Accelerator Staging and Pipelining in Heterogeneous System Architectures.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Report on the 2021 Embedded Systems Week (ESWEEK).
IEEE Des. Test, 2022

High-Level Simulation of Embedded Software Vulnerabilities to EM Side-Channel Attacks.
Proceedings of the Embedded Computer Systems: Architectures, Modeling, and Simulation, 2022

Memory Utilization-Based Dynamic Bandwidth Regulation for Temporal Isolation in Multi-Cores.
Proceedings of the 28th IEEE Real-Time and Embedded Technology and Applications Symposium, 2022

MAFAT: Memory-Aware Fusing and Tiling of Neural Networks for Accelerated Edge Inference.
Proceedings of the Designing Modern Embedded Systems: Software, Hardware, and Applications, 2022

GAPS: GPU-acceleration of PDE solvers for wave simulation.
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

2021
Cross-Layer Approximate Hardware Synthesis for Runtime Configurable Accuracy.
IEEE Trans. Very Large Scale Integr. Syst., 2021

Hardware Accelerator Integration Tradeoffs for High-Performance Computing: A Case Study of GEMM Acceleration in N-Body Methods.
IEEE Trans. Parallel Distributed Syst., 2021

Horizontal Side-Channel Vulnerabilities of Post-Quantum Key Exchange and Encapsulation Protocols.
ACM Trans. Embed. Comput. Syst., 2021

DeeperThings: Fully Distributed CNN Inference on Resource-Constrained Edge Devices.
Int. J. Parallel Program., 2021

Report on the 2020 Embedded Systems Week (ESWEEK): A Virtual Event during a Pandemic, September 20-25.
IEEE Des. Test, 2021

Approximate Systems (Dagstuhl Seminar 21302).
Dagstuhl Reports, 2021

Exploiting Errors for Efficiency: A Survey from Circuits to Applications.
ACM Comput. Surv., 2021

Memory-Aware Fusing and Tiling of Neural Networks for Accelerated Edge Inference.
CoRR, 2021

Phase-Aware CPU Workload Forecasting.
Proceedings of the Embedded Computer Systems: Architectures, Modeling, and Simulation, 2021

Learning based Memory Interference Prediction for Co-running Applications on Multi-Cores.
Proceedings of the 3rd ACM/IEEE Workshop on Machine Learning for CAD, 2021

Learning-Based Workload Phase Classification and Prediction Using Performance Monitoring Counters.
Proceedings of the 3rd ACM/IEEE Workshop on Machine Learning for CAD, 2021

Virtual-Link: A Scalable Multi-Producer Multi-Consumer Message Queue Architecture for Cross-Core Communication.
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

Wave-PIM: Accelerating Wave Simulation Using Processing-in-Memory.
Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021

Approximate Computing for ML: State-of-the-art, Challenges and Visions.
Proceedings of the ASPDAC '21: 26th Asia and South Pacific Design Automation Conference, 2021

2020
Network-level Design Space Exploration of Resource-constrained Networks-of-Systems.
ACM Trans. Embed. Comput. Syst., 2020

Aging Compensation With Dynamic Computation Approximation.
IEEE Trans. Circuits Syst. I Fundam. Theory Appl., 2020

Dynamic Power and Energy Management for NCFET-Based Processors.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Cacheline Utilization-Aware Link Traffic Compression for Modular GPUs.
Proceedings of the 33rd International Conference on VLSI Design and 19th International Conference on Embedded Systems, 2020

Off-Chip Congestion Management for GPU-based Non-Uniform Processing-in-Memory Networks.
Proceedings of the 28th Euromicro International Conference on Parallel, 2020

The Non-Uniform Compute Device (NUCD) Architecture for Lightweight Accelerator Offload.
Proceedings of the 28th Euromicro International Conference on Parallel, 2020

Energy Optimization in NCFET-based Processors.
Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

Runtime Accuracy-Configurable Approximate Hardware Synthesis Using Logic Gating and Relaxation.
Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

TAFE: Thread Address Footprint Estimation for Capturing Data/Thread Locality in GPU Systems.
Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020

2019
Quality/Latency-Aware Real-time Scheduling of Distributed Streaming IoT Applications.
ACM Trans. Embed. Comput. Syst., 2019

On the Efficiency of Voltage Overscaling under Temperature and Aging Effects.
IEEE Trans. Computers, 2019

A Study of Core Utilization and Residency in Heterogeneous Smart Phone Architectures.
Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering, 2019

Horus Testbed: Implementation of Real-Time Video Streaming Protocols.
Proceedings of the 2019 IEEE International Systems Conference, 2019

Real-Time Rate Distortion Optimized and Adaptive Low Complexity Algorithms for Video Streaming.
Proceedings of the 2019 IEEE International Systems Conference, 2019

Fully Distributed Deep Learning Inference on Resource-Constrained Edge Devices.
Proceedings of the Embedded Computer Systems: Architectures, Modeling, and Simulation, 2019

Aging Gracefully with Approximation.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2019

Using Power-Anomalies to Counter Evasive Micro-Architectural Attacks in Embedded Systems.
Proceedings of the IEEE International Symposium on Hardware Oriented Security and Trust, 2019

Approximate High-Level Synthesis of Custom Hardware.
Proceedings of the Approximate Circuits, Methodologies and CAD., 2019

2018
Learning-Based, Fine-Grain Power Modeling of System-Level Hardware IPs.
ACM Trans. Design Autom. Electr. Syst., 2018

DeepThings: Distributed Adaptive Deep Learning Inference on Resource-Constrained IoT Edge Clusters.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

Start Late or Finish Early: A Distributed Graph Processing System with Redundancy Reduction.
Proc. VLDB Endow., 2018

Data-Dependent Loop Approximations for Performance-Quality Driven High-Level Synthesis.
IEEE Embed. Syst. Lett., 2018

Exploiting Errors for Efficiency: A Survey from Circuits to Algorithms.
CoRR, 2018

MASES: Mobility And Slack Enhanced Scheduling For Latency-Optimized Pipelined Dataflow Graphs.
Proceedings of the 21st International Workshop on Software and Compilers for Embedded Systems, 2018

Trading Off Temperature Guardbands via Adaptive Approximations.
Proceedings of the 36th IEEE International Conference on Computer Design, 2018

Horizontal side-channel vulnerabilities of post-quantum key exchange protocols.
Proceedings of the 2018 IEEE International Symposium on Hardware Oriented Security and Trust, 2018

CAMP: Accurate modeling of core and memory locality for proxy generation of big-data applications.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

BUQS: Battery- and user-aware QoS scaling for interactive mobile devices.
Proceedings of the 23rd Asia and South Pacific Design Automation Conference, 2018

2017
SCE: System-on-Chip Environment.
Proceedings of the Handbook of Hardware/Software Codesign., 2017

Host-Compiled Simulation.
Proceedings of the Handbook of Hardware/Software Codesign., 2017

Introduction to Hardware/Software Codesign.
Proceedings of the Handbook of Hardware/Software Codesign., 2017

Source-Level Performance, Energy, Reliability, Power and Thermal (PERPT) Simulation.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2017

LACross: Learning-Based Analytical Cross-Platform Performance and Power Prediction.
Int. J. Parallel Program., 2017

Guest Editorial: Special Issue on the 2015 International Conference on Embedded Computer Systems - Architectures, Modeling and Simulation (SAMOS XV).
Int. J. Parallel Program., 2017

A Reactive and Adaptive Data Flow Model for Network-of-System Specification.
IEEE Embed. Syst. Lett., 2017

Network/system co-simulation for design space exploration of IoT applications.
Proceedings of the 2017 International Conference on Embedded Computer Systems: Architectures, 2017

Cloud-Guided QoS and Energy Management for Mobile Interactive Web Applications.
Proceedings of the 4th IEEE/ACM International Conference on Mobile Software Engineering and Systems, 2017

POWSER: A novel user-experience based power management metric.
Proceedings of the Eighth International Green and Sustainable Computing Conference, 2017

Fine-Grain Program Snippets Generator for Mobile Core Design.
Proceedings of the on Great Lakes Symposium on VLSI 2017, 2017

Exploring Heterogeneous-ISA Core Architectures for High-Performance and Energy-Efficient Mobile SoCs.
Proceedings of the on Great Lakes Symposium on VLSI 2017, 2017

Sampling-based binary-level cross-platform performance estimation.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

GATSim: Abstract timing simulation of GPUs.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

High-level synthesis of approximate hardware under joint precision and voltage scaling.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

Statistical Pattern Based Modeling of GPU Memory Access Streams.
Proceedings of the 54th Annual Design Automation Conference, 2017

Towards Aging-Induced Approximations.
Proceedings of the 54th Annual Design Automation Conference, 2017

2016
Adaptive resolution control in distributed cyber-physical system simulation.
Proceedings of the Winter Simulation Conference, 2016

Genesys: Automatically generating representative training sets for predictive benchmarking.
Proceedings of the International Conference on Embedded Computer Systems: Architectures, 2016

Simulator calibration for accelerator-rich architecture studies.
Proceedings of the International Conference on Embedded Computer Systems: Architectures, 2016

Statistical quality modeling of approximate hardware.
Proceedings of the 17th International Symposium on Quality Electronic Design, 2016

Optimizing GPGPU Kernel Summation for Performance and Energy Efficiency.
Proceedings of the 45th International Conference on Parallel Processing Workshops, 2016

Proxy-Guided Load Balancing of Graph Processing Workloads on Heterogeneous Clusters.
Proceedings of the 45th International Conference on Parallel Processing, 2016

Accurate phase-level cross-platform power and performance estimation.
Proceedings of the 53rd Annual Design Automation Conference, 2016

Reliability-aware design to suppress aging.
Proceedings of the 53rd Annual Design Automation Conference, 2016

Fine-grained power analysis of emerging graph processing workloads for cloud operations management.
Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData 2016), 2016

2015
Distributed Real-Time Implementation of Interference Alignment with Analog Feedback.
IEEE Trans. Veh. Technol., 2015

Learning-based analytical cross-platform performance prediction.
Proceedings of the 2015 International Conference on Embedded Computer Systems: Architectures, 2015

PowerTrain: A learning-based calibration of McPAT power models.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2015

Learning-Based Power Modeling of System-Level Black-Box IPs.
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2015

Dynamic power and performance back-annotation for fast and accurate functional hardware simulation.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

The next generation of virtual prototyping: ultra-fast yet accurate simulation of HW/SW systems.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

2014
A Highly Efficient Multicore Floating-Point FFT Architecture Based on Hybrid Linear Algebra/FFT Cores.
J. Signal Process. Syst., 2014

Host-Compiled Multicore System Simulation for Early Real-Time Performance Evaluation.
ACM Trans. Embed. Comput. Syst., 2014

Algorithm, Architecture, and Floating-Point Unit Codesign of a Matrix Factorization Accelerator.
IEEE Trans. Computers, 2014

Real-Time Rate-Distortion Optimized Streaming of Wireless Video.
CoRR, 2014

FastSpot: Host-compiled thermal estimation for early design space exploration.
Proceedings of the Fifteenth International Symposium on Quality Electronic Design, 2014

Performance analysis of HPC applications with irregular tree data structures.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

Multi-level approximate logic synthesis under general error constraints.
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2014

2013
Circuit-Level Timing-Error Acceptance for Design of Energy-Efficient DCT/IDCT-Based Systems.
IEEE Trans. Circuits Syst. Video Technol., 2013

SimConnect and SimTalk for distributed cyber-physical system simulation.
Simul., 2013

Fine Grain Precision Scaling for Datapath Approximations in Digital Signal Processing Systems.
Proceedings of the VLSI-SoC: At the Crossroads of Emerging Trends, 2013

Fine grain word length optimization for dynamic precision scaling in DSP systems.
Proceedings of the 21st IEEE/IFIP International Conference on VLSI and System-on-Chip, 2013

Dynamic resolution in distributed cyber-physical system simulation.
Proceedings of the SIGSIM Principles of Advanced Discrete Simulation, 2013

Low-energy digital filter design based on controlled timing error acceptance.
Proceedings of the International Symposium on Quality Electronic Design, 2013

Hardware and Software Implementations of Prim's Algorithm for Efficient Minimum Spanning Tree Computation.
Proceedings of the Embedded Systems: Design, Analysis and Verification, 2013

Approximate logic synthesis under general error magnitude and frequency constraints.
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2013

Automated, retargetable back-annotation for host compiled performance and power modeling.
Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis, 2013

Transforming a linear algebra core to an FFT accelerator.
Proceedings of the 24th International Conference on Application-Specific Systems, 2013

Toward a fast stochastic simulation processor for biochemical reaction networks.
Proceedings of the 24th International Conference on Application-Specific Systems, 2013

Floating Point Architecture Extensions for Optimized Matrix Factorization.
Proceedings of the 21st IEEE Symposium on Computer Arithmetic, 2013

2012
Communication-aware Heterogeneous Multiprocessor Mapping for Real-time Streaming Systems.
J. Signal Process. Syst., 2012

Codesign Tradeoffs for High-Performance, Low-Power Linear Algebra Architectures.
IEEE Trans. Computers, 2012

Predictive OS Modeling for Host-Compiled Simulation of Periodic Real-Time Task Sets.
IEEE Embed. Syst. Lett., 2012

On the Efficiency of Register File versus Broadcast Interconnect for Collective Communications in Data-Parallel Hardware Accelerators.
Proceedings of the IEEE 24th International Symposium on Computer Architecture and High Performance Computing, 2012

Low-energy signal processing using circuit-level timing-error acceptance.
Proceedings of the IEEE International Conference on IC Design & Technology, 2012

Modeling and synthesis of quality-energy optimal approximate adders.
Proceedings of the 2012 IEEE/ACM International Conference on Computer-Aided Design, 2012

Synthesis of optimized hardware transactors from abstract communication specifications.
Proceedings of the 10th International Conference on Hardware/Software Codesign and System Synthesis, 2012

Automatic timing granularity adjustment for host-compiled software simulation.
Proceedings of the 17th Asia and South Pacific Design Automation Conference, 2012

Abstract system-level models for early performance and power exploration.
Proceedings of the 17th Asia and South Pacific Design Automation Conference, 2012

A Linear Algebra Core Design for Efficient Level-3 BLAS.
Proceedings of the 23rd IEEE International Conference on Application-Specific Systems, 2012

Implementation of a real-time wireless interference alignment network.
Proceedings of the Conference Record of the Forty Sixth Asilomar Conference on Signals, 2012

2011
Real-Time Optimization of Video Transmission in a Network of AAVs.
Proceedings of the 74th IEEE Vehicular Technology Conference, 2011

Heterogeneous multiprocessor mapping for real-time streaming systems.
Proceedings of the IEEE International Conference on Acoustics, 2011

A programmable and configurable multi-port System-on-Chip for stimulating electrokinetically-driven microfluidic devices.
Proceedings of the 33rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2011

Expression-Level Parallelism for Distributed Spice Circuit Simulation.
Proceedings of the 15th IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications, 2011

Host-compiled multicore RTOS simulator for embedded real-time software development.
Proceedings of the Design, Automation and Test in Europe, 2011

Controlled timing-error acceptance for low energy IDCT design.
Proceedings of the Design, Automation and Test in Europe, 2011

Multi-core parallel simulation of System-level Description Languages.
Proceedings of the 16th Asia South Pacific Design Automation Conference, 2011

A high-performance, low-power linear algebra core.
Proceedings of the 22nd IEEE International Conference on Application-specific Systems, 2011

2010
Fast and accurate processor models for efficient MPSoC design.
ACM Trans. Design Autom. Electr. Syst., 2010

A system-level synthesis approach from formal application models to generic bus-based MPSoCs.
Proceedings of the 2010 International Conference on Embedded Computer Systems: Architectures, 2010

Host-compiled simulation of multi-core platforms.
Proceedings of the 21st IEEE International Symposium on Rapid System Prototyping, 2010

System-level development of embedded software.
Proceedings of the 15th Asia South Pacific Design Automation Conference, 2010

Platform modeling for exploration and synthesis.
Proceedings of the 15th Asia South Pacific Design Automation Conference, 2010

2009
Electronic System-Level Synthesis Methodologies.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2009

Modeling Cache Effects at the Transaction Level.
Proceedings of the Analysis, 2009

Transaction Level Modeling of Best-Effort Channels for Networked Embedded Devices.
Proceedings of the Analysis, 2009

Introduction to hardware-dependent software design hardware-dependent software for multi- and many-core embedded systems.
Proceedings of the 14th Asia South Pacific Design Automation Conference, 2009

2008
An Interactive Design Environment for C-Based High-Level Synthesis of RTL Processors.
IEEE Trans. Very Large Scale Integr. Syst., 2008

System-on-Chip Environment: A SpecC-Based Framework for Heterogeneous MPSoC Design.
EURASIP J. Embed. Syst., 2008

Specify-explore-refine (SER): from specification to implementation.
Proceedings of the 45th Design Automation Conference, 2008

Automatic generation of hardware dependent software for MPSoCs from abstract system specifications.
Proceedings of the 13th Asia South Pacific Design Automation Conference, 2008

2007
Automatic Layer-Based Generation of System-On-Chip Bus Communication Models.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2007

An Interactive Design Environment for C-based High-Level Synthesis.
Proceedings of the Embedded System Design: Topics, Techniques and Trends, IFIP TC10 Working Conference: International Embedded Systems Symposium (IESS), May 30, 2007

Embedded Software Development in a System-Level Design Flow.
Proceedings of the Embedded System Design: Topics, Techniques and Trends, IFIP TC10 Working Conference: International Embedded Systems Symposium (IESS), May 30, 2007

Abstract, Multifaceted Modeling of Embedded Processors for System Level Design.
Proceedings of the 12th Conference on Asia South Pacific Design Automation, 2007

2006
Automatic generation of transaction level models for rapid design space exploration.
Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis, 2006

2005
Automatic Generation of Communication Architectures.
Proceedings of the From Specification to Embedded Systems Application [International Embedded Systems Symposium, 2005

Automatic network generation for system-on-chip communication design.
Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, 2005

System-level communication modeling for network-on-chip synthesis.
Proceedings of the 2005 Conference on Asia South Pacific Design Automation, 2005

Multi-metric and multi-entity characterization of applications for early system design exploration.
Proceedings of the 2005 Conference on Asia South Pacific Design Automation, 2005

2004
Retargetable profiling for rapid, early system-level design space exploration.
Proceedings of the 41th Design Automation Conference, 2004

2003
RTOS Modeling for System Level Design.
Proceedings of the 2003 Design, 2003

RTOS scheduling in transaction level models.
Proceedings of the 1st IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, 2003

RTOS Modeling for System Level Design.
Proceedings of the Embedded Software for SoC, 2003

2002
Seamless approach for the design of control systems for power electronics and electric drives.
Proceedings of the IEEE International Conference on Systems, Man and Cybernetics: Bridging the Digital Divide, Yasmine Hammamet, Tunisia, October 6-9, 2002, 2002

System-Level Abstraction Semantics.
Proceedings of the 15th International Symposium on System Synthesis (ISSS 2002), 2002

The Formal Execution Semantics of SpecC.
Proceedings of the 15th International Symposium on System Synthesis (ISSS 2002), 2002

Co-design of embedded controllers for power electronics and electric systems.
Proceedings of the 2002 IEEE International Symposium on Intelligent Control, 2002

2001
System Design - A Practical Guide with SpecC.
Springer, ISBN: 978-0-7923-7387-2, 2001

2000
The Specification Language SpecC within the PARADISE Design Environment.
Proceedings of the Architecture and Design of Distributed Embedded Systems, 2000


  Loading...