Magnus Jahre

Orcid: 0000-0001-9147-5228

According to our database1, Magnus Jahre authored at least 61 papers between 2008 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Per-Instruction Cycle Stacks Through Time-Proportional Event Analysis.
IEEE Micro, 2024

CoFaaS: Automatic Transformation-based Consolidation of Serverless Functions.
Proceedings of the 2nd Workshop on SErverless Systems, Applications and MEthodologies, 2024

AIO: An Abstraction for Performance Analysis Across Diverse Accelerator Architectures.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

ECM: Improving IoT Throughput with Energy-Aware Connection Management.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

2023
Characterizing Multi-Chip GPU Data Sharing.
ACM Trans. Archit. Code Optim., December, 2023

Near-optimal multi-accelerator architectures for predictive maintenance at the edge.
Future Gener. Comput. Syst., 2023

PES: An Energy and Throughput Model for Energy Harvesting IoT Systems.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023

SAC: Sharing-Aware Caching in Multi-Chip GPUs.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

TEA: Time-Proportional Event Analysis.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

Balancing Accuracy and Evaluation Overhead in Simulation Point Selection.
Proceedings of the IEEE International Symposium on Workload Characterization, 2023

ESS: Repeatable Evaluation of Energy Harvesting Subsystems for Industry-Grade IoT Platforms.
Proceedings of the IEEE International Symposium on Workload Characterization, 2023

NUBA: Non-Uniform Bandwidth GPUs.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
LMT: Accurate and Resource-Scalable Slowdown Prediction.
IEEE Comput. Archit. Lett., 2022

Delegated Replies: Alleviating Network Clogging in Heterogeneous Architectures.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

2021
Fast and Accurate Edge Computing Energy Modeling and DVFS Implementation in GEM5 Using System Call Emulation Mode.
J. Signal Process. Syst., 2021

Modeling Periodic Energy-Harvesting Computing Systems.
IEEE Comput. Archit. Lett., 2021

TIP: Time-Proportional Instruction Profiling.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

2020
Scalability analysis of AVX-512 extensions.
J. Supercomput., 2020

DCMI: A Scalable Strategy for Accelerating Iterative Stencil Loops on FPGAs.
ACM Trans. Archit. Code Optim., 2020

MDM: The GPU Memory Divergence Model.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

Selective Replication in Memory-Side GPU Caches.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

HSM: A Hybrid Slowdown Model for Multitasking GPUs.
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

2019
EPIC: An Energy-Efficient, High-Performance GPGPU Computing Research Infrastructure.
CoRR, 2019

Modeling Emerging Memory-Divergent GPU Applications.
IEEE Comput. Archit. Lett., 2019

2018
Get Out of the Valley: Power-Efficient Address Mapping for GPUs.
Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

GDP: Using Dataflow Properties to Accurately Estimate Interference-Free Performance at Runtime.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

Supporting Utilities for Heterogeneous Embedded Image Processing Platforms (STHEM): An Overview.
Proceedings of the Applied Reconfigurable Computing. Architectures, Tools, and Applications, 2018

2017
Streamlined Deployment for Quantized Neural Networks.
CoRR, 2017

The READEX formalism for automatic tuning for energy efficiency.
Computing, 2017

Extending OMPT to Support Grain Graphs.
Proceedings of the Scaling OpenMP for Exascale Performance and Portability, 2017

Scaling Binarized Neural Networks on Reconfigurable Logic.
Proceedings of the 8th Workshop and 6th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms, 2017

DTP: Enabling Exhaustive Exploration of FPGA Temporal Partitions for Streaming HPC Applications.
Proceedings of the 8th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies, 2017

FINN: A Framework for Fast, Scalable Binarized Neural Network Inference.
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017

Towards Efficient Design Space Exploration of FPGA-based Accelerators for Streaming HPC Applications (Abstract Only).
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017

Towards efficient quantized neural network inference on mobile devices: work-in-progress.
Proceedings of the 2017 International Conference on Compilers, 2017

2016
Random access schemes for efficient FPGA SpMV acceleration.
Microprocess. Microsystems, 2016

TULIPP: Towards ubiquitous low-power image processing platforms.
Proceedings of the International Conference on Embedded Computer Systems: Architectures, 2016

Efficient control flow restructuring for GPUs.
Proceedings of the International Conference on High Performance Computing & Simulation, 2016

2015
Tuning the victim selection policy of Intel TBB.
J. Syst. Archit., 2015

ParVec: vectorizing the PARSEC benchmark suite.
Computing, 2015

Hybrid breadth-first search on a single-chip FPGA-CPU heterogeneous platform.
Proceedings of the 25th International Conference on Field Programmable Logic and Applications, 2015

A Vector Caching Scheme for Streaming FPGA SpMV Accelerators.
Proceedings of the Applied Reconfigurable Computing - 11th International Symposium, 2015

2014
Perfect Reconstructability of Control Flow from Demand Dependence Graphs.
ACM Trans. Archit. Code Optim., 2014

Patterned Heterogeneous CMPs: The Case for Regularity-Driven System-Level Synthesis.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2014

Optimized hardware for suboptimal software: The case for SIMD-aware benchmarks.
Proceedings of the 2014 IEEE International Symposium on Performance Analysis of Systems and Software, 2014

A Study of Energy and Locality Effects Using Space-Filling Curves.
Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

An energy efficient column-major backend for FPGA SpMV accelerators.
Proceedings of the 32nd IEEE International Conference on Computer Design, 2014

Victim Selection Policies for Intel TBB: Overheads and Energy Footprint.
Proceedings of the Architecture of Computing Systems - ARCS 2014, 2014

Graph-based performance accounting for chip multiprocessor memory systems.
Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

2013
On the energy footprint of task based parallel applications.
Proceedings of the International Conference on High Performance Computing & Simulation, 2013

Challenges of Reducing Cycle-Accurate Simulation Time for TBP Applications.
Proceedings of the International Conference on Computational Science, 2013

2011
A High Performance Adaptive Miss Handling Architecture for Chip Multiprocessors.
Trans. High Perform. Embed. Archit. Compil., 2011

Storage Efficient Hardware Prefetching using Delta-Correlating Prediction Tables.
J. Instr. Level Parallelism, 2011

Exploring the Prefetcher/Memory Controller Design Space: An Opportunistic Prefetch Scheduling Strategy.
Proceedings of the Architecture of Computing Systems - ARCS 2011, 2011

2010
Computational Computer Architecture Research at NTNU.
ERCIM News, 2010

DIEF: An Accurate Interference Feedback Mechanism for Chip Multiprocessor Memory Systems.
Proceedings of the High Performance Embedded Architectures and Compilers, 2010

Multi-level Hardware Prefetching Using Low Complexity Delta Correlating Prediction Tables with Partial Matching.
Proceedings of the High Performance Embedded Architectures and Compilers, 2010

2009
Experimental Validation of the Learning Effect for a Pedagogical Game on Computer Fundamentals.
IEEE Trans. Educ., 2009

A Quantitative Study of Memory System Interference in Chip Multiprocessor Architectures.
Proceedings of the 11th IEEE International Conference on High Performance Computing and Communications, 2009

A light-weight fairness mechanism for chip multiprocessor memory systems.
Proceedings of the 6th Conference on Computing Frontiers, 2009

2008
Low-cost open-page prefetch scheduling in chip multiprocessors.
Proceedings of the 26th International Conference on Computer Design, 2008


  Loading...