Magnus Själander

Orcid: 0000-0003-4232-6976

According to our database1, Magnus Själander authored at least 75 papers between 2004 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
ARCTIC: Approximate Real-Time Computing in a Cache-Conscious Multicore Environment.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., October, 2024

R-HLS: An IR for Dynamic High-Level Synthesis and Memory Disambiguation based on Regions and State Edges.
CoRR, 2024

CATBench: A Compiler Autotuning Benchmarking Suite for Black-box Optimization.
CoRR, 2024

TEEMO: Temperature Aware Energy Efficient Multi-Retention STT-RAM Cache Architecture.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

MAFin: Maximizing Accuracy in FinFET based Approximated Real-Time Computing.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

2023
BISDU: A Bit-Serial Dot-Product Unit for Microcontrollers.
ACM Trans. Embed. Comput. Syst., September, 2023

Delay-on-Squash: Stopping Microarchitectural Replay Attacks in Their Tracks.
ACM Trans. Archit. Code Optim., March, 2023

DELICIOUS: Deadline-Aware Approximate Computing in Cache-Conscious Multicore.
IEEE Trans. Parallel Distributed Syst., February, 2023

ReCon: Efficient Detection, Management, and Use of Non-Speculative Information Leakage.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

Doppelganger Loads: A Safe, Complexity-Effective Optimization for Secure Speculation Schemes.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

Architecting Selective Refresh based Multi-Retention Cache for Heterogeneous System (ARMOUR).
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

2022
Data-Out Instruction-In (DOIN!): Leveraging Inclusive Caches to Attack Speculative Delay Schemes.
Proceedings of the 2022 IEEE International Symposium on Secure and Private Execution Environment Design (SEED), 2022

STIFF: thermally safe temperature effect inversion aware FinFET based multi-core.
Proceedings of the CF '22: 19th ACM International Conference on Computing Frontiers, Turin, Italy, May 17, 2022

2021
Prepare: Power-Aware Approximate Real-time Task Scheduling for Energy-Adaptive QoS Maximization.
ACM Trans. Embed. Comput. Syst., 2021

WaFFLe: Gated Cache-Ways with Per-Core Fine-Grained DVFS for Reduced On-Chip Temperature and Leakage Consumption.
ACM Trans. Archit. Code Optim., 2021

"It's a Trap!"-How Speculation Invariance Can Be Abused with Forward Speculative Interference.
CoRR, 2021

Selectively Delaying Instructions to Prevent Microarchitectural Replay Attacks.
CoRR, 2021

On Value Recomputation to Accelerate Invisible Speculation.
CoRR, 2021

Reorder Buffer Contention: A Forward Speculative Interference Attack for Speculation Invariant Instructions.
IEEE Comput. Archit. Lett., 2021

Seeds of SEED: Preventing Priority Inversion in Instruction Scheduling to Disrupt Speculative Interference.
Proceedings of the 2021 International Symposium on Secure and Private Execution Environment Design (SEED), 2021

Do Not Predict - Recompute! How Value Recomputation Can Truly Boost the Performance of Invisible Speculation.
Proceedings of the 2021 International Symposium on Secure and Private Execution Environment Design (SEED), 2021

2020
Understanding Selective Delay as a Method for Efficient Secure Speculative Execution.
IEEE Trans. Computers, 2020

Evaluating the Potential Applications of Quaternary Logic for Approximate Computing.
ACM J. Emerg. Technol. Comput. Syst., 2020

Twig: Multi-Agent Task Management for Colocated Latency-Critical Cloud Services.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

RePAiR: A Strategy for Reducing Peak Temperature while Maximising Accuracy of Approximate Real-Time Computing: Work-in-Progress.
Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis, 2020

Clearing the Shadows: Recovering Lost Performance for Invisible Speculative Execution through HW/SW Co-Design.
Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020

2019
Optimizing Bit-Serial Matrix Multiplication for Reconfigurable Computing.
ACM Trans. Reconfigurable Technol. Syst., 2019

EPIC: An Energy-Efficient, High-Performance GPGPU Computing Research Infrastructure.
CoRR, 2019

RVSDG: An Intermediate Representation for Optimizing Compilers.
CoRR, 2019

Efficient invisible speculative execution through selective delay and value prediction.
Proceedings of the 46th International Symposium on Computer Architecture, 2019

Improving Memory Access Locality for Vectorized Bit-Serial Matrix Multiplication in Reconfigurable Computing.
Proceedings of the International Conference on Field-Programmable Technology, 2019

Ghost loads: what is the cost of invisible speculation?
Proceedings of the 16th ACM International Conference on Computing Frontiers, 2019

2018
Static Instruction Scheduling for High Performance on Limited Hardware.
IEEE Trans. Computers, 2018

SWOOP: software-hardware co-design for non-speculative, execute-ahead, in-order cores.
Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2018

BISMO: A Scalable Bit-Serial Matrix Multiplication Overlay for Reconfigurable Computing.
Proceedings of the 28th International Conference on Field Programmable Logic and Applications, 2018

2017
Transcending Hardware Limits with Software Out-of-Order Processing.
IEEE Comput. Archit. Lett., 2017

Clairvoyance: look-ahead compile-time scheduling.
Proceedings of the 2017 International Symposium on Code Generation and Optimization, 2017

2016
Poster: Approximation: A New Paradigm also for Wireless Sensing.
Proceedings of the International Conference on Embedded Wireless Systems and Networks, 2016

Practical way halting by speculatively accessing halt tags.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

Techniques for modulating error resilience in emerging multi-value technologies.
Proceedings of the ACM International Conference on Computing Frontiers, CF'16, 2016

Redesigning a tagless access buffer to require minimal ISA changes.
Proceedings of the 2016 International Conference on Compilers, 2016

2015
Improving Data Access Efficiency by Using Context-Aware Loads and Stores.
Proceedings of the 16th ACM SIGPLAN/SIGBED Conference on Languages, 2015

Optimizing Transfers of Control in the Static Pipeline Architecture.
Proceedings of the 16th ACM SIGPLAN/SIGBED Conference on Languages, 2015

Scheduling instruction effects for a statically pipelined processor.
Proceedings of the 2015 International Conference on Compilers, 2015

2014
Power-Efficient Computer Architectures: Recent Advances
Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, ISBN: 978-3-031-01745-2, 2014

A tunable cache for approximate computing.
Proceedings of the IEEE/ACM International Symposium on Nanoscale Architectures, 2014

Reducing set-associative L1 data cache energy by early load data dependence detection (ELD<sup>3</sup>).
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

2013
Reducing instruction fetch energy in multi-issue processors.
ACM Trans. Archit. Code Optim., 2013

Designing a practical data filter cache to improve both energy efficiency and performance.
ACM Trans. Archit. Code Optim., 2013

FlexCore: Implementing an exposed datapath processor.
Proceedings of the 2013 International Conference on Embedded Computer Systems: Architectures, 2013

Improving processor efficiency by statically pipelining instructions.
Proceedings of the SIGPLAN/SIGBED Conference on Languages, 2013

Speculative tag access for reduced energy dissipation in set-associative L1 data caches.
Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013

Improving data access efficiency by using a tagless access buffer (TAB).
Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization, 2013

2012
Techniques to Measure, Model, and Manage Power.
Adv. Comput., 2012

Configurable RTL model for level-1 caches.
Proceedings of the NORCHIP 2012, Copenhagen, Denmark, November 12-13, 2012, 2012

An LTE Uplink Receiver PHY benchmark and subframe-based power management.
Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2012

Viterbi Accelerator for Embedded Processor Datapaths.
Proceedings of the 23rd IEEE International Conference on Application-Specific Systems, 2012

2011
Power-Aware Resource Scheduling in Base Stations.
Proceedings of the MASCOTS 2011, 2011

Reconfigurable Instruction Decoding for a Wide-Control-Word Processor.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

2010
A High-Speed, Energy-Efficient Two-Cycle Multiply-Accumulate (MAC) Architecture and Its Application to a Double-Throughput MAC Unit.
IEEE Trans. Circuits Syst. I Regul. Pap., 2010

Design space exploration for an embedded processor with flexible datapath interconnect.
Proceedings of the 21st IEEE International Conference on Application-specific Systems Architectures and Processors, 2010

2009
FlexCore: Utilizing Exposed Datapath Control for Efficient Computing.
J. Signal Process. Syst., 2009

Multiplication Acceleration Through Twin Precision.
IEEE Trans. Very Large Scale Integr. Syst., 2009

High-speed, energy-efficient 2-cycle Multiply-Accumulate architecture.
Proceedings of the Annual IEEE International SoC Conference, SoCC 2009, 2009

Scheduling for an Embedded Architecture with a Flexible Datapath.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2009

Double Throughput Multiply-Accumulate unit for FlexCore processor enhancements.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Custom layout strategy for rectangle-shaped log-depth multiplier reduction tree.
Proceedings of the 16th IEEE International Conference on Electronics, 2009

A Flexible Code Compression Scheme Using Partitioned Look-Up Tables.
Proceedings of the High Performance Embedded Architectures and Compilers, 2009

2008
Early detection and bypassing of trivial operations to improve energy efficiency of processors.
Microprocess. Microsystems, 2008

High-speed and low-power multipliers using the Baugh-Wooley algorithm and HPM reduction tree.
Proceedings of the 15th IEEE International Conference on Electronics, Circuits and Systems, 2008

A Look-Ahead Task Management Unit for Embedded Multi-Core Architectures.
Proceedings of the 11th Euromicro Conference on Digital System Design: Architectures, 2008

2007
A Flexible Datapath Interconnect for Embedded Applications.
Proceedings of the 2007 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2007), 2007

2006
Multiplier reduction tree with logarithmic logic depth and regular connectivity.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2006), 2006

2005
A low-leakage twin-precision multiplier using reconfigurable power gating.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2005), 2005

2004
An Efficient Twin-Precision Multiplier.
Proceedings of the 22nd IEEE International Conference on Computer Design: VLSI in Computers & Processors (ICCD 2004), 2004


  Loading...