Mario R. Casu

Orcid: 0000-0002-1026-0178

According to our database1, Mario R. Casu authored at least 80 papers between 2000 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Mix & Latch: Comparison With State-of-the-Art Retiming on a RISC-V Benchmark.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., July, 2024

High-Level Design of Precision-Scalable DNN Accelerators Based on Sum-Together Multipliers.
IEEE Access, 2024

STAR: Sum-Together/Apart Reconfigurable Multipliers for Precision-Scalable ML Workloads.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

2023
To Spike or Not to Spike: A Digital Hardware Perspective on Deep Learning Acceleration.
IEEE J. Emerg. Sel. Topics Circuits Syst., December, 2023

Mix & Latch: An Optimization Flow for High-Performance Designs With Single-Clock Mixed-Polarity Latches and Flip-Flops.
IEEE Access, 2023

Design-Space Exploration of Mixed-precision DNN Accelerators based on Sum-Together Multipliers.
Proceedings of the 18th Conference on Ph.D Research in Microelectronics and Electronics, 2023

2022
Fast Energy-Optimal Multikernel DNN-Like Application Allocation on Multi-FPGA Platforms.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

A Reconfigurable Depth-Wise Convolution Module for Heterogeneously Quantized DNNs.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

HLS-based dataflow hardware architecture for Support Vector Machine in FPGA.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

A Reconfigurable 2D-Convolution Accelerator for DNNs Quantized with Mixed-Precision.
Proceedings of the Applications in Electronics Pervading Industry, Environment and Society, 2022

Multi-objective Framework for Training and Hardware Co-optimization in FPGAs.
Proceedings of the Applications in Electronics Pervading Industry, Environment and Society, 2022

2021
CNN-on-AWS: Efficient Allocation of Multikernel Applications on Multi-FPGA Platforms.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2021

Machine-Learning-Based Microwave Sensing: A Case Study for the Food Industry.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2021

High-Level Annotation of Routing Congestion for Xilinx Vivado HLS Designs.
IEEE Access, 2021

FPGA Acceleration of 3D FDTD for Multi- Antennas Microwave Imaging Using HLS.
IEEE Access, 2021

Efficient Training and Hardware Co-design of Machine Learning Models.
Proceedings of the Applications in Electronics Pervading Industry, Environment and Society, 2021

2020
Power-Optimal Mapping of CNN Applications to Cloud-Based Multi-FPGA Platforms.
IEEE Trans. Circuits Syst., 2020

A Prototype Microwave System for 3D Brain Stroke Imaging.
Sensors, 2020

A Machine-Learning Based Microwave Sensing Approach to Food Contaminant Detection.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2020

2019
Efficient FPGA Implementation of PCA Algorithm for Large Data using High Level Synthesis.
Proceedings of the 15th Conference on Ph.D. Research in Microelectronics and Electronics, 2019

HLS-Based Flexible Hardware Accelerator for PCA Algorithm on a Low-Cost ZYNQ SoC.
Proceedings of the 2019 IEEE Nordic Circuits and Systems Conference, 2019

Exact and Heuristic Allocation of Multi-kernel Applications to Multi-FPGA Platforms.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

2018
Lower-Order Compensation Chain Threshold-Reduction Technique for Multi-Stage Voltage Multipliers.
Sensors, 2018

Design-Space Exploration of Pareto-Optimal Architectures for Deep Learning with DVFS.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

Energy-performance design exploration of a low-power microprogrammed deep-learning accelerator.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

2017
Accelerators for Breast Cancer Detection.
ACM Trans. Embed. Comput. Syst., 2017

A COTS-Based Microwave Imaging System for Breast-Cancer Detection.
IEEE Trans. Biomed. Circuits Syst., 2017

Power-performance assessment of different DVFS control policies in NoCs.
J. Parallel Distributed Comput., 2017

ICARO-PAPM: Congestion Management with Selective Queue Power-Gating.
Proceedings of the 2017 International Conference on High Performance Computing & Simulation, 2017

2016
Increasing the Efficiency of Latency-Driven DVFS with a Smart NoC Congestion Management Strategy.
Proceedings of the 10th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2016

2015
A synchronous latency-insensitive RISC for better than worst-case design.
Integr., 2015

Acceleration of microwave imaging algorithms for breast cancer detection via High-Level Synthesis.
Proceedings of the 33rd IEEE International Conference on Computer Design, 2015

Rate-based vs delay-based control for DVFS in NoC.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

A low-cost, fast, and accurate microwave imaging system for breast cancer detection.
Proceedings of the IEEE Biomedical Circuits and Systems Conference, 2015

Microwave Imaging for Breast Cancer Detection: A COTS-Based Prototype.
Proceedings of the Applications in Electronics Pervading Industry, Environment and Society, 2015

2014
UWB microwave imaging for breast cancer detection: Many-core, GPU, or FPGA?
ACM Trans. Embed. Comput. Syst., 2014

Simulation and design of an UWB imaging system for breast cancer detection.
Integr., 2014

Accelerator Memory Reuse in the Dark Silicon Era.
IEEE Comput. Archit. Lett., 2014

2013
Hardware Acceleration of Beamforming in a UWB Imaging Unit for Breast Cancer Detection.
VLSI Design, 2013

LAURA-NoC: Local Automatic Rate Adjustment in Network-on-Chips With a Simple DVFS.
IEEE Trans. Circuits Syst. II Express Briefs, 2013

UWB receiver for breast cancer detection: Comparison between two different approaches.
Proceedings of the 2013 IEEE International SOC Conference, Erlangen, Germany, 2013

Joint delay and power control in single-server queueing systems.
Proceedings of the IEEE Online Conference on Green Communications, OnlineGreenComm 2013, 2013

Breast cancer detection based on an UWB imaging system: Receiver design and simulations.
Proceedings of 2013 International Conference on IC Design & Technology, 2013

2012
Exploiting space diversity and Dynamic Voltage Frequency Scaling in multiplane Network-on-Chips.
Proceedings of the 2012 IEEE Global Communications Conference, 2012

DVFS Based on Voltage Dithering and Clock Scheduling for GALS Systems.
Proceedings of the 18th IEEE International Symposium on Asynchronous Circuits and Systems, 2012

2011
A NoC-based hybrid message-passing/shared-memory approach to CMP design.
Microprocess. Microsystems, 2011

Half-buffer retiming and token cages for synchronous elastic circuits.
IET Comput. Digit. Tech., 2011

Coupling latency-insensitivity with variable-latency for better than worst case design: a RISC case study.
Proceedings of the 21st ACM Great Lakes Symposium on VLSI 2010, 2011

2010
MEDEA: a hybrid shared-memory/message-passing multiprocessor NoC-based architecture.
Proceedings of the Design, Automation and Test in Europe, 2010

A flexible UWB Transmitter for breast cancer detection imaging systems.
Proceedings of the Design, Automation and Test in Europe, 2010

Improving Synchronous Elastic Circuits: Token Cages and Half-Buffer Retiming.
Proceedings of the 16th IEEE International Symposium on Asynchronous Circuits and Systems, 2010

2009
A Case Study for NoC-Based Homogeneous MPSoC Architectures.
IEEE Trans. Very Large Scale Integr. Syst., 2009

A mixed-signal demodulator for a low-complexity IR-UWB receiver: Methodology, simulation and design.
Integr., 2009

Adaptive Latency Insensitive Protocols and Elastic Circuits with Early Evaluation: A Comparative Analysis.
Proceedings of the 4th International Workshop on the Application of Formal Methods for Globally Asynchronous and Locally Synchronous Design, 2009

A Fully Differential Digital CMOS UWB Pulse Generator.
Circuits Syst. Signal Process., 2009

2008
A VHDL-AMS Simulation Environment for an UWB Impulse Radio Transceiver.
IEEE Trans. Circuits Syst. I Regul. Pap., 2008

2007
Adaptive Latency-Insensitive Protocols.
IEEE Des. Test Comput., 2007

A methodology and a case-study for network-on-chip based MP-SoC architectures.
Proceedings of the 2nd Internationa ICST Conference on Nano-Networks, 2007

The NoCRay Graphic Accelerator: a Case-study for MP-SoC Network-on-Chip Design Methodology.
Proceedings of the International Symposium on System-on-Chip, 2007

An effective AMS top-down methodology applied to the design of a mixed-signal UWB system-on-chip.
Proceedings of the 2007 Design, Automation and Test in Europe Conference and Exposition, 2007

2006
Floorplanning With Wire Pipelining in Adaptive Communication Channels.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2006

Implementation analysis of NoC: a MPSoC trace-driven approach.
Proceedings of the 16th ACM Great Lakes Symposium on VLSI 2006, Philadelphia, PA, USA, April 30, 2006

2005
Implementation aspects of a transmitted-reference UWB receiver.
Wirel. Commun. Mob. Comput., 2005

Throughput-driven floorplanning with wire pipelining.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2005

Floorplan assisted data rate enhancement through wire pipelining: a real assessment.
Proceedings of the 2005 International Symposium on Physical Design, 2005

On the implementation of a transmitted-reference UWB receiver.
Proceedings of the 13th European Signal Processing Conference, 2005

A New System Design Methodology for Wire Pipelined SoC.
Proceedings of the 2005 Design, 2005

2004
An electromigration and thermal model of power wires for a priori high-level reliability prediction.
IEEE Trans. Very Large Scale Integr. Syst., 2004

Effects of temperature in deep-submicron global interconnect optimization in future technology nodes.
Microelectron. J., 2004

Floorplanning for throughput.
Proceedings of the 2004 International Symposium on Physical Design, 2004

On-Chip Transparent Wire Pipelining.
Proceedings of the 22nd IEEE International Conference on Computer Design: VLSI in Computers & Processors (ICCD 2004), 2004

Issues in Implementing Latency Insensitive Protocols.
Proceedings of the 2004 Design, 2004

A new approach to latency insensitive design.
Proceedings of the 41th Design Automation Conference, 2004

2003
Coupled electro-thermal modeling and optimization of clock networks.
Microelectron. J., 2003

Effects of Temperature in Deep-Submicron Global Interconnect Optimization.
Proceedings of the Integrated Circuit and System Design, 2003

A Block-Based Approach for SoC Global Interconnect Electrical Parameters Characterization.
Proceedings of the Integrated Circuit and System Design, 2003

2002
Clock Distribution Network Optimization under Self-Heating and Timing Constraints.
Proceedings of the Integrated Circuit Design. Power and Timing Modeling, 2002

Converting an Embedded Low-Power SRAM from Bulk to PD-SOI.
Proceedings of the 10th IEEE International Workshop on Memory Technology, 2002

2001
Synthesis of low-leakage PD-SOI circuits with body-biasing.
Proceedings of the 2001 International Symposium on Low Power Electronics and Design, 2001

2000
A high accuracy-low complexity model for CMOS delays.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2000


  Loading...