Michaela Blott

Orcid: 0000-0002-7833-4057

Affiliations:
  • AMD Adaptive and Embedded Computing Group (AECG) Labs, Dublin, Ireland


According to our database1, Michaela Blott authored at least 72 papers between 1998 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
EcoFlow: Efficient Convolutional Dataflows on Low-Power Neural Network Accelerators.
IEEE Trans. Computers, September, 2024

High-efficiency Compressor Trees for Latest AMD FPGAs.
ACM Trans. Reconfigurable Technol. Syst., June, 2024

LL-GNN: Low Latency Graph Neural Networks on FPGAs for High Energy Physics.
ACM Trans. Embed. Comput. Syst., March, 2024

Corrigendum: Applications and techniques for fast machine learning in science.
Frontiers Big Data, 2024

ACCL+: an FPGA-Based Collective Engine for Distributed Applications.
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024

Shedding the Bits: Pushing the Boundaries of Quantization with Minifloats on FPGAs.
Proceedings of the 34th International Conference on Field-Programmable Logic and Applications, 2024

Optimizing Communication for Latency Sensitive HPC Applications on up to 48 FPGAs Using ACCL.
Proceedings of the Euro-Par 2024: Parallel Processing, 2024

2023
On the RTL Implementation of FINN Matrix Vector Unit.
ACM Trans. Embed. Comput. Syst., November, 2023

Fault-Tolerant Neural Network Accelerators With Selective TMR.
IEEE Des. Test, April, 2023

AMNES: Accelerating the computation of data correlation using FPGAs.
Proc. VLDB Endow., 2023

Post-Training Quantization with Low-precision Minifloats and Integers on FPGAs.
CoRR, 2023

2022
Elastic-DF: Scaling Performance of DNN Inference in FPGA Clouds through Automatic Partitioning.
ACM Trans. Reconfigurable Technol. Syst., 2022

RadioML Meets FINN: Enabling Future RF Applications With FPGA Streaming Architectures.
IEEE Micro, 2022

Applications and Techniques for Fast Machine Learning in Science.
Frontiers Big Data, 2022

Evaluating Theoretical Baselines for ML Benchmarking Across Different Accelerators.
IEEE Des. Test, 2022

Implementing Neural Network-Based Equalizers in a Coherent Optical Transmission System Using Field-Programmable Gate Arrays.
CoRR, 2022

LL-GNN: Low Latency Graph Neural Networks on FPGAs for Particle Detectors.
CoRR, 2022

Towards FPGA Implementation of Neural Network-Based Nonlinearity Mitigation Equalizers in Coherent Optical Transmission Systems.
CoRR, 2022

Open-source FPGA-ML codesign for the MLPerf Tiny Benchmark.
CoRR, 2022

QONNX: Representing Arbitrary-Precision Quantized Neural Networks.
CoRR, 2022

On the RTL Implementation of FINN Matrix Vector Compute Unit.
CoRR, 2022

Machine Learning Aided Hardware Resource Estimation for FPGA DNN Implementations.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

2021
Evaluation of Optimized CNNs on Heterogeneous Accelerators Using a Novel Benchmarking Approach.
IEEE Trans. Computers, 2021

Benchmarking vision kernels and neural network inference accelerators on embedded platforms.
J. Syst. Archit., 2021

Performance vs. hardware requirements in state-of-the-art automatic speech recognition.
EURASIP J. Audio Speech Music. Process., 2021

Applications and Techniques for Fast Machine Learning in Science.
CoRR, 2021

ACCL: FPGA-Accelerated Collectives over 100 Gbps TCP-IP.
Proceedings of the IEEE/ACM International Workshop on Heterogeneous High-performance Reconfigurable Computing, 2021

Optimized Implementation of the HPCG Benchmark on Reconfigurable Hardware.
Proceedings of the Euro-Par 2021: Parallel Processing, 2021

2020
FAT: Training Neural Networks for Reliable Inference Under Hardware Faults.
Proceedings of the IEEE International Test Conference, 2020

Memory-Efficient Dataflow Inference for Deep CNNs on FPGA.
Proceedings of the International Conference on Field-Programmable Technology, 2020

DarwiNN: efficient distributed neuroevolution under communication constraints.
Proceedings of the GECCO '20: Genetic and Evolutionary Computation Conference, 2020

Evolutionary bin packing for memory-efficient dataflow inference acceleration on FPGA.
Proceedings of the GECCO '20: Genetic and Evolutionary Computation Conference, 2020

LogicNets: Co-Designed Neural Networks and Circuits for Extreme-Throughput Applications.
Proceedings of the 30th International Conference on Field-Programmable Logic and Applications, 2020

LPAC: A Low-Precision Accelerator for CNN on FPGAs.
Proceedings of the FPGA '20: The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020

Evaluation of Optimized CNNs on FPGA and non-FPGA based Accelerators using a Novel Benchmarking Approach.
Proceedings of the FPGA '20: The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020

High-Throughput DNN Inference with LogicNets.
Proceedings of the 28th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2020

2019
QuTiBench: Benchmarking Neural Networks on Heterogeneous Hardware.
ACM J. Emerg. Technol. Comput. Syst., 2019

Synetgy: Algorithm-hardware Co-design for ConvNet Accelerators on Embedded FPGAs.
Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019

Efficient Error-Tolerant Quantized Neural Network Accelerators.
Proceedings of the 2019 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, 2019

2018
FINN-<i>R</i>: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks.
ACM Trans. Reconfigurable Technol. Syst., 2018

FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks.
CoRR, 2018

Designing scalable FPGA architectures using high-level synthesis.
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018

FINN-L: Library Extensions and Design Trade-Off Analysis for Variable Precision LSTM Networks on FPGAs.
Proceedings of the 28th International Conference on Field Programmable Logic and Applications, 2018

Customizing Low-Precision Deep Neural Networks for FPGAs.
Proceedings of the 28th International Conference on Field Programmable Logic and Applications, 2018

Inference of quantized neural networks on heterogeneous all-programmable devices.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

SYQ: Learning Symmetric Quantization for Efficient Deep Neural Networks.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Accuracy to Throughput Trade-Offs for Reduced Precision Neural Networks on Reconfigurable Logic.
Proceedings of the Applied Reconfigurable Computing. Architectures, Tools, and Applications, 2018

2017
Databases on Future Hardware (Dagstuhl Seminar 17101).
Dagstuhl Reports, 2017

Compressing Low Precision Deep Neural Networks Using Sparsity-Induced Regularization in Ternary Networks.
Proceedings of the Neural Information Processing - 24th International Conference, 2017

Scaling Neural Network Performance through Customized Hardware Architectures on Reconfigurable Logic.
Proceedings of the 2017 IEEE International Conference on Computer Design, 2017

Scaling Binarized Neural Networks on Reconfigurable Logic.
Proceedings of the 8th Workshop and 6th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms, 2017

FINN: A Framework for Fast, Scalable Binarized Neural Network Inference.
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017

Architectural optimizations for high performance and energy efficient Smith-Waterman implementation on FPGAs using OpenCL.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

Towards exascale computing with heterogeneous architectures.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

2016
On How to Improve FPGA-Based Systems Design Productivity via SDAccel.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Reconfigurable future for HPC.
Proceedings of the International Conference on High Performance Computing & Simulation, 2016

2015
A Hash Table for Line-Rate Data Processing.
ACM Trans. Reconfigurable Technol. Syst., 2015

Scaling Out to a Single-Node 80Gbps Memcached Server with 40Terabytes of Memory.
Proceedings of the 7th USENIX Workshop on Hot Topics in Storage and File Systems, 2015

Scalable 10Gbps TCP/IP Stack Architecture for Reconfigurable Hardware.
Proceedings of the 23rd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2015

2014
OSNT: open source network tester.
IEEE Netw., 2014

High-Level Synthesis Case Study: Implementation of a Memcached Server.
CoRR, 2014

2013
Achieving 10Gbps Line-rate Key-value Stores with FPGAs.
Proceedings of the 5th USENIX Workshop on Hot Topics in Cloud Computing, 2013

Dataflow architectures for 10Gbps line-rate key-value-stores.
Proceedings of the 2013 IEEE Hot Chips 25 Symposium (HCS), 2013

A flexible hash table design for 10GBPS key-value stores on FPGAS.
Proceedings of the 23rd International Conference on Field programmable Logic and Applications, 2013

Architecture for an open source network tester.
Proceedings of the Symposium on Architecture for Networking and Communications Systems, 2013

2012
A Low-Latency Library in FPGA Hardware for High-Frequency Trading (HFT).
Proceedings of the IEEE 20th Annual Symposium on High-Performance Interconnects, 2012

2010
Design of a flexible high-speed FPGA-based flow monitor for next generation networks.
Proceedings of the 2010 International Conference on Embedded Computer Systems: Architectures, 2010

2009
Debugging FPGA-based packet processing systems through transaction-level communication-centric monitoring.
Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, 2009

Automated instrumentation of FPGA-based systems for system-level transaction monitoring.
Proceedings of the 2008 IEEE International Symposium on System-on-Chip, 2009

Architectural Comparison of Instruments for Transaction Level Monitoring of FPGA-Based Packet Processing Systems.
Proceedings of the FCCM 2009, 2009

2001
Starburst: Building next-generation internet devices.
Bell Labs Tech. J., 2001

1998
A Networked Frame Buffer with Windows Management Support.
Proceedings of the EUROMEDIA 1998 featuring WEBTEC-MEDIATEC-COMTEC-APTEC, 1998


  Loading...