Andrew Boutros

Orcid: 0000-0002-8044-1644

According to our database1, Andrew Boutros authored at least 31 papers between 2015 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
High Throughput FPGA-Based Object Detection via Algorithm-Hardware Co-Design.
ACM Trans. Reconfigurable Technol. Syst., March, 2024

Field-Programmable Gate Array Architecture for Deep Learning: Survey & Future Directions.
CoRR, 2024

A Software-Programmable Neural Processing Unit for Graph Neural Network Inference on FPGAs.
Proceedings of the 34th International Conference on Field-Programmable Logic and Applications, 2024

Stay Flexible: A High-Performance FPGA NPU Overlay for Graph Neural Networks.
Proceedings of the 32nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2024

2023
Koios 2.0: Open-Source Deep Learning Benchmarks for FPGA Architecture and CAD Research.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., November, 2023

A Fast and Flexible FPGA-based Accelerator for Natural Language Processing Neural Networks.
ACM Trans. Archit. Code Optim., March, 2023

Into the Third Dimension: Architecture Exploration Tools for 3D Reconfigurable Acceleration Devices.
Proceedings of the International Conference on Field Programmable Technology, 2023

A Whole New World: How to Architect Beyond-FPGA Reconfigurable Acceleration Devices?
Proceedings of the 33rd International Conference on Field-Programmable Logic and Applications, 2023

Placement Optimization for NoC-Enhanced FPGAs.
Proceedings of the 31st IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2023

2022
Recurrent Neural Networks With Column-Wise Matrix-Vector Multiplication on FPGAs.
IEEE Trans. Very Large Scale Integr. Syst., 2022

FPGA-based AI Smart NICs for Scalable Distributed AI Training Systems.
CoRR, 2022

FPGA-Based AI Smart NICs for Scalable Distributed AI Training Systems.
IEEE Comput. Archit. Lett., 2022

Architecture and Application Co-Design for Beyond-FPGA Reconfigurable Acceleration Devices.
IEEE Access, 2022

RAD-Sim: Rapid Architecture Exploration for Novel Reconfigurable Acceleration Devices.
Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications, 2022

2021
Specializing for Efficiency: Customizing AI Inference Processors on FPGAs.
Proceedings of the International Conference on Microelectronics, 2021

Koios: A Deep Learning Benchmark Suite for FPGA Architecture and CAD Research.
Proceedings of the 31st International Conference on Field-Programmable Logic and Applications, 2021

End-to-End FPGA-based Object Detection Using Pipelined CNN and Non-Maximum Suppression.
Proceedings of the 31st International Conference on Field-Programmable Logic and Applications, 2021

Compute-Capable Block RAMs for Efficient Deep Learning Acceleration on FPGAs.
Proceedings of the 29th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2021

2020
FPGA Logic Block Architectures for Efficient Deep Learning Inference.
ACM Trans. Reconfigurable Technol. Syst., 2020

Beyond Peak Performance: Comparing the Real Performance of AI-Optimized FPGAs and GPUs.
Proceedings of the International Conference on Field-Programmable Technology, 2020

Neighbors From Hell: Voltage Attacks Against Deep Learning Accelerators on Multi-Tenant FPGAs.
Proceedings of the International Conference on Field-Programmable Technology, 2020

2019
Scalable Low-Latency Persistent Neural Machine Translation on CPU Server with Multiple FPGAs.
Proceedings of the International Conference on Field-Programmable Technology, 2019

Evaluating and Enhancing Intel® Stratix® 10 FPGAs for Persistent Real-Time AI.
Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019

Math Doesn't Have to be Hard: Logic Block Architectures to Enhance Low-Precision Multiply-Accumulate on FPGAs.
Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019

Why Compete When You Can Work Together: FPGA-ASIC Integration for Persistent RNNs.
Proceedings of the 27th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2019

2018
You Cannot Improve What You Do not Measure: FPGA vs. ASIC Efficiency Gaps for Convolutional Neural Network Inference.
ACM Trans. Reconfigurable Technol. Syst., 2018

Embracing Diversity: Enhanced DSP Blocks for Low-Precision Deep Learning on FPGAs.
Proceedings of the 28th International Conference on Field Programmable Logic and Applications, 2018

2017
HW/SW Co-Design of the HOG algorithm on a Xilinx Zynq SoC.
J. Parallel Distributed Comput., 2017

Build fast, trade fast: FPGA-based high-frequency trading using high-level synthesis.
Proceedings of the International Conference on ReConFigurable Computing and FPGAs, 2017

Hardware acceleration of novel chaos-based image encryption for IoT applications.
Proceedings of the 29th International Conference on Microelectronics, 2017

2015
Real-time pedestrian detection on a xilinx zynq using the HOG algorithm.
Proceedings of the International Conference on ReConFigurable Computing and FPGAs, 2015


  Loading...