Mohamed M. Sabry

Orcid: 0000-0002-8018-1264

According to our database1, Mohamed M. Sabry authored at least 63 papers between 2008 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


Online presence:



PSRR-MaxpoolNMS++: Fast Non-Maximum Suppression With Discretization and Pooling.
IEEE Trans. Pattern Anal. Mach. Intell., February, 2025

LUT-DLA: Lookup Table as Efficient Extreme Low-Bit Deep Learning Accelerator.
CoRR, January, 2025

From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of Deep Neural Networks.
CoRR, 2024

ViTA: A Highly Efficient Dataflow and Architecture for Vision Transformers.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

MC-ELMM: Multi-Chip Endurance-Limited Memory Management.
Proceedings of the International Symposium on Memory Systems, 2023

EXTENT: Enabling Approximation-Oriented Energy Efficient STT-RAM Write Circuit.
CoRR, 2022

EXTENT: Enabling Approximation-Oriented Energy Efficient STT-RAM Write Circuit.
IEEE Access, 2022

Sub-10nm Ultra-thin ZnO Channel FET with Record-High 561 µA/µm ION at VDS 1V, High µ-84 cm<sup>2</sup>/V-s and1T-1RRAM Memory Cell Demonstration Memory Implications for Energy-Efficient Deep-Learning Computing.
Proceedings of the IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits 2022), 2022

RDO-Q: Extremely Fine-Grained Channel-Wise Quantization via Rate-Distortion Optimization.
Proceedings of the Computer Vision - ECCV 2022, 2022

Scalable Hardware Acceleration of Non-Maximum Suppression.
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

SonicFFT: A system architecture for ultrasonic-based FFT acceleration.
Proceedings of the 27th Asia and South Pacific Design Automation Conference, 2022

Pearl: Towards Optimization of DNN-accelerators Via Closed-Form Analytical Representation.
Proceedings of the 27th Asia and South Pacific Design Automation Conference, 2022

A*HAR: A New Benchmark towards Semi-supervised learning for Class-imbalanced Human Activity Recognition.
CoRR, 2021

Rate-Distortion Optimized Coding for Efficient CNN Compression.
Proceedings of the 31st Data Compression Conference, 2021

Efficient Tunstall Decoder for Deep Neural Network Compression.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

PSRR-MaxpoolNMS: Pyramid Shifted MaxpoolNMS With Relationship Recovery.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Towards Deeply Scaled 3D MPSoCs with Integrated Flow Cell Array Technology.
Proceedings of the GLSVLSI '20: Great Lakes Symposium on VLSI 2020, 2020

Quantifying the Benefits of Monolithic 3D Computing Systems Enabled by TFT and RRAM.
Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

Fledge: Flexible Edge Platforms Enabled by In-memory Computing.
Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

The N3XT Approach to Energy-Efficient Abundant-Data Computing.
Proc. IEEE, 2019

Dataflow-based Joint Quantization of Weights and Activations for Deep Neural Networks.
CoRR, 2019

A 43pJ/Cycle Non-Volatile Microcontroller with 4.7μs Shutdown/Wake-up Integrating 2.3-bit/Cell Resistive RAM and Resilience Techniques.
Proceedings of the IEEE International Solid- State Circuits Conference, 2019

TEA-DNN: the Quest for Time-Energy-Accuracy Co-optimized Deep Neural Networks.
Proceedings of the 2019 IEEE/ACM International Symposium on Low Power Electronics and Design, 2019

N3XT Monolithic 3D Energy-Efficient Computing Systems.
Proceedings of the 2019 on Great Lakes Symposium on VLSI, 2019

Dataflow-Based Joint Quantization for Deep Neural Networks.
Proceedings of the Data Compression Conference, 2019

MaxpoolNMS: Getting Rid of NMS Bottlenecks in Two-Stage Object Detectors.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

EAST-DNN: Expediting architectural SimulaTions using deep neural networks: work-in-progress.
Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis Companion, 2019

PowerCool: Simulation of Cooling and Powering of 3D MPSoCs with Integrated Flow Cell Arrays.
IEEE Trans. Computers, 2018

Hardware-Aware Softmax Approximation for Deep Neural Networks.
Proceedings of the Computer Vision - ACCV 2018, 2018

Classification of Resilience Techniques Against Functional Errors at Higher Abstraction Layers of Digital Systems.
ACM Comput. Surv., 2017

3D nanosystems enable <i>embedded</i> abundant-data computing: special session paper.
Proceedings of the Twelfth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis Companion, 2017

Nano-engineered architectures for ultra-low power wireless body sensor nodes.
Proceedings of the Eleventh IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, 2016

Memories for NTC.
Proceedings of the Near Threshold Computing, Technology, Methods and Applications., 2016

Power-Thermal Modeling and Control of Energy-Efficient Servers and Datacenters.
Proceedings of the Handbook on Data Centers, 2015

Classification Framework for Analysis and Modeling of Physically Induced Reliability Violations.
ACM Comput. Surv., 2015

Energy-Efficient Abundant-Data Computing: The N3XT 1, 000x.
Computer, 2015

ICCAD 2015 Contest in 3D Interlayer Cooling Optimized Network.
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2015

Monolithic 3D integration: a path from concept to reality.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

OCEAN: An Optimized HW/SW Reliability Mitigation Approach for Scratchpad Memories in Real-Time SoCs.
ACM Trans. Embed. Comput. Syst., 2014

A Semi-Analytical Thermal Modeling Framework for Liquid-Cooled ICs.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2014

Temperature-Aware Design and Management for 3D Multi-Core Architectures.
Found. Trends Electron. Des. Autom., 2014

PowerCool: simulation of integrated microfluidic power generation in bright silicon MPSoCs.
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2014

Integrated microfluidic power generation and cooling for bright silicon MPSoCs.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

Global fan speed control considering non-ideal temperature measurements in enterprise servers.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

A quality-scalable and energy-efficient approach for spectral analysis of heart rate variability.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

Resolving the memory bottleneck for single supply near-threshold computing.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

GreenCool: An Energy-Efficient Liquid Cooling Design Technique for 3-D MPSoCs Via Channel Width Modulation.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2013

Wearout-aware compiler-directed register assignment for embedded systems.
Proceedings of the Thirteenth International Symposium on Quality Electronic Design, 2012

Thermal balancing of liquid-cooled 3D-MPSoCs using channel modulation.
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012

A hybrid HW-SW approach for intermittent error mitigation in streaming-based embedded systems.
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012

Design of energy efficient and dependable health monitoring systems under unreliable nanometer technologies.
Proceedings of the 7th International Conference on Body Area Networks, 2012

Energy-Efficient Multiobjective Thermal Control for Liquid-Cooled 3-D Stacked Architectures.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2011

Attaining Single-Chip, High-Performance Computing through 3D Systems with Active Cooling.
IEEE Micro, 2011

Hierarchical Thermal Management Policy for High-Performance 3D Systems With Liquid Cooling.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2011

Thermal analysis and active cooling management for 3D MPSoCs.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2011), 2011

PRO3D, Programming for Future 3D Manycore Architectures: Project's Interim Status.
Proceedings of the Formal Methods for Components and Objects, 10th International Symposium, 2011

Towards thermally-aware design of 3D MPSoCs with inter-tier cooling.
Proceedings of the Design, Automation and Test in Europe, 2011

Thermal-Aware Compilation for Register Window-Based Embedded Processors.
IEEE Embed. Syst. Lett., 2010

Fuzzy control for enforcing energy efficiency in high-performance 3D systems.
Proceedings of the 2010 International Conference on Computer-Aided Design, 2010

Performance and energy trade-offs analysis of L2 on-chip cache architectures for embedded MPSoCs.
Proceedings of the 20th ACM Great Lakes Symposium on VLSI 2009, 2010

Thermal-aware compilation for system-on-chip processing architectures.
Proceedings of the 20th ACM Great Lakes Symposium on VLSI 2009, 2010

TLM-Based Verification of a Combined Switching Networks-on-Chip Router.
Proceedings of the Forum on specification and Design Languages, 2008
