Mehdi Kamal

Orcid: 0000-0001-7098-6440

According to our database1, Mehdi Kamal authored at least 113 papers between 2006 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 




FACTER: Fairness-Aware Conformal Thresholding and Prompt Engineering for Enabling Fair LLM-Based Recommender Systems.
CoRR, February, 2025

Dynamic Co-Optimization Compiler: Leveraging Multi-Agent Reinforcement Learning for Enhanced DNN Accelerator Performance.
Proceedings of the 30th Asia and South Pacific Design Automation Conference, 2025

ICD<sup>2</sup>S: A Hybrid Ising-Classical-Machines Data-Driven QUBO Solver Method.
Proceedings of the 30th Asia and South Pacific Design Automation Conference, 2025

Low-Precision Mixed-Computation Models for Inference on Edge.
IEEE Trans. Very Large Scale Integr. Syst., August, 2024

GEMA: A Genome Exact Mapping Accelerator Based on Learned Indexes.
IEEE Trans. Biomed. Circuits Syst., June, 2024

SAIM: Scalable Analog Ising Machine for Solving Quadratic Binary Optimization Problems.
CoRR, 2024

MENAGE: Mixed-Signal Event-Driven Neuromorphic Accelerator for Edge Applications.
CoRR, 2024

Efficient Noise Mitigation for Enhancing Inference Accuracy in DNNs on Mixed-Signal Accelerators.
CoRR, 2024

On the Impact of ISA Extension on Energy Consumption of I-Cache in Extensible Processors.
CoRR, 2024

Enhancing Layout Hotspot Detection Efficiency with YOLOv8 and PCA-Guided Augmentation.
CoRR, 2024

ARCO:Adaptive Multi-Agent Reinforcement Learning-Based Hardware/Software Co-Optimization Compiler for Improved Performance in DNN Accelerator Design.
CoRR, 2024

Scalable Superconductor Neuron with Ternary Synaptic Connections for Ultra-Fast SNN Hardware.
CoRR, 2024

Scalable Superconductor Ising Machine for Combinatorial Optimization Problems.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2024

X-IMM: Mixed-Signal Iterative Montgomery Modular Multiplication.
Proceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design, 2024

A<sup>2</sup>P-MANN: Adaptive Attention Inference Hops Pruned Memory-Augmented Neural Networks.
IEEE Trans. Neural Networks Learn. Syst., November, 2023

Memristive-based Mixed-signal CGRA for Accelerating Deep Neural Network Inference.
ACM Trans. Design Autom. Electr. Syst., July, 2023

Federated learning by employing knowledge distillation on edge devices with limited hardware resources.
Neurocomputing, April, 2023

Accuracy Configurable Adders with Negligible Delay Overhead in Exact Operating Mode.
ACM Trans. Design Autom. Electr. Syst., January, 2023

Unsupervised SFQ-Based Spiking Neural Network.
CoRR, 2023

A Josephson Parametric Oscillator-Based Ising Machine.
CoRR, 2023

ReMeCo: Reliable Memristor-Based in-Memory Neuromorphic Computation.
Proceedings of the 28th Asia and South Pacific Design Automation Conference, 2023

Posit Process Element for Using in Energy-Efficient DNN Accelerators.
IEEE Trans. Very Large Scale Integr. Syst., 2022

An Adaptive Memory-Side Encryption Method for Improving Security and Lifetime of PCM-Based Main Memory.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Distributing DNN training over IoT edge devices based on transfer learning.
Neurocomputing, 2022

AMR-MUL: An Approximate Maximally Redundant Signed Digit Multiplier.
CoRR, 2022

Heterogeneous Multi-core Array-based DNN Accelerator.
CoRR, 2022

SySCIM: SystemC-AMS Simulation of Memristive Computation In-Memory.
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

Design Techniques for Approximate Realization of Data-Flow Graphs.
Proceedings of the Approximate Computing, 2022

OPTIMA: An Approach for Online Management of Cache Approximation Levels in Approximate Processing Systems.
IEEE Trans. Very Large Scale Integr. Syst., 2021

An Energy-Efficient Inference Method in Convolutional Neural Networks Based on Dynamic Adjustment of the Pruning Level.
ACM Trans. Design Autom. Electr. Syst., 2021

LATIM: Loading-Aware Offline Training Method for Inverter-Based Memristive Neural Networks.
IEEE Trans. Circuits Syst. II Express Briefs, 2021

Reliability Enhancement of Inverter-Based Memristor Crossbar Neural Networks Using Mathematical Analysis of Circuit Non-Idealities.
IEEE Trans. Circuits Syst. I Regul. Pap., 2021

Loading-Aware Reliability Improvement of Ultra-Low Power Memristive Neural Networks.
IEEE Trans. Circuits Syst. I Regul. Pap., 2021

A2P-MANN: Adaptive Attention Inference Hops Pruned Memory-Augmented Neural Networks.
CoRR, 2021

BRDS: An FPGA-based LSTM Accelerator with Row-Balanced Dual-Ratio Sparsification.
CoRR, 2021

DART: A Framework for Determining Approximation Levels in an Approximable Memory Hierarchy.
IEEE Trans. Very Large Scale Integr. Syst., 2020

Interstice: Inverter-Based Memristive Neural Networks Discretization for Function Approximation Applications.
IEEE Trans. Very Large Scale Integr. Syst., 2020

RandShift: An Energy-Efficient Fault-Tolerant Method in Secure Nonvolatile Main Memory.
IEEE Trans. Very Large Scale Integr. Syst., 2020

POLAR: A Pipelined/Overlapped FPGA-Based LSTM Accelerator.
IEEE Trans. Very Large Scale Integr. Syst., 2020

Design Exploration of Energy-Efficient Accuracy-Configurable Dadda Multipliers With Improved Lifetime Based on Voltage Overscaling.
IEEE Trans. Very Large Scale Integr. Syst., 2020

Self-Adjusting Monitor for Measuring Aging Rate and Advancement.
IEEE Trans. Emerg. Top. Comput., 2020

Offline Training Improvement of Inverter-Based Memristive Neural Networks Using Inverter Voltage Characteristic Smoothing.
IEEE Trans. Circuits Syst., 2020

Res-DNN: A Residue Number System-Based DNN Accelerator Unit.
IEEE Trans. Circuits Syst. I Regul. Pap., 2020

O⁴-DNN: A Hybrid DSP-LUT-Based Processing Unit With Operation Packing and Out-of-Order Execution for Efficient Realization of Convolutional Neural Networks on FPGA Devices.
IEEE Trans. Circuits Syst. I Regul. Pap., 2020

Block-Based Carry Speculative Approximate Adder for Energy-Efficient Applications.
IEEE Trans. Circuits Syst. II Express Briefs, 2020

X-CGRA: An Energy-Efficient Approximate Coarse-Grained Reconfigurable Architecture.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Circuit-Level Techniques for Logic and Memory Blocks in Approximate Computing Systemsx.
Proc. IEEE, 2020

EGAN: A Framework for Exploring the Accuracy vs. Energy Efficiency Trade-off in Hardware Implementation of Error Resilient Applications.
Proceedings of the 21st International Symposium on Quality Electronic Design, 2020

Low-power Accuracy-configurable Carry Look-ahead Adder Based on Voltage Overscaling Technique.
Proceedings of the 21st International Symposium on Quality Electronic Design, 2020

TOSAM: An Energy-Efficient Truncation- and Rounding-Based Scalable Approximate Multiplier.
IEEE Trans. Very Large Scale Integr. Syst., 2019

A Theoretical Framework for Quality Estimation and Optimization of DSP Applications Using Low-Power Approximate Adders.
IEEE Trans. Circuits Syst. I Regul. Pap., 2019

OCTAN: An On-Chip Training Algorithm for Memristive Neuromorphic Circuits.
IEEE Trans. Circuits Syst. I Regul. Pap., 2019

ACHILLES: Accuracy-Aware High-Level Synthesis Considering Online Quality Management.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

Space Expansion of Feature Selection for Designing more Accurate Error Predictors.
CoRR, 2019

Approximate Reverse Carry Propagate Adder for Energy-Efficient DSP Applications.
IEEE Trans. Very Large Scale Integr. Syst., 2018

An Efficient False Path-Aware Heuristic Critical Path Selection Method with High Coverage of the Process Variation Space.
ACM Trans. Design Autom. Electr. Syst., 2018

RAP-CLA: A Reconfigurable Approximate Carry Look-Ahead Adder.
IEEE Trans. Circuits Syst. II Express Briefs, 2018

TheSPoT: Thermal Stress-Aware Power and Temperature Management for Multiprocessor Systems-on-Chip.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

PHAX: Physical Characteristics Aware Ex-Situ Training Framework for Inverter-Based Memristive Neuromorphic Circuits.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

Toward Approximate Computing for Coarse-Grained Reconfigurable Architectures.
IEEE Micro, 2018

An Ultra Low-Power Memristive Neuromorphic Circuit for Internet of Things Smart Sensors.
IEEE Internet Things J., 2018

Lifetime improvement by exploiting aggressive voltage scaling during runtime of error-resilient applications.
Integr., 2018

Energy and Reliability Improvement of Voltage-Based, Clustered, Coarse-Grain Reconfigurable Architectures by Employing Quality-Aware Mapping.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2018

An Energy-Efficient, Yet Highly-Accurate, Approximate Non-Iterative Divider.
Proceedings of the International Symposium on Low Power Electronics and Design, 2018

Energy Consumption and Lifetime Improvement of Coarse-Grained Reconfigurable Architectures Targeting Low-Power Error-Tolerant Applications.
Proceedings of the 2018 on Great Lakes Symposium on VLSI, 2018

PX-CGRA: Polymorphic approximate coarse-grained reconfigurable architecture.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

RoBA Multiplier: A Rounding-Based Approximate Multiplier for High-Speed yet Energy-Efficient Digital Signal Processing.
IEEE Trans. Very Large Scale Integr. Syst., 2017

Dual-Quality 4: 2 Compressors for Utilizing in Dynamic Accuracy Configurable Multipliers.
IEEE Trans. Very Large Scale Integr. Syst., 2017

Efficient Critical Path Identification Based on Viability Analysis Method Considering Process Variations.
IEEE Trans. Very Large Scale Integr. Syst., 2017

Hybrid TFET-MOSFET circuit: A solution to design soft-error resilient ultra-low power digital circuit.
Integr., 2017

CL-CPA: A hybrid carry-lookahead/carry-propagate adder for low-power or high-performance operation mode.
Integr., 2017

LETAM: A low energy truncation-based approximate multiplier.
Comput. Electr. Eng., 2017

An energy and area efficient yet high-speed square-root carry select adder structure.
Comput. Electr. Eng., 2017

TruncApp: A truncation-based approximate divider for energy efficient DSP applications.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

Robust neuromorphic computing in the presence of process variation.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

High-Speed and Energy-Efficient Carry Skip Adder Operating Under a Wide Range of Supply Voltage Levels.
IEEE Trans. Very Large Scale Integr. Syst., 2016

Yield and Speedup Improvements in Extensible Processors by Allocating Extra Cycles to Some Custom Instructions.
ACM Trans. Design Autom. Electr. Syst., 2016

All-Region Statistical Model for Delay Variation Based on Log-Skew-Normal Distribution.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2016

An efficient temperature dependent hot carrier injection reliability simulation flow.
Microelectron. Reliab., 2016

A comparative study on performance and reliability of 32-bit binary adders.
Integr., 2016

Power and energy reduction of racetrack-based caches by exploiting shared shift operations.
Proceedings of the 2016 IFIP/IEEE International Conference on Very Large Scale Integration, 2016

Robust Hybrid TFET-MOSFET Circuits in Presence of Process Variations and Soft Errors.
Proceedings of the VLSI-SoC: System-on-Chip in the Nanoscale Era - Design, Verification and Reliability, 2016

Hybrid TFET-MOSFET circuits: An approach to design reliable ultra-low power circuits in the presence of process variation.
Proceedings of the 2016 IFIP/IEEE International Conference on Very Large Scale Integration, 2016

SEERAD: A high speed yet energy-efficient rounding-based approximate divider.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

OPLE: A Heuristic Custom Instruction Selection Algorithm Based on Partitioning and Local Exploration of Application Dataflow Graphs.
ACM Trans. Embed. Comput. Syst., 2015

CSAM: A clock skew-aware aging mitigation technique.
Microelectron. Reliab., 2015

Workload and temperature dependent evaluation of BTI-induced lifetime degradation in digital circuits.
Microelectron. Reliab., 2015

Design of NBTI-resilient extensible processors.
Integr., 2015

An efficient network on-chip architecture based on isolating local and non-local communications.
Comput. Electr. Eng., 2015

A heuristic machine learning-based algorithm for power and thermal management of heterogeneous MPSoCs.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2015

Online self adjusting progressive age monitoring of timing variations.
Proceedings of the 10th International Conference on Design & Technology of Integrated Systems in Nanoscale Era, 2015

A thermal stress-aware algorithm for power and temperature management of MPSoCs.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

Implementation-aware selection of the custom instruction set for extensible processors.
Microprocess. Microsystems, 2014

Impact of Process Variations on Speedup and Maximum Achievable Frequency of Extensible Processors.
ACM J. Emerg. Technol. Comput. Syst., 2014

CupCarbon: a multi-agent and discrete event wireless sensor network design and simulation tool.
Proceedings of the 7th International ICST Conference on Simulation Tools and Techniques, 2014

A heuristic path selection method for small delay defects test.
Proceedings of the 2014 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, 2014

Improving efficiency of extensible processors by using approximate custom instructions.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

Considering the effect of process variations during the ISA extension design flow.
Microprocess. Microsystems, 2013

A new merit function for custom instruction selection under an area budget constraint.
Des. Autom. Embed. Syst., 2013

Capturing and mitigating the NBTI effect during the design flow for extensible processors.
Proceedings of the 8th International Conference on Design & Technology of Integrated Systems in Nanoscale Era, 2013

An efficient reliability simulation flow for evaluating the hot carrier injection effect in CMOS VLSI circuits.
Proceedings of the 30th International IEEE Conference on Computer Design, 2012

An architecture-level approach for mitigating the impact of process variations on extensible processors.
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012

GPH: A group-based partitioning scheme for reducing total power consumption of parallel buses.
Microprocess. Microsystems, 2011

Securing Embedded Processors against Power Analysis Based Side Channel Attacks Using Reconfigurable Architecture.
Proceedings of the IEEE/IFIP 9th International Conference on Embedded and Ubiquitous Computing, 2011

Timing variation-aware custom instruction extension technique.
Proceedings of the Design, Automation and Test in Europe, 2011

Energy-aware design space exploration of registerfile for extensible processors.
Proceedings of the 2010 International Conference on Embedded Computer Systems: Architectures, 2010

Dual-purpose custom instruction identification algorithm based on Particle Swarm Optimization.
Proceedings of the 21st IEEE International Conference on Application-specific Systems Architectures and Processors, 2010

A Novel Partitioned Encoding Scheme for Reducing Total Power Consumption of Parallel Bus.
Proceedings of the Advances in Computer Science and Engineering, 2008

GABIST: A New Methodology to Find near Optimal LFSR for BIST Structure.
Proceedings of the 14th IEEE International Conference on Electronics, 2007

HW/SW partitioning using discrete particle swarm.
Proceedings of the 17th ACM Great Lakes Symposium on VLSI 2007, 2007

Empirical Analysis of the Dependence of Test Power, Delay, Energy and Fault Coverage on the Architecture of LFSR-Based TPGs.
Proceedings of the 22nd IEEE International Symposium on Defect and Fault-Tolerance in VLSI Systems (DFT 2007), 2007

Parallel-Genetic-Algorithm-Based HW/SW Partitioning.
Proceedings of the Fifth International Conference on Parallel Computing in Electrical Engineering (PARELEC 2006), 2006

SOPC-Based Parallel Genetic Algorithm.
Proceedings of the IEEE International Conference on Evolutionary Computation, 2006
