Naifeng Jing

Orcid: 0000-0001-8417-5796

According to our database1, Naifeng Jing authored at least 116 papers between 2009 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
RecPIM: Efficient In-Memory Processing for Personalized Recommendation Inference Using Near-Bank Architecture.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., October, 2024

DeltaGNN: Accelerating Graph Neural Networks on Dynamic Graphs With Delta Updating.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., April, 2024

3A-ReRAM: Adaptive Activation Accumulation in ReRAM-Based CNN Accelerator.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., January, 2024

A novel vehicle collision detection system: Integrating audio-visual fusion for enhanced performance.
Expert Syst. Appl., 2024

LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration.
CoRR, 2024

A Flexible and High-Precision Activation Function Unit Based on Equi-Error Partitioning Algorithm.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2024

A 0.8-ps RMS Precision Period Jitter Measurement Circuit with Offset Reduction.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2024

UM-PIM: DRAM-based PIM with Uniform & Shared Memory Space.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

VDA: A Simple but Efficient Virtual-Channel-Based Deadlock Avoidance Scheme for Scalable Chiplet Networks.
Proceedings of the Great Lakes Symposium on VLSI 2024, 2024

Watt: A Write-Optimized RRAM-Based Accelerator for Attention.
Proceedings of the Euro-Par 2024: Parallel Processing, 2024

Enabling Multiple Tensor-wise Operator Fusion for Transformer Models on Spatial Accelerators.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

CMC: Video Transformer Acceleration via CODEC Assisted Matrix Condensing.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

SparGNN: Efficient Joint Feature-Model Sparsity Exploitation in Graph Neural Network Acceleration.
Proceedings of the 29th Asia and South Pacific Design Automation Conference, 2024

Bridge-NDP: Achieving Efficient Communication-Computation Overlap in Near Data Processing with Bridge Architecture.
Proceedings of the 29th Asia and South Pacific Design Automation Conference, 2024

2023
A Point Cloud Video Recognition Acceleration Framework Based on Tempo-Spatial Information.
IEEE Trans. Parallel Distributed Syst., December, 2023

Real-Time Video Recognition via Decoder-Assisted Neural Network Acceleration Framework.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., July, 2023

A Reschedulable Dataflow-SIMD Execution for Increased Utilization in CGRA Cross-Domain Acceleration.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., March, 2023

E<sup>2</sup>-VOR: An End-to-End En/Decoder Architecture for Efficient Video Object Recognition.
ACM Trans. Design Autom. Electr. Syst., January, 2023

Exploiting bit sparsity in both activation and weight in neural networks accelerators.
Integr., 2023

RTMDet-R2: An Improved Real-Time Rotated Object Detector.
Proceedings of the Pattern Recognition and Computer Vision - 6th Chinese Conference, 2023

HyAcc: A Hybrid CAM-MAC RRAM-based Accelerator for Recommendation Model.
Proceedings of the 41st IEEE International Conference on Computer Design, 2023

ACET: An Adaptive Clock Scheme Exploiting Comprehensive Timing Slack for Reconfigurable Processors.
Proceedings of the 41st IEEE International Conference on Computer Design, 2023

Pipeline Balancing for Integrated Mapping in High Performance Spatial Programmable Architecture.
Proceedings of the 33rd International Conference on Field-Programmable Logic and Applications, 2023

PRADA: Point Cloud Recognition Acceleration via Dynamic Approximation.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

AdaS: A Fast and Energy-Efficient CNN Accelerator Exploiting Bit-Sparsity.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

An Efficient near-Bank Processing Architecture for Personalized Recommendation System.
Proceedings of the 28th Asia and South Pacific Design Automation Conference, 2023

An NoC-based CNN Accelerator for Edge Computing.
Proceedings of the 15th IEEE International Conference on ASIC, 2023

High-Performance Genomic Analysis Heterogeneous System Using OpenCL.
Proceedings of the 15th IEEE International Conference on ASIC, 2023

ReMap: Reorder Mapping for Multi-level Uneven Distribution on Sparse ReRAM Accelerator.
Proceedings of the 15th IEEE International Conference on ASIC, 2023

PRADA: Point Cloud Recognition Acceleration via Dynamic Approximation.
Proceedings of the ACM Turing Award Celebration Conference - China 2023, 2023

2022
An Efficient CNN Accelerator Using Inter-Frame Data Reuse of Videos on FPGAs.
IEEE Trans. Very Large Scale Integr. Syst., 2022

A Novel Architecture Design for Output Significance Aligned Flow with Adaptive Control in ReRAM-based Neural Network Accelerator.
ACM Trans. Design Autom. Electr. Syst., 2022

A Universal RRAM-Based DNN Accelerator With Programmable Crossbars Beyond MVM Operator.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

A Hybrid-Grained Remapping Defense Scheme Against Hard Failures for Row-Column-NVM.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

A Low Coupling and Lightweight Algorithm for Ship Detection in Optical Remote Sensing Images.
IEEE Geosci. Remote. Sens. Lett., 2022

RePAST: A ReRAM-based PIM Accelerator for Second-order Training of DNN.
CoRR, 2022

DNN Training Acceleration via Exploring GPGPU Friendly Sparsity.
CoRR, 2022

CP-ViT: Cascade Vision Transformer Pruning via Progressive Sparsity Prediction.
CoRR, 2022

Ristretto: An Atomized Processing Architecture for Sparsity-Condensed Stream Flow in CNN.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

GCNTrain: A Unified and Efficient Accelerator for Graph Convolutional Neural Network Training.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

Gzippo: Highly-Compact Processing-in-Memory Graph Accelerator Alleviating Sparsity and Redundancy.
Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, 2022

E<sup>2</sup>SR: an end-to-end video CODEC assisted system for super resolution acceleration.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

EBSP: evolving bit sparsity patterns for hardware-friendly inference of quantized deep neural networks.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

Boosting ReRAM-based DNN by Row Activation Oversubscription.
Proceedings of the 27th Asia and South Pacific Design Automation Conference, 2022

2021
A 3.85-Gb/s 8 × 8 Soft-Output MIMO Detector With Lattice-Reduction-Aided Channel Preprocessing.
IEEE Trans. Very Large Scale Integr. Syst., 2021

ITT-RNA: Imperfection Tolerable Training for RRAM-Crossbar-Based Deep Neural-Network Accelerator.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2021

SME: ReRAM-based Sparse-Multiplication-Engine to Squeeze-Out Bit Sparsity of Neural Network.
CoRR, 2021

PIPArch: Programmable Image Processing Architecture Using Sliding Array.
Proceedings of the 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), New York City, NY, USA, September 30, 2021

Fast FPGA-Based Emulation for ReRAM-Enabled Deep Neural Network Accelerator.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

Re2PIM: A Reconfigurable ReRAM-Based PIM Design for Variable-Sized Vector-Matrix Multiplication.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

Subgraph Decoupling and Rescheduling for Increased Utilization in CGRA Architecture.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

A Mapping Method for Reconfigurable Array based on Decoupled DataFlow.
Proceedings of the 7th IEEE International Conference on Big Data Security on Cloud, 2021

Exploiting Dynamic Bit Sparsity in Activation for Deep Neural Network Acceleration.
Proceedings of the 14th IEEE International Conference on ASIC, 2021

2020
A Hierarchical Scrubbing Technique for SEU Mitigation on SRAM-Based FPGAs.
IEEE Trans. Very Large Scale Integr. Syst., 2020

Priority Branches for Ship Detection in Optical Remote Sensing Images.
Remote. Sens., 2020

Design of a Hardware Accelerator for Zero-Knowledge Proof in Blockchains.
Proceedings of the Smart Computing and Communication - 5th International Conference, 2020

Reliable SoC Design and Implementation of SHA-3-HMAC Algorithm with Attack Protection.
Proceedings of the IEEE International Conference on Smart Cloud, 2020

VR-DANN: Real-Time Video Recognition via Decoder-Assisted Neural Network Acceleration.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

A Low Power Temperature-Compensated Common-Mode Voltage Detector for Dynamic Amplifiers.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2020

Decoupling the Multi-Rate Dataflow Execution in Coarse-Grained Reconfigurable Array.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2020

DRQ: Dynamic Region-based Quantization for Deep Neural Network Acceleration.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

PRArch: Pattern-Based Reconfigurable Architecture for Deep Neural Network Acceleration.
Proceedings of the 22nd IEEE International Conference on High Performance Computing and Communications; 18th IEEE International Conference on Smart City; 6th IEEE International Conference on Data Science and Systems, 2020

Enabling Resistive-RAM-based Activation Functions for Deep Neural Network Acceleration.
Proceedings of the GLSVLSI '20: Great Lakes Symposium on VLSI 2020, 2020

A Winograd-Based CNN Accelerator with a Fine-Grained Regular Sparsity Pattern.
Proceedings of the 30th International Conference on Field-Programmable Logic and Applications, 2020

GPNPU: Enabling Efficient Hardware-Based Direct Convolution with Multi-Precision Support in GPU Tensor Cores.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

Identifying patch-level MSI from histological images of Colorectal Cancer by a Knowledge Distillation Model.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2020

A High-Throughput Tumor Location System with Deep Learning for Colorectal Cancer Histopathology Image.
Proceedings of the Artificial Intelligence in Medicine, 2020

FIMIL : A high-throughput deep learning model for abnormality detection with weak annotation in microscopy images.
Proceedings of the Australasian Computer Science Week, 2020

2019
A New Cellular-Based Redundant TSV Structure for Clustered Faults.
IEEE Trans. Very Large Scale Integr. Syst., 2019

A Novel Resistive Memory-based Process-in-memory Architecture for Efficient Logic and Add Operations.
ACM Trans. Design Autom. Electr. Syst., 2019

Energy-Efficient and Quality-Assured Approximate Computing Framework Using a Co-Training Method.
ACM Trans. Design Autom. Electr. Syst., 2019

Energy-Efficient Nonvolatile SRAM Design Based on Resistive Switching Multi-Level Cells.
IEEE Trans. Circuits Syst. II Express Briefs, 2019

Scale Adaptive Proposal Network for Object Detection in Remote Sensing Images.
IEEE Geosci. Remote. Sens. Lett., 2019

A Rapid Scrubbing Technique for SEU Mitigation on SRAM-Based FPGAs.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2019

A sharing-aware L1.5D cache for data reuse in GPGPUs.
Proceedings of the 24th Asia and South Pacific Design Automation Conference, 2019

HUBPA: high utilization bidirectional pipeline architecture for neuromorphic computing.
Proceedings of the 24th Asia and South Pacific Design Automation Conference, 2019

2018
IBOM: An Integrated and Balanced On-Chip Memory for High Performance GPGPUs.
IEEE Trans. Parallel Distributed Syst., 2018

CNFET-Based High Throughput SIMD Architecture.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

Invocation-driven neural approximate computing with a multiclass-classifier and multiple approximators.
Proceedings of the International Conference on Computer-Aided Design, 2018

AXNet: approximate computing using an end-to-end trainable neural network.
Proceedings of the International Conference on Computer-Aided Design, 2018

A FPGA Friendly Approximate Computing Framework with Hybrid Neural Networks: (Abstract Only).
Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018

2017
Bank Stealing for a Compact and Efficient Register File Architecture in GPGPU.
IEEE Trans. Very Large Scale Integr. Syst., 2017

A 0.33 V 2.5 μW cross-point data-aware write structure, read-half-select disturb-free sub-threshold SRAM in 130 nm CMOS.
Integr., 2017

Incorporating selective victim cache into GPGPU for high-performance computing.
Concurr. Comput. Pract. Exp., 2017

A wideband simplified transformer-based VCO with digital amplitude calibration.
Proceedings of the IEEE 60th International Midwest Symposium on Circuits and Systems, 2017

On Quality Trade-off Control for Approximate Computing Using Iterative Training.
Proceedings of the 54th Annual Design Automation Conference, 2017

Sneak-Path Based Test and Diagnosis for 1R RRAM Crossbar Using Voltage Bias Technique.
Proceedings of the 54th Annual Design Automation Conference, 2017

2016
A Novel Test Method for Metallic CNTs in CNFET-Based SRAMs.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2016

Energy-Efficient eDRAM-Based On-Chip Storage Architecture for GPGPUs.
IEEE Trans. Computers, 2016

Cache-emulated register file: An integrated on-chip memory architecture for high performance GPGPUs.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

Applying Victim Cache in High Performance GPGPU Computing.
Proceedings of the 15th International Symposium on Parallel and Distributed Computing, 2016

CNFET-based high throughput register file architecture.
Proceedings of the 34th IEEE International Conference on Computer Design, 2016

Enabling in-situ logic-in-memory capability using resistive-RAM crossbar memory.
Proceedings of the 2016 International Conference on Field-Programmable Technology, 2016

2015
Buddy SM: Sharing Pipeline Front-End for Improved Energy Efficiency in GPGPUs.
ACM Trans. Archit. Code Optim., 2015

Timing-driven placement for carbon nanotube circuits.
Proceedings of the 28th IEEE International System-on-Chip Conference, 2015

Resource-saving compile flow for coarse-grained reconfigurable architectures.
Proceedings of the International Conference on ReConFigurable Computing and FPGAs, 2015

On diagnosable and tunable 3D clock network design for lifetime reliability enhancement.
Proceedings of the 2015 IEEE International Test Conference, 2015

CGSharing: Efficient content sharing in GPU-based cloud gaming.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2015

Bank stealing for conflict mitigation in GPGPU Register File.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2015

Redundancy based Interconnect Duplication to Mitigate Soft Errors in SRAM-based FPGAs.
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2015

Jump test for metallic CNTs in CNFET-based SRAM.
Proceedings of the 52nd Annual Design Automation Conference, 2015

2014
IPF: In-Place X-Filling Algorithm for the Reliability of Modern FPGAs.
IEEE Trans. Very Large Scale Integr. Syst., 2014

2013
Compiler assisted dynamic register file in GPGPU.
Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2013

An energy-efficient and scalable eDRAM-based register file architecture for GPGPU.
Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013

2012
SEU fault evaluation and characteristics for SRAM-based FPGA architectures and synthesis algorithms.
ACM Trans. Design Autom. Electr. Syst., 2012

Contention and energy aware mapping for real-time applications on Network-on-Chip.
Proceedings of the International SoC Design Conference, 2012

Heterogeneous configuration memory scrubbing for soft error mitigation in FPGAs.
Proceedings of the 2012 International Conference on Field-Programmable Technology, 2012

2011
A general statistical estimation for application mapping in Network-on-Chip.
Proceedings of the IEEE/IFIP 19th International Conference on VLSI and System-on-Chip, 2011

A thermal-aware task mapping flow for coarse-grain dynamic reconfigurable processor.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2011), 2011

Mitigating FPGA interconnect soft errors by in-place LUT inversion.
Proceedings of the 2011 IEEE/ACM International Conference on Computer-Aided Design, 2011

Quantitative SEU Fault Evaluation for SRAM-Based FPGA Architectures and Synthesis Algorithms.
Proceedings of the International Conference on Field Programmable Logic and Applications, 2011

IPF: In-Place X-Filling to Mitigate Soft Errors in SRAM-Based FPGAs.
Proceedings of the International Conference on Field Programmable Logic and Applications, 2011

Fault modeling and characteristics of SRAM-based FPGAs (abstract only).
Proceedings of the ACM/SIGDA 19th International Symposium on Field Programmable Gate Arrays, 2011

2010
Statistical estimation and evaluation for communication mapping in Network-on-Chip.
Integr., 2010

Resource constrained mapping of data flow graphs onto coarse-grained reconfigurable array.
Proceedings of the Annual IEEE International SoC Conference, SoCC 2010, 2010

2009
Statistical Estimation for Total Communication Load in Application-Specific Network-on-Chip.
Proceedings of the International Conference on Embedded Software and Systems, 2009


  Loading...