Geng Yuan

Orcid: 0000-0001-9844-992X

According to our database1, Geng Yuan authored at least 92 papers between 2017 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Brain Tumor Classification on MRI in Light of Molecular Markers.
CoRR, 2024

AyE-Edge: Automated Deployment Space Search Empowering Accuracy yet Efficient Real-Time Object Detection on the Edge.
CoRR, 2024

Improving GPU Multi-Tenancy Through Dynamic Multi-Instance GPU Reconfiguration.
CoRR, 2024

Efficient Pruning of Large Language Model with Adaptive Estimation Fusion.
CoRR, 2024

EdgeOL: Efficient in-situ Online Learning on Edge Devices.
CoRR, 2024

Zero-Space Cost Fault Tolerance for Transformer-based Language Models on ReRAM.
CoRR, 2024

Advancing Dynamic Sparse Training by Exploring Optimization Opportunities.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Waxing-and-Waning: a Generic Similarity-based Framework for Efficient Self-Supervised Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

SuperFlow: A Fully-Customized RTL-to-GDS Design Automation Flow for Adiabatic Quantum- Flux - Parametron Superconducting Circuits.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

DACO: Pursuing Ultra-low Power Consumption via DNN-Adaptive CPU-GPU CO-optimization on Mobile Devices.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

Late Breaking Result: AQFP-aware Binary Neural Network Architecture Search.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

2023
Memristor-Based Spectral Decomposition of Matrices and Its Applications.
IEEE Trans. Computers, May, 2023

MTS-LOF: Medical Time-Series Representation Learning via Occlusion-Invariant Features.
CoRR, 2023

Towards Artificial General Intelligence (AGI) in the Internet of Things (IoT): Opportunities and Challenges.
CoRR, 2023

A Life-Cycle Energy and Inventory Analysis of Adiabatic Quantum-Flux-Parametron Circuits.
CoRR, 2023

Uncertainty Quantification in Neural Networks Using Stochastic Differential Equations.
Proceedings of the 62nd Annual Conference of the Society of Instrument and Control Engineers, 2023

PackQViT: Faster Sub-8-bit Vision Transformers via Full and Packed Quantization on the Mobile.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

HotBEV: Hardware-oriented Transformer-based Multi-View 3D Detector for BEV Perception.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

SupeRBNN: Randomized Binary Neural Network Using Adiabatic Superconductor Josephson Devices.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

Data Level Lottery Ticket Hypothesis for Vision Transformers.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

SmartFRZ: An Efficient Training Framework using Attention-Based Layer Freezing.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Self-Ensemble Protection: Training Checkpoints Are Good Data Protectors.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

MOC: Multi-Objective Mobile CPU-GPU Co-Optimization for Power-Efficient DNN Inference.
Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

ESRU: Extremely Low-Bit and Hardware-Efficient Stochastic Rounding Unit Design for Low-Bit DNN Training.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

A Scalable Real-time Semantic Segmentation Network for Autonomous Driving.
Proceedings of the 2023 Workshop on Advanced Multimedia Computing for Smart Manufacturing and Engineering, 2023

Towards Real-Time Segmentation on the Edge.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer Training.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Mobile or FPGA? A Comprehensive Evaluation on Energy Efficiency and a Unified Optimization Framework.
ACM Trans. Embed. Comput. Syst., September, 2022

Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time Mobile Acceleration.
ACM Trans. Design Autom. Electr. Syst., 2022

Non-Structured DNN Weight Pruning - Is It Beneficial in Any Platform?
IEEE Trans. Neural Networks Learn. Syst., 2022

The Lottery Ticket Hypothesis for Vision Transformers.
CoRR, 2022

DeltaFS: Pursuing Zero Update Overhead via Metadata-Enabled Delta Compression for Log-structured File System on Mobile Devices.
CoRR, 2022

EfficientFormer: Vision Transformers at MobileNet Speed.
CoRR, 2022

An Automatically Privacy Protection Solution for Implementing the Right to Be Forgotten in Embedded System.
IEEE Access, 2022

Optimizing Data Layout for Training Deep Neural Networks.
Proceedings of the Companion of The Web Conference 2022, Virtual Event / Lyon, France, April 25, 2022

Layer Freezing & Data Sieving: Missing Pieces of a Generic Framework for Sparse Training.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

EfficientFormer: Vision Transformers at MobileNet Speed.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

SparCL: Sparse Continual Learning on the Edge.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

BLCR: Towards Real-time DNN Execution with Block-based Reweighted Pruning.
Proceedings of the 23rd International Symposium on Quality Electronic Design, 2022

Reliability Improvement in RRAM-based DNN for Edge Computing.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

Pruning-as-Search: Efficient Neural Architecture Search via Channel Pruning and Structural Reparameterization.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Real-Time Portrait Stylization on the Edge.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Auto-ViT-Acc: An FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme Quantization.
Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications, 2022

You Already Have It: A Generator-Free Low-Precision DNN Training Framework Using Stochastic Rounding.
Proceedings of the Computer Vision - ECCV 2022, 2022

SPViT: Enabling Faster Vision Transformers via Latency-Aware Soft Token Pruning.
Proceedings of the Computer Vision - ECCV 2022, 2022

Fault-Tolerant Deep Neural Networks for Processing-In-Memory based Autonomous Edge Systems.
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

FPGA-aware automatic acceleration framework for vision transformer with mixed-scheme quantization: late breaking results.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

Hardware-efficient stochastic rounding unit design for DNN training: late breaking results.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

A Data-Loader Tunable Knob to Shorten GPU Idleness for Distributed Deep Learning.
Proceedings of the IEEE 15th International Conference on Cloud Computing, 2022

2021
Achieving Real-Time Object Detection on MobileDevices with Neural Pruning Search.
CoRR, 2021

Lottery Ticket Implies Accuracy Degradation, Is It a Desirable Phenomenon?
CoRR, 2021

Brief Industry Paper: Towards Real-Time 3D Object Detection for Autonomous Vehicles with Pruning Search.
Proceedings of the 27th IEEE Real-Time and Embedded Technology and Applications Symposium, 2021

Work in Progress: Mobile or FPGA? A Comprehensive Evaluation on Energy Efficiency and a Unified Optimization Framework.
Proceedings of the 27th IEEE Real-Time and Embedded Technology and Applications Symposium, 2021

MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Sanity Checks for Lottery Tickets: Does Your Winning Ticket Really Win the Jackpot?
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Improving DNN Fault Tolerance using Weight Pruning and Differential Crossbar Mapping for ReRAM-based Edge AI.
Proceedings of the 22nd International Symposium on Quality Electronic Design, 2021

FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

Towards Fast and Accurate Multi-Person Pose Estimation on Mobile Devices.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

A Compression-Compilation Framework for On-mobile Real-time BERT Applications.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

ClickTrain: efficient and accurate end-to-end deep learning training via fine-grained architecture-preserving pruning.
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

Lottery Ticket Preserves Weight Correlation: Is It Desirable or Not?
Proceedings of the 38th International Conference on Machine Learning, 2021

Achieving on-Mobile Real-Time Super-Resolution with Neural Architecture and Pruning Search.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

ScaleDNN: Data Movement Aware DNN Training on Multi-GPU.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

HMC-TRAN: A Tensor-core Inspired Hierarchical Model Compression for Transformer-based DNNs on GPU.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

TinyADC: Peripheral Circuit-aware Weight Pruning Framework for Mixed-signal DNN Accelerators.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

Neural Pruning Search for Real-Time Object Detection of Autonomous Vehicles.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

NPAS: A Compiler-Aware Framework of Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Teachers Do More Than Teach: Compressing Image-to-Image Models.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

An Explainable Convolutional Neural Networks for Automatic Segmentation of the Left Ventricle in Cardiac MRI.
Proceedings of CECNet 2021, 2021

Real-Time Mobile Acceleration of DNNs: From Computer Vision to Medical Applications.
Proceedings of the ASPDAC '21: 26th Asia and South Pacific Design Automation Conference, 2021

A Compression-Compilation Co-Design Framework Towards Real-Time Object Detection on Mobile Devices.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

YOLObile: Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Achieving Real-Time LiDAR 3D Object Detection on a Mobile Device.
CoRR, 2020

6.7ms on Mobile with over 78% ImageNet Accuracy: Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration.
CoRR, 2020

An Efficient End-to-End Deep Learning Training Framework via Fine-Grained Pattern-Based Pruning.
CoRR, 2020

Achieving Real-Time Execution of Transformer-based Large-scale Models on Mobile with Compiler-aware Neural Architecture Optimization.
CoRR, 2020

SS-Auto: A Single-Shot, Automatic Structured Weight Pruning Framework of DNNs with Ultra-High Efficiency.
CoRR, 2020

A DNN Compression Framework for SOT-MRAM-based Processing-In-Memory Engine.
Proceedings of the 33rd IEEE International System-on-Chip Conference, 2020

New Passive and Active Attacks on Deep Neural Networks in Medical Applications.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020

Tiny but Accurate: A Pruned, Quantized and Optimized Memristor Crossbar Framework for Ultra Efficient DNN Implementation.
Proceedings of the 25th Asia and South Pacific Design Automation Conference, 2020

2019
A SOT-MRAM-based Processing-In-Memory Engine for Highly Compressed DNN Implementation.
CoRR, 2019

Non-structured DNN Weight Pruning Considered Harmful.
CoRR, 2019

Toward Extremely Low Bit and Lossless Accuracy in DNNs with Progressive ADMM.
CoRR, 2019

ResNet Can Be Pruned 60x: Introducing Network Purification and Unused Path Removal (P-RM) after Weight Pruning.
CoRR, 2019

ResNet Can Be Pruned 60×: Introducing Network Purification and Unused Path Removal (P-RM) after Weight Pruning.
Proceedings of the IEEE/ACM International Symposium on Nanoscale Architectures, 2019

An Ultra-Efficient Memristor-Based DNN Framework with Structured Weight Pruning and Quantization Using ADMM.
Proceedings of the 2019 IEEE/ACM International Symposium on Low Power Electronics and Design, 2019

2018
An area and energy efficient design of domain-wall memory-based deep convolutional neural networks using stochastic computing.
Proceedings of the 19th International Symposium on Quality Electronic Design, 2018

Structured Weight Matrices-Based Hardware Accelerators in Deep Neural Networks: FPGAs and ASICs.
Proceedings of the 2018 on Great Lakes Symposium on VLSI, 2018

Towards Ultra-High Performance and Energy Efficiency of Deep Learning Systems: An Algorithm-Hardware Co-Optimization Framework.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
CirCNN: Accelerating and Compressing Deep Neural Networks Using Block-CirculantWeight Matrices.
CoRR, 2017

Memristor crossbar-based ultra-efficient next-generation baseband processors.
Proceedings of the IEEE 60th International Midwest Symposium on Circuits and Systems, 2017

CirCNN: accelerating and compressing deep neural networks using block-circulant weight matrices.
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017


  Loading...