Guangyu Sun
Orcid: 0000-0002-7315-6589Affiliations:
- Peking University, Center for Energy-efficient Computing and Applications, Beijing, China
- Pennsylvania State University, Department of Computer Science and Engineering, University Park, PA, USA
According to our database1,
Guangyu Sun
authored at least 173 papers
between 2008 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
CoMN: Algorithm-Hardware Co-Design Platform for Nonvolatile Memory-Based Convolutional Neural Network Accelerators.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., July, 2024
OriGen:Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection.
CoRR, 2024
Theseus: Towards High-Efficiency Wafer-Scale Chip Design Space Exploration for Large Language Models.
CoRR, 2024
CoRR, 2024
Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024
34.3 A 22nm 64kb Lightning-Like Hybrid Computing-in-Memory Macro with a Compressed Adder Tree and Analog-Storage Quantizers for Transformer and CNNs.
Proceedings of the IEEE International Solid-State Circuits Conference, 2024
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024
Algorithm-Hardware Co-Design for Energy-Efficient A/D Conversion in ReRAM-Based Accelerators.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024
Oltron: Algorithm-Hardware Co-design for Outlier-Aware Quantization of LLMs with Inter-/Intra-Layer Adaptation.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024
SpecPIM: Accelerating Speculative Inference on PIM-Enabled System via Architecture-Dataflow Co-Exploration.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
PIM-DL: Expanding the Applicability of Commodity DRAM-PIMs for Deep Learning via Algorithm-System Co-Optimization.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
2023
FD-CNN: A Frequency-Domain FPGA Acceleration Scheme for CNN-Based Image-Processing Applications.
ACM Trans. Embed. Comput. Syst., November, 2023
COPPER: a combinatorial optimization problem solver with processing-in-memory architecture.
Frontiers Inf. Technol. Electron. Eng., May, 2023
IEEE Trans. Computers, February, 2023
Energon: Toward Efficient Acceleration of Transformers Using Dynamic Sparse Attention.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2023
ASVD: Activation-aware Singular Value Decomposition for Compressing Large Language Models.
CoRR, 2023
Benchmarking the Reliability of Post-training Quantization: a Particular Focus on Worst-case Performance.
CoRR, 2023
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023
NMExplorer: An Efficient Exploration Framework for DIMM-based Near-Memory Tensor Reduction.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023
Proceedings of the Advanced Parallel Processing Technologies, 2023
Caffeine: Towards Uniformed Representation and Acceleration for Deep Convolutional Neural Networks.
Proceedings of the ACM Turing Award Celebration Conference - China 2023, 2023
2022
Editorial for the special issue on memory architectures and systems for modern applications.
CCF Trans. High Perform. Comput., December, 2022
PIMulator-NN: An Event-Driven, Cross-Level Simulation Framework for Processing-In-Memory-Based Neural Network Accelerators.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022
A Survey of Trustworthy Graph Learning: Reliability, Explainability, and Privacy Protection.
CoRR, 2022
Proceedings of the 2022 USENIX Annual Technical Conference, 2022
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2022
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
An 82nW 0.53pJ/SOP Clock-Free Spiking Neural Network with 40µs Latency for AloT Wake-Up Functions Using Ultimate-Event-Driven Bionic Architecture and Computing-in-Memory Technique.
Proceedings of the IEEE International Solid-State Circuits Conference, 2022
Enabling High-Quality Uncertainty Quantification in a PIM Designed for Bayesian Neural Network.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022
PTQ4ViT: Post-training Quantization for Vision Transformers with Twin Uniform Quantization.
Proceedings of the Computer Vision - ECCV 2022, 2022
Tailor: removing redundant operations in memristive analog neural network accelerators.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022
Proceedings of the 4th IEEE International Conference on Artificial Intelligence Circuits and Systems, 2022
GNNear: Accelerating Full-Batch Training of Graph Neural Networks with near-Memory Processing.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022
2021
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2021
ACM Trans. Archit. Code Optim., 2021
J. Comput. Sci. Technol., 2021
GCNear: A Hybrid Architecture for Efficient GCN Training with Near-Memory Processing.
CoRR, 2021
Energon: Towards Efficient Acceleration of Transformers Using Dynamic Sparse Attention.
CoRR, 2021
METRO: A Software-Hardware Co-Design of Interconnections for Spatial DNN Accelerators.
CoRR, 2021
NAS4RRAM: neural network architecture search for inference on RRAM-based accelerators.
Sci. China Inf. Sci., 2021
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021
Rapid Configuration of Asynchronous Recurrent Neural Networks for ASIC Implementations.
Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021
Proceedings of the 27th IEEE International Symposium on Asynchronous Circuits and Systems, 2021
2020
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020
Crane: Mitigating Accelerator Under-utilization Caused by Sparsity Irregularities in CNNs.
IEEE Trans. Computers, 2020
J. Comput. Sci. Technol., 2020
Customizing Trusted AI Accelerators for Efficient Privacy-Preserving Machine Learning.
CoRR, 2020
CoRR, 2020
Edge-Stream: a Stream Processing Approach for Distributed Applications on a Hierarchical Edge-computing System.
Proceedings of the 5th IEEE/ACM Symposium on Edge Computing, 2020
MobiLattice: A Depth-wise DCNN Accelerator with Hybrid Digital/Analog Nonvolatile Processing-In-Memory Block.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020
Proceedings of the 3rd USENIX Workshop on Hot Topics in Edge Computing, 2020
S2DNAS: Transforming Static CNN Model for Dynamic Inference via Neural Architecture Search.
Proceedings of the Computer Vision - ECCV 2020, 2020
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020
Proceedings of the Advanced Computer Architecture - 13th Conference, 2020
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020
2019
Caffeine: Toward Uniformed Representation and Acceleration for Deep Convolutional Neural Networks.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019
RC-NVM: Dual-Addressing Non-Volatile Memory Architecture Supporting Both Row and Column Memory Accesses.
IEEE Trans. Computers, 2019
EdgeFlow: Open-Source Multi-layer Data Flow Processing in Edge Computing for 5G and Beyond.
IEEE Netw., 2019
Joint Task Assignment, Transmission, and Computing Resource Allocation in Multilayer Mobile Edge Computing Systems.
IEEE Internet Things J., 2019
Generalization in Generative Adversarial Networks: A Novel Perspective from Privacy Protection.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019
BAYHENN: Combining Bayesian Deep Learning and Homomorphic Encryption for Secure DNN Inference.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019
Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, 2019
ZUMA: Enabling Direct Insertion/Deletion Operations with Emerging Skyrmion Racetrack Memory.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019
P3SGD: Patient Privacy Preserving SGD for Regularizing Deep CNNs in Pathological Image Classification.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
Proceedings of the 30th IEEE International Conference on Application-specific Systems, 2019
G2C: A Generator-to-Classifier Framework Integrating Multi-Stained Visual Cues for Pathological Glomerulus Classification.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019
2018
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018
CRAT: Enabling Coordinated Register Allocation and Thread-Level Parallelism Optimization for GPUs.
IEEE Trans. Computers, 2018
Proceedings of the IEEE 7th Non-Volatile Memory Systems and Applications Symposium, 2018
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018
Proceedings of the 36th IEEE International Conference on Computer Design, 2018
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018
2017
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2017
IEEE Trans. Computers, 2017
CoRR, 2017
Reducing Overfitting in Deep Convolutional Neural Networks Using Redundancy Regularizer.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2017, 2017
FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates.
Proceedings of the 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2017
Protect non-volatile memory from wear-out attack based on timing difference of row buffer hit/miss.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017
Toss-up Wear Leveling: Protecting Phase-Change Memories from Inconsistent Write Patterns.
Proceedings of the 54th Annual Design Automation Conference, 2017
Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017
2016
Perspectives of Racetrack Memory for Large-Capacity On-Chip Memory: From Device to System.
IEEE Trans. Circuits Syst. I Regul. Pap., 2016
Proceedings of the IEEE/ACM International Symposium on Nanoscale Architectures, 2016
Proceedings of the IEEE/ACM International Symposium on Nanoscale Architectures, 2016
Proceedings of the 2016 International Symposium on Low Power Electronics and Design, 2016
Proceedings of the 32nd IEEE International Conference on Data Engineering, 2016
Proceedings of the 26th edition on Great Lakes Symposium on VLSI, 2016
Proceedings of the 26th edition on Great Lakes Symposium on VLSI, 2016
Proceedings of the 53rd Annual Design Automation Conference, 2016
Proceedings of the 21st Asia and South Pacific Design Automation Conference, 2016
Proceedings of the 21st Asia and South Pacific Design Automation Conference, 2016
Proceedings of the 21st Asia and South Pacific Design Automation Conference, 2016
2015
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2015
Proceedings of the IEEE Non-Volatile Memory System and Applications Symposium, 2015
Proceedings of the 2015 IEEE/ACM International Symposium on Nanoscale Architectures, 2015
Proceedings of the IEEE 31st Symposium on Mass Storage Systems and Technologies, 2015
Proceedings of the 48th International Symposium on Microarchitecture, 2015
Enabling coordinated register allocation and thread-level parallelism optimization for GPUs.
Proceedings of the 48th International Symposium on Microarchitecture, 2015
Leveraging emerging nonvolatile memory in high-level synthesis with loop transformations.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2015
Perspectives of racetrack memory based on current-induced domain wall motion: From device to system.
Proceedings of the 2015 IEEE International Symposium on Circuits and Systems, 2015
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015
Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2015
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015
An energy efficient backup scheme with low inrush current for nonvolatile SRAM in energy harvesting sensor nodes.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015
Proceedings of the 52nd Annual Design Automation Conference, 2015
Quantitative modeling of racetrack memory, a tradeoff among area, performance, and power.
Proceedings of the 20th Asia and South Pacific Design Automation Conference, 2015
Proceedings of the 6th Asia-Pacific Workshop on Systems, 2015
Improving Memory Access Performance of In-Memory Key-Value Store Using Data Prefetching Techniques.
Proceedings of the Advanced Parallel Processing Technologies, 2015
2014
SIGARCH Comput. Archit. News, 2014
Proceedings of the International Symposium on Low Power Electronics and Design, 2014
Proceedings of the IEEE International Symposium on Circuits and Systemss, 2014
Half-DRAM: A high-bandwidth and low-power DRAM architecture from the rethinking of fine-grained activation.
Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014
Proceedings of the Great Lakes Symposium on VLSI 2014, GLSVLSI '14, Houston, TX, USA - May 21, 2014
An efficient design and implementation of LSM-tree based key-value store on open-channel SSD.
Proceedings of the Ninth Eurosys Conference 2014, 2014
Proceedings of the 51st Annual Design Automation Conference 2014, 2014
Proceedings of the 19th Asia and South Pacific Design Automation Conference, 2014
2013
Optimizing GPU energy efficiency with 3D die-stacking graphics memory and reconfigurable memory interface.
ACM Trans. Archit. Code Optim., 2013
Exploring the vulnerability of CMPs to soft errors with 3D stacked nonvolatile memory.
ACM J. Emerg. Technol. Comput. Syst., 2013
Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2013
Proceedings of the 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013), 2013
Lazy Precharge: An overhead-free method to reduce precharge overhead for memory parallelism improvement of DRAM system.
Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013
Proceedings of the Great Lakes Symposium on VLSI 2013 (part of ECRC), 2013
Proceedings of the International Conference on Compilers, 2013
Proceedings of the Web Technologies and Applications - 15th Asia-Pacific Web Conference, 2013
2012
ACM Trans. Design Autom. Electr. Syst., 2012
Proceedings of the International Symposium on Low Power Electronics and Design, 2012
Proceedings of the International Symposium on Low Power Electronics and Design, 2012
Proceedings of the 2012 IEEE/ACM International Conference on Computer-Aided Design, 2012
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012
3DHLS: Incorporating high-level synthesis in physical planning of three-dimensional (3D) ICs.
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012
2011
Proceedings of the 3D Integration for NoC-based SoC Architectures, 2011
Found. Trends Electron. Des. Autom., 2011
IEEE J. Emerg. Sel. Topics Circuits Syst., 2011
Proceedings of the 38th International Symposium on Computer Architecture (ISCA 2011), 2011
Proceedings of the 38th International Symposium on Computer Architecture (ISCA 2011), 2011
Proceedings of the IEEE 29th International Conference on Computer Design, 2011
Exploring the vulnerability of CMPs to soft errors with 3D stacked non-volatile memory.
Proceedings of the IEEE 29th International Conference on Computer Design, 2011
Proceedings of the 21st ACM Great Lakes Symposium on VLSI 2010, 2011
Proceedings of the 9th International Conference on Hardware/Software Codesign and System Synthesis, 2011
Proceedings of the 16th Asia South Pacific Design Automation Conference, 2011
Proceedings of the 2011 IEEE International 3D Systems Integration Conference (3DIC), Osaka, Japan, January 31, 2011
Proceedings of the 2011 IEEE International 3D Systems Integration Conference (3DIC), Osaka, Japan, January 31, 2011
2010
IEEE Trans. Very Large Scale Integr. Syst., 2010
A Hybrid solid-state storage architecture for the performance, energy consumption, and lifetime improvement.
Proceedings of the 16th International Conference on High-Performance Computer Architecture (HPCA-16 2010), 2010
Proceedings of the Design, Automation and Test in Europe, 2010
Proceedings of the 47th Design Automation Conference, 2010
2009
Exploration of 3D stacked L2 cache design for high performance and efficient thermal control.
Proceedings of the 2009 International Symposium on Low Power Electronics and Design, 2009
3D GPU architecture using cache stacking: Performance, cost, power and thermal analysis.
Proceedings of the 27th International Conference on Computer Design, 2009
Proceedings of the 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), 2009
Proceedings of the 14th Asia South Pacific Design Automation Conference, 2009
Proceedings of the IEEE International Conference on 3D System Integration, 2009
2008
Thermal-aware Design Considerations for Application-Specific Instruction Set Processor.
Proceedings of the IEEE Symposium on Application Specific Processors, 2008
Proceedings of the Design, Automation and Test in Europe, 2008
Circuit and microarchitecture evaluation of 3D stacking magnetic RAM (MRAM) as a universal memory replacement.
Proceedings of the 45th Design Automation Conference, 2008