Zhiru Zhang
Orcid: 0000-0002-0778-0308
According to our database1,
Zhiru Zhang
authored at least 156 papers
between 2003 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
Proc. ACM Program. Lang., 2024
Proc. ACM Program. Lang., 2024
CoRR, 2024
CoRR, 2024
Supporting a Virtual Vector Instruction Set on a Commercial Compute-in-SRAM Accelerator.
IEEE Comput. Archit. Lett., 2024
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the Second Tiny Papers Track at ICLR 2024, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
LibPreemptible: Enabling Fast, Adaptive, and Hardware-Assisted User-Space Scheduling.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024
Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2024
Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2024
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Less is More: Hop-Wise Graph Attention for Scalable and Generalizable Learning on Circuits.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024
Slapo: A Schedule Language for Progressive Optimization of Large Deep Learning Model Training.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
2023
RapidStream 2.0: Automated Parallel Implementation of Latency-Insensitive FPGA Designs Through Partial Reconfiguration.
ACM Trans. Reconfigurable Technol. Syst., December, 2023
TAPA: A Scalable Task-parallel Dataflow Programming Framework for Modern FPGAs with Co-optimization of HLS and Physical Design.
ACM Trans. Reconfigurable Technol. Syst., December, 2023
A 28-nm 8-bit Floating-Point Tensor Core-Based Programmable CNN Training Processor With Dynamic Structured Sparsity.
IEEE J. Solid State Circuits, 2023
Understanding the Potential of FPGA-Based Spatial Acceleration for Large Language Model Inference.
CoRR, 2023
CoRR, 2023
CoRR, 2023
Proceedings of the ACM SIGCOMM 2023 Conference, 2023
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Proceedings of the 2023 International Symposium on Physical Design, 2023
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023
Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis, 2023
2022
ACM Trans. Reconfigurable Technol. Syst., 2022
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022
CoRR, 2022
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
GARNET: Reduced-Rank Topology Learning for Robust and Scalable Graph Neural Networks.
Proceedings of the Learning on Graphs Conference, 2022
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022
Proceedings of the 7th IEEE/ACM Symposium on Edge Computing, 2022
HeteroFlow: An Accelerator Programming Model with Decoupled Data Placement for Software-Defined FPGAs.
Proceedings of the FPGA '22: The 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Virtual Event, USA, 27 February 2022, 2022
Proceedings of the FPGA '22: The 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Virtual Event, USA, 27 February 2022, 2022
High-Performance Sparse Linear Algebra on HBM-Equipped FPGAs Using HLS: A Case Study on SpMV.
Proceedings of the FPGA '22: The 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Virtual Event, USA, 27 February 2022, 2022
Proceedings of the 30th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2022
A 28nm 8-bit Floating-Point Tensor Core based CNN Training Processor with Dynamic Activation/Weight Sparsification.
Proceedings of the 48th IEEE European Solid State Circuits Conference, 2022
Accelerator design with decoupled hardware customizations: benefits and challenges: invited.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
2021
Programming and Synthesis for Software-defined FPGA Acceleration: Status and Future Prospects.
ACM Trans. Reconfigurable Technol. Syst., 2021
Enabling Design Methodologies and Future Trends for Edge AI: Specialization and Codesign.
IEEE Des. Test, 2021
CoRR, 2021
Dagger: Accelerating RPCs in Cloud Microservices Through Tightly-Coupled Reconfigurable NICs.
CoRR, 2021
Enabling Design Methodologies and Future Trends for Edge AI: Specialization and Co-design.
CoRR, 2021
BulletTrain: Accelerating Robust Neural Network Training via Boundary Example Mining.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Proceedings of the 38th International Conference on Machine Learning, 2021
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021
FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations.
Proceedings of the FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28, 2021
AutoBridge: Coupling Coarse-Grained Floorplanning and Pipelining for High-Frequency HLS Design on Multi-Die FPGAs.
Proceedings of the FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28, 2021
Scaling Up Hardware Accelerator Verification using A-QED with Functional Decomposition.
Proceedings of the Formal Methods in Computer Aided Design, 2021
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021
Distilling Arbitration Logic from Traces using Machine Learning: A Case Study on NoC.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021
Dagger: efficient and fast RPCs in cloud microservices with near-memory reconfigurable NICs.
Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021
Proceedings of the ASPDAC '21: 26th Asia and South Pacific Design Automation Conference, 2021
2020
MgX: Near-Zero Overhead Memory Protection with an Application to Secure DNN Acceleration.
CoRR, 2020
Dagger: Towards Efficient RPCs in Cloud Microservices With Near-Memory Reconfigurable NICs.
IEEE Comput. Archit. Lett., 2020
Proceedings of the International Conference for High Performance Computing, 2020
Proceedings of the 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation, 2020
MatRaptor: A Sparse-Sparse Matrix Multiplication Accelerator Based on Row-Wise Product.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020
Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations.
Proceedings of the 8th International Conference on Learning Representations, 2020
GraphZoom: A Multi-level Spectral Approach for Accurate and Scalable Graph Embedding.
Proceedings of the 8th International Conference on Learning Representations, 2020
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020
SuSy: A Programming Model for Productive Construction of High-Performance Systolic Arrays on FPGAs.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020
Analysis and Optimization of the Implicit Broadcasts in FPGA HLS to Improve Maximum Frequency.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020
2019
PIMap: A Flexible Framework for Improving LUT-Based Technology Mapping via Parallelized Iterative Optimization.
ACM Trans. Reconfigurable Technol. Syst., 2019
Overwrite Quantization: Opportunistic Outlier Handling for Neural Network Accelerators.
CoRR, 2019
A 1.4 GHz 695 Giga Risc-V Inst/s 496-Core Manycore Processor With Mesh On-Chip Network and an All-Digital Synthesized PLL in 16nm CMOS.
Proceedings of the 2019 Symposium on VLSI Circuits, Kyoto, Japan, June 9-14, 2019, 2019
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019
Boosting the Performance of CNN Accelerators with Dynamic Fine-Grained Channel Gating.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019
Improving Neural Network Quantization without Retraining using Outlier Channel Splitting.
Proceedings of the 36th International Conference on Machine Learning, 2019
HeteroCL: A Multi-Paradigm Programming Infrastructure for Software-Defined Reconfigurable Computing.
Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019
Proceedings of the 27th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2019
T2S-Tensor: Productively Generating High-Performance Spatial Hardware for Dense Tensor Computations.
Proceedings of the 27th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2019
Proceedings of the 56th Annual Design Automation Conference 2019, 2019
Painting on Placement: Forecasting Routing Congestion using Conditional Generative Adversarial Nets.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019
Rapid Generation of High-Qality RISC-V Processors from Functional Instruction Set Specifications.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019
Designing Secure Cryptographic Accelerators with Information Flow Enforcement: A Case Study on AES.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019
Improving Scalability of Exact Modulo Scheduling with Specialized Conflict-Driven Learning.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
2018
The Celerity Open-Source 511-Core RISC-V Tiered Accelerator Fabric: Fast Architectures and Design Methodologies for Fast Chips.
IEEE Micro, 2018
Proceedings of the International Conference on Computer-Aided Design, 2018
Rosetta: A Realistic High-Level Synthesis Benchmark Suite for Software Programmable FPGAs.
Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018
DATuner: An Extensible Distributed Autotuning Framework for FPGA Design and Design Automation: (Abstract Only).
Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018
A Scalable Approach to Exact Resource-Constrained Scheduling Based on a Joint SDC and SAT Formulation.
Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018
Fast and Accurate Estimation of Quality of Results in High-Level Synthesis with Machine Learning.
Proceedings of the 26th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2018
Reverse engineering convolutional neural networks through side-channel information leaks.
Proceedings of the 55th Annual Design Automation Conference, 2018
2017
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2017
Proceedings of the 2017 IEEE/ACM International Conference on Computer-Aided Design, 2017
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017
Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs.
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017
A Parallelized Iterative Improvement Approach to Area Optimization for LUT-Based Technology Mapping.
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017
FPGA-Based Real-Time Charged Particle Trajectory Reconstruction at the Large Hadron Collider.
Proceedings of the 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2017
Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017
Proceedings of the 51st Asilomar Conference on Signals, Systems, and Computers, 2017
2016
Platform choices and design demands for IoT platforms: cost, power, and performance tradeoffs.
IET Cyper-Phys. Syst.: Theory & Appl., 2016
Characterizing the Benefits and Limitations of Smart Building Meeting Room Scheduling.
Proceedings of the 7th ACM/IEEE International Conference on Cyber-Physical Systems, 2016
Proceedings of the 53rd Annual Design Automation Conference, 2016
2015
IPSJ Trans. Syst. LSI Des. Methodol., 2015
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2015
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2015
Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2015
Proceedings of the 52nd Annual Design Automation Conference, 2015
Proceedings of the 52nd Annual Design Automation Conference, 2015
2014
Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014
Proceedings of the International Symposium on Low Power Electronics and Design, 2014
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2014
Proceedings of the 51st Annual Design Automation Conference 2014, 2014
2013
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2013
2012
2011
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2011
2010
Behavior-Level Observability Analysis for Operation Gating in Low-Power Behavioral Synthesis.
ACM Trans. Design Autom. Electr. Syst., 2010
Proceedings of the 18th International Conference on Geoinformatics: GIScience in Change, 2010
Proceedings of the ACM/SIGDA 18th International Symposium on Field Programmable Gate Arrays, 2010
2009
Behavior-level observability don't-cares and application to low-power behavioral synthesis.
Proceedings of the 2009 International Symposium on Low Power Electronics and Design, 2009
Proceedings of the 2009 International Conference on Computer-Aided Design, 2009
Proceedings of the ACM/SIGDA 17th International Symposium on Field Programmable Gate Arrays, 2009
Proceedings of the FCCM 2009, 2009
2008
Proceedings of the 13th Asia South Pacific Design Automation Conference, 2008
Behavioral synthesis with activating unused flip-flops for reducing glitch power in FPGA.
Proceedings of the 13th Asia South Pacific Design Automation Conference, 2008
2007
Proceedings of the 12th Conference on Asia South Pacific Design Automation, 2007
2006
Architecture and Compiler Optimizations for Data Bandwidth Improvement in Configurable Processors.
IEEE Trans. Very Large Scale Integr. Syst., 2006
Proceedings of the 2006 IEEE International SOC Conference, Austin, Texas, USA, 2006
Proceedings of the 43rd Design Automation Conference, 2006
Behavior and communication co-optimization for systems with sequential communication media.
Proceedings of the 43rd Design Automation Conference, 2006
2005
Architecture and compilation for data bandwidth improvement in configurable embedded processors.
Proceedings of the 2005 International Conference on Computer-Aided Design, 2005
Proceedings of the ACM/SIGDA 13th International Symposium on Field Programmable Gate Arrays, 2005
Proceedings of the 2005 Conference on Asia South Pacific Design Automation, 2005
2004
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2004
Application-specific instruction generation for configurable processor architectures.
Proceedings of the ACM/SIGDA 12th International Symposium on Field Programmable Gate Arrays, 2004
Proceedings of the 41th Design Automation Conference, 2004
2003
Proceedings of the 2003 International Symposium on Physical Design, 2003
Proceedings of the 2003 International Conference on Computer-Aided Design, 2003
Architectural Synthesis Integrated with Global Placement for Multi-Cycle Communication.
Proceedings of the 2003 International Conference on Computer-Aided Design, 2003
Proceedings of the 1st IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, 2003