Xiaofan Zhang

Orcid: 0000-0001-5081-3972

Affiliations:
  • University of Illinois Urbana-Champaign, IL, USA


According to our database1, Xiaofan Zhang authored at least 43 papers between 2017 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
AutoAI2C: An Automated Hardware Generator for DNN Acceleration on Both FPGA and ASIC.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., October, 2024

TBA: Faster Large Language Model Training Using SSD-Based Activation Offloading.
CoRR, 2024

New Solutions on LLM Acceleration, Optimization, and Application.
CoRR, 2024

ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization.
CoRR, 2024

Invited: New Solutions on LLM Acceleration, Optimization, and Application.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

HomeSGN: A Smarter Home with Novel Rule Mining Enabled by a Scorer-Generator GAN.
Proceedings of the 29th Asia and South Pacific Design Automation Conference, 2024

Invited Paper: Software/Hardware Co-design for LLM and Its Application for Design Verification.
Proceedings of the 29th Asia and South Pacific Design Automation Conference, 2024

2023
Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision Post-Training Quantization.
CoRR, 2023

Mixed Precision Post Training Quantization of Neural Networks with Sensitivity Guided Search.
CoRR, 2023

2022
Algorithm/Accelerator Co-Design and Co-Search for Edge AI.
IEEE Trans. Circuits Syst. II Express Briefs, 2022

Exploring HW/SW Co-Design for Video Analysis on CPU-FPGA Heterogeneous Systems.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Efficient Machine Learning, Compilers, and Optimizations for Embedded Systems.
CoRR, 2022

AutoDistill: an End-to-End Framework to Explore and Distill Hardware-Efficient Language Models.
CoRR, 2022

YouHome System and Dataset: Making Your Home Know You Better.
Proceedings of the IEEE International Symposium on Smart Electronic Systems, 2022

2021
Efficient Methods for Mapping Neural Machine Translator on FPGAs.
IEEE Trans. Parallel Distributed Syst., 2021

EH-DNAS: End-to-End Hardware-aware Differentiable Neural Architecture Search.
CoRR, 2021

Being-ahead: Benchmarking and Exploring Accelerators for Hardware-Efficient AI Deployment.
CoRR, 2021

Exploring HW/SW Co-Optimizations for Accelerating Large-scale Texture Identification on Distributed GPUs.
Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021

Scaling Up Hardware Accelerator Verification using A-QED with Functional Decomposition.
Proceedings of the Formal Methods in Computer Aided Design, 2021

F-CAD: A Framework to Explore Hardware Accelerators for Codec Avatar Decoding.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

2020
SkyNet: a Hardware-Efficient Method for Object Detection and Tracking on Embedded Systems.
Proceedings of the Third Conference on Machine Learning and Systems, 2020

DNNExplorer: A Framework for Modeling and Exploring a Novel Paradigm of FPGA-based DNN Accelerator.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020

Effective Algorithm-Accelerator Co-design for AI Solutions on Edge Devices.
Proceedings of the GLSVLSI '20: Great Lakes Symposium on VLSI 2020, 2020

AutoDNNchip: An Automated DNN Chip Predictor and Builder for Both FPGAs and ASICs.
Proceedings of the FPGA '20: The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020

HybridDNN: A Framework for High-Performance Hybrid DNN Accelerator Design and Implementation.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

A-QED Verification of Hardware Accelerators.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

EDD: Efficient Differentiable DNN Architecture and Implementation Co-search for Embedded AI Solutions.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

2019
SkyNet: A Champion Model for DAC-SDC on Low Power Object Detection.
CoRR, 2019

A Bi-Directional Co-Design Approach to Enable Deep Learning on IoT Devices.
CoRR, 2019

SiamVGG: Visual Tracking using Deeper Siamese Networks.
CoRR, 2019

T-DLA: An Open-source Deep Learning Accelerator for Ternarized DNN Models on Embedded FPGA.
Proceedings of the 2019 IEEE Computer Society Annual Symposium on VLSI, 2019

µL2Q: An Ultra-Low Loss Quantization Method for DNN Compression.
Proceedings of the International Joint Conference on Neural Networks, 2019

Cloud-DNN: An Open Framework for Mapping DNN Models to Cloud FPGAs.
Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019

FPGA/DNN Co-Design: An Efficient Design Methodology for IoT Intelligence on the Edge.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

Implementing neural machine translation with bi-directional GRU and attention mechanism on FPGAs using HLS.
Proceedings of the 24th Asia and South Pacific Design Automation Conference, 2019

2018
DNNBuilder: an automated tool for building high-performance DNN hardware accelerators for FPGAs.
Proceedings of the International Conference on Computer-Aided Design, 2018

Face Recognition with Hybrid Efficient Convolution Algorithms on FPGAs.
Proceedings of the 2018 on Great Lakes Symposium on VLSI, 2018

Design Flow of Accelerating Hybrid Extremely Low Bit-Width Neural Network in Embedded FPGA.
Proceedings of the 28th International Conference on Field Programmable Logic and Applications, 2018

AccDNN: An IP-Based DNN Generator for FPGAs.
Proceedings of the 26th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2018

CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017
Machine learning on FPGAs to face the IoT revolution.
Proceedings of the 2017 IEEE/ACM International Conference on Computer-Aided Design, 2017

An energy efficient approach for C4.5 algorithm using OpenCL design flow.
Proceedings of the International Conference on Field Programmable Technology, 2017

High-performance video content recognition with long-term recurrent convolutional network for FPGA.
Proceedings of the 27th International Conference on Field Programmable Logic and Applications, 2017


  Loading...