Sai Qian Zhang

Orcid: 0000-0002-4815-9235

According to our database1, Sai Qian Zhang authored at least 47 papers between 2015 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Estimating Power, Performance, and Area for On-Sensor Deployment of AR/VR Workloads Using an Analytical Framework.
ACM Trans. Design Autom. Electr. Syst., 2024

PETAH: Parameter Efficient Task Adaptation for Hybrid Transformers in a resource-limited Context.
CoRR, 2024

Neural Architecture Search of Hybrid Models for NPU-CIM Heterogeneous AR/VR Devices.
CoRR, 2024

DiTAS: Quantizing Diffusion Transformers via Enhanced Activation Smoothing.
CoRR, 2024

HQ-DiT: Efficient Diffusion Transformer with FP4 Hybrid Quantization.
CoRR, 2024

Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey.
CoRR, 2024

T3M: Text Guided 3D Human Motion Synthesis from Speech.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

Rome was Not Built in a Single Step: Hierarchical Prompting for LLM-based Chip Design.
Proceedings of the 2024 ACM/IEEE International Symposium on Machine Learning for CAD, 2024

Learning Client Selection Strategy for Federated Learning across Heterogeneous Mobile Devices.
Proceedings of the 25th International Symposium on Quality Electronic Design, 2024

sLLM: Accelerating LLM Inference using Semantic Load Balancing with Shared Memory Data Structures.
Proceedings of the 25th International Symposium on Quality Electronic Design, 2024

JointNF: Enhancing DNN Performance through Adaptive N: M Pruning across both Weight and Activation.
Proceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design, 2024

Hyft: A Reconfigurable Softmax Accelerator with Hybrid Numeric Format for both Training and Inference.
Proceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design, 2024

Murmuration: On-the-fly DNN Adaptation for SLO-Aware Distributed Inference in Dynamic Edge Environments.
Proceedings of the 53rd International Conference on Parallel Processing, 2024

CAMEL: Co-Designing AI Models and eDRAMs for Efficient On-Device Learning.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

DLoRA: Distributed Parameter-Efficient Fine-Tuning Solution for Large Language Model.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

2023
BIM: Block-Wise Self-Supervised Learning with Masked Image Modeling.
CoRR, 2023

Softmax Acceleration with Adaptive Numeric Format for both Training and Inference.
CoRR, 2023

CAMEL: Co-Designing AI Models and Embedded DRAMs for Efficient On-Device Learning.
CoRR, 2023

2022
FAST: DNN Training Under Variable Precision Block Floating Point with Stochastic Rounding.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

SphereFed: Hyperspherical Federated Learning.
Proceedings of the Computer Vision - ECCV 2022, 2022

A Multi-Agent Reinforcement Learning Approach for Efficient Client Selection in Federated Learning.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Saturation RRAM Leveraging Bit-Level Sparsity Resulting from Term Quantization.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

Training for multi-resolution inference using reusable quantization terms.
Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021

2020
Term Revealing: Furthering Quantization at Run Time on Quantized DNNs.
CoRR, 2020

On the Robustness of Cooperative Multi-Agent Reinforcement Learning.
Proceedings of the 2020 IEEE Security and Privacy Workshops, 2020

Term quantization: furthering quantization at run time.
Proceedings of the International Conference for High Performance Computing, 2020

Succinct and Robust Multi-Agent Communication With Temporal Message Control.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Adaptive Distributed Convolutional Neural Network Inference at the Network Edge with ADCNN.
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020

RTN: Reparameterized Ternary Network.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Full-stack Optimization for Accelerating CNNs with FPGA Validation.
CoRR, 2019

Efficient Communication in Multi-Agent Reinforcement Learning via Variance Based Control.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Systolic Building Block for Logic-on-Logic 3D-IC Implementations of Convolutional Neural Networks.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2019

Full-stack optimization for accelerating CNNs using powers-of-two weights with FPGA validation.
Proceedings of the ACM International Conference on Supercomputing, 2019

Packing Sparse Convolutional Neural Networks for Efficient Systolic Array Implementations: Column Combining Under Joint Optimization.
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

Maestro: A Memory-on-Logic Architecture for Coordinated Parallel Use of Many Systolic Arrays.
Proceedings of the 30th IEEE International Conference on Application-specific Systems, 2019

2018
InferBeam: A Fast Beam Alignment Protocol for Millimeter-wave Networking.
CoRR, 2018

A Machine Learning Assisted Cell Selection Method for Drones in Cellular Networks.
Proceedings of the 19th IEEE International Workshop on Signal Processing Advances in Wireless Communications, 2018

Mapping Systolic Arrays onto 3D Circuit Structures: Accelerating Convolutional Neural Network Inference.
Proceedings of the 2018 IEEE International Workshop on Signal Processing Systems, 2018

Adaptive Tiling: Applying Fixed-size Systolic Arrays To Sparse Convolutional Neural Networks.
Proceedings of the 24th International Conference on Pattern Recognition, 2018

2017
TCAM space-efficient routing in a software defined network.
Comput. Networks, 2017

2016
Sector: TCAM Space Aware Routing on SDN.
Proceedings of the 28th International Teletraffic Congress, 2016

Joint NFV placement and routing for multicast service on SDN.
Proceedings of the 2016 IEEE/IFIP Network Operations and Management Symposium, 2016

2015
Routing Algorithms for Network Function Virtualization Enabled Multicast Topology on SDN.
IEEE Trans. Netw. Serv. Manag., 2015

Kaleidoscope: Real-time content delivery in software defined infrastructures.
Proceedings of the IFIP/IEEE International Symposium on Integrated Network Management, 2015

Fast Network Flow Resumption for Live Virtual Machine Migration on SDN.
Proceedings of the 23rd IEEE International Conference on Network Protocols, 2015

Aurora: Adaptive Block Replication in Distributed File Systems.
Proceedings of the 35th IEEE International Conference on Distributed Computing Systems, 2015

Network Function Virtualization enabled multicast routing on SDN.
Proceedings of the 2015 IEEE International Conference on Communications, 2015


  Loading...