Quanlu Zhang

Orcid: 0000-0003-0557-1104

According to our database1, Quanlu Zhang authored at least 35 papers between 2015 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Automating Cloud Deployment for Real-Time Online Foundation Model Inference.
IEEE/ACM Trans. Netw., April, 2024

Efficient Large Language Models: A Survey.
Trans. Mach. Learn. Res., 2024

You Only Cache Once: Decoder-Decoder Architectures for Language Models.
CoRR, 2024

Ladder: Enabling Efficient Low-Precision Deep Learning Computing through Hardware-aware Tensor Transformation.
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024

nnScaler: Constraint-Guided Parallelization Plan Generation for Deep Learning Training.
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024

2023
AutoTaskFormer: Searching Vision Transformers for Multi-task Learning.
CoRR, 2023

SparDA: Accelerating Dynamic Sparse Deep Neural Networks via Sparse-Dense Transformation.
CoRR, 2023

SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction.
CoRR, 2023

PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation Invariant Transformation.
Proceedings of the 29th Symposium on Operating Systems Principles, 2023

Efficient GPU Kernels for N: M-Sparse Weights in Deep Learning.
Proceedings of the Sixth Conference on Machine Learning and Systems, 2023

SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8 Inference.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SiloD: A Co-design of Caching and Scheduling for Deep Learning Clusters.
Proceedings of the Eighteenth European Conference on Computer Systems, 2023

2022
SparTA: Deep-Learning Model Sparsity via Tensor-with-Sparsity-Attribute.
Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation, 2022

Nesting Forward Automatic Differentiation for Memory-Efficient Deep Neural Network Training.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

Privacy-preserving Online AutoML for Domain-Specific Face Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
AceNAS: Learning to Rank Ace Neural Architectures with Weak Supervision of Weight Sharing.
CoRR, 2021

2020
How Does Supernet Help in Neural Architecture Search?
CoRR, 2020

Deeper Insights into Weight Sharing in Neural Architecture Search.
CoRR, 2020

A Novel Hybrid Active Contour Model for Intracranial Tuberculosis MRI Segmentation Applications.
IEEE Access, 2020

AutoSys: The Design and Operation of Learning-Augmented Systems.
Proceedings of the 2020 USENIX Annual Technical Conference, 2020

HiveD: Sharing a GPU Cluster for Deep Learning with Guarantees.
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

Retiarii: A Deep Learning Exploratory-Training Framework.
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

Automating Cloud Deployment for Deep Learning Inference of Real-time Online Services.
Proceedings of the 39th IEEE Conference on Computer Communications, 2020

LadaBERT: Lightweight Adaptation of BERT through Hybrid Model Compression.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

2018
Gandiva: Introspective Cluster Scheduling for Deep Learning.
Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, 2018

Building efficient and available distributed transaction with Paxos-based coding consensus.
Proceedings of the IEEE INFOCOM 2018, 2018

Towards Web-based Delta Synchronization for Cloud Storage Services.
Proceedings of the 16th USENIX Conference on File and Storage Technologies, 2018

SDPaxos: Building Efficient Semi-Decentralized Geo-replicated State Machines.
Proceedings of the ACM Symposium on Cloud Computing, 2018

Scheduling CPU for GPU-based Deep Learning Jobs.
Proceedings of the ACM Symposium on Cloud Computing, 2018

2017
DeltaCFS: Boosting Delta Sync for Cloud Storage Services by Learning from NFS.
Proceedings of the 37th IEEE International Conference on Distributed Computing Systems, 2017

2015
CHARM: A Cost-Efficient Multi-Cloud Data Hosting Scheme with High Availability.
IEEE Trans. Cloud Comput., 2015

UStore: A Low Cost Cold and Archival Data Storage System for Data Centers.
Proceedings of the 35th IEEE International Conference on Distributed Computing Systems, 2015

Understanding and Surpassing Dropbox: Efficient Incremental Synchronization in Cloud Storage Services.
Proceedings of the 2015 IEEE Global Communications Conference, 2015

DSwitch: a dual mode direct and network attached disk.
Proceedings of the Sixth ACM Symposium on Cloud Computing, 2015


  Loading...