We stand with Ukraine

We stand with Ukraine

Ningxin Zheng

Orcid: 0009-0009-6449-8972

According to our database¹, Ningxin Zheng authored at least 27 papers between 2018 and 2024.

Collaborative distances:

Dijkstra number² of three.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Online Streaming Video Super-Resolution With Convolutional Look-Up Table.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

IEEE Trans. Image Process., 2024

ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2024

FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion.

[BibT_eX]

[DOI]

,

,

,

Chengquan Jiang

,

,

,

,

,

,

,

,

CoRR, 2024

Ladder: Enabling Efficient Low-Precision Deep Learning Computing through Hardware-aware Tensor Transformation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024

2023

Online Video Super-Resolution With Convolutional Kernel Bypass Grafts.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

IEEE Trans. Multim., 2023

Online Video Streaming Super-Resolution with Adaptive Look-Up Table Fusion.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

CoRR, 2023

SparDA: Accelerating Dynamic Sparse Deep Neural Networks via Sparse-Dense Transformation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, 2023

PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation Invariant Transformation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Chengruidong Zhang

,

,

,

Proceedings of the 29th Symposium on Operating Systems Principles, 2023

Optimizing Dynamic Neural Networks with Brainstorm.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

Efficient GPU Kernels for N: M-Sparse Weights in Deep Learning.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

Proceedings of the Sixth Conference on Machine Learning and Systems, 2023

SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8 Inference.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Toward QoS-Awareness and Improved Utilization of Spatial Multitasking GPUs.

[BibT_eX]

[DOI]

,

,

,

,

,

IEEE Trans. Computers, 2022

Online Video Super-Resolution with Convolutional Kernel Bypass Graft.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2022

QoS-Aware Irregular Collaborative Inference for Improving Throughput of DNN Services.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the SC22: International Conference for High Performance Computing, 2022

SparTA: Deep-Learning Model Sparsity via Tensor-with-Sparsity-Attribute.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation, 2022

Astraea: towards QoS-aware and resource-efficient multi-stage GPU services.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

2021

nn-METER: Towards Accurate Latency Prediction of DNN Inference on Diverse Edge Devices.

[BibT_eX]

[DOI]

,

,

,

,

,

GetMobile Mob. Comput. Commun., 2021

Full-Cycle Energy Consumption Benchmark for Low-Carbon Computer Vision.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2021

Enable simultaneous DNN services based on deterministic operator overlap and precise latency prediction.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

Proceedings of the International Conference for High Performance Computing, 2021

nn-Meter: towards accurate latency prediction of deep-learning model inference on diverse edge devices.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the MobiSys '21: The 19th Annual International Conference on Mobile Systems, Applications, and Services, Virtual Event, Wisconsin, USA, 24 June, 2021

CHARM: Collaborative Host and Accelerator Resource Management for GPU Datacenters.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 39th IEEE International Conference on Computer Design, 2021

2020

Towards QoS-Aware and Resource-Efficient GPU Microservices Based on Spatial Multitasking GPUs In Datacenters.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2020

URSA: Precise Capacity Planning and Fair Scheduling based on Low-level Statistics for Public Clouds.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020

2019

URSA: Precise Capacity Planning and Contention-aware Scheduling for Public Clouds.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, 2019

POSTER: Precise Capacity Planning for Database Public Clouds.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2018

CLIBE: Precise Cluster-Level I/O Bandwidth Enforcement in Distributed File System.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems, 2018

Loading...