Xianwei Zhang

Orcid: 0000-0003-3507-4299

Affiliations:

Sun Yat-sen University, School of Computer Science and Engineering, Guangzhou, China
AMD Inc., Sunnyvale, CA, USA
University of Pittsburgh, Computer Science Department, Pittsburgh, PA, USA

According to our database¹, Xianwei Zhang authored at least 27 papers between 2013 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

2014

2016

2018

2020

2022

2024

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

Mpache: Interaction Aware Multi-level Cache Bypassing on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 30th Asia and South Pacific Design Automation Conference, 2025

2024

APTMoE: Affinity-Aware Pipeline Tuning for MoE Models on Bandwidth-Constrained GPU Nodes.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2024

mLOOP: Optimize Loop Unrolling in Compilation with a ML-based Approach.

[BibT_eX]

[DOI]

Zhongchun Zheng

Yuan Wu

Xianwei Zhang

Proceedings of the International Conference on Networking, Architecture and Storage, 2024

MixPert: Optimizing Mixed-Precision Floating-Point Emulation on GPU Integer Tensor Cores.

[BibT_eX]

[DOI]

Proceedings of the 25th ACM SIGPLAN/SIGBED International Conference on Languages, 2024

openLG: A Tunable and Efficient Open-source LSTM on GPUs.

[BibT_eX]

[DOI]

Proceedings of the International Joint Conference on Neural Networks, 2024

SMILE: LLC-based Shared Memory Expansion to Improve GPU Thread Level Parallelism.

[BibT_eX]

[DOI]

Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

2023

Hybrid MPI and CUDA paralleled finite volume unstructured CFD simulations on a multi-GPU system.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2023

Hay: Enhancing GPU Sharing Performance With Two-Level Scheduling for Ray.

[BibT_eX]

[DOI]

Proceedings of the 29th IEEE International Conference on Parallel and Distributed Systems, 2023

KeSCo: Compiler-based Kernel Scheduling for Multi-task GPU Applications.

[BibT_eX]

[DOI]

Proceedings of the 41st IEEE International Conference on Computer Design, 2023

2022

RollBin: reducing code-size via loop rerolling at binary level.

[BibT_eX]

[DOI]

Proceedings of the LCTES '22: 23rd ACM SIGPLAN/SIGBED International Conference on Languages, 2022

moTuner: a compiler-based auto-tuning approach for mixed-precision operators.

[BibT_eX]

[DOI]

Proceedings of the CF '22: 19th ACM International Conference on Computing Frontiers, Turin, Italy, May 17, 2022

RAISE: Efficient GPU Resource Management via Hybrid Scheduling.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Symposium on Cluster, 2022

2020

DELTA: Validate GPU Memory Profiling with Microbenchmarks.

[BibT_eX]

[DOI]

Xianwei Zhang

Evgeny Shcherbakov

Proceedings of the MEMSYS 2020: The International Symposium on Memory Systems, 2020

2019

Optimizing GPU Cache Policies for MI Workloads.

[BibT_eX]

[DOI]

CoRR, 2019

Autonomous Data-Race-Free GPU Testing.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Workload Characterization, 2019

Optimizing GPU Cache Policies for MI Workloads.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Workload Characterization, 2019

Boosting chipkill capability under retention-error induced reliability emergency.

[BibT_eX]

[DOI]

Proceedings of the 24th Asia and South Pacific Design Automation Conference, 2019

2018

Lost in Abstraction: Pitfalls of Analyzing GPUs at the Intermediate Language Level.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

2017

On the Restore Time Variations of Future DRAM Memory.

[BibT_eX]

[DOI]

ACM Trans. Design Autom. Electr. Syst., 2017

DrMP: Mixed Precision-Aware DRAM for High Performance Approximate and Precise Computing.

[BibT_eX]

[DOI]

Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

2016

AWARD: Approximation-aWAre Restore in Further Scaling DRAM.

[BibT_eX]

[DOI]

Proceedings of the Second International Symposium on Memory Systems, 2016

Restore truncation for performance improvement in future DRAM systems.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

2015

Exploit common source-line to construct energy efficient domain wall memory based caches.

[BibT_eX]

[DOI]

Proceedings of the 33rd IEEE International Conference on Computer Design, 2015

TriState-SET: Proactive SET for improved performance of MLC phase change memories.

[BibT_eX]

[DOI]

XianWei Zhang

Youtao Zhang

Jun Yang

Proceedings of the 33rd IEEE International Conference on Computer Design, 2015

DLB: Dynamic lane borrowing for improving bandwidth and performance in Hybrid Memory Cube.

[BibT_eX]

[DOI]

XianWei Zhang

Youtao Zhang

Jun Yang

Proceedings of the 33rd IEEE International Conference on Computer Design, 2015

Exploiting DRAM restore time variations in deep sub-micron scaling.

[BibT_eX]

[DOI]

Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

2013

WoM-SET: Low power proactive-SET-based PCM write using WoM code.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2013

Xianwei Zhang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...