Jie Zhao

Orcid: 0000-0003-2303-9736

Affiliations:
  • College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, China
  • School of Information, Renmin University of China, Beijing, China (former)
  • State Key Laboratory of Mathematical Engineering and Advanced Computing, Information Engineering University, Zhengzhou, Henan, China (former)
  • PSL Research University, Paris, France (former)


According to our database1, Jie Zhao authored at least 26 papers between 2012 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
A Holistic Approach to Automatic Mixed-Precision Code Generation and Tuning for Affine Programs.
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024

Enabling Tensor Language Model to Assist in Generating High-Performance Tensor Programs for Deep Learning.
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024

Arfa: An Agile Regime-Based Floating-Point Optimization Approach for Rounding Errors.
Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis, 2024

2023
Modeling the Interplay between Loop Tiling and Fusion in Optimizing Compilers Using Affine Relations.
ACM Trans. Comput. Syst., 2023

Effectively Scheduling Computational Graphs of Deep Neural Networks toward Their Domain-Specific Accelerators.
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

SIRIUS: Harvesting Whole-Program Optimization Opportunities for DNNs.
Proceedings of the Sixth Conference on Machine Learning and Systems, 2023

Eiffel: Inferring Input Ranges of Significant Floating-point Errors via Polynomial Extrapolation.
Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering, 2023

2022
Apollo: Automatic Partition-based Operator Fusion through Layer by Layer Optimization.
Proceedings of the Fifth Conference on Machine Learning and Systems, 2022

Automatically Generating High-performance Matrix Multiplication Kernels on the Latest Sunway Processor.
Proceedings of the 51st International Conference on Parallel Processing, 2022

Parallelizing Neural Network Models Effectively on GPU by Implementing Reductions Atomically.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022

2021
OneFlow: Redesign the Distributed Deep Learning Framework from Scratch.
CoRR, 2021

AKG: automatic kernel generation for neural processing units using polyhedral transformations.
Proceedings of the PLDI '21: 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, 2021

2020
Flextended Tiles: A Flexible Extension of Overlapped Tiles for Polyhedral Compilation.
ACM Trans. Archit. Code Optim., 2020

Optimizing the Memory Hierarchy by Compositing Automatic Transformations on Computations and Data.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

2019
WCCV: improving the vectorization of IF-statements with warp-coherent conditions.
Proceedings of the ACM International Conference on Supercomputing, 2019

2018
A Combined Language and Polyhedral Approach for Heterogeneous Parallelism. (Une Approche Combinée Langage-Polyédrique pour la Programmation Parallèle Hétérogène).
PhD thesis, 2018

K-DT: a formal system for the evaluation of linear data dependence testing techniques.
J. Supercomput., 2018

A Practical and Aggressive Loop Fission Technique.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2018

A polyhedral compilation framework for loops with dynamic data-dependent bounds.
Proceedings of the 27th International Conference on Compiler Construction, 2018

2017
Identifying superword level parallelism with extended directed dependence graph reachability.
Sci. China Inf. Sci., 2017

2016
Code Generation for Distributed-Memory Architectures.
Comput. J., 2016

2015
An improved nonlinear data dependence test.
J. Supercomput., 2015

2013
QP test: a dependence test for quadratic array subscripts.
IET Softw., 2013

2012
A Max-Plus Algebra Approach for Network-on-Chip End-to-End Delay Estimation.
Proceedings of the Eighth International Conference on Semantics, Knowledge and Grids, 2012

A Nested Loop Fusion Algorithm Based on Cost Analysis.
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

A Nonlinear Array Subscripts Dependence Test.
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012


  Loading...