Jidong Zhai
Orcid: 0000-0002-7656-6428
According to our database1,
Jidong Zhai
authored at least 135 papers
between 2009 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
Optimizing I/O Performance Through Effective vCPU Scheduling Interference Management.
IEEE Trans. Parallel Distributed Syst., December, 2024
IEEE Trans. Parallel Distributed Syst., December, 2024
Efficient Inference for Pruned CNN Models on Mobile Devices With Holistic Sparsity Alignment.
IEEE Trans. Parallel Distributed Syst., November, 2024
IEEE Trans. Parallel Distributed Syst., July, 2024
IEEE Trans. Parallel Distributed Syst., June, 2024
Editorial for the special issue on programming models and system software for High-Performance Computing (HPC) environments.
CCF Trans. High Perform. Comput., June, 2024
IEEE Trans. Knowl. Data Eng., May, 2024
FreshGNN: Reducing Memory Access via Stable Historical Embeddings for Graph Neural Network Training.
Proc. VLDB Endow., February, 2024
CoRR, 2024
Proceedings of the 2024 USENIX Annual Technical Conference, 2024
PUZZLE: Efficiently Aligning Large Language Models through Light-Weight Context Switch.
Proceedings of the 2024 USENIX Annual Technical Conference, 2024
POSTER: Pattern-Aware Sparse Communication for Scalable Recommendation Model Training.
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024
Proceedings of the 40th IEEE International Conference on Data Engineering, 2024
Proceedings of the Nineteenth European Conference on Computer Systems, 2024
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
2023
IEEE Trans. Computers, December, 2023
Enabling Efficient Random Access to Hierarchically Compressed Text Data on Diverse GPU Platforms.
IEEE Trans. Parallel Distributed Syst., October, 2023
BladeDISC: Optimizing Dynamic Shape Machine Learning Workloads via Compiler Approach.
Proc. ACM Manag. Data, September, 2023
Critique of "A Parallel Framework for Constraint-Based Bayesian Network Learning via Markov Blanket Discovery" by SCC Team From Tsinghua University.
IEEE Trans. Parallel Distributed Syst., June, 2023
J. Comput. Sci. Technol., February, 2023
Special issue on new trends in high-performance computing: Software systems and applications.
Softw. Pract. Exp., 2023
Proc. ACM Manag. Data, 2023
PowerFusion: A Tensor Compiler with Explicit Data Movement Description and Instruction-level Graph IR.
CoRR, 2023
ReFresh: Reducing Memory Access from Exploiting Stable Historical Embeddings for Graph Neural Network Training.
CoRR, 2023
SmartMoE: Efficiently Training Sparsely-Activated Models through Combining Offline and Online Parallelization.
Proceedings of the 2023 USENIX Annual Technical Conference, 2023
Proceedings of the International Conference for High Performance Computing, 2023
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023
Proceedings of the Eleventh International Conference on Learning Representations, 2023
Joint Geometrical and Statistical Domain Adaptation for Cross-domain Code Vulnerability Detection.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
Unveiling the Black Box of PLMs with Semantic Anchors: Towards Interpretable Neural Semantic Parsing.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
Critique of "MemXCT: Memory-Centric X-Ray CT Reconstruction With Massive Parallelization" by SCC Team From Tsinghua University.
IEEE Trans. Parallel Distributed Syst., 2022
POCLib: A High-Performance Framework for Enabling Near Orthogonal Processing on Compression.
IEEE Trans. Parallel Distributed Syst., 2022
IEEE Trans. Parallel Distributed Syst., 2022
IEEE Trans. Parallel Distributed Syst., 2022
IEEE Trans. Parallel Distributed Syst., 2022
IEEE Trans. Parallel Distributed Syst., 2022
IEEE Trans. Knowl. Data Eng., 2022
J. Parallel Distributed Comput., 2022
Guiding the PLMs with Semantic Anchors as Intermediate Supervision: Towards Interpretable Semantic Parsing.
CoRR, 2022
GraphQ IR: Unifying Semantic Parsing of Graph Query Language with Intermediate Representation.
CoRR, 2022
CompressDB: Enabling Efficient Compressed Data Direct Processing for Various Databases.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022
Proceedings of the SC22: International Conference for High Performance Computing, 2022
Vapro: performance variance detection and diagnosis for production-run parallel applications.
Proceedings of the PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2, 2022
Proceedings of the PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2, 2022
PerFlow: a domain specific framework for automatic performance analysis of parallel applications.
Proceedings of the PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2, 2022
FasterMoE: modeling and optimizing training of large-scale dynamic pre-trained models.
Proceedings of the PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2, 2022
FreeTensor: a free-form DSL with holistic optimizations for irregular tensor programs.
Proceedings of the PLDI '22: 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, San Diego, CA, USA, June 13, 2022
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022
GraphQ IR: Unifying the Semantic Parsing of Graph Query Languages with One Intermediate Representation.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
AStitch: enabling a new multi-dimensional optimization space for memory-intensive ML training and inference on modern SIMT architectures.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022
Suppressing ZZ crosstalk of Quantum computers through pulse and scheduling co-optimization.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022
2021
Critique of "Planetary Normal Mode Computation: Parallel Algorithms, Performance, and Reproducibility" by SCC Team From Tsinghua University.
IEEE Trans. Parallel Distributed Syst., 2021
IEEE Trans. Parallel Distributed Syst., 2021
IEEE Trans. Parallel Distributed Syst., 2021
Automatic Irregularity-Aware Fine-Grained Workload Partitioning on Integrated Architectures.
IEEE Trans. Knowl. Data Eng., 2021
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021
PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections.
Proceedings of the 15th USENIX Symposium on Operating Systems Design and Implementation, 2021
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021
Mitigating Crosstalk in Quantum Computers through Commutativity-Based Instruction Reordering.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021
Proceedings of the IEEE International Conference on Cluster Computing, 2021
2020
Int. J. Parallel Program., 2020
GraphPi: high performance graph pattern matching through effective redundancy elimination.
Proceedings of the International Conference for High Performance Computing, 2020
Proceedings of the International Conference for High Performance Computing, 2020
Identifying scalability bottlenecks for large-scale parallel programs with graph analysis.
Proceedings of the PPoPP '20: 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020
Proceedings of the Network and Parallel Computing, 2020
PewLSTM: Periodic LSTM with Weather-Aware Gating Mechanism for Parking Behavior Prediction.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020
Edge-Stream: a Stream Processing Approach for Distributed Applications on a Hierarchical Edge-computing System.
Proceedings of the 5th IEEE/ACM Symposium on Edge Computing, 2020
Memory-Centric Communication Mechanism for Real-time Autonomous Navigation Applications.
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020
Proceedings of the 40th IEEE International Conference on Distributed Computing Systems, 2020
GOPipe: A Granularity-Oblivious Programming Framework for Pipelined Stencil Executions on GPU.
Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020
2019
IEEE Trans. Parallel Distributed Syst., 2019
Student Cluster Competition 2018, Team Tsinghua University: Reproducing performance of multi-physics simulations of the Tsunamigenic 2004 Sumatra megathrust earthquake on the Intel Skylake Architecture.
Parallel Comput., 2019
Guest Editorial: Special Issue on Network and Parallel Computing for Emerging Architectures and Applications.
Int. J. Parallel Program., 2019
Performance evaluation and analysis of sparse matrix and graph kernels on heterogeneous processors.
CCF Trans. High Perform. Comput., 2019
Spread-n-share: improving application performance and cluster throughput with resource-aware job placement.
Proceedings of the International Conference for High Performance Computing, 2019
Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation, 2019
Proceedings of the Network and Parallel Computing, 2019
Proceedings of the 17th USENIX Conference on File and Storage Technologies, 2019
HiWayLib: A Software Framework for Enabling High Performance Communications for Heterogeneous Pipeline Computations.
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019
2018
IEEE Trans. Parallel Distributed Syst., 2018
J. Supercomput., 2018
Efficient Document Analytics on Compressed Data: Method, Challenges, Algorithms, Insights.
Proc. VLDB Endow., 2018
Student cluster competition 2017, team Tsinghua University: Reproducing vectorization of the tersoff multi-body potential on the Intel Skylake and NVIDIA Volta architectures.
Parallel Comput., 2018
Proceedings of the 2018 USENIX Annual Technical Conference, 2018
vSensor: leveraging fixed-workload snippets of programs for performance variance detection.
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018
Zwift: A Programming Framework for High Performance Text Analytics on Compressed Data.
Proceedings of the 32nd International Conference on Supercomputing, 2018
2017
IEEE Trans. Parallel Distributed Syst., 2017
Proceedings of the International Conference for High Performance Computing, 2017
Self-Checkpoint: An In-Memory Checkpoint Method Using Less Space and Its Practice on Fault-Tolerant HPL.
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017
Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017
FinePar: irregularity-aware fine-grained workload partitioning on integrated architectures.
Proceedings of the 2017 International Symposium on Code Generation and Optimization, 2017
2016
Building Semi-Elastic Virtual Clusters for Cost-Effective HPC Cloud Resource Provisioning.
IEEE Trans. Parallel Distributed Syst., 2016
Performance Prediction for Large-Scale Parallel Applications Using Representative Replay.
IEEE Trans. Computers, 2016
Frontiers Comput. Sci., 2016
Characterizing and optimizing TPC-C workloads on large-scale systems using SSD arrays.
Sci. China Inf. Sci., 2016
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016
2015
IEEE Trans. Parallel Distributed Syst., 2015
Optimizing seam carving on multi-GPU systems for real-time content-aware image resizing.
J. Supercomput., 2015
Proceedings of the 23rd IEEE International Symposium on Modeling, 2015
Proceedings of the Algorithms and Architectures for Parallel Processing, 2015
2014
CYPRESS: Combining Static and Dynamic Analysis for Top-Down Communication Trace Compression.
Proceedings of the International Conference for High Performance Computing, 2014
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014
2013
Cost-effective cloud HPC resource provisioning by building semi-elastic virtual clusters.
Proceedings of the International Conference for High Performance Computing, 2013
Proceedings of the International Conference for High Performance Computing, 2013
Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, 2013
2012
Proceedings of the Job Scheduling Strategies for Parallel Processing, 2012
2011
IEEE Trans. Parallel Distributed Syst., 2011
Cloud versus in-house cluster: evaluating Amazon cluster compute instances for running MPI applications.
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, 2011
One optimized I/O configuration per HPC application: leveraging the configurability of cloud.
Proceedings of the APSys '11 Asia Pacific Workshop on Systems, 2011
2010
PHANTOM: predicting performance of parallel applications on large-scale parallel machines using a single node.
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010
2009
Sci. China Ser. F Inf. Sci., 2009
FACT: fast communication trace collection for parallel applications through program slicing.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009
Proceedings of the Euro-Par 2009 Parallel Processing, 2009