Jue Wang

Orcid: 0000-0001-5573-9986

  • Zhejiang Dawning Information Technology Company, Ltd., Hangzhou, China
  • Chinese Academy of Sciences, Computer Network Information Center, Supercomputing Center, Beijing, China
  • University of Science and Technology Beijing, School of Information and Engineering, China (former)

According to our database1, Jue Wang authored at least 50 papers between 2005 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


Online presence:

On csauthors.net:


Acc-SpMM: Accelerating General-purpose Sparse Matrix-Matrix Multiplication with GPU Tensor Cores.
CoRR, January, 2025

Acc-SpMM: Accelerating General-purpose Sparse Matrix-Matrix Multiplication with GPU Tensor Cores.
Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2025

POSTER: ParGNN: Efficient Training for Large-Scale Graph Neural Network on GPU Clusters.
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024

Reinforcement Learning for Scientific Application: A Survey.
Proceedings of the Knowledge Science, Engineering and Management, 2024

Rationality of Thought Improves Reasoning in Large Language Models.
Proceedings of the Knowledge Science, Engineering and Management, 2024

Large-Scale Simulation of Structural Dynamics Computing on GPU Clusters.
Proceedings of the International Conference for High Performance Computing, 2023

ANT-MOC: Scalable Neutral Particle Transport Using 3D Method of Characteristics on Multi-GPU Systems.
Proceedings of the International Conference for High Performance Computing, 2023

A Scalable Hybrid Total FETI Method for Massively Parallel FEM Simulations.
Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023

A Sparse Matrix Optimization Method for Graph Neural Networks Training.
Proceedings of the Knowledge Science, Engineering and Management, 2023

A Graph Partitioning Algorithm Based on Graph Structure and Label Propagation for Citation Network Prediction.
Proceedings of the Knowledge Science, Engineering and Management, 2023

Updates and Experiences of VenusAI Platform.
Proceedings of the Artificial Intelligence - Third CAAI International Conference, 2023

Deployment and Comparison of Large Language Models Based on Virtual Cluster.
Proceedings of the Artificial Intelligence - Third CAAI International Conference, 2023

InParformer: Evolutionary Decomposition Transformers with Interactive Parallel Attention for Long-Term Time Series Forecasting.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

VenusAI: An artificial intelligence platform for scientific discovery on supercomputers.
J. Syst. Archit., 2022

OpenVenus: An Open Service Interface for HPC Environment Based on SLURM.
Proceedings of the Smart Computing and Communication - 7th International Conference, 2022

A Multi-level Attention-Based LSTM Network for Ultra-short-term Solar Power Forecast Using Meteorological Knowledge.
Proceedings of the Knowledge Science, Engineering and Management, 2022

Data-Driven Approach for Investigation of Irradiation Hardening Behavior of RAFM Steel.
Proceedings of the Knowledge Science, Engineering and Management, 2022

A Multiperiod Multiobjective Portfolio Selection Model With Fuzzy Random Returns for Large Scale Securities Data.
IEEE Trans. Fuzzy Syst., 2021

Defects Detection System of Medical Gloves Based on Deep Learning.
Proceedings of the Smart Computing and Communication - 6th International Conference, 2021

Sci-Base: A Resource Aggregation and Sharing Ecology for Software on Discovery Science.
Proceedings of the Smart Computing and Communication - 6th International Conference, 2021

Secure Shell Remote Access for Virtualized Computing Environment.
Proceedings of the Smart Computing and Communication - 6th International Conference, 2021

A Genetic Algorithm-Based Artificial Network Method for Material Feature Recombination.
Proceedings of the 6th IEEE International Conference on Smart Cloud, 2021

Distributed machine learning load balancing strategy in cloud computing services.
Wirel. Networks, 2020

Artificial Intelligence Platform for Mobile Service Computing.
J. Signal Process. Syst., 2019

Parameter Communication Consistency Model for Large-Scale Security Monitoring Based on Mobile Computing.
IEEE Access, 2019

Model Aggregation Method for Data Parallelism in Distributed Real-Time Machine Learning of Smart Sensing Equipment.
IEEE Access, 2019

An Adaptive Synchronous Parallel Strategy for Distributed Machine Learning.
IEEE Access, 2018

An Automatically Learning and Discovering Human Fishing Behaviors Scheme for CPSCN.
IEEE Access, 2018

Artificial Intelligence Platform for Heterogeneous Computing.
Proceedings of the Smart Computing and Communication - Third International Conference, 2018

Massively Scaling the Metal Microscopic Damage Simulation on Sunway TaihuLight Supercomputer.
Proceedings of the 47th International Conference on Parallel Processing, 2018

A Parameter Communication Optimization Strategy for Distributed Machine Learning in Sensors.
Sensors, 2017

High performance computing for advanced modeling and simulation of materials.
Comput. Phys. Commun., 2017

GPU implementation of the linear scaling three dimensional fragment method for large scale electronic structure calculations.
Comput. Phys. Commun., 2017

A Parallel Hybrid Intelligent Algorithm for Fuzzy Mean-CVaR Portfolio Model.
Proceedings of the 19th IEEE International Conference on High Performance Computing and Communications; 15th IEEE International Conference on Smart City; 3rd IEEE International Conference on Data Science and Systems, 2017

Auto tuning for new energy dispatch problem: A case study.
Future Gener. Comput. Syst., 2016

Efficient parallel implementation of incompressible pipe flow algorithm based on SIMPLE.
Concurr. Comput. Pract. Exp., 2016

Parallel simulation of high-dimensional American option pricing based on CPU versus MIC.
Concurr. Comput. Pract. Exp., 2015

ORTHRUS: a lightweighted block-level cloud storage system.
Clust. Comput., 2013

Message scheduling for array re-decomposition on distributed memory systems.
Future Gener. Comput. Syst., 2010

OpenMP compiler for distributed memory architectures.
Sci. China Inf. Sci., 2010

A Heuristic Rule of Partitioning Irregular Loop for Parallelizing Compilers.
Proceedings of the High Performance Computing and Applications, 2009

Automatic Transformation for Overlapping Communication and Computation.
Proceedings of the Network and Parallel Computing, IFIP International Conference, 2008

Contention-Free Communication Scheduling for Group Communication in Data Parallelism.
Proceedings of the On the Move to Meaningful Internet Systems 2007: CoopIS, 2007

Transforming the Adaptive Irregular Out-of-Core Applications for Hiding Communication and Disk I/O.
Proceedings of the On the Move to Meaningful Internet Systems 2007: CoopIS, 2007

OpenMP Extensions for Irregular Parallel Applications on Clusters.
Proceedings of the A Practical Programming Model for the Multi-Core Era, 2007

OpenMP Implementation of Parallel Linear Solver for Reservoir Simulation.
Proceedings of the A Practical Programming Model for the Multi-Core Era, 2007

Optimized scheduling for group communication in data parallelism.
Proceedings of the 2007 IEEE International Conference on Cluster Computing, 2007

Parallel iteration space alternate tiling Gauss-Seidel solver.
Proceedings of the 2007 IEEE International Conference on Cluster Computing, 2007

A New Parallel Gauss-Seidel Method by Iteration Space Alternate Tiling.
Proceedings of the 16th International Conference on Parallel Architectures and Compilation Techniques (PACT 2007), 2007

Extending OpenMP for Implementation of Multi-Paradigm and Multi-Grain Parallel Execution Model.
Proceedings of the International Conference on Parallel and Distributed Computing Systems, 2005
