Chao Yang
Orcid: 0000-0001-7426-6248Affiliations:
- Peking University, Beijing, China
- Chinese Academy of Sciences, Institute of Software, Beijing, China (former)
According to our database1,
Chao Yang
authored at least 94 papers
between 2009 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on zbmath.org
-
on orcid.org
On csauthors.net:
Bibliography
2024
J. Sci. Comput., December, 2024
AONN: An Adjoint-Oriented Neural Network Method for All-At-Once Solutions of Parametric Optimal Control Problems.
SIAM J. Sci. Comput., February, 2024
Nonlinearly Constrained Pressure Residual (NCPR) Algorithms for Fractured Reservoir Simulation.
SIAM J. Sci. Comput., February, 2024
Adaptive Space-Time Domain Decomposition for Multiphase Flow in Porous Media with Bound Constraints.
SIAM J. Sci. Comput., 2024
AONN-2: An adjoint-oriented neural network method for PDE-constrained shape optimization.
J. Comput. Phys., 2024
SampleAttention: Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention.
CoRR, 2024
CoRR, 2024
HOSCF: Efficient decoupling algorithms for finding the best rank-one approximation of higher-order tensors.
CoRR, 2024
Uncovering Nested Data Parallelism and Data Reuse in DNN Computation with FractalTensor.
Proceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles, 2024
Adversarial Adaptive Sampling: Unify PINN and Optimal Transport for the Approximation of PDEs.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
A Holistic Functionalization Approach to Optimizing Imperative Tensor Programs in Deep Learning.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024
Centauri: Enabling Efficient Scheduling for Communication-Computation Overlap in Large Model Training via Communication Partitioning.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
2023
J. Comput. Phys., December, 2023
DAS-PINNs: A deep adaptive sampling method for solving high-dimensional partial differential equations.
J. Comput. Phys., March, 2023
Publisher Correction: xMath2.0: a high-performance extended math library for SW26010-Pro many-core processor.
CCF Trans. High Perform. Comput., March, 2023
xMath2.0: a high-performance extended math library for SW26010-Pro many-core processor.
CCF Trans. High Perform. Comput., March, 2023
a-Tucker: fast input-adaptive and matricization-free Tucker decomposition of higher-order tensors on GPUs.
CCF Trans. High Perform. Comput., March, 2023
Proceedings of the Eleventh International Conference on Learning Representations, 2023
2022
Parallel finite volume simulation of the spherical shell dynamo with pseudo-vacuum magnetic boundary conditions.
J. Comput. Phys., 2022
Scalable semismooth Newton methods with multilevel domain decomposition for subsurface flow and reactive transport in porous media.
J. Comput. Phys., 2022
Multilevel field-split preconditioners with domain decomposition for steady and unsteady flow problems.
Comput. Phys. Commun., 2022
Proceedings of the 51st International Conference on Parallel Processing, 2022
2021
IEEE Trans. Parallel Distributed Syst., 2021
Efficient Alternating Least Squares Algorithms for Low Multilinear Rank Approximation of Tensors.
J. Sci. Comput., 2021
Variational inequality transport model on the sphere by the active-set reduced-space algorithm.
Comput. Phys. Commun., 2021
CoRR, 2021
A rank-adaptive higher-order orthogonal iteration algorithm for truncated Tucker decomposition.
CoRR, 2021
AutoWM: a novel domain-specific tool for universal multi-/many-core accelerations of the WRF cloud microphysics.
Clust. Comput., 2021
2020
Enabling Highly Efficient Batched Matrix Multiplications on SW26010 Many-core Processor.
ACM Trans. Archit. Code Optim., 2020
SIAM J. Sci. Comput., 2020
J. Comput. Sci. Technol., 2020
Parallel multilevel restricted Schwarz preconditioners for implicit simulation of subsurface flows with Peng-Robinson equation of state.
J. Comput. Phys., 2020
a-Tucker: Input-Adaptive and Matricization-Free Tucker Decomposition for Dense Tensors on CPUs and GPUs.
CoRR, 2020
Efficient Alternating Least Squares Algorithms for Truncated HOSVD of Higher-Order Tensors.
CoRR, 2020
Clust. Comput., 2020
2019
IEEE Trans. Parallel Distributed Syst., 2019
Enabling Highly Efficient k-Means Computations on the SW26010 Many-Core Processor of Sunway TaihuLight.
J. Comput. Sci. Technol., 2019
A fully implicit constraint-preserving simulator for the black oil model of petroleum reservoirs.
J. Comput. Phys., 2019
Parallel reservoir simulators for fully implicit complementarity formulation of multicomponent compressible flows.
Comput. Phys. Commun., 2019
Parallel energy-stable phase field crystal simulations based on domain decomposition methods.
Comput. Phys. Commun., 2019
2018
PEPS++: Towards Extreme-Scale Simulations of Strongly Correlated Quantum Many-Particle Models on Sunway TaihuLight.
IEEE Trans. Parallel Distributed Syst., 2018
Extreme-Scale High-Order WENO Simulations of 3-D Detonation Wave with 10 Million Cores.
ACM Trans. Archit. Code Optim., 2018
Performance Optimization of the HPCG Benchmark on the Sunway TaihuLight Supercomputer.
ACM Trans. Archit. Code Optim., 2018
A Fast Sparse Triangular Solver for Structured-grid Problems on Sunway Many-core Processor SW26010.
Proceedings of the 47th International Conference on Parallel Processing, 2018
Extreme-Scale Realistic Stencil Computations on Sunway TaihuLight with Ten Million Cores.
Proceedings of the 18th IEEE/ACM International Symposium on Cluster, 2018
2017
J. Supercomput., 2017
IEEE Micro, 2017
Nonlinearly preconditioned semismooth Newton methods for variational inequality solution of two-phase flow in porous media.
J. Comput. Phys., 2017
A Multi-Perspective Method for Analysis of Cooperative Behaviors Among Industrial Devices of Smart Factory.
IEEE Access, 2017
A 3-Layer Method for Analysis of Cooperative Behaviors of Physical Devices in Cyber-Physical Systems.
Proceedings of the Wireless Algorithms, Systems, and Applications, 2017
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017
Proceedings of the 46th International Conference on Parallel Processing, 2017
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2017, 2017
2016
Active-Set Reduced-Space Methods with Nonlinear Elimination for Two-Phase Flow Problems in Porous Media.
SIAM J. Sci. Comput., 2016
A Nonlinearly Preconditioned Inexact Newton Algorithm for Steady State Lattice Boltzmann Equations.
SIAM J. Sci. Comput., 2016
Int. J. High Perform. Comput. Appl., 2016
Sci. China Inf. Sci., 2016
Proceedings of the International Conference for High Performance Computing, 2016
Accelerating the Simulation of Thermal Convection in the Earth's Outer Core on Tianhe-2.
Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016
Accelerating the 3D euler atmospheric solver through heterogeneous CPU-GPU platforms.
Proceedings of the ACM International Conference on Computing Frontiers, CF'16, 2016
Proceedings of the IEEE/ACM 16th International Symposium on Cluster, 2016
Proceedings of the IEEE/ACM 16th International Symposium on Cluster, 2016
Unleashing the performance potential of CPU-GPU platforms for the 3D atmospheric Euler solver.
Proceedings of the 27th IEEE International Conference on Application-specific Systems, 2016
2015
Solving the Global Atmospheric Equations through Heterogeneous Reconfigurable Platforms.
ACM Trans. Reconfigurable Technol. Syst., 2015
IEEE Trans. Computers, 2015
A parallel domain decomposition-based implicit method for the Cahn-Hilliard-Cook phase-field equation in 3D.
J. Comput. Phys., 2015
A multiscale algorithm for radiative heat transfer equation with rapidly oscillating coefficients.
Appl. Math. Comput., 2015
Pattern-Driven Hybrid Multi- and Many-Core Acceleration in the MPAS Shallow-Water Model.
Proceedings of the 44th International Conference on Parallel Processing, 2015
Proceedings of the Algorithms and Architectures for Parallel Processing, 2015
Proceedings of the Cloud Computing - 6th International Conference, 2015
2014
A Scalable Fully Implicit Compressible Euler Solver for Mesoscale Nonhydrostatic Simulation of Atmospheric Flows.
SIAM J. Sci. Comput., 2014
Parallel Domain Decomposition Methods with Mixed Order Discretization for Fully Implicit Solution of Tracer Transport Problems on the Cubed-Sphere.
J. Sci. Comput., 2014
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014
Scaling and analyzing the stencil performance on multi-core and many-core architectures.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014
Proceedings of the Algorithms and Architectures for Parallel Processing, 2014
Proceedings of the 21st International Conference on High Performance Computing, 2014
A highly-efficient and green data flow engine for solving euler atmospheric equations.
Proceedings of the 24th International Conference on Field Programmable Logic and Applications, 2014
2013
Proceedings of the Domain Decomposition Methods in Science and Engineering XX, 2013
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013
Accelerating solvers for global atmospheric equations through mixed-precision data flow engine.
Proceedings of the 23rd International Conference on Field programmable Logic and Applications, 2013
Proceedings of the 21st IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2013
2012
2011
Parallel multilevel methods for implicit solution of shallow water equations with nonsmooth topography on the cubed-sphere.
J. Comput. Phys., 2011
A parallel well-balanced finite volume method for shallow water equations with topography on the cubed-sphere.
J. Comput. Appl. Math., 2011
2010
A Fully Implicit Domain Decomposition Algorithm for Shallow Water Equations on the Cubed-Sphere.
SIAM J. Sci. Comput., 2010
Scalability Studies of an Implicit Shallow Water Solver for the Rossby-Haurwitz Problem.
Proceedings of the High Performance Computing for Computational Science - VECPAR 2010, 2010
Proceedings of the 12th IEEE International Conference on High Performance Computing and Communications, 2010
2009
Proceedings of the High Performance Computing and Applications, 2009