Hari Sundar

Orcid: 0000-0001-9001-5107

According to our database1, Hari Sundar authored at least 64 papers between 2005 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


On csauthors.net:


Automating GPU Scalability for Complex Scientific Models: Phonon Boltzmann Transport Equation.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

Multi-discretization domain specific language and code generation for differential equations.
J. Comput. Sci., April, 2023

A projection-based, semi-implicit time-stepping approach for the Cahn-Hilliard Navier-Stokes equations on adaptive octree meshes.
J. Comput. Phys., February, 2023

Automating GPU Scalability for Complex Scientific Models: Phonon Boltzman Transport Equation.
CoRR, 2023

Generating Finite Element Codes combining Adaptive Octrees with Complex Geometries.
CoRR, 2023

An autoencoder compression approach for accelerating large-scale inverse problems.
CoRR, 2023

TANGO: A GPU optimized traceback approach for sequence alignment algorithms.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

Scalable adaptive algorithms for next-generation multiphase flow simulations.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

Scalable parallelization for the solution of phonon Boltzmann Transport Equation.
Proceedings of the 37th International Conference on Supercomputing, 2023

Scalable Local Timestepping on Octree Grids.
SIAM J. Sci. Comput., 2022

A fully-coupled framework for solving Cahn-Hilliard Navier-Stokes equations: Second-order, energy-stable numerical methods on adaptive octree based meshes.
Comput. Phys. Commun., 2022

Scalable adaptive algorithms for next-generation multiphase simulations.
CoRR, 2022

A GPU-Accelerated AMR Solver for Gravitational Wave Propagation.
Proceedings of the SC22: International Conference for High Performance Computing, 2022

A scalable adaptive-matrix SPMV for heterogeneous architectures.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

Finch: Domain Specific Language and Code Generation for Finite Element and Finite Volume in Julia.
Proceedings of the Computational Science - ICCS 2022, 2022

An open-source parallel code for computing the spectral fractional Laplacian on 3D complex geometry domains.
Comput. Phys. Commun., 2021

Industrial scale Large Eddy Simulations with adaptive octree meshes using immersogeometric analysis.
Comput. Math. Appl., 2021

Scalable adaptive PDE solvers in arbitrary domains.
Proceedings of the International Conference for High Performance Computing, 2021

Case study of SARS-CoV-2 transmission risk assessment in indoor environments using cloud computing resources.
Proceedings of the 2021 SC Workshops Supplementary Proceedings, 2021

A Compressed, Divide and Conquer Algorithm for Scalable Distributed Matrix-Matrix Multiplication.
Proceedings of the HPC Asia 2021: The International Conference on High Performance Computing in Asia-Pacific Region, 2021

Data-driven Mixed Precision Sparse Matrix Vector Multiplication for GPUs.
ACM Trans. Archit. Code Optim., 2020

Simulating two-phase flows with thermodynamically consistent energy stable Cahn-Hilliard Navier-Stokes equations on parallel adaptive octree based meshes.
J. Comput. Phys., 2020

Industrial scale large eddy simulations (LES) with adaptive octree meshes using immersogeometric analysis.
CoRR, 2020

A scalable framework for solving fractional diffusion equations.
Proceedings of the ICS '20: 2020 International Conference on Supercomputing, 2020

Massively Parallel Simulations of Binary Black Hole Intermediate-Mass-Ratio Inspirals.
SIAM J. Sci. Comput., 2019

Solving PDEs in space-time: 4D tree-based adaptivity, mesh-free and matrix-free approaches.
Proceedings of the International Conference for High Performance Computing, 2019

A scalable framework for adaptive computational general relativity on heterogeneous clusters.
Proceedings of the ACM International Conference on Supercomputing, 2019

Scalable Lazy-update Multigrid Preconditioners.
Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, 2019

Towards Triangle Counting on GPU using Stable Radix binning.
Proceedings of the 2018 IEEE High Performance Extreme Computing Conference, 2018

Improving Performance and Scalability of Algebraic Multigrid through a Specialized MATVEC.
Proceedings of the 2018 IEEE High Performance Extreme Computing Conference, 2018

Utilizing GPU Parallelism to Improve Fast Spherical Harmonic Transforms.
Proceedings of the 2018 IEEE High Performance Extreme Computing Conference, 2018

Reproducing ParConnect for SC16.
Parallel Comput., 2017

Parallel Algorithms for the Computation of Cycles in Relative Neighborhood Graphs.
Proceedings of the 46th International Conference on Parallel Processing, 2017

A Scalable Hierarchical Semi-Separable Library for Heterogeneous Clusters.
Proceedings of the 46th International Conference on Parallel Processing, 2017

Efficient parallel streaming algorithms for large-scale inverse problems.
Proceedings of the 2017 IEEE High Performance Extreme Computing Conference, 2017

Machine and Application Aware Partitioning for Adaptive Mesh Refinement Applications.
Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing, 2017

FFT, FMM, or Multigrid? A comparative Study of State-Of-the-Art Poisson Solvers for Uniform and Nonuniform Grids in the Unit Cube.
SIAM J. Sci. Comput., 2016

Comparison of multigrid algorithms for high-order continuous finite element discretizations.
Numer. Linear Algebra Appl., 2015

A Nested Partitioning Algorithm for Adaptive Meshes on Heterogeneous Clusters.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

FFT, FMM, or MULTIGRID? A comparative study of state-of-the-art poisson solvers.
CoRR, 2014

Designing Scalable Out-of-core Sorting with Hybrid MPI+PGAS Programming Models.
Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, 2014

A Nested Partitioning Scheme for Parallel Heterogeneous Clusters.
CoRR, 2013

Algorithms for high-throughput disk-to-disk sorting.
Proceedings of the International Conference for High Performance Computing, 2013

HykSort: a new variant of hypercube quicksort on distributed memory architectures.
Proceedings of the International Conference on Supercomputing, 2013

Nonrigid 2D/3D Registration of Coronary Artery Models With Live Fluoroscopy for Guidance of Cardiac Interventions.
IEEE Trans. Medical Imaging, 2012

Parallel geometric-algebraic multigrid on unstructured forests of octrees.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

Coronary arteries motion modeling on 2D x-ray images.
Proceedings of the Medical Imaging 2012: Image-Guided Procedures, 2012

A robust and accurate approach to automatic blood vessel detection and segmentation from angiography x-ray images using multistage random forests.
Proceedings of the Medical Imaging 2012: Computer-Aided Diagnosis, San Diego, 2012

Parallel Fast Gauss Transform.
Proceedings of the Conference on High Performance Computing Networking, 2010

Image-Based Respiratory Motion Compensation for Fluoroscopic Coronary Roadmapping.
Proceedings of the Medical Image Computing and Computer-Assisted Intervention, 2010

An Efficient Graph-Based Deformable 2D/3D Registration Algorithm with Applications for Abdominal Aortic Aneurysm Interventions.
Proceedings of the Medical Imaging and Augmented Reality - 5th International Workshop, 2010

Automatic global vessel segmentation and catheter removal using local geometry information and vector field integration.
Proceedings of the 2010 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 2010

Model-based respiratory motion compensation for image-guided cardiac interventions.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

Estimating myocardial motion by 4D image warping.
Pattern Recognit., 2009

Curve-based 2D-3D registration of coronary vessels for image guided procedure.
Proceedings of the Medical Imaging 2009: Visualization, 2009

Automatic Image-Based Cardiac and Respiratory Cycle Synchronization and Gating of Image Sequences.
Proceedings of the Medical Image Computing and Computer-Assisted Intervention, 2009

Biomechanically-Constrained 4D Estimation of Myocardial Motion.
Proceedings of the Medical Image Computing and Computer-Assisted Intervention, 2009

Bottom-Up Construction and 2: 1 Balance Refinement of Linear Octrees in Parallel.
SIAM J. Sci. Comput., 2008

Dendro: parallel algorithms for multigrid and AMR methods on 2: 1 balanced octrees.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Low-constant parallel algorithms for finite element simulations using linear octrees.
Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

Robust Computation of Mutual Information Using Spatially Adaptive Meshes.
Proceedings of the Medical Image Computing and Computer-Assisted Intervention - MICCAI 2007, 10th International Conference, Brisbane, Australia, October 29, 2007

A novel 2D-3D registration algorithm for aligning fluoro images with 3D pre-op CT/MR images.
Proceedings of the Medical Imaging 2006: Visualization, 2006

Estimating myocardial fiber orientations by template warping.
Proceedings of the 2006 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 2006

Consistent Estimation of Cardiac Motions by 4D Image Registration.
Proceedings of the Medical Image Computing and Computer-Assisted Intervention, 2005
