Kamesh Madduri

Orcid: 0000-0003-4344-0957

According to our database1, Kamesh Madduri authored at least 91 papers between 2004 and 2024.

Collaborative distances:



In proceedings 
PhD thesis 


On csauthors.net:


Jet: Multilevel Graph Partitioning on Graphics Processing Units.
SIAM J. Sci. Comput., 2024

Efficient community detection in multilayer networks using boolean compositions.
Frontiers Big Data, 2024

GraphEx: A Graph-based Extraction Method for Advertiser Keyphrase Recommendation.
CoRR, 2024

Message from the HCW 2024 Steering Committee Co-Chairs.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

Graphite: A Graph-Based Extreme Multi-Label Short Text Classifier for Keyphrase Recommendation.
Proceedings of the ECAI 2024 - 27th European Conference on Artificial Intelligence, 19-24 October 2024, Santiago de Compostela, Spain, 2024

Jet: Multilevel Graph Partitioning on GPUs.
CoRR, 2023

Multicore Algorithms for Graph Connectivity Problems.
Proceedings of the Massive Graph Analytics, 2022

Partitioning Trillion-Edge Graphs.
Proceedings of the Massive Graph Analytics, 2022

Convex Latent Effect Logit Model via Sparse and Low-rank Decomposition.
CoRR, 2021

Performance-Portable Graph Coarsening for Efficient Multilevel Graph Analysis.
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

Scalable, Multi-Constraint, Complex-Objective Graph Partitioning.
IEEE Trans. Parallel Distributed Syst., 2020

Fast Spectral Graph Layout on Multicore Platforms.
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020

Modern gyrokinetic particle-in-cell simulation of fusion plasmas on top supercomputers.
Int. J. High Perform. Comput. Appl., 2019

Consensus Ensemble System for Traffic Flow Prediction.
IEEE Trans. Intell. Transp. Syst., 2018

Efficient Online Hyperparameter Optimization for Kernel Ridge Regression with Applications to Traffic Time Series Prediction.
CoRR, 2018

Efficient Online Hyperparameter Learning for Traffic Flow Prediction.
Proceedings of the 21st International Conference on Intelligent Transportation Systems, 2018

Spectral Graph Drawing: Building Blocks and Performance Analysis.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

Parallel Read Partitioning for Concurrent Assembly of Metagenomic Data.
Proceedings of the 25th IEEE International Conference on High Performance Computing, 2018

Graph-based visual analysis for large-scale hydrological modeling.
Inf. Vis., 2017

Distributed Graph Layout for Scalable Small-world Network Analysis.
CoRR, 2017

Distributed-Memory Breadth-First Search on Massive Graphs.
CoRR, 2017

Optimizing Word2Vec Performance on Multicore Systems.
Proceedings of the Seventh Workshop on Irregular Applications: Architectures and Algorithms, 2017

Analyzing Community Structure in Networks.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Order or Shuffle: Empirically Evaluating Vertex Order Impact on Parallel Graph Computations.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Partitioning Trillion-Edge Graphs in Minutes.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

Parallel and Memory-Efficient Preprocessing for Metagenome Assembly.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Fast Parallel Graph Triad Census and Triangle Counting on Shared-Memory Platforms.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Parallel Particle-in-Cell Performance Optimization: A Case Study of Electrospray Simulation.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Parallel k-Core Decomposition on Multicore Platforms.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Congestion-aware memory management on NUMA platforms: A VMware ESXi case study.
Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017

Parallel k-truss decomposition on multicore systems.
Proceedings of the 2017 IEEE High Performance Extreme Computing Conference, 2017

Shared-Memory Graph Truss Decomposition.
Proceedings of the 24th IEEE International Conference on High Performance Computing, 2017

Tuning Heterogeneous Computing Platforms for Large-Scale Hydrology Data Management.
IEEE Trans. Parallel Distributed Syst., 2016

Complex Network Partitioning Using Label Propagation.
SIAM J. Sci. Comput., 2016

SPRITE: A Fast Parallel SNP Detection Pipeline.
Proceedings of the High Performance Computing - 31st International Conference, 2016

Extreme scale plasma turbulence simulations on top supercomputers worldwide.
Proceedings of the International Conference for High Performance Computing, 2016

A Case Study of Complex Graph Analysis in Distributed Memory: Implementation and Optimization.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Parallel color-coding.
Parallel Comput., 2015

High-Performance Graph Analytics on Manycore Processors.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

BFS and Coloring-Based Parallel Algorithms for Strongly Connected Components and Related Problems.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Complex Network Analysis Using Parallel Approximate Motif Counting.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Simple parallel biconnectivity algorithms for multicore platforms.
Proceedings of the 21st International Conference on High Performance Computing, 2014

PuLP: Scalable multi-objective multi-constraint partitioning for small-world networks.
Proceedings of the 2014 IEEE International Conference on Big Data (IEEE BigData 2014), 2014

A CPU: GPU Hybrid Implementation and Model-Driven Scheduling of the Fast Multipole Method.
Proceedings of the Seventh Workshop on General Purpose Processing Using GPUs, 2014

Analysis and optimization of gyrokinetic toroidal simulations on homogenous and heterogenous platforms.
Int. J. High Perform. Comput. Appl., 2013

Kinetic turbulence simulations at extreme scale on leadership-class systems.
Proceedings of the International Conference for High Performance Computing, 2013

Scalability Analysis of the Asynchronous, Master-Slave Borg Multiobjective Evolutionary Algorithm.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Fast Approximate Subgraph Counting and Enumeration.
Proceedings of the 42nd International Conference on Parallel Processing, 2013

Parallel analysis of large graph-structured data in genomics and proteomics.
Proceedings of the IEEE 3rd International Conference on Computational Advances in Bio and Medical Sciences, 2013

Optimization of Parallel Particle-to-Grid Interpolation on Leading Multicore Platforms.
IEEE Trans. Parallel Distributed Syst., 2012

Brief announcement: towards a communication optimal fast multipole method and its implications at exascale.
Proceedings of the 24th ACM Symposium on Parallelism in Algorithms and Architectures, 2012

Poster: Advances in Gyrokinetic Particle in Cell Simulation for Fusion Plasmas to Extreme Scale.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Abstract: Advances in Gyrokinetic Particle in Cell Simulation for Fusion Plasmas to Extreme Scale.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

NUMA-aware graph mining techniques for performance and energy efficiency.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

Graph partitioning for scalable distributed graph computations.
Proceedings of the Graph Partitioning and Graph Clustering, 2012

SNAP (Small-World Network Analysis and Partitioning) Framework.
Proceedings of the Encyclopedia of Parallel Computing, 2011

Gyrokinetic particle-in-cell optimization on emerging multi- and manycore platforms.
Parallel Comput., 2011

Massive-Scale RDF Processing Using Compressed Bitmap Indexes.
Proceedings of the Scientific and Statistical Database Management, 2011

Gyrokinetic toroidal simulations on leading multi- and manycore HPC systems.
Proceedings of the Conference on High Performance Computing Networking, 2011

Parallel breadth-first search on distributed memory systems.
Proceedings of the Conference on High Performance Computing Networking, 2011

Cosmic microwave background map-making at the petascale and beyond.
Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

Large-Scale Network Analysis.
Proceedings of the Graph Algorithms in the Language of Linear Algebra, 2011

A faster algorithm for the single source shortest path problem with few distinct positive lengths.
J. Discrete Algorithms, 2010

Two-level heaps: a new priority queue structure with applications to the single source shortest path problem.
Computing, 2010

Space-time tradeoffs in negative cycle detection - An empirical analysis of the Stressing Algorithm.
Appl. Math. Comput., 2010

Diagnosis, Tuning, and Redesign for Multicore Performance: A Case Study of the Fast Multipole Method.
Proceedings of the Conference on High Performance Computing Networking, 2010

Multi-level bitmap indexes for flash memory storage.
Proceedings of the Fourteenth International Database Engineering and Applications Symposium (IDEAS 2010), 2010

Combinatorial Algorithm Design on the Cell/B.E. Processor.
Proceedings of the Scientific Computing with Multicore and Accelerators., 2010

Memory-efficient optimization of Gyrokinetic particle-to-grid interpolation for multicore processors.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

A faster parallel algorithm and efficient multithreaded implementations for evaluating betweenness centrality on massive datasets.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Compact graph representations and parallel connectivity algorithms for massive dynamic network analysis.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Improved Algorithms for Detecting Negative Cost Cycles in Undirected Graphs.
Proceedings of the Frontiers in Algorithmics, Third International Workshop, 2009

Efficient joins with compressed bitmap indexes.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

A high-performance framework for analyzing massive complex networks.
PhD thesis, 2008

A graph-theoretic analysis of the human protein-interaction network using multicore parallel algorithms.
Parallel Comput., 2008

A Randomized Queueless Algorithm for Breadth-First Search.
Int. J. Comput. Their Appl., 2008

SNAP, Small-world Network Analysis and Partitioning: An open-source parallel graph framework for the exploration of large-scale networks.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Design of Multithreaded Algorithms for Combinatorial Problems.
Proceedings of the Handbook of Parallel Computing - Models, Algorithms and Applications., 2007

High performance combinatorial algorithm design on the Cell Broadband Engine processor.
Parallel Comput., 2007

Approximating Betweenness Centrality.
Proceedings of the Algorithms and Models for the Web-Graph, 5th International Workshop, 2007

Advanced Shortest Paths Algorithms on a Massively-Multithreaded Architecture.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

SWARM: A Parallel Programming Framework for Multicore Processors.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

On the Design and Analysis of Irregular Algorithms on the Cell Processor: A Case Study of List Ranking.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Accomplishing Approximate FCFS Fairness Without Queues.
Proceedings of the High Performance Computing, 2007

An Experimental Study of A Parallel Shortest Path Algorithm for Solving Large-Scale Graph Instances.
Proceedings of the Nine Workshop on Algorithm Engineering and Experiments, 2007

Parallel Algorithms for Evaluating Centrality Indices in Real-world Networks.
Proceedings of the 2006 International Conference on Parallel Processing (ICPP 2006), 2006

Designing Multithreaded Algorithms for Breadth-First Search and st-connectivity on the Cray MTA-2.
Proceedings of the 2006 International Conference on Parallel Processing (ICPP 2006), 2006

Parallel Shortest Path Algorithms for Solving Large-Scale Instances.
Proceedings of the Shortest Path Problem, 2006

Design and Implementation of the HPCS Graph Analysis Benchmark on Symmetric Multiprocessors.
Proceedings of the High Performance Computing, 2005

PATRAM - a handwritten word processor for Indian languages.
Proceedings of the Ninth International Workshop on Frontiers in Handwriting Recognition, 2004

A Parallel State Assignment Algorithm for Finite State Machines.
Proceedings of the High Performance Computing, 2004
