Sameer Kumar

Affiliations:
  • IBM T. J. Watson Research Center, Yorktown Heights, NY, USA


According to our database1, Sameer Kumar authored at least 58 papers between 2002 and 2021.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2021



2020




2019



2018
Efficient Training of Convolutional Neural Nets on Large Distributed Systems.
Proceedings of the IEEE International Conference on Cluster Computing, 2018

2017
PowerAI DDL.
CoRR, 2017

MPI Acceleration of Image Classification: Are We Seeing the Resurgence of MPI in Solving Big Data Problems?
Proceedings of the 2017 Workshop on Software Engineering Methods for Parallel and High Performance Applications, 2017

2016
Space Performance Tradeoffs in Compressing MPI Group Data Structures.
Proceedings of the 23rd European MPI Users' Group Meeting, EuroMPI 2016, 2016

Optimization of Message Passing Services on POWER8 InfiniBand Clusters.
Proceedings of the 23rd European MPI Users' Group Meeting, EuroMPI 2016, 2016

2015

2014
Optimization of MPI collective operations on the IBM Blue Gene/Q supercomputer.
Int. J. High Perform. Comput. Appl., 2014

Scalable MPI-3.0 RMA on the Blue Gene/Q Supercomputer.
Proceedings of the 21st European MPI Users' Group Meeting, 2014

2013
A divide and conquer strategy for scaling weather simulations with multiple regions of interest.
Sci. Program., 2013

IBM Blue Gene/Q system software stack.
IBM J. Res. Dev., 2013

Optimization of MPI_Allreduce on the blue Gene/Q supercomputer.
Proceedings of the 20th European MPI Users's Group Meeting, 2013

2012
The IBM Blue Gene/Q Interconnection Fabric.
IEEE Micro, 2012

Looking under the hood of the IBM blue gene/Q network.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

PAMI: A Parallel Active Message Interface for the Blue Gene/Q Supercomputer.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Collective algorithms for sub-communicators.
Proceedings of the International Conference on Supercomputing, 2012

Performance Evaluation and Optimization of Nested High Resolution Weather Simulations.
Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

2011
Mpi on millions of Cores.
Parallel Process. Lett., 2011

Comparison of neuronal spike exchange methods on a Blue Gene/P supercomputer.
Frontiers Comput. Neurosci., 2011

The IBM Blue Gene/Q interconnection network and message unit.
Proceedings of the Conference on High Performance Computing Networking, 2011

Optimizing MPI Collectives Using Efficient Intra-node Communication Techniques over the Blue Gene/P Supercomputer.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

2010
Architecture of the Component Collective Messaging Interface.
Int. J. High Perform. Comput. Appl., 2010

Enabling Concurrent Multithreaded MPI Communication on Multicore Petascale Systems.
Proceedings of the Recent Advances in the Message Passing Interface, 2010

Optimization of applications with non-blocking neighborhood collectives via multisends on the Blue Gene/P supercomputer.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Minimizing MPI Resource Contention in Multithreaded Multicore Environments.
Proceedings of the 2010 IEEE International Conference on Cluster Computing, 2010

2009
MPI on a Million Processors.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2009

MPI collective communications on the blue gene/p supercomputer: algorithms and optimizations.
Proceedings of the 23rd international conference on Supercomputing, 2009

Dynamic topology aware load balancing algorithms for molecular dynamics applications.
Proceedings of the 23rd international conference on Supercomputing, 2009

MPI Collective Communications on The Blue Gene/P Supercomputer: Algorithms and Optimizations.
Proceedings of the 17th IEEE Symposium on High Performance Interconnects, 2009

2008
Scalable molecular dynamics with NAMD on the IBM Blue Gene/L system.
IBM J. Res. Dev., 2008

Fine-grained parallelization of the Car - Parrinello ab initio molecular dynamics method on the IBM Blue Gene/L supercomputer.
IBM J. Res. Dev., 2008

Architecture of the Component Collective Messaging Interface.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

Overcoming scaling challenges in biomolecular simulations across multiple platforms.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Evaluating the effect of replacing CNK with linux on the compute-nodes of blue gene/l.
Proceedings of the 22nd Annual International Conference on Supercomputing, 2008

The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer.
Proceedings of the 22nd Annual International Conference on Supercomputing, 2008

Optimization of All-to-All Communication on the Blue Gene/L Supercomputer.
Proceedings of the 2008 International Conference on Parallel Processing, 2008

2006
Scaling applications to massively parallel machines using Projections performance analysis tool.
Future Gener. Comput. Syst., 2006

Performance evaluation of adaptive MPI.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2006

Achieving strong scaling with NAMD on Blue Gene/L.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

2005
Optimizing Communication for Massively Parallel Processing
PhD thesis, 2005

Improved Point-to-Point and Collective Communication Performance with Output-Queued High-Radix Routers.
Proceedings of the High Performance Computing, 2005

2004
Scalable fine-grained parallelization of plane-wave-based ab initio molecular dynamics for large supercomputers.
J. Comput. Chem., 2004

Opportunities and Challenges of Modern Communication Architectures: Case Study with QsNet.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

Faucets: Efficient Resource Allocation on the Computational Grid.
Proceedings of the 33rd International Conference on Parallel Processing (ICPP 2004), 2004

Scaling All-to-All Multicast on Fat-tree Networks.
Proceedings of the 10th International Conference on Parallel and Distributed Systems, 2004

2003
A Framework for Collective Personalized Communication.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Scaling Molecular Dynamics to 3000 Processors with Projections: A Performance Analysis Case Study.
Proceedings of the Computational Science - ICCS 2003, 2003

2002
NAMD: biomolecular simulation on thousands of processors.
Proceedings of the 2002 ACM/IEEE conference on Supercomputing, 2002

A Malleable-Job System for Timeshared Parallel Machines.
Proceedings of the 2nd IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2002), 2002


  Loading...