Alexander Heinecke
According to our database1,
Alexander Heinecke
authored at least 72 papers
between 2007 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
Harnessing Deep Learning and HPC Kernels via High-Level Loop and Tensor Abstractions on CPU Architectures.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024
2023
Harnessing Deep Learning and HPC Kernels via High-Level Loop and Tensor Abstractions on CPU Architectures.
CoRR, 2023
2022
Tensor Processing Primitives: A Programming Abstraction for Efficiency and Portability in Deep Learning and HPC Workloads.
Frontiers Appl. Math. Stat., 2022
IEEE Comput. Archit. Lett., 2022
Accelerating Deep Learning based Identification of Chromatin Accessibility from noisy ATAC-seq Data.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022
2021
ACM Trans. Archit. Code Optim., 2021
Tensor Processing Primitives: A Programming Abstraction for Efficiency and Portability in Deep Learning Workloads.
CoRR, 2021
Proceedings of the International Conference for High Performance Computing, 2021
Tensor processing primitives: a programming abstraction for efficiency and portability in deep learning workloads.
Proceedings of the International Conference for High Performance Computing, 2021
2020
PolyScientist: Automatic Loop Transformations Combined with Microkernels for Optimization of Deep Learning Primitives.
CoRR, 2020
Performance study of sustained petascale direct numerical simulation on Cray XC40 systems.
Concurr. Comput. Pract. Exp., 2020
Proceedings of the International Conference for High Performance Computing, 2020
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020
2019
Supercomput. Front. Innov., 2019
Parallel Comput., 2019
Training Neural Machine Translation (NMT) Models using Tensor Train Decomposition on TensorFlow (T3F).
CoRR, 2019
Proceedings of the High Performance Computing - 34th International Conference, 2019
Proceedings of the 2019 IEEE International Conference on Cluster Computing, 2019
Proceedings of the 16th ACM International Conference on Computing Frontiers, 2019
Leveraging the bfloat16 Artificial Intelligence Datatype For Higher-Precision Computations.
Proceedings of the 26th IEEE Symposium on Computer Arithmetic, 2019
2018
Proceedings of the International Conference for High Performance Computing, 2018
Proceedings of the 6th International Conference on Learning Representations, 2018
2017
Proceedings of the High Performance Computing - 32nd International Conference, 2017
EDGE: Extreme Scale Fused Seismic Simulations with the Discontinuous Galerkin Method.
Proceedings of the High Performance Computing - 32nd International Conference, 2017
2016
Optimizations in a high-performance conjugate gradient benchmark for IA-based multi- and many-core processors.
Int. J. High Perform. Comput. Appl., 2016
Concurr. Comput. Pract. Exp., 2016
Proceedings of the High Performance Computing - 31st International Conference, 2016
Proceedings of the High Performance Computing - 31st International Conference, 2016
Proceedings of the International Conference for High Performance Computing, 2016
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016
2015
Supercomputing for Molecular Dynamics Simulations - Handling Multi-Trillion Particles in Nanofluidics
Springer Briefs in Computer Science, Springer, ISBN: 978-3-319-17148-7, 2015
Beacon: Deployment and Application of Intel Xeon Phi Coprocessorsfor Scientific Computing.
Comput. Sci. Eng., 2015
Concurr. Comput. Pract. Exp., 2015
Proceedings of the High Performance Computing - 30th International Conference, 2015
Proceedings of the International Conference for High Performance Computing, 2015
Exploring Shared-Memory Optimizations for an Unstructured Mesh CFD Application on Modern Parallel Systems.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015
Optimized Force Calculation in Molecular Dynamics Simulations for the Intel Xeon Phi.
Proceedings of the Euro-Par 2015: Parallel Processing Workshops, 2015
2014
Boosting Scientific Computing Applications through Leveraging Data Parallel Architectures.
PhD thesis, 2014
CoRR, 2014
Concurr. Comput. Pract. Exp., 2014
Proceedings of the Supercomputing - 29th International Conference, 2014
Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and its Application to Unstructured Matrices.
Proceedings of the International Conference for High Performance Computing, 2014
Petascale High Order Dynamic Rupture Earthquake Simulations on Heterogeneous Supercomputers.
Proceedings of the International Conference for High Performance Computing, 2014
Improving Communication Performance and Scalability of Native Applications on Intel Xeon Phi Coprocessor Clusters.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014
2013
Emerging Architectures Enable to Boost Massively Parallel Data Mining Using Adaptive Sparse Grids.
Int. J. Parallel Program., 2013
Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013
Many-core architectures boost the pricing of basket options on adaptive sparse grids.
Proceedings of WHPCF'13: 6th Workshop on High Performance Computational Finance, 2013
Proceedings of the Parallel Computing: Accelerating Computational Science and Engineering (CSE), 2013
Design and Implementation of the Linpack Benchmark for Single and Multi-node Systems Based on Intel® Xeon Phi Coprocessor.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013
Proceedings of the International Conference on High Performance Computing & Simulation, 2013
2012
J. Comput. Appl. Math., 2012
Int. J. Comput. Math., 2012
Comput. Sci. Eng., 2012
Proceedings of the 11th International Symposium on Parallel and Distributed Computing, 2012
HPCS 2012 panels: Panel I: Energy efficient systems in next generation high performance data and compute centers.
Proceedings of the 2012 International Conference on High Performance Computing & Simulation, 2012
Proceedings of the 2012 International Conference on High Performance Computing & Simulation, 2012
Proceedings of the Facing the Multicore-Challenge, 2012
Proceedings of the Computing Frontiers Conference, CF'12, 2012
2011
Proceedings of the 2011 International Conference on High Performance Computing & Simulation, 2011
Towards High-Performance Implementations of a Custom HPC Kernel Using ® Array Building Blocks.
Proceedings of the Facing the Multicore - Challenge II, 2011
Extending a Highly Parallel Data Mining Algorithm to the Intel ® Many Integrated Core Architecture.
Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2011
Proceedings of the 8th Conference on Computing Frontiers, 2011
2010
Porting existing cache-oblivious linear algebra HPC modules to larrabee architecture.
Proceedings of the 7th Conference on Computing Frontiers, 2010
2007
Hardware-Oriented Implementation of Cache Oblivious Matrix Operations Based on Space-Filling Curves.
Proceedings of the Parallel Processing and Applied Mathematics, 2007