Eric Petit

SIAM J. Sci. Comput., 2024

Error Analysis of Sum-Product Algorithms under Stochastic Rounding.

[BibT_eX]

[DOI]

CoRR, 2024

Verificarlo CI: continuous integration for numerical optimization and debugging.

[BibT_eX]

[DOI]

CoRR, 2024

Deconstructing HPL-MxP Benchmark: A Numerical Perspective.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2024: Parallel Processing, 2024

2023

Stochastic Rounding Variance and Probabilistic Bounds: A New Approach.

[BibT_eX]

[DOI]

SIAM J. Sci. Comput., October, 2023

Bounds on non-linear errors for variance computation with stochastic rounding.

[BibT_eX]

[DOI]

CoRR, 2023

Lasa: Abstraction and Specialization for Productive and Performant Linear Algebra on FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 31st IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2023

2022

A BF16 FMA is All You Need for DNN Training.

[BibT_eX]

[DOI]

IEEE Trans. Emerg. Top. Comput., 2022

FASE: A Fast, Accurate and Seamless Emulator for Custom Numerical Formats.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2022

The Positive Effects of Stochastic Rounding in Numerical Algorithms.

[BibT_eX]

[DOI]

Proceedings of the 29th IEEE Symposium on Computer Arithmetic, 2022

2021

Confidence Intervals for Stochastic Arithmetic.

[BibT_eX]

[DOI]

ACM Trans. Math. Softw., 2021

A Study of the Effects and Benefits of Custom-Precision Mathematical Libraries for HPC Codes.

[BibT_eX]

[DOI]

Emeric Brun

IEEE Trans. Emerg. Top. Comput., 2021

Dynamically Adapting Floating-Point Precision to Accelerate Deep Neural Network Training.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE International Conference on Machine Learning and Applications, 2021

Shadow computation with BFloat16 to estimate the numerical accuracy of summations.

[BibT_eX]

[DOI]

Matei Istoan

Proceedings of the 28th IEEE Symposium on Computer Arithmetic, 2021

2020

Comparing perturbation models for evaluating stability of neuroimaging pipelines.

[BibT_eX]

[DOI]

Gregory Kiar

Int. J. High Perform. Comput. Appl., 2020

Evaluating Mixed-Precision Arithmetic for 3D Generative Adversarial Networks to Simulate High Energy Physics Detectors.

[BibT_eX]

[DOI]

Proceedings of the 19th IEEE International Conference on Machine Learning and Applications, 2020

Custom-Precision Mathematical Library Explorations for Code Profiling and Optimization.

[BibT_eX]

[DOI]

Matei Istoan

Proceedings of the 27th IEEE Symposium on Computer Arithmetic, 2020

2019

Comparing Perturbation Models for Evaluating Stability of Post-Processing Pipelines in Neuroimaging.

[BibT_eX]

[DOI]

Gregory Kiar

CoRR, 2019

Scalable Fast Multipole Method for Electromagnetic Simulations.

[BibT_eX]

[DOI]

Proceedings of the Computational Science - ICCS 2019, 2019

Automatic Exploration of Reduced Floating-Point Representations in Iterative Methods.

[BibT_eX]

[DOI]

Yohan Chatelain

Ghislain Lartigue

Proceedings of the Euro-Par 2019: Parallel Processing, 2019

2018

Asynchronous and multithreaded communications on irregular applications using vectorized divide and conquer approach.

[BibT_eX]

[DOI]

Loïc Thébault

J. Parallel Distributed Comput., 2018

Scalable Work-Stealing Load-Balancer for HPC Distributed Memory Systems.

[BibT_eX]

[DOI]

Clement Fontenaille

Proceedings of the Euro-Par 2018: Parallel Processing Workshops, 2018

VeriTracer: Context-enriched tracer for floating-point arithmetic analysis.

[BibT_eX]

[DOI]

Yohan Chatelain

Proceedings of the 25th IEEE Symposium on Computer Arithmetic, 2018

2016

A software scheduling solution to avoid corrupted units on GPUs.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2016

Verificarlo: Checking Floating Point Accuracy through Monte Carlo Arithmetic.

[BibT_eX]

[DOI]

Christophe Denis

Proceedings of the 23nd IEEE Symposium on Computer Arithmetic, 2016

2015

CERE: LLVM-Based Codelet Extractor and REplayer for Piecewise Benchmarking and Optimization.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2015

2014

Task-Based Parallelization of Unstructured Meshes Assembly Using D&C Strategy.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014

2013

Adaptive sampling for performance characterization of application kernels.

[BibT_eX]

[DOI]

Asma Farjallah

William Jalby

Concurr. Comput. Pract. Exp., 2013

GPUburn: A system to test and mitigate GPU hardware failures.

[BibT_eX]

[DOI]

Proceedings of the 2013 International Conference on Embedded Computer Systems: Architectures, 2013

Divide and Conquer Parallelization of Finite Element Method Assembly.

[BibT_eX]

[DOI]

Proceedings of the Parallel Computing: Accelerating Computational Science and Engineering (CSE), 2013

Binary Instrumentation for Scalable Performance Measurement of OpenMP Applications.

[BibT_eX]

[DOI]

Proceedings of the Parallel Computing: Accelerating Computational Science and Engineering (CSE), 2013

2012

ASK: Adaptive Sampling Kit for Performance Characterization.

[BibT_eX]

[DOI]