Towards Reproducible Blocked LU Factorization

Abstract : In this article, we address the problem of reproducibility of the blocked LU factorization on GPUs due to cancellations and rounding errors when dealing with floating-point arithmetic. Thanks to the hierarchical structure of linear algebra libraries, the computations carried within this operation can be expressed in terms of the Level-3 BLAS routines as well as the unblocked variant; the latter is correspon-dently built upon the Level-1/2 BLAS kernels. In addition, we strengthen numerical stability of the blocked LU factorization via partial row pivoting. Therefore, we propose a double-layer bottom-up approach for ensuring reproducibility of the blocked LU factorization and provide experimental results for its underlying blocks.
Type de document :
Communication dans un congrès
IPDPS 2017 - 31st IEEE International Parallel & Distributed Processing Symposium, May 2017, Orlando, United States
Liste complète des métadonnées


https://hal.archives-ouvertes.fr/hal-01456307
Contributeur : Roman Iakymchuk <>
Soumis le : mercredi 22 mars 2017 - 11:42:55
Dernière modification le : lundi 27 mars 2017 - 11:19:11

Fichier

REPPAR-05.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01456307, version 2

Collections

Citation

Roman Iakymchuk, Enrique Quintana-Ortí, Erwin Laure, Stef Graillat. Towards Reproducible Blocked LU Factorization. IPDPS 2017 - 31st IEEE International Parallel & Distributed Processing Symposium, May 2017, Orlando, United States. <hal-01456307v2>

Partager

Métriques

Consultations de
la notice

52

Téléchargements du document

17