Fault Resilient Domain Decomposition Preconditioner for PDEs


The move towards extreme-scale computing platforms challenges scientific simulations in many ways. Given the recent tendencies in computer architecture development, one needs to reformulate legacy codes in order to cope with large amounts of communication, system faults, and requirements of low-memory usage per core. In this work, we develop a novel framework for solving PDEs via domain decomposition that reformulates the solution as a state of knowledge with a probabilistic interpretation. Such reformulation allows resiliency with respect to potential faults without having to apply fault detection, avoids unnecessary communication, and is generally well-suited for rigorous uncertainty quantification studies that target improvements of predictive fidelity of scientific models. We demonstrate our algorithm for one-dimensional PDE examples where artificial faults have been implemented as bit flips in the binary representation of subdomain solutions.

SIAM Journal on Scientific Computing