In this work, we define the concept of stiffness for residual neural networks (ResNets) relying on the fact that ResNets can be viewed as a discretization of an underlying neural ordinary differential equation (NODE). We then propose several metrics …
Engineering and applied science relies on computational experiments to rigorously study physical systems. The mathematical models used to probe these systems are highly complex, and sampling-intensive studies often require prohibitively many …
Data movement is considered the main performance concern for exascale, including both on-node memory and off-node network communication. Indeed, many application traces show significant time spent in MPI calls, potentially indicating that faster …
The development of scramjet engines is an important research area for advancing hypersonic and orbital flights. Progress towards optimal engine designs requires accurate and computationally affordable flow simulations, as well as uncertainty …
The development of scramjet engines is an important research area for advancing hypersonic and orbital flights. Progress towards optimal engine designs requires both accurate flow simulations as well as uncertainty quantification (UQ). However, …
We present a resilient task-based domain-decomposition preconditioner for partial differential equations (PDEs) built on top of User Level Fault Mitigation Message Passing Interface (ULFM-MPI). The algorithm reformulates the PDE as a sampling …
We present a resilient domain-decomposition preconditioner for partial differential equations (PDEs). The algorithm reformulates the PDE as a sampling problem, followed by a solution update through data manipulation that is resilient to both soft and …
We present a task-based domain-decomposition preconditioner for partial differential equations (PDEs) resilient to silent data corruption (SDC) and hard faults. The algorithm exploits a reformulation of the PDE as a sampling problem, followed by a …
We present a domain-decomposition-based pre-conditioner for the solution of partial differential equations (PDEs) that is resilient to both soft and hard faults. The algorithm is based on the following steps: first, the computational domain is split …