Surrogate-Based Uncertainty Quantification in Climate Models in presence of High Dimensional, Dependent Inputs and Nonsmooth Outputs


Uncertainty quantification capabilities have been boosted considerably by recent
advances in associated algorithms and software, as well as increased
computational capabilities. As a result, it has become possible to address
uncertainties in complex climate models more quantitatively. However, there
still remain numerous challenges when dealing with complex climate models. In
this work, we highlight and address some of these challenges, using the
Community Land Model (CLM) as the main benchmark system for algorithm development.

To begin with, climate models are computationally intensive. This necessarily disqualifies pure Monte-Carlo algorithms for uncertainty estimation, since naive Monte-Carlo approaches require too many sampled simulations for reasonable
accuracy. In this work, we build computationally inexpensive surrogate
model in order to accelerate both forward and inverse UQ methods.
We apply Polynomial Chaos (PC) spectral expansions to build surrogate
relationships between output quantities and model parameters using as few
forward model simulations as possible.

Next, climate models typically suffer from the mph{curse of dimensionality}.
For example, the CLM depends on about 80 input parameters with somewhat
uncertain values. Representation of the input-output dependence requires
prohibitively many basis functions for spectral expansions. Moreover, to obtain
such a representation, one needs to sample an 80-dimensional space, which can at
best be mph{sparsely} covered. We apply Bayesian compressive sensing (BCS)
techniques in order to infer the best basis set for the PC surrogate model. BCS
performs particularly well in high-dimensional settings when model simulations are very sparse.

Furthermore, many climate models incorporate dependent uncertain parameters. In
this context, we apply the Rosenblatt transformation, mapping dependent
parameters into a computationally convenient set of independent variables. This
allows efficient parameter sampling even in presence of dependencies.

Finally, as climate models can, and the CLM in particular mph{does}, exhibit
sharp transients with varying input parameters, we consider multi-cluster PC
representations in which spectral expansions are obtained within each
sample set class, and combined accordingly using classification techniques.

The ultimate, multi-cluster PC surrogate model allows a global, variance-based
sensitivity analysis that can drastically reduce the input parameter space
dimensionality. Also, the PC surrogate model can be invoked, without much
computational overhead, in place of the full climate model, in optimization or
calibration studies that require prohibitively many forward model simulations. In
particular, we use PC surrogates to infer input parameter distributions given physical observations. At this stage, adaptive Markov chain Monte-Carlo
algorithms are used to explore the input parameter space efficiently.

This work is supported by the U.S. Department of Energy, Office of Science,
Biological and Environmental Research, CSSEF (Climate Science for a Sustainable
Energy Future) program. Sandia National Laboratories is a multi-program
laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary
of Lockheed Martin Corporation, for the U.S. Department of Energy’s National
Nuclear Security Administration under contract DE-AC04-94AL85000.

Dec 5, 2012
AGU Fall Meeting 2012
San Francisco, CA