Title: | PEcAn Functions Used for Propagating and Partitioning Uncertainties in Ecological Forecasts and Reanalysis |
---|---|
Description: | The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. |
Authors: | David LeBauer, Mike Dietze, Xiaohui Feng, Dan Wang, Carl Davidson, Rob Kooper, Shawn Serbin |
Maintainer: | David LeBauer <[email protected]> |
License: | BSD_3_clause + file LICENSE |
Version: | 1.8.0.9000 |
Built: | 2024-11-20 22:23:32 UTC |
Source: | https://github.com/PecanProject/pecan |
Generates a vector of filenames to be used for PEcAn ensemble output files. All paths start from directory 'settings$outdir', which will be created if it does not exist.
ensemble.filename( settings, prefix = "ensemble.samples", suffix = "Rdata", all.var.yr = TRUE, ensemble.id = settings$ensemble$ensemble.id, variable = settings$ensemble$variable, start.year = settings$ensemble$start.year, end.year = settings$ensemble$end.year )
ensemble.filename( settings, prefix = "ensemble.samples", suffix = "Rdata", all.var.yr = TRUE, ensemble.id = settings$ensemble$ensemble.id, variable = settings$ensemble$variable, start.year = settings$ensemble$start.year, end.year = settings$ensemble$end.year )
settings |
list of PEcAn settings. |
prefix |
string to appear at the beginning of the filename |
suffix |
file extension: string to appear at the end of the filename |
all.var.yr |
logical: does ensemble include all vars and years? If FALSE, filename will include years and vars |
ensemble.id |
ensemble ID(s) |
variable |
variable(s) included in the ensemble. |
start.year , end.year
|
first and last year simulated. |
Typically used by passing only a settings object, but all values can be overridden for manual use.
If only a single variable or a subset of years are needed, the generated filename will identify these in the form If all vars and years are included, set 'all.yr.var' to TRUE to get a filename of the form 'prefix.ensemble_id.suffix'. All elements are recycled vectorwise.
a vector of filenames, each in the form '[settings$outdir]/[prefix].[ensemble.ID].[variable].[start.year].[end.year][suffix]'.
Ryan Kelly
Plots an ensemble time-series from PEcAn for the selected target variable
ensemble.ts(ensemble.ts, observations = NULL, window = 1, ...)
ensemble.ts(ensemble.ts, observations = NULL, window = 1, ...)
ensemble.ts |
ensemble timeseries to be plotted, as returned by [read.ensemble.ts()] |
observations |
observed data to plot over timeseries predictions |
window |
number of timepoints to average across |
... |
additional arguments, currently ignored |
nothing, generates an ensemble time-series plot
Michael Dietze, Ryan Kelly
Calculate parameters for heteroskedastic flux uncertainty
flux.uncertainty( measurement, QC = 0, flags = TRUE, bin.num = 10, transform = identity, minBin = 5, ... )
flux.uncertainty( measurement, QC = 0, flags = TRUE, bin.num = 10, transform = identity, minBin = 5, ... )
measurement |
= flux time-series |
QC |
= quality control flag time series (0 = best) |
flags |
= additional flags on flux filtering of PAIRS (length = 1/2 that of the time series, TRUE = use). |
bin.num |
= number of bins (default = 10) |
transform |
= transformation of magnitude (default = identity) |
minBin |
= minimum number of points allowed in a bin |
... |
additional arguments, currently ignored |
return.list List of parameters from the fitted uncertainty model
Mike Dietze, Carl Davidson
Get delta between sequential flux datapoints
get.change(measurement)
get.change(measurement)
measurement |
numeric vector |
Difference between consecutive measurements
Mike Dietze, Carl Davidson
Given a set of numbers (a numeric vector), this returns the set's coefficient of variance.
get.coef.var(set)
get.coef.var(set)
set |
numeric vector of trait values |
coeficient of variance
Given the sensitivity, samples, and outputs for a single trait, return elasticity
get.elasticity(sensitivity, samples, outputs)
get.elasticity(sensitivity, samples, outputs)
sensitivity |
univariate sensitivity of model to a parameter, can be calculated by |
samples |
samples from trait distribution |
outputs |
model output from ensemble runs |
elasticity = normalized sensitivity
Get parameter values used in ensemble
get.ensemble.samples( ensemble.size, pft.samples, env.samples, method = "uniform", param.names = NULL, ... )
get.ensemble.samples( ensemble.size, pft.samples, env.samples, method = "uniform", param.names = NULL, ... )
ensemble.size |
number of runs in model ensemble |
pft.samples |
random samples from parameter distribution, e.g. from a MCMC chain |
env.samples |
env samples |
method |
the method used to generate the ensemble samples. Random generators: uniform, uniform with latin hypercube permutation. Quasi-random generators: halton, sobol, torus. Random generation draws random variates whereas quasi-random generation is deterministic but well equidistributed. Default is uniform. For small ensemble size with relatively large parameter number (e.g ensemble size < 5 and # of traits > 5) use methods other than halton. |
param.names |
a list of parameter names that were fitted either by MA or PDA, important argument, if NULL parameters will be resampled independently |
... |
Other arguments passed on to the sampling method |
Returns a matrix of randomly or quasi-randomly sampled trait values to be assigned to traits over several model runs. given the number of model runs and a list of sample distributions for traits The model run is indexed first by model run, then by trait
matrix of (quasi-)random samples from trait distributions
David LeBauer, Istem Fer
Calculate distribution of function of a variable
get.gi.phii(splinefuns, trait.samples, maxn = NULL)
get.gi.phii(splinefuns, trait.samples, maxn = NULL)
splinefuns |
univariate spline functions created for each trait, e.g. by the |
trait.samples |
n x m matrix (or list with m vectors of length n) of n parameter sets, each with a sample from m traits |
maxn |
maximum number of parameter sets to evaluate |
Transform trait parameter through trait-specific univariate model emulator spline function
Given a matrix of parameter sets, calculates a matrix of model output
by applying appropriate spline function to each parameter.
Output is used with plot_variance_decomposition
and spline.ensemble
matrix of spline estimates of model output for each of n parameter sets
David LeBauer
Convert priors / MCMC samples to chains that can be sampled for model parameters
get.parameter.samples( settings, posterior.files = rep(NA, length(settings$pfts)), ens.sample.method = "uniform" )
get.parameter.samples( settings, posterior.files = rep(NA, length(settings$pfts)), ens.sample.method = "uniform" )
settings |
PEcAn settings object |
posterior.files |
list of filenames to read from |
ens.sample.method |
one of "halton", "sobol", "torus", "lhc", "uniform" |
David LeBauer, Shawn Serbin, Istem Fer
Output is placed in model output directory (settings$outdir).
get.results( settings, sa.ensemble.id = NULL, ens.ensemble.id = NULL, variable = NULL, start.year = NULL, end.year = NULL )
get.results( settings, sa.ensemble.id = NULL, ens.ensemble.id = NULL, variable = NULL, start.year = NULL, end.year = NULL )
settings |
list, read from settings file (xml) using |
sa.ensemble.id , ens.ensemble.id
|
ensemble IDs for the sensitivity
analysis and ensemble analysis.
If not provided, they are first looked up from 'settings',
then if not found they are not used and the most recent set of results
is read from |
variable |
variables to retrieve, as vector of names or expressions |
start.year , end.year
|
first and last years to retrieve |
David LeBauer, Shawn Serbin, Mike Dietze, Ryan Kelly
This function evaluates the sensitivity of a model to a parameter. This is done by evaluating the first derivative of the univariate spline estimate of the model response at the parameter median.
get.sensitivity(trait.samples, sa.splinefun)
get.sensitivity(trait.samples, sa.splinefun)
trait.samples |
parameter values to evaluate at their median |
sa.splinefun |
fitted spline function. Must take two arguments. |
numeric estimate of model sensitivity to parameter
Function for generating samples based on sampling method, parent or etc
input.ens.gen(settings, input, method = "sampling", parent_ids = NULL)
input.ens.gen(settings, input, method = "sampling", parent_ids = NULL)
settings |
list of PEcAn settings |
input |
name of input to sample, e.g. "met", "veg", "pss" |
method |
Method for sampling - For now looping or sampling with replacement is implemented |
parent_ids |
This is basically the order of the paths that the parent is sampled.See Details. |
If for example met was a parent and it's sampling method resulted in choosing the first, third and fourth samples, these are the ids that need to be sent as parent_ids to this function.
For a given input/tag in the pecan xml and a method, this function returns a list with $id showing the order of sampling and $samples with samples of that input.
## Not run: input.ens.gen(settings,"met","sampling")
## Not run: input.ens.gen(settings,"met","sampling")
Note that this calculates the 'excess kurtosis', which is defined as kurtosis - 3.
This statistic is used in the calculation of the standard deviation of sample variance
in the function sd.var
.
kurtosis(x)
kurtosis(x)
x |
vector of values |
numeric value of kurtosis
David LeBauer
NIST/SEMATECH e-Handbook of Statistical Methods, http://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htm, 2011-06-20.
Plot fit for heteroskedastic flux uncertainty
plot_flux_uncertainty(f, ...)
plot_flux_uncertainty(f, ...)
f |
output of flux.uncertainty functions |
... |
optional graphical paramters |
Mike Dietze, Carl Davidson
Generates a plot using plot_sensitivity
for multiple traits.
plot_sensitivities( sensitivity.plot.inputs, prior.sensitivity.plot.inputs = NULL, ... )
plot_sensitivities( sensitivity.plot.inputs, prior.sensitivity.plot.inputs = NULL, ... )
sensitivity.plot.inputs |
inputs |
prior.sensitivity.plot.inputs |
priors |
... |
arguments passed to |
list of plots, one per trait
Plot univariate response of model output to a trait parameter.
plot_sensitivity( sa.sample, sa.spline, trait, y.range = c(0, 50), median.i = 4, prior.sa.sample = NULL, prior.sa.spline = NULL, fontsize = list(title = 12, axis = 8), linesize = 1, dotsize = 2 )
plot_sensitivity( sa.sample, sa.spline, trait, y.range = c(0, 50), median.i = 4, prior.sa.sample = NULL, prior.sa.spline = NULL, fontsize = list(title = 12, axis = 8), linesize = 1, dotsize = 2 )
sa.sample |
trait quantiles used in sensitivity analysis |
sa.spline |
spline function estimated from sensitivity analysis |
trait |
trait name for title |
y.range |
limits for y axis of plot |
median.i |
index of median value in sa.sample; |
prior.sa.sample |
similar to sa.sample, but for prior distribution. If given, plots sensitivity for prior run |
prior.sa.spline |
similar to sa.spline, but for prior trait distribution. |
fontsize |
(optional) list with three arguments that can be set to vary the fontsize of the title, axis labels, and axis title in the sensitivity plots |
linesize |
passed to ggplot to set line thickness |
dotsize |
passed to ggplot to set point size |
Plots for a single trait; called by plot_sensitivities
to plot sensitivity plots for multiple traits.
object of class ggplot
Plots variance decomposition tryptich
plot_variance_decomposition( plot.inputs, fontsize = list(title = 18, axis = 14) )
plot_variance_decomposition( plot.inputs, fontsize = list(title = 18, axis = 14) )
plot.inputs |
Output from a sensitivity analysis. Output must be of the form given by sensitivity.results$variance.decomposition.output in model output |
fontsize |
list specifying the font size of the titles and axes of the graph |
David LeBauer, Carl Davidson
x <- list(trait.labels = c('a', 'b', 'c'), coef.vars = c(a=1,b=0.5, c=0.1), elasticities = c(a=1,b=2,c=0.5), variances = c(a = 20, b=30, c = 10)) do.call(gridExtra::grid.arrange, c(plot_variance_decomposition(x), ncol = 4))
x <- list(trait.labels = c('a', 'b', 'c'), coef.vars = c(a=1,b=0.5, c=0.1), elasticities = c(a=1,b=2,c=0.5), variances = c(a = 20, b=30, c = 10)) do.call(gridExtra::grid.arrange, c(plot_variance_decomposition(x), ncol = 4))
Read Ameriflux L2 Data
read.ameriflux.L2(file.name, year)
read.ameriflux.L2(file.name, year)
file.name |
path to file |
year |
currently ignored |
Ameriflux L2 data read from file
Mike Dietze, Carl Davidson
Reads output from model ensemble
read.ensemble.output( ensemble.size, pecandir, outdir, start.year, end.year, variable, ens.run.ids = NULL )
read.ensemble.output( ensemble.size, pecandir, outdir, start.year, end.year, variable, ens.run.ids = NULL )
ensemble.size |
the number of ensemble members run |
pecandir |
specifies where pecan writes its configuration files |
outdir |
directory with model output to use in ensemble analysis |
start.year |
first year to include in ensemble analysis |
end.year |
last year to include in ensemble analysis |
variable |
target variables for ensemble analysis |
ens.run.ids |
dataframe. Must contain a column named "id" giving the run IDs to be read.
If NULL, will attempt to read IDs from a file named "samples.Rdata" in |
Reads output for an ensemble of length specified by ensemble.size
and bounded by start.year
and end.year
a list of ensemble model output
Ryan Kelly, David LeBauer, Rob Kooper
Reads ensemble time-series from PEcAn for the selected target variable
read.ensemble.ts( settings, ensemble.id = NULL, variable = NULL, start.year = NULL, end.year = NULL )
read.ensemble.ts( settings, ensemble.id = NULL, variable = NULL, start.year = NULL, end.year = NULL )
settings |
PEcAn settings object |
ensemble.id |
database ID, taken from settings if not specified |
variable |
variable name to process. Required |
start.year , end.year
|
taken from settings if not specified |
list
Michael Dietze, Ryan Kelly
Reads output of sensitivity analysis runs
read.sa.output( traits, quantiles, pecandir, outdir, pft.name = "", start.year, end.year, variable, sa.run.ids = NULL, per.pft = FALSE )
read.sa.output( traits, quantiles, pecandir, outdir, pft.name = "", start.year, end.year, variable, sa.run.ids = NULL, per.pft = FALSE )
traits |
model parameters included in the sensitivity analysis |
quantiles |
quantiles selected for sensitivity analysis |
pecandir |
specifies where pecan writes its configuration files |
outdir |
directory with model output to use in sensitivity analysis |
pft.name |
name of PFT used in sensitivity analysis (Optional) |
start.year |
first year to include in sensitivity analysis |
end.year |
last year to include in sensitivity analysis |
variable |
variables to be read from model output |
sa.run.ids |
list of run ids to read. If NULL, will look in 'pecandir' for a file named 'samples.Rdata' and read from that |
per.pft |
flag to determine whether we want SA on pft-specific variables |
dataframe with one col per quantile analysed and one row per trait, each cell is a list of AGB over time
Ryan Kelly, David LeBauer, Rob Kooper, Mike Dietze, Istem Fer
run ensemble.analysis
run.ensemble.analysis( settings, plot.timeseries = NA, ensemble.id = NULL, variable = NULL, start.year = NULL, end.year = NULL, ... )
run.ensemble.analysis( settings, plot.timeseries = NA, ensemble.id = NULL, variable = NULL, start.year = NULL, end.year = NULL, ... )
settings |
PEcAn settings object |
plot.timeseries |
if TRUE plots a modeled timeseries of target variable(s) with CIs |
ensemble.id |
database ID, taken from settings if not specified |
variable |
variable name to process, taken from settings if not specified |
start.year , end.year
|
taken from settings if not specified |
... |
additional arguments passed to [ensemble.ts()] |
nothing, creates ensemble plots as ensemble.analysis.pdf
David LeBauer, Shawn Serbin, Ryan Kelly
Runs the sensitivity analysis module on a finished run
run.sensitivity.analysis( settings, plot = TRUE, ensemble.id = NULL, variable = NULL, start.year = NULL, end.year = NULL, pfts = NULL, ... )
run.sensitivity.analysis( settings, plot = TRUE, ensemble.id = NULL, variable = NULL, start.year = NULL, end.year = NULL, pfts = NULL, ... )
settings |
a PEcAn settings object |
plot |
logical. Option to generate sensitivity analysis and variance decomposition plots (plot=TRUE) or to turn these plots off (plot=FALSE). |
ensemble.id |
ensemble ID |
variable |
which varibable(s) to do sensitivity analysis for. Defaults to all specified in 'settings' |
start.year |
defaults to what is specified in 'settings' |
end.year |
defaults to what is specified in 'settings' |
pfts |
a vector of PFT names found in 'settings' to run sensitivity analysis on |
... |
currently unused |
nothing, saves sensitivity.results
as
sensitivity.results.Rdata, sensitivity plots as sensitivityanalysis.pdf,
and variance decomposition 'popsicle plot' as variancedecomposition.pdf a
side effect (OPTIONAL)
David LeBauer, Shawn Serbin, Ryan Kelly
## Not run: library(PEcAn.settings) library(PEcAn.uncertainty) settings <- read.settings("path/to/pecan.xml") run.sensitivity.analysis(settings) ## End(Not run)
## Not run: library(PEcAn.settings) library(PEcAn.uncertainty) settings <- read.settings("path/to/pecan.xml") run.sensitivity.analysis(settings) ## End(Not run)
Apply get.results to each of a list of settings
runModule.get.results(settings)
runModule.get.results(settings)
settings |
a PEcAn |
get.results
Creates a spline function using the splinefun function that estimates univariate response of parameter input to model output
sa.splinefun(quantiles.input, quantiles.output)
sa.splinefun(quantiles.input, quantiles.output)
quantiles.input |
passed to 'x' argument of 'stats::splinefun' |
quantiles.output |
passed to 'y' argument of 'stats::splinefun' |
function
Uses the equation
sd.var(x)
sd.var(x)
x |
sample |
estimate of standard deviation of the sample variance
David LeBauer
Mood, Graybill, Boes 1974 'Introduction to the Theory of Statistics' 3rd ed. p 229; Casella and Berger 'Statistical Inference' p 364 ex. 7.45; 'Reference for Var(s^2)' CrossValidated http://stats.stackexchange.com/q/29905/1381, 'Calculating required sample size, precision of variance estimate' CrossValidated http://stats.stackexchange.com/q/7004/1381, 'Variance of Sample Variance?' Mathematics - Stack Exchange http://math.stackexchange.com/q/72975/3733
This function estimates the univariate responses of a model to a parameter for a set of traits, calculates the model sensitivity at the median, and performs a variance decomposition. This function results in a set of sensitivity plots (one per variable) and plot_variance_decomposition.
sensitivity.analysis(trait.samples, sa.samples, sa.output, outdir)
sensitivity.analysis(trait.samples, sa.samples, sa.output, outdir)
trait.samples |
list of vectors, one per trait, representing samples of the trait value, with length equal to the mcmc chain length. Samples are taken from either the prior distribution or meta-analysis results |
sa.samples |
data.frame with one column per trait and one row for the set of quantiles used in sensitivity analysis. |
sa.output |
list of data.frames, similar to sa.samples, except cells contain the results of a model run with that trait x quantile combination and all other traits held at their median value |
outdir |
directory to which plots are written |
results of sensitivity analysis
David LeBauer
## Not run: sensitivity.analysis( trait.samples[[pft$name]], sa.samples[[pft$name]], sa.agb[[pft$name]], pft$outdir ) ## End(Not run)
## Not run: sensitivity.analysis( trait.samples[[pft$name]], sa.samples[[pft$name]], sa.agb[[pft$name]], pft$outdir ) ## End(Not run)
Generate sensitivity analysis filenames
sensitivity.filename( settings, prefix = "sensitivity.samples", suffix = "Rdata", all.var.yr = TRUE, pft = NULL, ensemble.id = settings$sensitivity.analysis$ensemble.id, variable = settings$sensitivity.analysis$variable, start.year = settings$sensitivity.analysis$start.year, end.year = settings$sensitivity.analysis$end.year )
sensitivity.filename( settings, prefix = "sensitivity.samples", suffix = "Rdata", all.var.yr = TRUE, pft = NULL, ensemble.id = settings$sensitivity.analysis$ensemble.id, variable = settings$sensitivity.analysis$variable, start.year = settings$sensitivity.analysis$start.year, end.year = settings$sensitivity.analysis$end.year )
settings |
list of PEcAn settings. |
prefix |
string to appear at the beginning of the filename |
suffix |
file extension: string to appear at the end of the filename |
all.var.yr |
logical: does ensemble include all vars and years? If FALSE, filename will include years and vars |
pft |
name of PFT used for analysis. If NULL, assumes all PFTs in run are used and does not add them to the filename |
ensemble.id |
ensemble ID(s) |
variable |
variable(s) included in the ensemble. |
start.year , end.year
|
first and last year simulated. |
Generally uses values in settings, but can be overwritten for manual uses
a filename
Ryan Kelly
Estimate model output based on univariate splines
spline.ensemble(gi.phii, median)
spline.ensemble(gi.phii, median)
gi.phii |
matrix given as output from |
median |
median value around which variance will be calculated |
Accepts output from get.gi.phii (the matrix ) and produces
spline estimate of
for use in estimating closure term associated with
spline approximation
David LeBauer
Truncate spline at zero if P[x<0] < pnorm(-3) pnorm(-3) chosen as default value for min quantile because this is the default low end of range for the sensitivity analysis. This parameter could be determined based on minimum value in settings$sensitivity.analysis$quantiles
spline.truncate(x, min.quantile = stats::pnorm(-3))
spline.truncate(x, min.quantile = stats::pnorm(-3))
x |
vector |
min.quantile |
threshold quantile for testing lower bound on variable |
either x or a vector with values < 0 converted to zero
David LeBauer
set.seed(0) x <- c(rgamma(998,1,1), rnorm(10)) min(x) # -0.5238 min(PEcAn.uncertainty::spline.truncate(x))
set.seed(0) x <- c(rgamma(998,1,1), rnorm(10)) min(x) # -0.5238 min(PEcAn.uncertainty::spline.truncate(x))
Variance and SD(variance)
variance.stats(x)
variance.stats(x)
x |
numeric vector |
calculates variance and sd of variance using the var
from base R and sd.var
from PEcAn.
list with variance and sd of variance
David LeBauer
Writes config files for use in meta-analysis and returns a list of run ids. Given a pft.xml object, a list of lists as supplied by get.sa.samples, a name to distinguish the output files, and the directory to place the files.
write.ensemble.configs( defaults, ensemble.samples, settings, model, clean = FALSE, write.to.db = TRUE, restart = NULL, rename = FALSE )
write.ensemble.configs( defaults, ensemble.samples, settings, model, clean = FALSE, write.to.db = TRUE, restart = NULL, rename = FALSE )
defaults |
pft |
ensemble.samples |
list of lists supplied by get.ensemble.samples |
settings |
list of PEcAn settings |
model |
name of model to be run, e.g. "ED2" or "SIPNET" |
clean |
remove old output first? |
write.to.db |
logical: Record this run in BETY? |
restart |
In case this is a continuation of an old simulation. restart needs to be a list with name tags of runid, inputs, new.params (parameters), new.state (initial condition), ensemble.id (ensemble id), start.time and stop.time.See Details. |
rename |
Decide if we want to rename previous output files, for example convert from sipnet.out to sipnet.2020-07-16.out. |
The restart functionality is developed using model specific functions by calling write_restart.modelname function. First, you need to make sure that this function is already exist for your desired model.See here https://pecanproject.github.io/pecan-documentation/latest/pecan-models.html new state is a dataframe with a different column for each state variable. The number of the rows in this dataframe needs to be the same as the ensemble size. State variables that you can use for setting up the intial conditions differs for different models. You may check the documentation of the write_restart.modelname your model. The units for the state variables need to be in the PEcAn standard units which can be found in standard_vars. new.params also has similar structure to ensemble.samples which is sent as an argument.
list, containing $runs = data frame of runids, $ensemble.id = the ensemble ID for these runs and $samples with ids and samples used for each tag. Also writes sensitivity analysis configuration files as a side effect
David LeBauer, Carl Davidson, Hamze Dokoohaki
Writes config files for use in sensitivity analysis.
write.sa.configs( defaults, quantile.samples, settings, model, clean = FALSE, write.to.db = TRUE )
write.sa.configs( defaults, quantile.samples, settings, model, clean = FALSE, write.to.db = TRUE )
defaults |
named list with default parameter values |
quantile.samples |
list of lists supplied by get.sa.samples |
settings |
list of settings |
model |
name of model to be run |
clean |
logical: Delete any existing contents of the directory specified by |
write.to.db |
logical: Record this run to BETY? If TRUE, uses connection settings specified in |
list, containing $runs = data frame of runids, and $ensemble.id = the ensemble ID for these runs. Also writes sensitivity analysis configuration files as a side effect
David LeBauer, Carl Davidson