Package 'PEcAn.uncertainty'

Title: PEcAn Functions Used for Propagating and Partitioning Uncertainties in Ecological Forecasts and Reanalysis
Description: The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Authors: David LeBauer, Mike Dietze, Xiaohui Feng, Dan Wang, Carl Davidson, Rob Kooper, Shawn Serbin
Maintainer: David LeBauer <[email protected]>
License: BSD_3_clause + file LICENSE
Version: 1.9.0.9000
Built: 2026-06-04 19:47:47 UTC
Source: https://github.com/PecanProject/pecan

Help Index


Compute Sobol indices from a finished PEcAn run

Description

Loads model outputs from a Sobol ensemble, calculates summary statistics for a chosen variable, feeds them to sensitivity::tell(), and returns the updated Sobol object.

Usage

compute_sobol_indices(outdir, sobol_obj, var = "GPP", stat_fun = mean)

Arguments

outdir

PEcAn run output directory that contains runs.txt

sobol_obj

object produced by PEcAn.uncertainty::generate_joint_ensemble_design()

var

Variable name to summarise (default "GPP").

stat_fun

Summary statistic applied to var default mean .

Value

sobol_obj .


Generate ensemble filenames

Description

Generates a vector of filenames to be used for PEcAn ensemble output files. All paths start from directory 'settings$outdir', which will be created if it does not exist.

Usage

ensemble.filename(
  settings,
  prefix = "ensemble.samples",
  suffix = "Rdata",
  all.var.yr = TRUE,
  ensemble.id = settings$ensemble$ensemble.id,
  variable = settings$ensemble$variable,
  start.year = settings$ensemble$start.year,
  end.year = settings$ensemble$end.year
)

Arguments

settings

list of PEcAn settings.

prefix

string to appear at the beginning of the filename

suffix

file extension: string to appear at the end of the filename

all.var.yr

logical: does ensemble include all vars and years? If FALSE, filename will include years and vars

ensemble.id

ensemble ID(s)

variable

variable(s) included in the ensemble.

start.year, end.year

first and last year simulated.

Details

Typically used by passing only a settings object, but all values can be overridden for manual use.

If only a single variable or a subset of years are needed, the generated filename will identify these in the form If all vars and years are included, set 'all.yr.var' to TRUE to get a filename of the form 'prefix.ensemble_id.suffix'. All elements are recycled vectorwise.

Value

a vector of filenames, each in the form '[settings$outdir]/[prefix].[ensemble.ID].[variable].[start.year].[end.year][suffix]'.

Author(s)

Ryan Kelly


Plots an ensemble time-series from PEcAn for the selected target variable

Description

Plots an ensemble time-series from PEcAn for the selected target variable

Usage

ensemble.ts(ensemble.ts, observations = NULL, window = 1, ...)

Arguments

ensemble.ts

ensemble timeseries to be plotted, as returned by [read.ensemble.ts()]

observations

observed data to plot over timeseries predictions

window

number of timepoints to average across

...

additional arguments, currently ignored

Value

nothing, generates an ensemble time-series plot

Author(s)

Michael Dietze, Ryan Kelly


Calculate parameters for heteroskedastic flux uncertainty

Description

Calculate parameters for heteroskedastic flux uncertainty

Usage

flux.uncertainty(
  measurement,
  QC = 0,
  flags = TRUE,
  bin.num = 10,
  transform = identity,
  minBin = 5,
  ...
)

Arguments

measurement

= flux time-series

QC

= quality control flag time series (0 = best)

flags

= additional flags on flux filtering of PAIRS (length = 1/2 that of the time series, TRUE = use).

bin.num

= number of bins (default = 10)

transform

= transformation of magnitude (default = identity)

minBin

= minimum number of points allowed in a bin

...

additional arguments, currently ignored

Value

return.list List of parameters from the fitted uncertainty model

Author(s)

Mike Dietze, Carl Davidson


Generate joint ensemble design for parameter sampling Creates a joint ensemble design that maintains parameter correlations across all sites in a multi-site run. This function generates sample indices that are shared across sites to ensure consistent parameter sampling.

Description

Generate joint ensemble design for parameter sampling Creates a joint ensemble design that maintains parameter correlations across all sites in a multi-site run. This function generates sample indices that are shared across sites to ensure consistent parameter sampling.

Usage

generate_joint_ensemble_design(settings, ensemble_size, sobol = FALSE)

Arguments

settings

PEcAn settings object. This function directly uses:

  • settings$outdir - Output directory path for samples.Rdata

  • settings$pfts - List of PFTs (extracts posterior.files)

  • settings$ensemble$samplingspace - Input sampling configuration

  • settings$run$inputs - Input paths for each input type

When samples.Rdata doesn't exist, settings is passed to get.parameter.samples which additionally requires:

  • settings$ensemble - Ensemble configuration

  • settings$database$bety - Database connection (optional)

  • settings$host$name - Host name for dbfile.check (optional)

ensemble_size

Integer specifying the number of ensemble members. The input_design is generated once for the entire model run. You might want to recycle existing ensemble_samples when splitting larger runs into smaller jobs while keeping the same parameters.

sobol

Logical. If TRUE, returns a sensitivity::soboljansen object for Sobol sensitivity analysis.

Details

Note on internal dependencies

If samples.Rdata doesn't exist we call get.parameter.samples(), which loads parameter distributions.

In practice it: - uses pft$posterior.files directly when it is defined (an Rdata file with post.distns or prior.distns), - otherwise figures out an output directory from pft$outdir or, if needed, via pft$posteriorid in the database, - then looks in that directory for post.distns.Rdata, falling back to prior.distns.Rdata, - and, for MCMC posteriors, looks up trait.mcmc*.Rdata linked to the same posteriorid or a trait.mcmc.Rdata file in that directory.

Difference from generate_OAT_SA_design: This function samples inputs randomly or quasi-randomly, while generate_OAT_SA_design holds all non-parameter inputs constant to isolate parameter effects.

Value

A list containing ensemble samples and indices. If sobol = FALSE, returns list(X = design_matrix). If sobol = TRUE, returns a sensitivity::soboljansen() result object with the design matrix in $X plus additional components for Sobol index calculations.


Generate One-At-a-Time (OAT) input design for sensitivity analysis

Description

Creates an input design matrix for sensitivity analysis where non-parameter inputs (met, IC, soil, etc.) are held constant while parameters vary one-at-a-time across quantiles. This differs from ensemble design where all inputs vary together.

Usage

generate_OAT_SA_design(settings, sa_samples = NULL)

Arguments

settings

PEcAn settings object. See details for required elements.

sa_samples

Optional. Pre-loaded SA samples (named list with one element per PFT, each a matrix with quantiles as rows and traits as columns). If NULL (default), samples are generated via get.parameter.samples.

Details

## Settings requirements

This function directly uses:

  • settings$outdir - Output directory path for samples.Rdata

  • settings$pfts - List of PFTs (extracts posterior.files)

  • settings$ensemble$samplingspace - Input types to include in design

When sa_samples = NULL, settings is passed to get.parameter.samples which additionally requires:

  • settings$sensitivity.analysis - SA quantile configuration

  • settings$database$bety - Database connection (optional)

  • settings$host$name - Host name for dbfile.check (optional)

## OAT design logic For sensitivity analysis, we must isolate the effect of each parameter by holding all other inputs constant. The param column contains sequential indices (1, 2, 3, ...) matching the SA run order in write.sa.configs. All other columns (met, ic, soil, etc.) are set to 1, meaning the first input file is always used.

Note on internal dependencies

If sa_samples is NULL we hand off to get.parameter.samples(), which does the work of finding and loading parameter distributions.

In practice it: - uses pft$posterior.files directly when it is defined (an Rdata file with post.distns or prior.distns), - otherwise figures out an output directory from pft$outdir or, if needed, via pft$posteriorid in the database, - then looks in that directory for post.distns.Rdata, falling back to prior.distns.Rdata, - and, for MCMC posteriors, looks up trait.mcmc*.Rdata linked to the same posteriorid or a trait.mcmc.Rdata file in that directory.

Value

list with component X: a data.frame with columns for each input type and one row per SA run. Non-parameter columns are all 1 (constant).

Author(s)

Akash B V

Examples

## Not run: 
# Generate SA design for a multi-site run
sa_design <- generate_OAT_SA_design(settings)

# View the design matrix
print(sa_design$X)
#   param met ic soil
# 1     1   1  1    1   # Median run
# 2     2   1  1    1   # trait1 @ q=2.3%
# 3     3   1  1    1   # trait1 @ q=15.9%
# 4     4   1  1    1   # trait1 @ q=84.1%
# ...

# With pre-loaded sa_samples (skips get.parameter.samples call)
load("samples.Rdata")
sa_design <- generate_OAT_SA_design(settings, sa_samples = sa.samples)

## End(Not run)

Generate parameter samples from priors and MCMC posteriors (pure function)

Description

This is the pure computation core extracted from get.parameter.samples(). It takes pre-loaded R objects as input and returns all computed samples directly, with no file I/O and no database calls.

Usage

get_parameter_samples(
  pft_names,
  prior_distns_list,
  trait_mcmc_list = NULL,
  ensemble.size = 1,
  ens.sample.method = "uniform",
  sa_quantiles = NULL,
  do_ensemble = TRUE,
  independent = TRUE
)

Arguments

pft_names

character vector of PFT names

prior_distns_list

list of data frames (one per PFT), each with columns: distn, parama, paramb, n. Row names are trait names.

trait_mcmc_list

list of trait MCMC results (one per PFT). Each element should be a named list of coda::mcmc.list objects, or NULL if no MCMC results exist for that PFT.

ensemble.size

integer. Number of runs in model ensemble.

ens.sample.method

character. One of "halton", "sobol", "torus", "lhc", "uniform".

sa_quantiles

numeric vector of quantiles for sensitivity analysis, or NULL to skip SA sampling.

do_ensemble

logical. Whether to generate ensemble samples.

independent

logical. TRUE means parameters were fitted individually (standard MA) and can be resampled independently. FALSE means they were fitted together (e.g. PDA) and correlations should be preserved.

Details

For the original disk-based wrapper, see get.parameter.samples.

Value

named list with five elements:

trait.samples

Named list by PFT, then by trait. Each leaf is a numeric vector of parameter samples.

sa.samples

Sensitivity analysis samples (list), or empty list if sa_quantiles is NULL.

ensemble.samples

Ensemble samples (list), or empty list if do_ensemble is FALSE.

runs.samples

Reserved for future use (empty list).

env.samples

Environmental samples (empty list, populated by downstream code if needed).

Author(s)

David LeBauer, Shawn Serbin, Istem Fer, Om Kapale

See Also

get.parameter.samples for the backward-compatible wrapper that handles file I/O and database lookups.

Examples

## Not run: 
priors <- data.frame(
  distn  = "norm",
  parama = 20,
  paramb = 5,
  n      = 50,
  row.names = "SLA",
  stringsAsFactors = FALSE
)

result <- get_parameter_samples(
  pft_names         = "temperate.Hardwood",
  prior_distns_list = list(priors),
  ensemble.size     = 100,
  do_ensemble       = TRUE
)

str(result$trait.samples)
str(result$ensemble.samples)

## End(Not run)

Get delta between sequential flux datapoints

Description

Get delta between sequential flux datapoints

Usage

get.change(measurement)

Arguments

measurement

numeric vector

Value

Difference between consecutive measurements

Author(s)

Mike Dietze, Carl Davidson


Get coefficient of variance

Description

Given a set of numbers (a numeric vector), this returns the set's coefficient of variance.

Usage

get.coef.var(set)

Arguments

set

numeric vector of trait values

Value

coeficient of variance


Generic function for the elasticity

Description

Given the sensitivity, samples, and outputs for a single trait, return elasticity

Usage

get.elasticity(sensitivity, samples, outputs)

Arguments

sensitivity

univariate sensitivity of model to a parameter, can be calculated by get.sensitivity

samples

samples from trait distribution

outputs

model output from ensemble runs

Value

elasticity = normalized sensitivity


Get parameter values used in ensemble

Description

Returns a matrix of randomly or quasi-randomly sampled trait values to be assigned to traits over several model runs. given the number of model runs and a list of sample distributions for traits The model run is indexed first by model run, then by trait

Usage

get.ensemble.samples(
  ensemble.size,
  pft.samples,
  env.samples,
  method = "random",
  param.names = NULL,
  ...
)

Arguments

ensemble.size

number of runs in model ensemble

pft.samples

random samples from parameter distribution, e.g. from a MCMC chain

env.samples

env samples

method

the method used to generate the ensemble samples. Random generators: uniform, uniform with latin hypercube permutation. Quasi-random generators: halton, sobol, torus. Random generation draws random variates whereas quasi-random generation is deterministic but well equidistributed. Default is uniform. For small ensemble size with relatively large parameter number (e.g ensemble size < 5 and # of traits > 5) use methods other than halton.

param.names

a list of parameter names that were fitted either by MA or PDA, important argument, if NULL parameters will be resampled independently

...

Other arguments passed on to the sampling method

Value

matrix of (quasi-)random samples from trait distributions

Author(s)

David LeBauer, Istem Fer


Get g_i(phi_i)

Description

Calculate distribution of function of a variable

Usage

get.gi.phii(splinefuns, trait.samples, maxn = NULL)

Arguments

splinefuns

univariate spline functions created for each trait, e.g. by the sensitivity.analysis function.

trait.samples

n x m matrix (or list with m vectors of length n) of n parameter sets, each with a sample from m traits

maxn

maximum number of parameter sets to evaluate

Details

Transform trait parameter through trait-specific univariate model emulator spline function Given a matrix of parameter sets, calculates a matrix of model output by applying appropriate spline function to each parameter. Output is used with plot_variance_decomposition and spline.ensemble

Value

matrix of spline estimates of model output for each of n parameter sets

Author(s)

David LeBauer


Convert priors / MCMC samples to parameter sample chains

Description

Loads posterior distributions and MCMC chain results from disk, generates parameter samples for ensemble and sensitivity analysis runs, and optionally saves results to samples.Rdata. This is the backward-compatible wrapper that delegates computation to get_parameter_samples.

Usage

get.parameter.samples(
  settings,
  ensemble.size = 1,
  posterior.files = rep(NA, length(settings$pfts)),
  ens.sample.method = "uniform",
  outdir = settings$outdir
)

Arguments

settings

PEcAn settings object

ensemble.size

number of runs in model ensemble

posterior.files

list of filenames to read from

ens.sample.method

one of "halton", "sobol", "torus", "lhc", "uniform"

outdir

character path; directory to write samples.Rdata to for provenance. Defaults to settings$outdir, preserving the legacy always-save behaviour for existing callers (none of which pass it). Pass outdir = NULL to skip the save entirely; the pure get_parameter_samples() never writes to disk regardless.

Details

Upstream contract (reads from each PFT's outdir):

post.distns.Rdata or prior.distns.Rdata

Posterior (or prior) distribution summaries produced by run.meta.analysis.pft. A data frame with columns distn, parama, paramb, n.

trait.mcmc.Rdata

(Optional) MCMC chain samples from the meta-analysis. Named list of mcmc.list objects, one per trait. If present, samples are drawn from the chains directly; otherwise, independent samples are drawn from post.distns.

File-based side effects (saved to settings$outdir):

samples.Rdata

When outdir is non-NULL (the default), bundles 5 objects:

  • trait.samples — Named list (PFT -> trait -> numeric vector of length iterations). Raw MCMC or prior-sampled values.

  • sa.samples — Named list (PFT -> matrix[n_quantiles x n_traits]). Quantile-based samples for sensitivity analysis.

  • ensemble.samples — Named list (PFT -> data frame[ensemble.size x n_traits]). Subsampled parameter sets for ensemble runs.

  • env.samples — Currently empty list (reserved for environmental samples).

  • runs.samples — Currently empty list (reserved for run metadata).

Downstream contract: samples.Rdata is loaded by run.write.configs (in PEcAn.workflow) to generate model configuration files. It is also loaded by get.results and run.sensitivity.analysis to retrieve sample metadata for post-processing. This implicit file-based coupling is a refactoring target.

Value

Named list with 5 elements: trait.samples, sa.samples, ensemble.samples, runs.samples, env.samples. Returned invisibly.

Author(s)

David LeBauer, Shawn Serbin, Istem Fer, Om Kapale


Read model output and save parsed results for sensitivity and ensemble analyses

Description

Loads sample metadata and run IDs from disk, reads model output for each run, and saves parsed results for downstream analysis by run.sensitivity.analysis and run.ensemble.analysis.

Usage

get.results(
  settings,
  sa.ensemble.id = NULL,
  ens.ensemble.id = NULL,
  variable = NULL,
  start.year = NULL,
  end.year = NULL
)

Arguments

settings

list, read from settings file (xml) using PEcAn.settings::read.settings()

sa.ensemble.id, ens.ensemble.id

ensemble IDs for the sensitivity analysis and ensemble analysis. If not provided, they are first looked up from settings, then if not found they are not used and the most recent set of results is read from samples.Rdata in directory settings$outdir

variable

variables to retrieve, as vector of names or expressions

start.year, end.year

first and last years to retrieve

Details

Output is placed in model output directory (settings$outdir).

Upstream contract (reads from settings$outdir):

⁠sensitivity.samples.<id>.Rdata⁠ or samples.Rdata

Loaded to obtain sa.run.ids, pft.names, trait.names, and sa.samples for sensitivity analysis. Produced by run.write.configs.

⁠ensemble.samples.<id>.Rdata⁠ or samples.Rdata

Loaded to obtain ens.run.ids, pft.names, and trait.names for ensemble analysis. Produced by run.write.configs.

File-based side effects (saved to settings$outdir):

⁠sensitivity.output.<var>.<years>.<id>.Rdata⁠

Contains sensitivity.output: a named list (PFT -> matrix[n_quantiles x n_traits]) of model output values corresponding to each SA run. One file per variable/year combination.

⁠ensemble.output.<var>.<years>.<id>.Rdata⁠

Contains ensemble.output: a numeric vector of model output values, one per ensemble member. One file per variable/year combination.

Downstream contract: sensitivity.output.*.Rdata is loaded by run.sensitivity.analysis. ensemble.output.*.Rdata is loaded by run.ensemble.analysis. Both use these files to compute diagnostics and plots. This implicit file-based coupling is a refactoring target.

Value

Nothing (called for side effects). Saves sensitivity.output and/or ensemble.output .Rdata files to settings$outdir.

Author(s)

David LeBauer, Shawn Serbin, Mike Dietze, Ryan Kelly


Calculate the sensitivity of a function at the median

Description

This function evaluates the sensitivity of a model to a parameter. This is done by evaluating the first derivative of the univariate spline estimate of the model response at the parameter median.

Usage

get.sensitivity(trait.samples, sa.splinefun)

Arguments

trait.samples

parameter values to evaluate at their median

sa.splinefun

fitted spline function. Must take two arguments.

Value

numeric estimate of model sensitivity to parameter


Generate an ensemble of samples from one input

Description

Given an input containing paths to multiple files, this function samples from the given paths to make an ensemble of the requested size. If method is 'sampling' they are sampled with replacement from the full set; if method is 'looping' they are taken sequentially with the sequence repeating as needed.

Usage

input.ens.gen(
  settings,
  ensemble_size,
  input,
  method = "sampling",
  parent_ids = NULL,
  bad_parent_action = c("error", "resample")
)

Arguments

settings

list of PEcAn settings

ensemble_size

size of ensemble

input

name of input to sample, e.g. "met", "veg", "pss"

method

Method for sampling - For now looping or sampling with replacement is implemented

parent_ids

integer vector of indices to be used as-is

bad_parent_action

How to handle entries in 'parent_id' that are not valid indices into the paths given for this input. See details

Details

If 'parent_ids' is provided, it is used as the indices for the current input. Use this when you have multiple inputs whose files should be sampled as a coordinated set. For example if your soil moisture files are matched to the rainfall amounts in your weather files, generate the met ensemble first ('met_ids <- input.ens.gen(..., input = "met", parent_ids = NULL)') and then pass those as parents to the soil moisture ensemble: 'input.ens.gen(..., input = "soilmoist", parent_ids = met_ids)'.

If any indices in 'parent_ids' are larger than the length of the input being sampled (e.g. '53' when the input only has 25 paths), the result is determined by 'bad_parent_action': "error" stops execution and requests the user to fix their inputs, "resample" uses the valid parents as-is and replaces the invalid parents via method = "sampling". Note that method "resample" is provided for backwards compatibility; we strongly recommend validating parent-child alignment before calling 'input.ens.gen', ideally in the same process that generates your parent ids.

Value

A list with elements 'id' showing the order of sampling and 'samples' with the paths selected from those indices.

Examples

## Not run: 
  settings <- PEcAn.settings::read.settings("pecan.xml")
  input.ens.gen(
    settings,
    ensemble_size = 50,
    input = "met",
    method = "sampling"
  )

## End(Not run)

Calculates the excess kurtosis of a vector

Description

Note that this calculates the 'excess kurtosis', which is defined as kurtosis - 3. This statistic is used in the calculation of the standard deviation of sample variance in the function sd.var.

Usage

kurtosis(x)

Arguments

x

vector of values

Value

numeric value of kurtosis

Author(s)

David LeBauer

References

NIST/SEMATECH e-Handbook of Statistical Methods, http://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htm, 2011-06-20.


Plot fit for heteroskedastic flux uncertainty

Description

Plot fit for heteroskedastic flux uncertainty

Usage

plot_flux_uncertainty(f, ...)

Arguments

f

output of flux.uncertainty functions

...

optional graphical paramters

Author(s)

Mike Dietze, Carl Davidson


Plot functions and quantiles used in sensitivity analysis

Description

Generates a plot using plot_sensitivity for multiple traits.

Usage

plot_sensitivities(
  sensitivity.plot.inputs,
  prior.sensitivity.plot.inputs = NULL,
  ...
)

Arguments

sensitivity.plot.inputs

inputs

prior.sensitivity.plot.inputs

priors

...

arguments passed to plot_sensitivity

Value

list of plots, one per trait


Sensitivity plot

Description

Plot univariate response of model output to a trait parameter.

Usage

plot_sensitivity(
  sa.sample,
  sa.spline,
  trait,
  y.range = c(0, 50),
  median.i = 4,
  prior.sa.sample = NULL,
  prior.sa.spline = NULL,
  fontsize = list(title = 12, axis = 8),
  linesize = 1,
  dotsize = 2
)

Arguments

sa.sample

trait quantiles used in sensitivity analysis

sa.spline

spline function estimated from sensitivity analysis

trait

trait name for title

y.range

limits for y axis of plot

median.i

index of median value in sa.sample; median.i == which(as.numeric(rownames(sa.sample)) == 50)

prior.sa.sample

similar to sa.sample, but for prior distribution. If given, plots sensitivity for prior run

prior.sa.spline

similar to sa.spline, but for prior trait distribution.

fontsize

(optional) list with three arguments that can be set to vary the fontsize of the title, axis labels, and axis title in the sensitivity plots

linesize

passed to ggplot to set line thickness

dotsize

passed to ggplot to set point size

Details

Plots for a single trait; called by plot_sensitivities to plot sensitivity plots for multiple traits.

Value

object of class ggplot


Variance Decomposition Plots

Description

Plots variance decomposition tryptich: CV, elasticity, variance

Usage

plot_variance_decomposition(
  plot.inputs,
  fontsize = list(title = 18, axis = 14),
  order_by = c("Variance", "Elasticity", "CV (%)", "rowname", "none")
)

Arguments

plot.inputs

Output from a sensitivity analysis. Output must be of the form given by sensitivity.results$variance.decomposition.output in model output

fontsize

list specifying the font size of the titles and axes of the graph

order_by

Result column to sort by, or "none" to retain input order

Value

ggplot object

Author(s)

David LeBauer, Carl Davidson, Chris Black

Examples

x <- list(trait.labels = c('a', 'b', 'c'),
          coef.vars = c(a=1,b=0.5, c=0.1),
          elasticities = c(a=1,b=2,c=0.5),
          variances    = c(a = 20, b=30, c = 10))
plot_variance_decomposition(x)
plot_variance_decomposition(x, order_by = "rowname")

Read Ameriflux L2 Data

Description

Read Ameriflux L2 Data

Usage

read.ameriflux.L2(file.name, year)

Arguments

file.name

path to file

year

currently ignored

Value

Ameriflux L2 data read from file

Author(s)

Mike Dietze, Carl Davidson


Reads output from model ensemble

Description

Reads output for an ensemble of length specified by ensemble.size and bounded by start.year and end.year

Usage

read.ensemble.output(
  ensemble.size,
  pecandir,
  outdir,
  start.year,
  end.year,
  variable,
  ens.run.ids = NULL
)

Arguments

ensemble.size

the number of ensemble members run

pecandir

specifies where pecan writes its configuration files

outdir

directory with model output to use in ensemble analysis

start.year

first year to include in ensemble analysis

end.year

last year to include in ensemble analysis

variable

target variables for ensemble analysis

ens.run.ids

dataframe. Must contain a column named "id" giving the run IDs to be read. If NULL, will attempt to read IDs from a file named "samples.Rdata" in pecandir

Value

a list of ensemble model output

Author(s)

Ryan Kelly, David LeBauer, Rob Kooper


Reads ensemble time-series from PEcAn for the selected target variable

Description

Reads ensemble time-series from PEcAn for the selected target variable

Usage

read.ensemble.ts(
  settings,
  ensemble.id = NULL,
  variable = NULL,
  start.year = NULL,
  end.year = NULL
)

Arguments

settings

PEcAn settings object

ensemble.id

database ID, taken from settings if not specified

variable

variable name to process. Required

start.year, end.year

taken from settings if not specified

Value

list

Author(s)

Michael Dietze, Ryan Kelly


Reads output of sensitivity analysis runs

Description

Reads output of sensitivity analysis runs

Usage

read.sa.output(
  traits,
  quantiles,
  pecandir,
  outdir,
  pft.name = "",
  start.year,
  end.year,
  variable,
  sa.run.ids = NULL,
  per.pft = FALSE
)

Arguments

traits

model parameters included in the sensitivity analysis

quantiles

quantiles selected for sensitivity analysis

pecandir

specifies where pecan writes its configuration files

outdir

directory with model output to use in sensitivity analysis

pft.name

name of PFT used in sensitivity analysis (Optional)

start.year

first year to include in sensitivity analysis

end.year

last year to include in sensitivity analysis

variable

variables to be read from model output

sa.run.ids

list of run ids to read. If NULL, will look in 'pecandir' for a file named 'samples.Rdata' and read from that

per.pft

flag to determine whether we want SA on pft-specific variables

Value

dataframe with one col per quantile analysed and one row per trait, each cell is a list of AGB over time

Author(s)

Ryan Kelly, David LeBauer, Rob Kooper, Mike Dietze, Istem Fer, Akash B V


Run ensemble analysis on finished model runs

Description

Loads parsed ensemble model output from disk, computes summary statistics (mean, median, quantiles), and generates diagnostic plots (histogram, boxplot, and optionally time series).

Usage

run.ensemble.analysis(
  settings,
  plot.timeseries = NA,
  ensemble.id = NULL,
  variable = NULL,
  start.year = NULL,
  end.year = NULL,
  ...
)

Arguments

settings

PEcAn settings object

plot.timeseries

if TRUE plots a modeled timeseries of target variable(s) with CIs

ensemble.id

database ID, taken from settings if not specified

variable

variable name to process, taken from settings if not specified

start.year, end.year

taken from settings if not specified

...

additional arguments passed to ensemble.ts

Details

Upstream contract (reads from settings$outdir):

⁠ensemble.output.<var>.<years>.<id>.Rdata⁠

Produced by get.results. Contains ensemble.output: a numeric vector of model output values, one per ensemble member.

File-based side effects (saved to settings$outdir):

⁠ensemble.analysis.<var>.<years>.<id>.pdf⁠

Histogram and boxplot of ensemble output distribution.

⁠ensemble.ts.<var>.<years>.<id>.pdf⁠

(Optional) Time series plot with confidence intervals. Generated when plot.timeseries is not NA.

⁠ensemble.ts.analysis.<var>.<years>.<id>.Rdata⁠

(Optional) Contains ensemble.ts.analysis: time series summary statistics. Saved when plot.timeseries is not NA.

Note: This is a terminal step — nothing downstream loads these files programmatically. The results are consumed by visualization or user inspection.

Value

Nothing (called for side effects). Creates ensemble plots as PDF files and optionally saves time series analysis results.

Author(s)

David LeBauer, Shawn Serbin, Ryan Kelly


Run sensitivity analysis on finished model runs

Description

Loads parameter samples and parsed model output from disk, computes sensitivity coefficients and variance decomposition for each PFT/trait, and saves results and diagnostic plots.

Usage

run.sensitivity.analysis(
  settings,
  plot = TRUE,
  ensemble.id = NULL,
  variable = NULL,
  start.year = NULL,
  end.year = NULL,
  pfts = NULL,
  ...
)

Arguments

settings

a PEcAn settings object

plot

logical. Option to generate sensitivity analysis and variance decomposition plots (plot = TRUE) or to turn these plots off (plot = FALSE).

ensemble.id

ensemble ID

variable

which variable(s) to do sensitivity analysis for. Defaults to all specified in settings

start.year

defaults to what is specified in settings

end.year

defaults to what is specified in settings

pfts

a vector of PFT names found in settings to run sensitivity analysis on

...

currently unused

Details

Upstream contract (reads from settings$outdir):

samples.Rdata

Produced by get.parameter.samples. Loaded to obtain trait.samples, trait.names, pft.names, and sa.run.ids.

⁠sensitivity.samples.<id>.Rdata⁠

Produced by run.write.configs. If present, overrides sa.run.ids and sa.samples from samples.Rdata.

⁠sensitivity.output.<var>.<years>.<id>.Rdata⁠

Produced by get.results. Contains sensitivity.output: a named list (PFT -> matrix) of model outputs for each SA run.

File-based side effects (saved to settings$outdir):

⁠sensitivity.results.<var>.<years>.<id>.Rdata⁠

Contains sensitivity.results: a named list (PFT -> list) with elements sensitivity.output (partial derivatives) and variance.decomposition.output (elasticities and partial variances).

⁠sensitivityanalysis.<pft>.<var>.<years>.<id>.pdf⁠

Sensitivity analysis diagnostic plots (one per PFT). Generated when plot = TRUE.

⁠variance.decomposition.<pft>.<var>.<years>.<id>.pdf⁠

Variance decomposition "popsicle" plots (one per PFT). Generated when plot = TRUE.

Note: This is a terminal step in the workflow — nothing downstream loads sensitivity.results.*.Rdata programmatically. The results are consumed by visualization or user inspection.

Value

Nothing (called for side effects). Saves sensitivity.results as sensitivity.results.*.Rdata and optional PDF plots.

Author(s)

David LeBauer, Shawn Serbin, Ryan Kelly

Examples

## Not run: 
library(PEcAn.settings)
library(PEcAn.uncertainty)
settings <- read.settings("path/to/pecan.xml")
run.sensitivity.analysis(settings)

## End(Not run)

Apply get.results to each of a list of settings

Description

Apply get.results to each of a list of settings

Usage

runModule.get.results(settings)

Arguments

settings

a PEcAn Settings or MultiSettings object

See Also

get.results


Run ensemble analyses across all sites in settings

Description

Run ensemble analyses across all sites in settings

Usage

runModule.run.ensemble.analysis(settings, ...)

Arguments

settings

PEcAn settings object

...

arguments passed on to run.ensemble.analysis


Run sensitivity analysis for every Settings in a MultiSettings

Description

Caution: Not yet working for multisite settings. It will _run_ the analysis for all sites, but each site will overwrite the result from the previous one.

Usage

runModule.run.sensitivity.analysis(settings, ...)

Arguments

settings

PEcan settings object

...

additional arguments passed on to 'run.sensitivity.analysis'


Spline estimate of univariate relationship between parameter value and model output

Description

Creates a spline function using the splinefun function that estimates univariate response of parameter input to model output

Usage

sa.splinefun(quantiles.input, quantiles.output)

Arguments

quantiles.input

passed to 'x' argument of 'stats::splinefun'

quantiles.output

passed to 'y' argument of 'stats::splinefun'

Value

function


Calculates the standard deviation of the variance estimate

Description

Uses the equation σ4(2n1+κn)\sigma^4\left(\frac{2}{n-1}+\frac{\kappa}{n}\right)

Usage

sd.var(x)

Arguments

x

sample

Value

estimate of standard deviation of the sample variance

Author(s)

David LeBauer

References

Mood, Graybill, Boes 1974 'Introduction to the Theory of Statistics' 3rd ed. p 229; Casella and Berger 'Statistical Inference' p 364 ex. 7.45; 'Reference for Var(s^2)' CrossValidated http://stats.stackexchange.com/q/29905/1381, 'Calculating required sample size, precision of variance estimate' CrossValidated http://stats.stackexchange.com/q/7004/1381, 'Variance of Sample Variance?' Mathematics - Stack Exchange http://math.stackexchange.com/q/72975/3733


Performs univariate sensitivity analysis and variance decomposition

Description

This function estimates the univariate responses of a model to a parameter for a set of traits, calculates the model sensitivity at the median, and performs a variance decomposition. This function results in a set of ensitivity plots (one per variable) and plot_variance_decomposition.

Usage

sensitivity.analysis(trait.samples, sa.samples, sa.output, outdir)

Arguments

trait.samples

list of vectors, one per trait, representing samples of the trait value, with length equal to the mcmc chain length. Samples are taken from either the prior distribution or meta-analysis results

sa.samples

data.frame with one column per trait and one row for the set of quantiles used in sensitivity analysis.

sa.output

list of data.frames, similar to sa.samples, except cells contain the results of a model run with that trait x quantile combination and all other traits held at their median value

outdir

directory to which plots are written

Value

results of sensitivity analysis

Author(s)

David LeBauer

Examples

## Not run: 
sensitivity.analysis(
 trait.samples[[pft$name]],
 sa.samples[[pft$name]],
 sa.agb[[pft$name]],
 pft$outdir
)

## End(Not run)

Generate sensitivity analysis filenames

Description

Generate sensitivity analysis filenames

Usage

sensitivity.filename(
  settings,
  prefix = "sensitivity.samples",
  suffix = "Rdata",
  all.var.yr = TRUE,
  pft = NULL,
  ensemble.id = settings$sensitivity.analysis$ensemble.id,
  variable = settings$sensitivity.analysis$variable,
  start.year = settings$sensitivity.analysis$start.year,
  end.year = settings$sensitivity.analysis$end.year
)

Arguments

settings

list of PEcAn settings.

prefix

string to appear at the beginning of the filename

suffix

file extension: string to appear at the end of the filename

all.var.yr

logical: does ensemble include all vars and years? If FALSE, filename will include years and vars

pft

name of PFT used for analysis. If NULL, assumes all PFTs in run are used and does not add them to the filename

ensemble.id

ensemble ID(s)

variable

variable(s) included in the ensemble.

start.year, end.year

first and last year simulated.

Details

Generally uses values in settings, but can be overwritten for manual uses

Value

a filename

Author(s)

Ryan Kelly


Spline Ensemble

Description

Estimate model output based on univariate splines

Usage

spline.ensemble(gi.phii, median)

Arguments

gi.phii

matrix given as output from get.gi.phii

median

median value around which variance will be calculated

Details

Accepts output from get.gi.phii (the matrix g(ϕi)g(\phi_i)) and produces spline estimate of f(ϕ)f(\phi) for use in estimating closure term associated with spline approximation

Author(s)

David LeBauer


Truncate spline at zero if...

Description

Truncate spline at zero if P[x<0] < pnorm(-3) pnorm(-3) chosen as default value for min quantile because this is the default low end of range for the sensitivity analysis. This parameter could be determined based on minimum value in settings$sensitivity.analysis$quantiles

Usage

spline.truncate(x, min.quantile = stats::pnorm(-3))

Arguments

x

vector

min.quantile

threshold quantile for testing lower bound on variable

Value

either x or a vector with values < 0 converted to zero

Author(s)

David LeBauer

Examples

set.seed(0)
x <- c(rgamma(998,1,1), rnorm(10))
min(x) # -0.5238
min(PEcAn.uncertainty::spline.truncate(x))

variance statistics

Description

Variance and SD(variance)

Usage

variance.stats(x)

Arguments

x

numeric vector

Details

calculates variance and sd of variance using the var from base R and sd.var from PEcAn.

Value

list with variance and sd of variance

Author(s)

David LeBauer


Write ensemble config files

Description

Writes config files for use in meta-analysis and returns a list of run ids. Given a pft.xml object, a list of lists as supplied by get.sa.samples, a name to distinguish the output files, and the directory to place the files.

Usage

write.ensemble.configs(
  input_design,
  ensemble.size,
  defaults,
  ensemble.samples,
  settings,
  model,
  clean = FALSE,
  write.to.db = TRUE,
  restart = NULL,
  rename = FALSE
)

Arguments

input_design

design matrix describing sampled inputs (see 'run.write.configs()'). Columns named after 'settings$run$inputs' tags give 1-based indices into each input's 'path' list and rows follow run order. Requires 'nrow(input_design) >= ensemble.size'; extra rows are ignored.

ensemble.size

size of ensemble

defaults

pft

ensemble.samples

list of lists supplied by get.ensemble.samples

settings

list of PEcAn settings

model

name of model to be run, e.g. "ED2" or "SIPNET"

clean

remove old output first?

write.to.db

logical: Record this run in BETY?

restart

In case this is a continuation of an old simulation. Needs to be a list with name tags of runid, inputs, new.params (parameters), new.state (initial condition), ensemble.id (ensemble id), start.time and stop.time. See Details.

rename

Decide if we want to rename previous output files, for example convert from sipnet.out to sipnet.2020-07-16.out.

Details

The restart functionality is developed using model specific functions by calling write_restart.modelname function. First, you need to make sure that this function is already exist for your desired model. See here https://pecanproject.github.io/pecan-documentation/latest/pecan-models.html. new state is a dataframe with a different column for each state variable. The number of the rows in this dataframe needs to be the same as the ensemble size. The state variables that you can use for setting up initial conditions are model specific. Check the documentation of the write_restart.<modelname> function for the model you are using. The units for the state variables need to be in the PEcAn standard units which can be found in standard_vars. new.params also has similar structure to ensemble.samples which is sent as an argument.

Value

list, containing $runs = data frame of runids, $ensemble.id = the ensemble ID for these runs, and $samples with ids and samples used for each tag. Also writes sensitivity analysis configuration files as a side effect

Author(s)

David LeBauer, Carl Davidson, Hamze Dokoohaki


Write sensitivity analysis config files

Description

Writes config files for use in sensitivity analysis.

Usage

write.sa.configs(
  defaults,
  quantile.samples,
  settings,
  model,
  clean = FALSE,
  write.to.db = TRUE,
  input_design = NULL
)

Arguments

defaults

named list with default parameter values

quantile.samples

list of lists supplied by get.sa.samples

settings

list of settings

model

name of model to be run

clean

logical: Delete any existing contents of the directory specified by settings$rundir before writing to it?

write.to.db

logical: Record this run to BETY? If TRUE, uses connection settings specified in settings$database

input_design

data.frame coordinating input files across runs

Value

list, containing $runs = data frame of runids, and $ensemble.id = the ensemble ID for these runs. Also writes sensitivity analysis configuration files as a side effect

Author(s)

David LeBauer, Carl Davidson, Akash B V