| Title: | PEcAn Functions Used for Propagating and Partitioning Uncertainties in Ecological Forecasts and Reanalysis |
|---|---|
| Description: | The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. |
| Authors: | David LeBauer, Mike Dietze, Xiaohui Feng, Dan Wang, Carl Davidson, Rob Kooper, Shawn Serbin |
| Maintainer: | David LeBauer <[email protected]> |
| License: | BSD_3_clause + file LICENSE |
| Version: | 1.9.0.9000 |
| Built: | 2026-06-04 19:47:47 UTC |
| Source: | https://github.com/PecanProject/pecan |
Loads model outputs from a Sobol ensemble, calculates summary
statistics for a chosen variable, feeds them to sensitivity::tell(),
and returns the updated Sobol object.
compute_sobol_indices(outdir, sobol_obj, var = "GPP", stat_fun = mean)compute_sobol_indices(outdir, sobol_obj, var = "GPP", stat_fun = mean)
outdir |
PEcAn run output directory that contains runs.txt |
sobol_obj |
object produced by PEcAn.uncertainty::generate_joint_ensemble_design() |
var |
Variable name to summarise (default "GPP"). |
stat_fun |
Summary statistic applied to var default mean . |
sobol_obj .
Generates a vector of filenames to be used for PEcAn ensemble output files. All paths start from directory 'settings$outdir', which will be created if it does not exist.
ensemble.filename( settings, prefix = "ensemble.samples", suffix = "Rdata", all.var.yr = TRUE, ensemble.id = settings$ensemble$ensemble.id, variable = settings$ensemble$variable, start.year = settings$ensemble$start.year, end.year = settings$ensemble$end.year )ensemble.filename( settings, prefix = "ensemble.samples", suffix = "Rdata", all.var.yr = TRUE, ensemble.id = settings$ensemble$ensemble.id, variable = settings$ensemble$variable, start.year = settings$ensemble$start.year, end.year = settings$ensemble$end.year )
settings |
list of PEcAn settings. |
prefix |
string to appear at the beginning of the filename |
suffix |
file extension: string to appear at the end of the filename |
all.var.yr |
logical: does ensemble include all vars and years? If FALSE, filename will include years and vars |
ensemble.id |
ensemble ID(s) |
variable |
variable(s) included in the ensemble. |
start.year, end.year
|
first and last year simulated. |
Typically used by passing only a settings object, but all values can be overridden for manual use.
If only a single variable or a subset of years are needed, the generated filename will identify these in the form If all vars and years are included, set 'all.yr.var' to TRUE to get a filename of the form 'prefix.ensemble_id.suffix'. All elements are recycled vectorwise.
a vector of filenames, each in the form '[settings$outdir]/[prefix].[ensemble.ID].[variable].[start.year].[end.year][suffix]'.
Ryan Kelly
Plots an ensemble time-series from PEcAn for the selected target variable
ensemble.ts(ensemble.ts, observations = NULL, window = 1, ...)ensemble.ts(ensemble.ts, observations = NULL, window = 1, ...)
ensemble.ts |
ensemble timeseries to be plotted, as returned by [read.ensemble.ts()] |
observations |
observed data to plot over timeseries predictions |
window |
number of timepoints to average across |
... |
additional arguments, currently ignored |
nothing, generates an ensemble time-series plot
Michael Dietze, Ryan Kelly
Calculate parameters for heteroskedastic flux uncertainty
flux.uncertainty( measurement, QC = 0, flags = TRUE, bin.num = 10, transform = identity, minBin = 5, ... )flux.uncertainty( measurement, QC = 0, flags = TRUE, bin.num = 10, transform = identity, minBin = 5, ... )
measurement |
= flux time-series |
QC |
= quality control flag time series (0 = best) |
flags |
= additional flags on flux filtering of PAIRS (length = 1/2 that of the time series, TRUE = use). |
bin.num |
= number of bins (default = 10) |
transform |
= transformation of magnitude (default = identity) |
minBin |
= minimum number of points allowed in a bin |
... |
additional arguments, currently ignored |
return.list List of parameters from the fitted uncertainty model
Mike Dietze, Carl Davidson
Generate joint ensemble design for parameter sampling Creates a joint ensemble design that maintains parameter correlations across all sites in a multi-site run. This function generates sample indices that are shared across sites to ensure consistent parameter sampling.
generate_joint_ensemble_design(settings, ensemble_size, sobol = FALSE)generate_joint_ensemble_design(settings, ensemble_size, sobol = FALSE)
settings |
PEcAn settings object. This function directly uses:
When samples.Rdata doesn't exist, settings is passed to
|
ensemble_size |
Integer specifying the number of ensemble members. The input_design is generated once for the entire model run. You might want to recycle existing ensemble_samples when splitting larger runs into smaller jobs while keeping the same parameters. |
sobol |
Logical. If TRUE, returns a |
Note on internal dependencies
If samples.Rdata doesn't exist we call get.parameter.samples(), which loads parameter distributions.
In practice it: - uses pft$posterior.files directly when it is defined (an Rdata file with post.distns or prior.distns), - otherwise figures out an output directory from pft$outdir or, if needed, via pft$posteriorid in the database, - then looks in that directory for post.distns.Rdata, falling back to prior.distns.Rdata, - and, for MCMC posteriors, looks up trait.mcmc*.Rdata linked to the same posteriorid or a trait.mcmc.Rdata file in that directory.
Difference from generate_OAT_SA_design: This function samples inputs randomly or quasi-randomly, while generate_OAT_SA_design holds all non-parameter inputs constant to isolate parameter effects.
A list containing ensemble samples and indices.
If sobol = FALSE, returns list(X = design_matrix).
If sobol = TRUE, returns a sensitivity::soboljansen()
result object with the design matrix in $X plus additional
components for Sobol index calculations.
Creates an input design matrix for sensitivity analysis where non-parameter inputs (met, IC, soil, etc.) are held constant while parameters vary one-at-a-time across quantiles. This differs from ensemble design where all inputs vary together.
generate_OAT_SA_design(settings, sa_samples = NULL)generate_OAT_SA_design(settings, sa_samples = NULL)
settings |
PEcAn settings object. See details for required elements. |
sa_samples |
Optional. Pre-loaded SA samples (named list with one
element per PFT, each a matrix with quantiles as rows and traits as columns).
If NULL (default), samples are generated via |
## Settings requirements
This function directly uses:
settings$outdir - Output directory path for samples.Rdata
settings$pfts - List of PFTs (extracts posterior.files)
settings$ensemble$samplingspace - Input types to include in design
When sa_samples = NULL, settings is passed to
get.parameter.samples which additionally requires:
settings$sensitivity.analysis - SA quantile configuration
settings$database$bety - Database connection (optional)
settings$host$name - Host name for dbfile.check (optional)
## OAT design logic
For sensitivity analysis, we must isolate the effect of each
parameter by holding all other inputs constant. The param column contains
sequential indices (1, 2, 3, ...) matching the SA run order in
write.sa.configs. All other columns (met, ic, soil, etc.) are set to 1,
meaning the first input file is always used.
Note on internal dependencies
If sa_samples is NULL we hand off to get.parameter.samples(), which does the work of finding and loading parameter distributions.
In practice it: - uses pft$posterior.files directly when it is defined (an Rdata file with post.distns or prior.distns), - otherwise figures out an output directory from pft$outdir or, if needed, via pft$posteriorid in the database, - then looks in that directory for post.distns.Rdata, falling back to prior.distns.Rdata, - and, for MCMC posteriors, looks up trait.mcmc*.Rdata linked to the same posteriorid or a trait.mcmc.Rdata file in that directory.
list with component X: a data.frame with columns for each input type and one row per SA run. Non-parameter columns are all 1 (constant).
Akash B V
## Not run: # Generate SA design for a multi-site run sa_design <- generate_OAT_SA_design(settings) # View the design matrix print(sa_design$X) # param met ic soil # 1 1 1 1 1 # Median run # 2 2 1 1 1 # trait1 @ q=2.3% # 3 3 1 1 1 # trait1 @ q=15.9% # 4 4 1 1 1 # trait1 @ q=84.1% # ... # With pre-loaded sa_samples (skips get.parameter.samples call) load("samples.Rdata") sa_design <- generate_OAT_SA_design(settings, sa_samples = sa.samples) ## End(Not run)## Not run: # Generate SA design for a multi-site run sa_design <- generate_OAT_SA_design(settings) # View the design matrix print(sa_design$X) # param met ic soil # 1 1 1 1 1 # Median run # 2 2 1 1 1 # trait1 @ q=2.3% # 3 3 1 1 1 # trait1 @ q=15.9% # 4 4 1 1 1 # trait1 @ q=84.1% # ... # With pre-loaded sa_samples (skips get.parameter.samples call) load("samples.Rdata") sa_design <- generate_OAT_SA_design(settings, sa_samples = sa.samples) ## End(Not run)
This is the pure computation core extracted from get.parameter.samples().
It takes pre-loaded R objects as input and returns all computed samples
directly, with no file I/O and no database calls.
get_parameter_samples( pft_names, prior_distns_list, trait_mcmc_list = NULL, ensemble.size = 1, ens.sample.method = "uniform", sa_quantiles = NULL, do_ensemble = TRUE, independent = TRUE )get_parameter_samples( pft_names, prior_distns_list, trait_mcmc_list = NULL, ensemble.size = 1, ens.sample.method = "uniform", sa_quantiles = NULL, do_ensemble = TRUE, independent = TRUE )
pft_names |
character vector of PFT names |
prior_distns_list |
list of data frames (one per PFT), each with columns: distn, parama, paramb, n. Row names are trait names. |
trait_mcmc_list |
list of trait MCMC results (one per PFT).
Each element should be a named list of |
ensemble.size |
integer. Number of runs in model ensemble. |
ens.sample.method |
character. One of "halton", "sobol", "torus", "lhc", "uniform". |
sa_quantiles |
numeric vector of quantiles for sensitivity analysis, or NULL to skip SA sampling. |
do_ensemble |
logical. Whether to generate ensemble samples. |
independent |
logical. TRUE means parameters were fitted individually (standard MA) and can be resampled independently. FALSE means they were fitted together (e.g. PDA) and correlations should be preserved. |
For the original disk-based wrapper, see get.parameter.samples.
named list with five elements:
Named list by PFT, then by trait. Each leaf is a numeric vector of parameter samples.
Sensitivity analysis samples (list), or empty list
if sa_quantiles is NULL.
Ensemble samples (list), or empty list if
do_ensemble is FALSE.
Reserved for future use (empty list).
Environmental samples (empty list, populated by downstream code if needed).
David LeBauer, Shawn Serbin, Istem Fer, Om Kapale
get.parameter.samples for the backward-compatible
wrapper that handles file I/O and database lookups.
## Not run: priors <- data.frame( distn = "norm", parama = 20, paramb = 5, n = 50, row.names = "SLA", stringsAsFactors = FALSE ) result <- get_parameter_samples( pft_names = "temperate.Hardwood", prior_distns_list = list(priors), ensemble.size = 100, do_ensemble = TRUE ) str(result$trait.samples) str(result$ensemble.samples) ## End(Not run)## Not run: priors <- data.frame( distn = "norm", parama = 20, paramb = 5, n = 50, row.names = "SLA", stringsAsFactors = FALSE ) result <- get_parameter_samples( pft_names = "temperate.Hardwood", prior_distns_list = list(priors), ensemble.size = 100, do_ensemble = TRUE ) str(result$trait.samples) str(result$ensemble.samples) ## End(Not run)
Get delta between sequential flux datapoints
get.change(measurement)get.change(measurement)
measurement |
numeric vector |
Difference between consecutive measurements
Mike Dietze, Carl Davidson
Given a set of numbers (a numeric vector), this returns the set's coefficient of variance.
get.coef.var(set)get.coef.var(set)
set |
numeric vector of trait values |
coeficient of variance
Given the sensitivity, samples, and outputs for a single trait, return elasticity
get.elasticity(sensitivity, samples, outputs)get.elasticity(sensitivity, samples, outputs)
sensitivity |
univariate sensitivity of model to a parameter,
can be calculated by |
samples |
samples from trait distribution |
outputs |
model output from ensemble runs |
elasticity = normalized sensitivity
Returns a matrix of randomly or quasi-randomly sampled trait values to be assigned to traits over several model runs. given the number of model runs and a list of sample distributions for traits The model run is indexed first by model run, then by trait
get.ensemble.samples( ensemble.size, pft.samples, env.samples, method = "random", param.names = NULL, ... )get.ensemble.samples( ensemble.size, pft.samples, env.samples, method = "random", param.names = NULL, ... )
ensemble.size |
number of runs in model ensemble |
pft.samples |
random samples from parameter distribution, e.g. from a MCMC chain |
env.samples |
env samples |
method |
the method used to generate the ensemble samples. Random generators: uniform, uniform with latin hypercube permutation. Quasi-random generators: halton, sobol, torus. Random generation draws random variates whereas quasi-random generation is deterministic but well equidistributed. Default is uniform. For small ensemble size with relatively large parameter number (e.g ensemble size < 5 and # of traits > 5) use methods other than halton. |
param.names |
a list of parameter names that were fitted either by MA or PDA, important argument, if NULL parameters will be resampled independently |
... |
Other arguments passed on to the sampling method |
matrix of (quasi-)random samples from trait distributions
David LeBauer, Istem Fer
Calculate distribution of function of a variable
get.gi.phii(splinefuns, trait.samples, maxn = NULL)get.gi.phii(splinefuns, trait.samples, maxn = NULL)
splinefuns |
univariate spline functions created for each trait, e.g. by the |
trait.samples |
n x m matrix (or list with m vectors of length n) of n parameter sets, each with a sample from m traits |
maxn |
maximum number of parameter sets to evaluate |
Transform trait parameter through trait-specific univariate model emulator spline function
Given a matrix of parameter sets, calculates a matrix of model output
by applying appropriate spline function to each parameter.
Output is used with plot_variance_decomposition
and spline.ensemble
matrix of spline estimates of model output for each of n parameter sets
David LeBauer
Loads posterior distributions and MCMC chain results from disk, generates
parameter samples for ensemble and sensitivity analysis runs, and optionally
saves results to samples.Rdata. This is the backward-compatible wrapper
that delegates computation to get_parameter_samples.
get.parameter.samples( settings, ensemble.size = 1, posterior.files = rep(NA, length(settings$pfts)), ens.sample.method = "uniform", outdir = settings$outdir )get.parameter.samples( settings, ensemble.size = 1, posterior.files = rep(NA, length(settings$pfts)), ens.sample.method = "uniform", outdir = settings$outdir )
settings |
PEcAn settings object |
ensemble.size |
number of runs in model ensemble |
posterior.files |
list of filenames to read from |
ens.sample.method |
one of |
outdir |
character path; directory to write |
Upstream contract (reads from each PFT's outdir):
post.distns.Rdata or prior.distns.Rdata
Posterior (or prior)
distribution summaries produced by run.meta.analysis.pft. A data
frame with columns distn, parama, paramb, n.
trait.mcmc.Rdata(Optional) MCMC chain samples from the
meta-analysis. Named list of mcmc.list objects, one per trait.
If present, samples are drawn from the chains directly; otherwise,
independent samples are drawn from post.distns.
File-based side effects (saved to settings$outdir):
samples.RdataWhen outdir is non-NULL (the default), bundles 5 objects:
trait.samples — Named list (PFT -> trait -> numeric vector of
length iterations). Raw MCMC or prior-sampled values.
sa.samples — Named list (PFT -> matrix[n_quantiles x
n_traits]). Quantile-based samples for sensitivity analysis.
ensemble.samples — Named list (PFT -> data frame[ensemble.size
x n_traits]). Subsampled parameter sets for ensemble runs.
env.samples — Currently empty list (reserved for
environmental samples).
runs.samples — Currently empty list (reserved for run
metadata).
Downstream contract: samples.Rdata is loaded by run.write.configs
(in PEcAn.workflow) to generate model configuration files. It is also
loaded by get.results and run.sensitivity.analysis to retrieve
sample metadata for post-processing. This implicit file-based coupling is
a refactoring target.
Named list with 5 elements: trait.samples, sa.samples,
ensemble.samples, runs.samples, env.samples. Returned invisibly.
David LeBauer, Shawn Serbin, Istem Fer, Om Kapale
Loads sample metadata and run IDs from disk, reads model output for each
run, and saves parsed results for downstream analysis by
run.sensitivity.analysis and run.ensemble.analysis.
get.results( settings, sa.ensemble.id = NULL, ens.ensemble.id = NULL, variable = NULL, start.year = NULL, end.year = NULL )get.results( settings, sa.ensemble.id = NULL, ens.ensemble.id = NULL, variable = NULL, start.year = NULL, end.year = NULL )
settings |
list, read from settings file (xml) using
|
sa.ensemble.id, ens.ensemble.id
|
ensemble IDs for the sensitivity
analysis and ensemble analysis.
If not provided, they are first looked up from |
variable |
variables to retrieve, as vector of names or expressions |
start.year, end.year
|
first and last years to retrieve |
Output is placed in model output directory (settings$outdir).
Upstream contract (reads from settings$outdir):
sensitivity.samples.<id>.Rdata or samples.Rdata
Loaded to
obtain sa.run.ids, pft.names, trait.names, and sa.samples for
sensitivity analysis. Produced by run.write.configs.
ensemble.samples.<id>.Rdata or samples.Rdata
Loaded to obtain
ens.run.ids, pft.names, and trait.names for ensemble analysis.
Produced by run.write.configs.
File-based side effects (saved to settings$outdir):
sensitivity.output.<var>.<years>.<id>.RdataContains
sensitivity.output: a named list (PFT -> matrix[n_quantiles x
n_traits]) of model output values corresponding to each SA run.
One file per variable/year combination.
ensemble.output.<var>.<years>.<id>.RdataContains
ensemble.output: a numeric vector of model output values, one per
ensemble member. One file per variable/year combination.
Downstream contract: sensitivity.output.*.Rdata is loaded by
run.sensitivity.analysis. ensemble.output.*.Rdata is loaded by
run.ensemble.analysis. Both use these files to compute diagnostics
and plots. This implicit file-based coupling is a refactoring target.
Nothing (called for side effects). Saves sensitivity.output and/or
ensemble.output .Rdata files to settings$outdir.
David LeBauer, Shawn Serbin, Mike Dietze, Ryan Kelly
This function evaluates the sensitivity of a model to a parameter. This is done by evaluating the first derivative of the univariate spline estimate of the model response at the parameter median.
get.sensitivity(trait.samples, sa.splinefun)get.sensitivity(trait.samples, sa.splinefun)
trait.samples |
parameter values to evaluate at their median |
sa.splinefun |
fitted spline function. Must take two arguments. |
numeric estimate of model sensitivity to parameter
Given an input containing paths to multiple files, this function samples from the given paths to make an ensemble of the requested size. If method is 'sampling' they are sampled with replacement from the full set; if method is 'looping' they are taken sequentially with the sequence repeating as needed.
input.ens.gen( settings, ensemble_size, input, method = "sampling", parent_ids = NULL, bad_parent_action = c("error", "resample") )input.ens.gen( settings, ensemble_size, input, method = "sampling", parent_ids = NULL, bad_parent_action = c("error", "resample") )
settings |
list of PEcAn settings |
ensemble_size |
size of ensemble |
input |
name of input to sample, e.g. "met", "veg", "pss" |
method |
Method for sampling - For now looping or sampling with replacement is implemented |
parent_ids |
integer vector of indices to be used as-is |
bad_parent_action |
How to handle entries in 'parent_id' that are not valid indices into the paths given for this input. See details |
If 'parent_ids' is provided, it is used as the indices for the current input. Use this when you have multiple inputs whose files should be sampled as a coordinated set. For example if your soil moisture files are matched to the rainfall amounts in your weather files, generate the met ensemble first ('met_ids <- input.ens.gen(..., input = "met", parent_ids = NULL)') and then pass those as parents to the soil moisture ensemble: 'input.ens.gen(..., input = "soilmoist", parent_ids = met_ids)'.
If any indices in 'parent_ids' are larger than the length of the input being sampled (e.g. '53' when the input only has 25 paths), the result is determined by 'bad_parent_action': "error" stops execution and requests the user to fix their inputs, "resample" uses the valid parents as-is and replaces the invalid parents via method = "sampling". Note that method "resample" is provided for backwards compatibility; we strongly recommend validating parent-child alignment before calling 'input.ens.gen', ideally in the same process that generates your parent ids.
A list with elements 'id' showing the order of sampling and 'samples' with the paths selected from those indices.
## Not run: settings <- PEcAn.settings::read.settings("pecan.xml") input.ens.gen( settings, ensemble_size = 50, input = "met", method = "sampling" ) ## End(Not run)## Not run: settings <- PEcAn.settings::read.settings("pecan.xml") input.ens.gen( settings, ensemble_size = 50, input = "met", method = "sampling" ) ## End(Not run)
Note that this calculates the 'excess kurtosis',
which is defined as kurtosis - 3.
This statistic is used in the calculation of the standard deviation of
sample variance in the function sd.var.
kurtosis(x)kurtosis(x)
x |
vector of values |
numeric value of kurtosis
David LeBauer
NIST/SEMATECH e-Handbook of Statistical Methods, http://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htm, 2011-06-20.
Plot fit for heteroskedastic flux uncertainty
plot_flux_uncertainty(f, ...)plot_flux_uncertainty(f, ...)
f |
output of flux.uncertainty functions |
... |
optional graphical paramters |
Mike Dietze, Carl Davidson
Generates a plot using plot_sensitivity for multiple traits.
plot_sensitivities( sensitivity.plot.inputs, prior.sensitivity.plot.inputs = NULL, ... )plot_sensitivities( sensitivity.plot.inputs, prior.sensitivity.plot.inputs = NULL, ... )
sensitivity.plot.inputs |
inputs |
prior.sensitivity.plot.inputs |
priors |
... |
arguments passed to |
list of plots, one per trait
Plot univariate response of model output to a trait parameter.
plot_sensitivity( sa.sample, sa.spline, trait, y.range = c(0, 50), median.i = 4, prior.sa.sample = NULL, prior.sa.spline = NULL, fontsize = list(title = 12, axis = 8), linesize = 1, dotsize = 2 )plot_sensitivity( sa.sample, sa.spline, trait, y.range = c(0, 50), median.i = 4, prior.sa.sample = NULL, prior.sa.spline = NULL, fontsize = list(title = 12, axis = 8), linesize = 1, dotsize = 2 )
sa.sample |
trait quantiles used in sensitivity analysis |
sa.spline |
spline function estimated from sensitivity analysis |
trait |
trait name for title |
y.range |
limits for y axis of plot |
median.i |
index of median value in sa.sample; |
prior.sa.sample |
similar to sa.sample, but for prior distribution. If given, plots sensitivity for prior run |
prior.sa.spline |
similar to sa.spline, but for prior trait distribution. |
fontsize |
(optional) list with three arguments that can be set to vary the fontsize of the title, axis labels, and axis title in the sensitivity plots |
linesize |
passed to ggplot to set line thickness |
dotsize |
passed to ggplot to set point size |
Plots for a single trait; called by plot_sensitivities
to plot sensitivity plots for multiple traits.
object of class ggplot
Plots variance decomposition tryptich: CV, elasticity, variance
plot_variance_decomposition( plot.inputs, fontsize = list(title = 18, axis = 14), order_by = c("Variance", "Elasticity", "CV (%)", "rowname", "none") )plot_variance_decomposition( plot.inputs, fontsize = list(title = 18, axis = 14), order_by = c("Variance", "Elasticity", "CV (%)", "rowname", "none") )
plot.inputs |
Output from a sensitivity analysis. Output must be of the form given by sensitivity.results$variance.decomposition.output in model output |
fontsize |
list specifying the font size of the titles and axes of the graph |
order_by |
Result column to sort by, or "none" to retain input order |
ggplot object
David LeBauer, Carl Davidson, Chris Black
x <- list(trait.labels = c('a', 'b', 'c'), coef.vars = c(a=1,b=0.5, c=0.1), elasticities = c(a=1,b=2,c=0.5), variances = c(a = 20, b=30, c = 10)) plot_variance_decomposition(x) plot_variance_decomposition(x, order_by = "rowname")x <- list(trait.labels = c('a', 'b', 'c'), coef.vars = c(a=1,b=0.5, c=0.1), elasticities = c(a=1,b=2,c=0.5), variances = c(a = 20, b=30, c = 10)) plot_variance_decomposition(x) plot_variance_decomposition(x, order_by = "rowname")
Read Ameriflux L2 Data
read.ameriflux.L2(file.name, year)read.ameriflux.L2(file.name, year)
file.name |
path to file |
year |
currently ignored |
Ameriflux L2 data read from file
Mike Dietze, Carl Davidson
Reads output for an ensemble of length specified by ensemble.size and bounded by start.year
and end.year
read.ensemble.output( ensemble.size, pecandir, outdir, start.year, end.year, variable, ens.run.ids = NULL )read.ensemble.output( ensemble.size, pecandir, outdir, start.year, end.year, variable, ens.run.ids = NULL )
ensemble.size |
the number of ensemble members run |
pecandir |
specifies where pecan writes its configuration files |
outdir |
directory with model output to use in ensemble analysis |
start.year |
first year to include in ensemble analysis |
end.year |
last year to include in ensemble analysis |
variable |
target variables for ensemble analysis |
ens.run.ids |
dataframe. Must contain a column named "id" giving the run IDs to be read.
If NULL, will attempt to read IDs from a file named "samples.Rdata" in |
a list of ensemble model output
Ryan Kelly, David LeBauer, Rob Kooper
Reads ensemble time-series from PEcAn for the selected target variable
read.ensemble.ts( settings, ensemble.id = NULL, variable = NULL, start.year = NULL, end.year = NULL )read.ensemble.ts( settings, ensemble.id = NULL, variable = NULL, start.year = NULL, end.year = NULL )
settings |
PEcAn settings object |
ensemble.id |
database ID, taken from settings if not specified |
variable |
variable name to process. Required |
start.year, end.year
|
taken from settings if not specified |
list
Michael Dietze, Ryan Kelly
Reads output of sensitivity analysis runs
read.sa.output( traits, quantiles, pecandir, outdir, pft.name = "", start.year, end.year, variable, sa.run.ids = NULL, per.pft = FALSE )read.sa.output( traits, quantiles, pecandir, outdir, pft.name = "", start.year, end.year, variable, sa.run.ids = NULL, per.pft = FALSE )
traits |
model parameters included in the sensitivity analysis |
quantiles |
quantiles selected for sensitivity analysis |
pecandir |
specifies where pecan writes its configuration files |
outdir |
directory with model output to use in sensitivity analysis |
pft.name |
name of PFT used in sensitivity analysis (Optional) |
start.year |
first year to include in sensitivity analysis |
end.year |
last year to include in sensitivity analysis |
variable |
variables to be read from model output |
sa.run.ids |
list of run ids to read. If NULL, will look in 'pecandir' for a file named 'samples.Rdata' and read from that |
per.pft |
flag to determine whether we want SA on pft-specific variables |
dataframe with one col per quantile analysed and one row per trait, each cell is a list of AGB over time
Ryan Kelly, David LeBauer, Rob Kooper, Mike Dietze, Istem Fer, Akash B V
Loads parsed ensemble model output from disk, computes summary statistics (mean, median, quantiles), and generates diagnostic plots (histogram, boxplot, and optionally time series).
run.ensemble.analysis( settings, plot.timeseries = NA, ensemble.id = NULL, variable = NULL, start.year = NULL, end.year = NULL, ... )run.ensemble.analysis( settings, plot.timeseries = NA, ensemble.id = NULL, variable = NULL, start.year = NULL, end.year = NULL, ... )
settings |
PEcAn settings object |
plot.timeseries |
if TRUE plots a modeled timeseries of target variable(s) with CIs |
ensemble.id |
database ID, taken from settings if not specified |
variable |
variable name to process, taken from settings if not specified |
start.year, end.year
|
taken from settings if not specified |
... |
additional arguments passed to |
Upstream contract (reads from settings$outdir):
ensemble.output.<var>.<years>.<id>.RdataProduced by
get.results. Contains ensemble.output: a numeric vector of
model output values, one per ensemble member.
File-based side effects (saved to settings$outdir):
ensemble.analysis.<var>.<years>.<id>.pdfHistogram and boxplot of ensemble output distribution.
ensemble.ts.<var>.<years>.<id>.pdf(Optional) Time series plot
with confidence intervals. Generated when plot.timeseries is not
NA.
ensemble.ts.analysis.<var>.<years>.<id>.Rdata(Optional)
Contains ensemble.ts.analysis: time series summary statistics.
Saved when plot.timeseries is not NA.
Note: This is a terminal step — nothing downstream loads these files programmatically. The results are consumed by visualization or user inspection.
Nothing (called for side effects). Creates ensemble plots as PDF files and optionally saves time series analysis results.
David LeBauer, Shawn Serbin, Ryan Kelly
Loads parameter samples and parsed model output from disk, computes sensitivity coefficients and variance decomposition for each PFT/trait, and saves results and diagnostic plots.
run.sensitivity.analysis( settings, plot = TRUE, ensemble.id = NULL, variable = NULL, start.year = NULL, end.year = NULL, pfts = NULL, ... )run.sensitivity.analysis( settings, plot = TRUE, ensemble.id = NULL, variable = NULL, start.year = NULL, end.year = NULL, pfts = NULL, ... )
settings |
a PEcAn settings object |
plot |
logical. Option to generate sensitivity analysis and variance
decomposition plots ( |
ensemble.id |
ensemble ID |
variable |
which variable(s) to do sensitivity analysis for. Defaults
to all specified in |
start.year |
defaults to what is specified in |
end.year |
defaults to what is specified in |
pfts |
a vector of PFT names found in |
... |
currently unused |
Upstream contract (reads from settings$outdir):
samples.RdataProduced by get.parameter.samples. Loaded to
obtain trait.samples, trait.names, pft.names, and sa.run.ids.
sensitivity.samples.<id>.RdataProduced by run.write.configs.
If present, overrides sa.run.ids and sa.samples from
samples.Rdata.
sensitivity.output.<var>.<years>.<id>.RdataProduced by
get.results. Contains sensitivity.output: a named list
(PFT -> matrix) of model outputs for each SA run.
File-based side effects (saved to settings$outdir):
sensitivity.results.<var>.<years>.<id>.RdataContains
sensitivity.results: a named list (PFT -> list) with elements
sensitivity.output (partial derivatives) and
variance.decomposition.output (elasticities and partial variances).
sensitivityanalysis.<pft>.<var>.<years>.<id>.pdfSensitivity
analysis diagnostic plots (one per PFT). Generated when plot = TRUE.
variance.decomposition.<pft>.<var>.<years>.<id>.pdfVariance
decomposition "popsicle" plots (one per PFT). Generated when
plot = TRUE.
Note: This is a terminal step in the workflow — nothing downstream
loads sensitivity.results.*.Rdata programmatically. The results are
consumed by visualization or user inspection.
Nothing (called for side effects). Saves sensitivity.results as
sensitivity.results.*.Rdata and optional PDF plots.
David LeBauer, Shawn Serbin, Ryan Kelly
## Not run: library(PEcAn.settings) library(PEcAn.uncertainty) settings <- read.settings("path/to/pecan.xml") run.sensitivity.analysis(settings) ## End(Not run)## Not run: library(PEcAn.settings) library(PEcAn.uncertainty) settings <- read.settings("path/to/pecan.xml") run.sensitivity.analysis(settings) ## End(Not run)
Apply get.results to each of a list of settings
runModule.get.results(settings)runModule.get.results(settings)
settings |
a PEcAn |
get.results
Run ensemble analyses across all sites in settings
runModule.run.ensemble.analysis(settings, ...)runModule.run.ensemble.analysis(settings, ...)
settings |
PEcAn settings object |
... |
arguments passed on to run.ensemble.analysis |
Caution: Not yet working for multisite settings. It will _run_ the analysis for all sites, but each site will overwrite the result from the previous one.
runModule.run.sensitivity.analysis(settings, ...)runModule.run.sensitivity.analysis(settings, ...)
settings |
PEcan settings object |
... |
additional arguments passed on to 'run.sensitivity.analysis' |
Creates a spline function using the splinefun function that estimates univariate response of parameter input to model output
sa.splinefun(quantiles.input, quantiles.output)sa.splinefun(quantiles.input, quantiles.output)
quantiles.input |
passed to 'x' argument of 'stats::splinefun' |
quantiles.output |
passed to 'y' argument of 'stats::splinefun' |
function
Uses the equation
sd.var(x)sd.var(x)
x |
sample |
estimate of standard deviation of the sample variance
David LeBauer
Mood, Graybill, Boes 1974 'Introduction to the Theory of Statistics' 3rd ed. p 229; Casella and Berger 'Statistical Inference' p 364 ex. 7.45; 'Reference for Var(s^2)' CrossValidated http://stats.stackexchange.com/q/29905/1381, 'Calculating required sample size, precision of variance estimate' CrossValidated http://stats.stackexchange.com/q/7004/1381, 'Variance of Sample Variance?' Mathematics - Stack Exchange http://math.stackexchange.com/q/72975/3733
This function estimates the univariate responses of a model to a parameter for a set of traits, calculates the model sensitivity at the median, and performs a variance decomposition. This function results in a set of ensitivity plots (one per variable) and plot_variance_decomposition.
sensitivity.analysis(trait.samples, sa.samples, sa.output, outdir)sensitivity.analysis(trait.samples, sa.samples, sa.output, outdir)
trait.samples |
list of vectors, one per trait, representing samples of the trait value, with length equal to the mcmc chain length. Samples are taken from either the prior distribution or meta-analysis results |
sa.samples |
data.frame with one column per trait and one row for the set of quantiles used in sensitivity analysis. |
sa.output |
list of data.frames, similar to sa.samples, except cells contain the results of a model run with that trait x quantile combination and all other traits held at their median value |
outdir |
directory to which plots are written |
results of sensitivity analysis
David LeBauer
## Not run: sensitivity.analysis( trait.samples[[pft$name]], sa.samples[[pft$name]], sa.agb[[pft$name]], pft$outdir ) ## End(Not run)## Not run: sensitivity.analysis( trait.samples[[pft$name]], sa.samples[[pft$name]], sa.agb[[pft$name]], pft$outdir ) ## End(Not run)
Generate sensitivity analysis filenames
sensitivity.filename( settings, prefix = "sensitivity.samples", suffix = "Rdata", all.var.yr = TRUE, pft = NULL, ensemble.id = settings$sensitivity.analysis$ensemble.id, variable = settings$sensitivity.analysis$variable, start.year = settings$sensitivity.analysis$start.year, end.year = settings$sensitivity.analysis$end.year )sensitivity.filename( settings, prefix = "sensitivity.samples", suffix = "Rdata", all.var.yr = TRUE, pft = NULL, ensemble.id = settings$sensitivity.analysis$ensemble.id, variable = settings$sensitivity.analysis$variable, start.year = settings$sensitivity.analysis$start.year, end.year = settings$sensitivity.analysis$end.year )
settings |
list of PEcAn settings. |
prefix |
string to appear at the beginning of the filename |
suffix |
file extension: string to appear at the end of the filename |
all.var.yr |
logical: does ensemble include all vars and years? If FALSE, filename will include years and vars |
pft |
name of PFT used for analysis. If NULL, assumes all PFTs in run are used and does not add them to the filename |
ensemble.id |
ensemble ID(s) |
variable |
variable(s) included in the ensemble. |
start.year, end.year
|
first and last year simulated. |
Generally uses values in settings, but can be overwritten for manual uses
a filename
Ryan Kelly
Estimate model output based on univariate splines
spline.ensemble(gi.phii, median)spline.ensemble(gi.phii, median)
gi.phii |
matrix given as output from |
median |
median value around which variance will be calculated |
Accepts output from get.gi.phii (the matrix ) and produces
spline estimate of for use in estimating closure term associated with
spline approximation
David LeBauer
Truncate spline at zero if P[x<0] < pnorm(-3) pnorm(-3) chosen as default value for min quantile because this is the default low end of range for the sensitivity analysis. This parameter could be determined based on minimum value in settings$sensitivity.analysis$quantiles
spline.truncate(x, min.quantile = stats::pnorm(-3))spline.truncate(x, min.quantile = stats::pnorm(-3))
x |
vector |
min.quantile |
threshold quantile for testing lower bound on variable |
either x or a vector with values < 0 converted to zero
David LeBauer
set.seed(0) x <- c(rgamma(998,1,1), rnorm(10)) min(x) # -0.5238 min(PEcAn.uncertainty::spline.truncate(x))set.seed(0) x <- c(rgamma(998,1,1), rnorm(10)) min(x) # -0.5238 min(PEcAn.uncertainty::spline.truncate(x))
Variance and SD(variance)
variance.stats(x)variance.stats(x)
x |
numeric vector |
calculates variance and sd of variance using the var from base R and sd.var from PEcAn.
list with variance and sd of variance
David LeBauer
Writes config files for use in meta-analysis and returns a list of run ids. Given a pft.xml object, a list of lists as supplied by get.sa.samples, a name to distinguish the output files, and the directory to place the files.
write.ensemble.configs( input_design, ensemble.size, defaults, ensemble.samples, settings, model, clean = FALSE, write.to.db = TRUE, restart = NULL, rename = FALSE )write.ensemble.configs( input_design, ensemble.size, defaults, ensemble.samples, settings, model, clean = FALSE, write.to.db = TRUE, restart = NULL, rename = FALSE )
input_design |
design matrix describing sampled inputs (see 'run.write.configs()'). Columns named after 'settings$run$inputs' tags give 1-based indices into each input's 'path' list and rows follow run order. Requires 'nrow(input_design) >= ensemble.size'; extra rows are ignored. |
ensemble.size |
size of ensemble |
defaults |
pft |
ensemble.samples |
list of lists supplied by get.ensemble.samples |
settings |
list of PEcAn settings |
model |
name of model to be run, e.g. "ED2" or "SIPNET" |
clean |
remove old output first? |
write.to.db |
logical: Record this run in BETY? |
restart |
In case this is a continuation of an old simulation. Needs to be a list with name tags of runid, inputs, new.params (parameters), new.state (initial condition), ensemble.id (ensemble id), start.time and stop.time. See Details. |
rename |
Decide if we want to rename previous output files, for example convert from sipnet.out to sipnet.2020-07-16.out. |
The restart functionality is developed using model specific functions by calling write_restart.modelname function. First, you need to make sure that this function is already exist for your desired model. See here https://pecanproject.github.io/pecan-documentation/latest/pecan-models.html. new state is a dataframe with a different column for each state variable. The number of the rows in this dataframe needs to be the same as the ensemble size. The state variables that you can use for setting up initial conditions are model specific. Check the documentation of the write_restart.<modelname> function for the model you are using. The units for the state variables need to be in the PEcAn standard units which can be found in standard_vars. new.params also has similar structure to ensemble.samples which is sent as an argument.
list, containing $runs = data frame of runids, $ensemble.id = the ensemble ID for these runs, and $samples with ids and samples used for each tag. Also writes sensitivity analysis configuration files as a side effect
David LeBauer, Carl Davidson, Hamze Dokoohaki
Writes config files for use in sensitivity analysis.
write.sa.configs( defaults, quantile.samples, settings, model, clean = FALSE, write.to.db = TRUE, input_design = NULL )write.sa.configs( defaults, quantile.samples, settings, model, clean = FALSE, write.to.db = TRUE, input_design = NULL )
defaults |
named list with default parameter values |
quantile.samples |
list of lists supplied by get.sa.samples |
settings |
list of settings |
model |
name of model to be run |
clean |
logical: Delete any existing contents of the directory specified
by |
write.to.db |
logical: Record this run to BETY? If TRUE, uses connection
settings specified in |
input_design |
data.frame coordinating input files across runs |
list, containing $runs = data frame of runids, and $ensemble.id = the ensemble ID for these runs. Also writes sensitivity analysis configuration files as a side effect
David LeBauer, Carl Davidson, Akash B V