| Title: | PEcAn Functions Used for Meta-Analysis |
|---|---|
| Description: | The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. The PEcAn.MA package contains the functions used in the Bayesian meta-analysis of trait data. |
| Authors: | Mike Dietze [aut], David LeBauer [aut, cre], Xiaohui Feng [aut], Dan Wang [aut], Carl Davidson [aut], Rob Kooper [aut], Shawn Serbin [aut], Shashank Singh [aut], University of Illinois, NCSA [cph] |
| Maintainer: | David LeBauer <[email protected]> |
| License: | BSD_3_clause + file LICENSE |
| Version: | 1.7.5.9000 |
| Built: | 2026-06-05 14:57:07 UTC |
| Source: | https://github.com/PecanProject/pecan |
Approximate the posterior MCMC with a closed form pdf
approx.posterior( trait.mcmc, priors, trait.data = NULL, outdir = NULL, filename.flag = "" )approx.posterior( trait.mcmc, priors, trait.data = NULL, outdir = NULL, filename.flag = "" )
trait.mcmc |
meta analysis outputs |
priors |
dataframe of priors used in meta analysis |
trait.data |
data used in meta-analysis (used for plotting) |
outdir |
directory in which to plot results |
filename.flag |
text to be included in the posteriors.pdf filename to make unique |
returns priors where posterior MCMC are missing
NOTE: this function is similar to PEcAn.priors::fit.dist
posteriors data frame, similar to priors, but with closed form pdfs fit to meta-analysis results
David LeBauer, Carl Davidson, Mike Dietze
## Not run: data('trait.mcmc', package = 'PEcAn.utils') data('prior.distns', package = 'PEcAn.utils') approx.posterior(trait.mcmc, priors = prior.distns) ## End(Not run)## Not run: data('trait.mcmc', package = 'PEcAn.utils') data('prior.distns', package = 'PEcAn.utils') approx.posterior(trait.mcmc, priors = prior.distns) ## End(Not run)
Assigns all control treatments the same value, then assigns unique treatments within each site. Each site is required to have a control treatment. The algorithm (incorrectly) assumes that each site has a unique set of experimental treatments. This assumption is required by the data in BETYdb that does not always consistently name treatments or quantity them in the managements table. Also it avoids having the need to estimate treatment by site interactions in the meta analysis model. This model uses data in the control treatment to estimate model parameters so the impact of the assumption is minimal.
assign_treatments(data)assign_treatments(data)
data |
input data |
dataframe with sequential treatments
David LeBauer, Carl Davidson, Alexey Shiklomanov
Check that a data value is consistent with its prior
check_consistent(point, prior, p_error = 5e-04, p_warning = 0.025)check_consistent(point, prior, p_error = 5e-04, p_warning = 0.025)
point |
(numeric) Data value to check |
prior |
list of distn, parama, paramb |
p_error |
(numeric) Probability value outside of which we raise an error |
p_warning |
(numeric) Probability value outside of which we raise a warning |
(c(no_error =
Convert queried data to format required by JAGS meta-analysis model
jagify(result, use_ghs = TRUE)jagify(result, use_ghs = TRUE)
result |
input trait data |
use_ghs |
(Logical) If |
result transformed to meet requirements of PEcAn meta-analysis model
David LeBauer
Complete meta-analysis workflow for a single plant functional type (PFT)
meta_analysis_standalone( trait_data, priors, iterations, outdir = file.path(tempdir(), "pecan-meta-analysis"), pft_name = NA_character_, random = TRUE, threshold = 1.2, use_ghs = TRUE, gamma_tau = 0.01 )meta_analysis_standalone( trait_data, priors, iterations, outdir = file.path(tempdir(), "pecan-meta-analysis"), pft_name = NA_character_, random = TRUE, threshold = 1.2, use_ghs = TRUE, gamma_tau = 0.01 )
trait_data |
(list) Named list of trait data. List item names must be
trait names (consistent with |
priors |
(list) Named list of priors |
iterations |
(integer) Number of sampler iterations for MCMC analysis |
outdir |
(character; default = |
pft_name |
(character; default = NA) Name of PFT (for logging purposes). |
random |
(boolean; default = TRUE) Should random effects be used? |
threshold |
Gelman-Rubin convergence diagnostic (MGPRF) default = 1.2 following Bolker 2008 Ecological Models and Data in R |
use_ghs |
(boolean; default = TRUE) If TRUE, do not exclude greenhouse data |
gamma_tau |
(numeric; default = 0.01) Prior on gamma tau parameter |
(list) List of trait meta-analysis results, including:
trait.mcmc: MCMC samples
post.distns: Posterior distributions
jagged.data: "JAGS-ified" input data (after GHG screen, if applied)
compare point to prior distribution
p.point.in.prior(point, prior)p.point.in.prior(point, prior)
point |
quantile of given prior to return |
prior |
list of distn, parama, paramb |
used to compare data to prior, meta analysis posterior to prior
result of p<distn>(point, parama, paramb)
David LeBauer
Runs hierarchical meta-analysis of plant trait data
pecan.ma( trait.data, prior.distns, taupriors, j.iter, outdir, random = FALSE, overdispersed = TRUE, logfile = file.path(outdir, "meta-analysis.log"), verbose = TRUE )pecan.ma( trait.data, prior.distns, taupriors, j.iter, outdir, random = FALSE, overdispersed = TRUE, logfile = file.path(outdir, "meta-analysis.log"), verbose = TRUE )
trait.data |
list of |
prior.distns |
|
taupriors |
priors on variance parameters, can be scaled as needed with data mean |
j.iter |
number of MCMC samples |
outdir |
output directory |
random |
use random effects, |
overdispersed |
|
logfile |
Path to file for sinking meta analysis output. If
|
verbose |
Logical. If |
data |
data frame generated by jagify function with indexed values for greenhouse, treatment, and site (ghs, trt, site) as well as Y, SE, and n for each observation or summary statistic. |
pecan.ma runs a hierarchical Bayesian meta-analytical model.
This model combines prior information with data from studies on the particular species or group of interest.
Data that is incorporated into the meta-analysis include the mean (Y), sample size (n),
and precision (obs.prec).
Where a set of data includes more than one level of treatment, comes from more than one site,
or comes from both field and greenhouse studies, these variables are included as random (treatment, site)
or fixed (greenhouse) effects.
The pecan.ma function writes a model for each specific data set and prior using the write.ma.model()
function to modify the ma.model.template.bug generic model.
four chains with 5000 total samples from posterior
David LeBauer, Michael C. Dietze, Alexey Shiklomanov
## Not run: # Setup con <- PEcAn.DB::db.open(list(host = "postgres", user = "bety", password = "carya")) pft <- "temperate.Early_Hardwood" pft_id <- PEcAn.DB::db.query("SELECT id FROM pfts WHERE name = $1", con, values = list(pft))[[1]] traits <- c("SLA", "Vcmax") trait_string <- paste(shQuote(traits), collapse = ",") # Load traits and priors from BETY species <- PEcAn.DB::query.pft_species(pft, con = con) trait.data <- PEcAn.DB::query.traits(species[["id"]], c("SLA", "Vcmax"), con = con) prior.distns <- PEcAn.DB::query.priors(pft_id, trait_string, con = con) # Pre-process data jagged.data <- lapply(trait.data, PEcAn.MA::jagify) taupriors <- list(tauA = 0.01, tauB = c(SLA = 1000, Vcmax = 1000)) result <- pecan.ma(jagged.data, prior.distns, taupriors, j.iter = 5000, outdir = tempdir()) ## End(Not run)## Not run: # Setup con <- PEcAn.DB::db.open(list(host = "postgres", user = "bety", password = "carya")) pft <- "temperate.Early_Hardwood" pft_id <- PEcAn.DB::db.query("SELECT id FROM pfts WHERE name = $1", con, values = list(pft))[[1]] traits <- c("SLA", "Vcmax") trait_string <- paste(shQuote(traits), collapse = ",") # Load traits and priors from BETY species <- PEcAn.DB::query.pft_species(pft, con = con) trait.data <- PEcAn.DB::query.traits(species[["id"]], c("SLA", "Vcmax"), con = con) prior.distns <- PEcAn.DB::query.priors(pft_id, trait_string, con = con) # Pre-process data jagged.data <- lapply(trait.data, PEcAn.MA::jagify) taupriors <- list(tauA = 0.01, tauB = c(SLA = 1000, Vcmax = 1000)) result <- pecan.ma(jagged.data, prior.distns, taupriors, j.iter = 5000, outdir = tempdir()) ## End(Not run)
Generate summary statistics and diagnostics for PEcAn meta.analysis
pecan.ma.summary(mcmc.object, pft, outdir, threshold = 1.2, gg = FALSE)pecan.ma.summary(mcmc.object, pft, outdir, threshold = 1.2, gg = FALSE)
mcmc.object |
JAGS mcmc output from |
pft |
plant functional type |
outdir |
output directory |
threshold |
Gelman-Rubin convergence diagnostic (MGPRF) default = 1.2 following Bolker 2008 Ecological Models and Data in R |
gg |
produce extra diagnostic plots using the "ggmcmc" package? Caution: very slow! |
David LeBauer, Shawn Serbin
## Not run: summary <- pecan.ma.summary( trait.mcmc, settings$pfts$pft, settings$outdir, settings$meta.analysis$threshold) ## End(Not run)## Not run: summary <- pecan.ma.summary( trait.mcmc, settings$pfts$pft, settings$outdir, settings$meta.analysis$threshold) ## End(Not run)
Renames the variables within output data frame trait.data
rename_jags_columns(data)rename_jags_columns(data)
data |
data frame to with variables to rename |
David LeBauer
used with jagify;
Run meta-analysis across all PFTs
run.meta.analysis( pfts, iterations, random = TRUE, threshold = 1.2, dbfiles, database, use_ghs = TRUE, update = FALSE )run.meta.analysis( pfts, iterations, random = TRUE, threshold = 1.2, dbfiles, database, use_ghs = TRUE, update = FALSE )
pfts |
the list of pfts to get traits for |
iterations |
(integer) Number of sampler iterations for MCMC analysis |
random |
(boolean; default = TRUE) Should random effects be used? |
threshold |
Gelman-Rubin convergence diagnostic, passed on to
|
dbfiles |
(character) directory where previous results are found |
database |
database connection parameters |
use_ghs |
(boolean; default = TRUE) If TRUE, do not exclude greenhouse data |
update |
logical: Rerun the meta-analysis if result files already exist? |
nothing, as side effect saves trait.mcmc created by
pecan.ma and post.distns created by
approx.posterior(trait.mcmc, ...) to trait.mcmc.Rdata
and post.distns.Rdata, respectively
Shawn Serbin, David LeBauer
"Workflow" version of run.meta.analysis.pft #' Run Bayesian meta-analysis for a single PFT (file-based wrapper)
run.meta.analysis.pft( pft, iterations, random = TRUE, threshold = 1.2, dbfiles, dbcon, use_ghs = TRUE, update = FALSE )run.meta.analysis.pft( pft, iterations, random = TRUE, threshold = 1.2, dbfiles, dbcon, use_ghs = TRUE, update = FALSE )
pft |
(list) PFT list object, as defined in settings. Must include the
following: |
iterations |
(integer) Number of sampler iterations for MCMC analysis |
random |
(boolean; default = TRUE) Should random effects be used? |
threshold |
Gelman-Rubin convergence diagnostic (MGPRF) default = 1.2 following Bolker 2008 Ecological Models and Data in R |
dbfiles |
(character) directory where previous results are found |
dbcon |
(DBI connection object) BETY database connection object |
use_ghs |
(boolean; default = TRUE) If TRUE, do not exclude greenhouse data |
update |
(boolean; default = FALSE) If |
Upstream contract (reads from pft$outdir):
trait.data.RdataNamed list of data frames produced by
get.trait.data.pft. Loaded into trait_env$trait.data.
prior.distns.RdataData frame of prior distributions produced by
get.trait.data.pft. Loaded into prior_env$prior.distns.
File-based side effects (saved to pft$outdir):
trait.mcmc.RdataContains trait.mcmc: a named list of
mcmc.list objects (one per trait) with posterior MCMC samples from
JAGS. Each element has columns beta.o (overall mean) and optionally
sd.o (overall SD).
post.distns.MA.RdataContains post.distns: a data frame with
one row per trait and columns distn, parama, paramb, n
summarizing the fitted posterior distribution.
post.distns.RdataSymlink to post.distns.MA.Rdata.
jagged.data.RdataContains jagged.data: a named list of data
frames (one per trait) formatted for use in the JAGS meta-analysis
model (see jagify).
Downstream contract: The files trait.mcmc.Rdata and
post.distns.Rdata are expected by get.parameter.samples (in
PEcAn.uncertainty), which loads them to generate ensemble and sensitivity
analysis samples.
Note: The core computation is performed by meta_analysis_standalone,
which accepts and returns R objects directly — see its documentation for
the pure-function interface.
The pft list (invisibly), or NA if no trait data are available.
The returned pft list is a named list with the following elements:
name(character) PFT name, e.g. "temperate.deciduous".
outdir(character) Path to directory where output files are stored (trait data, priors, posteriors, MCMC samples).
posteriorid(integer) Row ID of the posterior record in
BETYdb's posteriors table.
constants(named list, optional) Trait values to treat as fixed constants, bypassing the meta-analysis.
The function's primary outputs are communicated through files saved in
pft$outdir.
Run meta-analysis on all PFTs in a (list of) PEcAn settings
runModule.run.meta.analysis(settings)runModule.run.meta.analysis(settings)
settings |
a PEcAn settings or MultiSettings object |
list of PFTs, invisibly;
saves MA results to settings$pft$outdir as a side effect
Individual Meta-analysis
single.MA( data, j.chains, j.iter, tauA, tauB, prior, jag.model.file, overdispersed = TRUE )single.MA( data, j.chains, j.iter, tauA, tauB, prior, jag.model.file, overdispersed = TRUE )
data |
data frame generated by jagify function with indexed values for greenhouse, treatment, and site (ghs, trt, site) as well as Y, SE, and n for each observation or summary statistic. |
j.chains |
number of chains in meta-analysis |
j.iter |
number of mcmc samples |
tauA |
prior on variance parameters |
tauB |
prior on variance parameters |
prior |
data.frame with columns named 'distn', 'parama', 'paramb'
e.g. |
jag.model.file |
file to which model will be written |
overdispersed |
if TRUE (default), chains start at overdispersed locations in parameter space (recommended) |
Individual meta-analysis for a specific trait and PFT is run by the function single.MA. This will allow power analysis to run repeated MA outside of the full loop over traits and PFTs.
jags.out, an mcmc.object with results of meta-analysis
David LeBauer, Michael C. Dietze
Transform NA values in data exported from BETYdb
transform_nas(data)transform_nas(data)
data |
input data |
A data frame NAs sensibly replaced
Convert template ma.model.template.R to a JAGS model.
write.ma.model( modelfile, outfile, reg.model, pr.dist, pr.param.a, pr.param.b, n, trt.n, site.n, ghs.n, tauA, tauB )write.ma.model( modelfile, outfile, reg.model, pr.dist, pr.param.a, pr.param.b, n, trt.n, site.n, ghs.n, tauA, tauB )
modelfile |
model template file (ma.model.template.R) |
outfile |
file name of model created |
reg.model |
structure of regression model |
pr.dist |
A string representing the root distribution name used by R, e.g. 'norm', 'lnorm', 'gamma', 'beta', etc. |
pr.param.a |
first parameter value accepted by |
pr.param.b |
second parameter value accepted by |
n |
number of observations in data |
trt.n |
number of distinct treatments in data |
site.n |
number of distinct sites in data |
ghs.n |
= 1 if only non-greenhouse or greenhouse studies included, 2 if both |
tauA |
parameter a for gamma prior on precision |
tauB |
parameter b for gamma prior on precision |
Writes a meta-analysis model based on available data and prior specification.
Inspired by the R2WinBUGS::write.model by Jouni Kerman and Uwe Ligges.
Nothing, but as a side effect, the model is written
David LeBauer and Mike Dietze.