Package 'PEcAn.MA'

Title: PEcAn Functions Used for Meta-Analysis
Description: The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. The PEcAn.MA package contains the functions used in the Bayesian meta-analysis of trait data.
Authors: Mike Dietze [aut], David LeBauer [aut, cre], Xiaohui Feng [aut], Dan Wang [aut], Carl Davidson [aut], Rob Kooper [aut], Shawn Serbin [aut], Shashank Singh [aut], University of Illinois, NCSA [cph]
Maintainer: David LeBauer <[email protected]>
License: BSD_3_clause + file LICENSE
Version: 1.7.5.9000
Built: 2026-06-05 14:57:07 UTC
Source: https://github.com/PecanProject/pecan

Help Index


Approximate posterior

Description

Approximate the posterior MCMC with a closed form pdf

Usage

approx.posterior(
  trait.mcmc,
  priors,
  trait.data = NULL,
  outdir = NULL,
  filename.flag = ""
)

Arguments

trait.mcmc

meta analysis outputs

priors

dataframe of priors used in meta analysis

trait.data

data used in meta-analysis (used for plotting)

outdir

directory in which to plot results

filename.flag

text to be included in the posteriors.pdf filename to make unique

Details

returns priors where posterior MCMC are missing

NOTE: this function is similar to PEcAn.priors::fit.dist

Value

posteriors data frame, similar to priors, but with closed form pdfs fit to meta-analysis results

Author(s)

David LeBauer, Carl Davidson, Mike Dietze

Examples

## Not run: 
  data('trait.mcmc', package = 'PEcAn.utils')
  data('prior.distns', package = 'PEcAn.utils')
  approx.posterior(trait.mcmc, priors = prior.distns)

## End(Not run)

Change treatments to sequential integers

Description

Assigns all control treatments the same value, then assigns unique treatments within each site. Each site is required to have a control treatment. The algorithm (incorrectly) assumes that each site has a unique set of experimental treatments. This assumption is required by the data in BETYdb that does not always consistently name treatments or quantity them in the managements table. Also it avoids having the need to estimate treatment by site interactions in the meta analysis model. This model uses data in the control treatment to estimate model parameters so the impact of the assumption is minimal.

Usage

assign_treatments(data)

Arguments

data

input data

Value

dataframe with sequential treatments

Author(s)

David LeBauer, Carl Davidson, Alexey Shiklomanov


Check that a data value is consistent with its prior

Description

Check that a data value is consistent with its prior

Usage

check_consistent(point, prior, p_error = 5e-04, p_warning = 0.025)

Arguments

point

(numeric) Data value to check

prior

list of distn, parama, paramb

p_error

(numeric) Probability value outside of which we raise an error

p_warning

(numeric) Probability value outside of which we raise a warning

Value

(c(no_error = , no_warning = ))


Prepare trait data for JAGS meta-analysis

Description

Convert queried data to format required by JAGS meta-analysis model

Usage

jagify(result, use_ghs = TRUE)

Arguments

result

input trait data

use_ghs

(Logical) If FALSE, exclude all greenhouse data. If TRUE, use all data, including greenhouse data.

Value

result transformed to meet requirements of PEcAn meta-analysis model

Author(s)

David LeBauer


Complete meta-analysis workflow for a single plant functional type (PFT)

Description

Complete meta-analysis workflow for a single plant functional type (PFT)

Usage

meta_analysis_standalone(
  trait_data,
  priors,
  iterations,
  outdir = file.path(tempdir(), "pecan-meta-analysis"),
  pft_name = NA_character_,
  random = TRUE,
  threshold = 1.2,
  use_ghs = TRUE,
  gamma_tau = 0.01
)

Arguments

trait_data

(list) Named list of trait data. List item names must be trait names (consistent with priors argument). List values are data.frames with the following required columns: name, mean statname, stat, greenhouse, n, site_id, specie_id, citation_id, cultivar_id, date, time, control

priors

(list) Named list of priors

iterations

(integer) Number of sampler iterations for MCMC analysis

outdir

(character; default = tempdir() / "pecan-meta-analysis") Path to directory where outputs will be stored.

pft_name

(character; default = NA) Name of PFT (for logging purposes).

random

(boolean; default = TRUE) Should random effects be used?

threshold

Gelman-Rubin convergence diagnostic (MGPRF) default = 1.2 following Bolker 2008 Ecological Models and Data in R

use_ghs

(boolean; default = TRUE) If TRUE, do not exclude greenhouse data

gamma_tau

(numeric; default = 0.01) Prior on gamma tau parameter

Value

(list) List of trait meta-analysis results, including:

  • trait.mcmc: MCMC samples

  • post.distns: Posterior distributions

  • jagged.data: "JAGS-ified" input data (after GHG screen, if applied)


find quantile of point within prior distribution

Description

compare point to prior distribution

Usage

p.point.in.prior(point, prior)

Arguments

point

quantile of given prior to return

prior

list of distn, parama, paramb

Details

used to compare data to prior, meta analysis posterior to prior

Value

result of ⁠p<distn>(point, parama, paramb)⁠

Author(s)

David LeBauer


Trait Meta-analysis

Description

Runs hierarchical meta-analysis of plant trait data

Usage

pecan.ma(
  trait.data,
  prior.distns,
  taupriors,
  j.iter,
  outdir,
  random = FALSE,
  overdispersed = TRUE,
  logfile = file.path(outdir, "meta-analysis.log"),
  verbose = TRUE
)

Arguments

trait.data

list of data.frames, one per trait for which data is available, generated by call to PEcAn.DB::query.traits(), and post-processed by jagify().

prior.distns

data.frame of prior distributions generated by call to PEcAn.DB::query.priors()

taupriors

priors on variance parameters, can be scaled as needed with data mean

j.iter

number of MCMC samples

outdir

output directory

random

use random effects, FALSE by default

overdispersed

TRUE by default, if set to FALSE, data mean will be used as starting point for MCMC chains (use with caution)

logfile

Path to file for sinking meta analysis output. If NULL, only print output to console.

verbose

Logical. If TRUE (default), print progress messages.

data

data frame generated by jagify function with indexed values for greenhouse, treatment, and site (ghs, trt, site) as well as Y, SE, and n for each observation or summary statistic.

Details

pecan.ma runs a hierarchical Bayesian meta-analytical model. This model combines prior information with data from studies on the particular species or group of interest. Data that is incorporated into the meta-analysis include the mean (Y), sample size (n), and precision (obs.prec). Where a set of data includes more than one level of treatment, comes from more than one site, or comes from both field and greenhouse studies, these variables are included as random (treatment, site) or fixed (greenhouse) effects. The pecan.ma function writes a model for each specific data set and prior using the write.ma.model() function to modify the ma.model.template.bug generic model.

Value

four chains with 5000 total samples from posterior

Author(s)

David LeBauer, Michael C. Dietze, Alexey Shiklomanov

Examples

## Not run: 
  # Setup
  con <- PEcAn.DB::db.open(list(host = "postgres", user = "bety", password = "carya"))
  pft <- "temperate.Early_Hardwood"
  pft_id <- PEcAn.DB::db.query("SELECT id FROM pfts WHERE name = $1", con,
  values = list(pft))[[1]]
  traits <- c("SLA", "Vcmax")
  trait_string <- paste(shQuote(traits), collapse = ",")
  
  # Load traits and priors from BETY
  species <- PEcAn.DB::query.pft_species(pft, con = con)
  trait.data <- PEcAn.DB::query.traits(species[["id"]], c("SLA", "Vcmax"), con = con)
  prior.distns <- PEcAn.DB::query.priors(pft_id, trait_string, con = con)
  
  # Pre-process data
  jagged.data <- lapply(trait.data, PEcAn.MA::jagify)
  taupriors <- list(tauA = 0.01,
  tauB = c(SLA = 1000, Vcmax = 1000))
  result <- pecan.ma(jagged.data, prior.distns, taupriors,
                     j.iter = 5000, outdir = tempdir())

## End(Not run)

Generate summary statistics and diagnostics for PEcAn meta.analysis

Description

Generate summary statistics and diagnostics for PEcAn meta.analysis

Usage

pecan.ma.summary(mcmc.object, pft, outdir, threshold = 1.2, gg = FALSE)

Arguments

mcmc.object

JAGS mcmc output from pecan.ma

pft

plant functional type

outdir

output directory

threshold

Gelman-Rubin convergence diagnostic (MGPRF) default = 1.2 following Bolker 2008 Ecological Models and Data in R

gg

produce extra diagnostic plots using the "ggmcmc" package? Caution: very slow!

Author(s)

David LeBauer, Shawn Serbin

Examples

## Not run: 
summary <- pecan.ma.summary(
 trait.mcmc,
 settings$pfts$pft,
 settings$outdir,
 settings$meta.analysis$threshold)

## End(Not run)

Renames the variables within output data frame trait.data

Description

Renames the variables within output data frame trait.data

Usage

rename_jags_columns(data)

Arguments

data

data frame to with variables to rename

Author(s)

David LeBauer

See Also

used with jagify;


Run meta-analysis across all PFTs

Description

Run meta-analysis across all PFTs

Usage

run.meta.analysis(
  pfts,
  iterations,
  random = TRUE,
  threshold = 1.2,
  dbfiles,
  database,
  use_ghs = TRUE,
  update = FALSE
)

Arguments

pfts

the list of pfts to get traits for

iterations

(integer) Number of sampler iterations for MCMC analysis

random

(boolean; default = TRUE) Should random effects be used?

threshold

Gelman-Rubin convergence diagnostic, passed on to pecan.ma.summary

dbfiles

(character) directory where previous results are found

database

database connection parameters

use_ghs

(boolean; default = TRUE) If TRUE, do not exclude greenhouse data

update

logical: Rerun the meta-analysis if result files already exist?

Value

nothing, as side effect saves trait.mcmc created by pecan.ma and post.distns created by approx.posterior(trait.mcmc, ...) to trait.mcmc.Rdata and post.distns.Rdata, respectively

Author(s)

Shawn Serbin, David LeBauer


"Workflow" version of run.meta.analysis.pft #' Run Bayesian meta-analysis for a single PFT (file-based wrapper)

Description

"Workflow" version of run.meta.analysis.pft #' Run Bayesian meta-analysis for a single PFT (file-based wrapper)

Usage

run.meta.analysis.pft(
  pft,
  iterations,
  random = TRUE,
  threshold = 1.2,
  dbfiles,
  dbcon,
  use_ghs = TRUE,
  update = FALSE
)

Arguments

pft

(list) PFT list object, as defined in settings. Must include the following: outdir, name, posteriorid

iterations

(integer) Number of sampler iterations for MCMC analysis

random

(boolean; default = TRUE) Should random effects be used?

threshold

Gelman-Rubin convergence diagnostic (MGPRF) default = 1.2 following Bolker 2008 Ecological Models and Data in R

dbfiles

(character) directory where previous results are found

dbcon

(DBI connection object) BETY database connection object

use_ghs

(boolean; default = TRUE) If TRUE, do not exclude greenhouse data

update

(boolean; default = FALSE) If TRUE, replace existing posteriors with new ones

Details

Upstream contract (reads from pft$outdir):

trait.data.Rdata

Named list of data frames produced by get.trait.data.pft. Loaded into trait_env$trait.data.

prior.distns.Rdata

Data frame of prior distributions produced by get.trait.data.pft. Loaded into prior_env$prior.distns.

File-based side effects (saved to pft$outdir):

trait.mcmc.Rdata

Contains trait.mcmc: a named list of mcmc.list objects (one per trait) with posterior MCMC samples from JAGS. Each element has columns beta.o (overall mean) and optionally sd.o (overall SD).

post.distns.MA.Rdata

Contains post.distns: a data frame with one row per trait and columns distn, parama, paramb, n summarizing the fitted posterior distribution.

post.distns.Rdata

Symlink to post.distns.MA.Rdata.

jagged.data.Rdata

Contains jagged.data: a named list of data frames (one per trait) formatted for use in the JAGS meta-analysis model (see jagify).

Downstream contract: The files trait.mcmc.Rdata and post.distns.Rdata are expected by get.parameter.samples (in PEcAn.uncertainty), which loads them to generate ensemble and sensitivity analysis samples.

Note: The core computation is performed by meta_analysis_standalone, which accepts and returns R objects directly — see its documentation for the pure-function interface.

Value

The pft list (invisibly), or NA if no trait data are available. The returned pft list is a named list with the following elements:

name

(character) PFT name, e.g. "temperate.deciduous".

outdir

(character) Path to directory where output files are stored (trait data, priors, posteriors, MCMC samples).

posteriorid

(integer) Row ID of the posterior record in BETYdb's posteriors table.

constants

(named list, optional) Trait values to treat as fixed constants, bypassing the meta-analysis.

The function's primary outputs are communicated through files saved in pft$outdir.


Run meta-analysis on all PFTs in a (list of) PEcAn settings

Description

Run meta-analysis on all PFTs in a (list of) PEcAn settings

Usage

runModule.run.meta.analysis(settings)

Arguments

settings

a PEcAn settings or MultiSettings object

Value

list of PFTs, invisibly; saves MA results to settings$pft$outdir as a side effect


Single MA

Description

Individual Meta-analysis

Usage

single.MA(
  data,
  j.chains,
  j.iter,
  tauA,
  tauB,
  prior,
  jag.model.file,
  overdispersed = TRUE
)

Arguments

data

data frame generated by jagify function with indexed values for greenhouse, treatment, and site (ghs, trt, site) as well as Y, SE, and n for each observation or summary statistic.

j.chains

number of chains in meta-analysis

j.iter

number of mcmc samples

tauA

prior on variance parameters

tauB

prior on variance parameters

prior

data.frame with columns named 'distn', 'parama', 'paramb' e.g. prior <- data.frame(distn = 'weibull', parama = 0.5, paramb = 10, n = 1)

jag.model.file

file to which model will be written

overdispersed

if TRUE (default), chains start at overdispersed locations in parameter space (recommended)

Details

Individual meta-analysis for a specific trait and PFT is run by the function single.MA. This will allow power analysis to run repeated MA outside of the full loop over traits and PFTs.

Value

jags.out, an mcmc.object with results of meta-analysis

Author(s)

David LeBauer, Michael C. Dietze


Function to remove NA values from database queries

Description

Transform NA values in data exported from BETYdb

Usage

transform_nas(data)

Arguments

data

input data

Value

A data frame NAs sensibly replaced


write.ma.model

Description

Convert template ma.model.template.R to a JAGS model.

Usage

write.ma.model(
  modelfile,
  outfile,
  reg.model,
  pr.dist,
  pr.param.a,
  pr.param.b,
  n,
  trt.n,
  site.n,
  ghs.n,
  tauA,
  tauB
)

Arguments

modelfile

model template file (ma.model.template.R)

outfile

file name of model created

reg.model

structure of regression model

pr.dist

A string representing the root distribution name used by R, e.g. 'norm', 'lnorm', 'gamma', 'beta', etc.

pr.param.a

first parameter value accepted by pr.dist

pr.param.b

second parameter value accepted by pr.dist

n

number of observations in data

trt.n

number of distinct treatments in data

site.n

number of distinct sites in data

ghs.n

= 1 if only non-greenhouse or greenhouse studies included, 2 if both

tauA

parameter a for gamma prior on precision

tauB

parameter b for gamma prior on precision

Details

Writes a meta-analysis model based on available data and prior specification. Inspired by the R2WinBUGS::write.model by Jouni Kerman and Uwe Ligges.

Value

Nothing, but as a side effect, the model is written

Author(s)

David LeBauer and Mike Dietze.