--- title: "Basic meta-analysis example" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Basic meta-analysis example} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r} library("PEcAn.MA") library("tibble", include.only = "tribble") set.seed(20251222) ``` An example of a prior distributions table, demonstrating the expected format. ```{r} prior_distns <- tribble( ~row_name , ~distn , ~parama , ~paramb , ~n , #nolint "growth_resp_factor" , "beta" , 2.63 , 6.52 , 0 , #nolint "leaf_turnover_rate" , "weibull" , 1.37 , 1.43 , 363 , #nolint "root_respiration_rate" , "unif" , 0.00 , 100.00 , NA , #nolint "root_turnover_rate" , "unif" , 0.00 , 10.00 , NA , #nolint "Amax" , "unif" , 0.00 , 40.00 , NA , #nolint "leaf_respiration_rate_m2" , "weibull" , 2.00 , 6.00 , NA , #nolint "SLA" , "lnorm" , 1.89 , 0.61 , 455 , #nolint "leafC" , "norm" , 50.60 , 1.32 , 291 , #nolint "Vm_low_temp" , "norm" , 0.00 , 3.00 , NA , #nolint "AmaxFrac" , "unif" , 0.60 , 0.90 , NA , #nolint "psnTOpt" , "unif" , 5.00 , 40.00 , NA , #nolint "stem_respiration_rate" , "unif" , 0.00 , 100.00 , NA , #nolint "extinction_coefficient" , "unif" , 0.38 , 0.62 , NA , #nolint "half_saturation_PAR" , "unif" , 4.00 , 27.00 , NA , #nolint "dVPDSlope" , "unif" , 0.01 , 0.25 , NA , #nolint "dVpdExp" , "unif" , 1.00 , 3.00 , NA , #nolint "veg_respiration_Q10" , "unif" , 1.40 , 2.60 , NA , #nolint "fine_root_respiration_Q10" , "unif" , 1.40 , 5.00 , NA , #nolint "coarse_root_respiration_Q10" , "unif" , 1.40 , 5.00 , NA , #nolint ) ``` `PEcAn.MA` expects the code to be a base R `data.frame` with trait names as row names. ```{r} priors <- as.data.frame(prior_distns) rownames(priors) <- priors[["row_name"]] priors[["row_name"]] <- NULL ``` The format of the `trait_data` is a named list of `data.frame`s, with each `data.frame` containing data for the corresponding trait. PEcAn database queries return all the columns the meta-analysis needs, in the expected format. If you are providing your own trait data, you will need the following columns: - `name` (character) -- A string description of the trait - `mean` (numeric) -- The mean value of the trait measurement (or the only value, if only one value is given) - `statname` (character) -- error statistic type (e.g., "SE" for standard error) - `stat` (numeric) -- value of error statistic - `n` (integer or NA) -- sample size - `greenhouse` (boolean) -- `TRUE` if from a greenhouse; `FALSE` if not (e.g., natural setting) - `control` (boolean) -- If `TRUE`, this is the "control" part of an experiment (or there is no experiment). If `FALSE`, this is the treatment. - `site_id` (integer) -- site ID (used for grouping) - `specie_id` (integer) -- species ID (as above) - `citation_id` (integer) -- citation ID (as above) - `treatment_id` (integer) -- treatment ID (as above) - `cultivar_id` (integer / NA) -- cultivar ID (as above) - `date` (datetime / NA) -- date of trait observation (as above) - `time` (datetime / NA) -- time of trait observation (as above) To avoid having to load custom data, here is some simulated data that fits these criteria. ```{r} simulate_data <- function(n_rows, mean_lo, mean_hi, se_lo, se_hi) { return(data.frame( name = sprintf("t%02d", seq_len(n_rows)), mean = runif(n_rows, mean_lo, mean_hi), statname = "SE", stat = suppressWarnings(runif(n_rows, se_lo, se_hi)), n = sample(1:100, size = n_rows, replace = TRUE), greenhouse = sample( c(TRUE, FALSE), size = n_rows, replace = TRUE, prob = c(0.1, 0.9) ), control = TRUE, site_id = sample(1:10, size = n_rows, replace = TRUE), specie_id = sample(1:4, size = n_rows, replace = TRUE), citation_id = sample(1:8, size = n_rows, replace = TRUE), treatment_id = sample(1:3, size = n_rows, replace = TRUE), cultivar_id = NA, date = NA, time = NA )) } n_rows <- 40 trait_data <- list() trait_data[["Amax"]] <- simulate_data(n_rows, 4.0, 20.0, 0.13, 4.0) trait_data[["SLA"]] <- simulate_data(n_rows, 3.0, 15.0, 0.06, 1.7) trait_data[["leaf_turnover_rate"]] <- simulate_data(n_rows, 0.1, 0.5, 0.01, 0.05) ``` Run the meta analysis, performing prior and posterior checks and summarizing the results. ```{r} ma_result <- meta_analysis_standalone(trait_data, priors, iterations = 1000) ``` This returns three things: ```{r} names(ma_result) ``` (1) The posterior distributions... ```{r} ma_result[["post.distns"]] ``` ...(2) the full MCMC samples (as a list)... ```{r} tail(ma_result[["trait.mcmc"]][["Amax"]][[1]]) ``` ...(3) and the "JAGS-ified" trait data. ```{r} head(ma_result[["jagged.data"]][["Amax"]]) ```