# package names
packages<-c("readr","ggplot2","dplyr", "here", "mcp", "lubridate", "knitr", "ezknitr", "loo", "flextable")
# install any packages not previously installed
installed_packages<-packages %in% rownames(installed.packages())
if(any(installed_packages == FALSE)){
install.packages(packages[!installed_packages])
}
# load packages
invisible(lapply(packages, library, character.only = TRUE))
Load in data
df<-read_csv(here("DRUM_materials/data_files_for_DRUM/moose.csv"))
head(df)
## # A tibble: 6 x 2
## yday nsd
## <dbl> <dbl>
## 1 121 0
## 2 121 22986
## 3 121 31330
## 4 121 29005
## 5 121 51125
## 6 121 50098
Take an average NSD value per day
df<-df %>%
group_by(yday) %>%
mutate(daily_NSD=mean(nsd))
df$daily_sqrt_NSD_km<-sqrt(df$daily_NSD)/1000
m1<-list(
daily_sqrt_NSD_km~1,
~1)
f1<-mcp(m1, df, par_x = "yday")
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 684
## Unobserved stochastic nodes: 4
## Total graph size: 1698
##
## Initializing model
m2<-list(
daily_sqrt_NSD_km~1,
~1+sigma(1))
f2<-mcp(m2, df, par_x="yday")
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 684
## Unobserved stochastic nodes: 5
## Total graph size: 1881
##
## Initializing model
Cross-validation
f1$loo<-loo(f1)
f2$loo<-loo(f2)
loos<-loo_compare(f1$loo, f2$loo)
loo_df<-as.data.frame(loos[1:2, 1:2])
loo_df[,1:2]<-lapply(loo_df[,1:2], round, 3)
loo_df<-cbind.data.frame('Model Syntax'=c("Change in Mean and Variance", "Change in Mean"), loo_df)
loo_df %>% flextable() %>%
align(part = "all") %>% # left align
set_header_labels(values=list("Model Syntax" = "Model Syntax",
"elpd_diff" = "ELPD Difference",
"se_diff" = "SE of Difference")) %>%
#set_caption(caption = "Add caption for Table 1 here") %>%
# font(fontname = "Calibri (Body)", part = "all") %>%
fontsize(size = 12, part = "body") %>%
theme_booktabs() %>% # default theme
border_outer() %>%
vline(part="header") %>%
bg(part="header", bg="#dbdbdb") %>%
bold(part="header") %>%
autofit()
Model Syntax | ELPD Difference | SE of Difference |
Change in Mean and Variance | 0 | 0.0 |
Change in Mean | -285 | 15.8 |
Second model with a change in both mean and variance is the better fit.
Plot fitted model over data.
plot(f2, q_fit = T, q_predict = T)+
xlab("Julian Date")+ylab("Displacement (in km)")+
theme_bw(base_size = 16)+
theme(plot.title = element_text(face="bold"))+
geom_rect(aes(xmin=132, xmax=133, ymin=0, ymax=Inf), fill="green",alpha=0.005, size=0.1)+
theme(plot.title=element_text(hjust=0.5))
Displacement (distance from each location to the location at the start of the observation period) for a pregnant moose in Minnesota. The green area is the timeperiod identified by researchers as immediately preceding parturition. Grey lines represent 25 draws from the posterior distribution of the mean displacement. Red dotted lines depict 95% credible intervals for the mean displacement and green dotted lines depict 95% prediction intervals. The posterior distribution for the change point is shown in blue on the x-axis.
Posterior predictive checks, a common method used to test goodness of fit, generate data from the fitted model by simulating from the posterior predictive distribution to visualize how well a model matches the observed values. Using the fitted model for moose movement, we demonstrate a graphical method to compare y, the observed response dataset, and y^{rep}, replicated datasets simulated from the fitted model. The two peaks at 3 km and 12 km correspond to the two intercepts fit by the model and overall, the observed data conform well to the replicated datasets.
pp_check(f2)+ggtitle("Kernel Density Posterior Predictive Check")+
ylab("Density")+xlab("Displacement (in km)")+
theme_bw(base_size = 18)+
theme(plot.title = element_text(face="bold"))+
theme(plot.title = element_text(hjust=0.5))
A kernel density posterior predictive check compares the distribution of observed outcomes (displacement in kilometers by a pregnant moose), shown in the black line, against 50 kernel densities of replicated datasets produced by the fitted model, each shown as a light blue line.
sessionInfo()
## R version 4.0.2 (2020-06-22)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19043)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=English_United States.1252
## [2] LC_CTYPE=English_United States.1252
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.1252
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] flextable_0.6.8 loo_2.4.1 ezknitr_0.6 knitr_1.30
## [5] lubridate_1.7.9 mcp_0.3.0.9000 here_0.1 dplyr_1.0.2
## [9] ggplot2_3.3.2 readr_1.3.1
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.8 lattice_0.20-41 tidyr_1.1.2
## [4] assertthat_0.2.1 rprojroot_1.3-2 digest_0.6.28
## [7] utf8_1.1.4 plyr_1.8.6 R6_2.4.1
## [10] ggridges_0.5.3 backports_1.1.10 HDInterval_0.2.2
## [13] evaluate_0.14 coda_0.19-3 pillar_1.4.6
## [16] tidybayes_2.3.1 gdtools_0.2.2 rlang_0.4.10
## [19] uuid_0.1-4 data.table_1.13.0 jquerylib_0.1.4
## [22] rjags_4-10 checkmate_2.0.0 rmarkdown_2.11
## [25] labeling_0.3 stringr_1.4.0 munsell_0.5.0
## [28] compiler_4.0.2 xfun_0.26 pkgconfig_2.0.3
## [31] systemfonts_0.2.3 base64enc_0.1-3 htmltools_0.5.2
## [34] tidyselect_1.1.0 tibble_3.0.3 arrayhelpers_1.1-0
## [37] matrixStats_0.57.0 fansi_0.4.1 crayon_1.3.4
## [40] withr_2.3.0 distributional_0.2.2 ggdist_2.4.0
## [43] grid_4.0.2 jsonlite_1.7.1 gtable_0.3.0
## [46] lifecycle_0.2.0 magrittr_2.0.1 scales_1.1.1
## [49] zip_2.1.1 cli_2.0.2 stringi_1.5.3
## [52] reshape2_1.4.4 farver_2.0.3 xml2_1.3.2
## [55] bslib_0.3.1 ellipsis_0.3.1 generics_0.0.2
## [58] vctrs_0.3.4 tools_4.0.2 forcats_0.5.0
## [61] svUnit_1.0.3 glue_1.4.2 officer_0.4.0
## [64] purrr_0.3.4 hms_0.5.3 parallel_4.0.2
## [67] fastmap_1.1.0 yaml_2.2.1 colorspace_1.4-1
## [70] bayesplot_1.8.0 patchwork_1.1.1 sass_0.4.0