moose_example

# package names
packages<-c("readr","ggplot2","dplyr", "here", "mcp", "lubridate", "knitr", "ezknitr", "loo", "flextable")

# install any packages not previously installed
installed_packages<-packages %in% rownames(installed.packages())
if(any(installed_packages == FALSE)){
  install.packages(packages[!installed_packages])
}

# load packages
invisible(lapply(packages, library, character.only = TRUE))

Load in data

df<-read_csv(here("DRUM_materials/data_files_for_DRUM/moose.csv"))
head(df)

## # A tibble: 6 x 2
##    yday   nsd
##   <dbl> <dbl>
## 1   121     0
## 2   121 22986
## 3   121 31330
## 4   121 29005
## 5   121 51125
## 6   121 50098

Take an average NSD value per day

df<-df %>% 
  group_by(yday) %>% 
  mutate(daily_NSD=mean(nsd))

df$daily_sqrt_NSD_km<-sqrt(df$daily_NSD)/1000

m1<-list(
  daily_sqrt_NSD_km~1,
                   ~1)

f1<-mcp(m1, df, par_x = "yday")

## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 684
##    Unobserved stochastic nodes: 4
##    Total graph size: 1698
## 
## Initializing model

m2<-list(
  daily_sqrt_NSD_km~1,
                   ~1+sigma(1))

f2<-mcp(m2, df, par_x="yday")

## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 684
##    Unobserved stochastic nodes: 5
##    Total graph size: 1881
## 
## Initializing model

Cross-validation

f1$loo<-loo(f1)
f2$loo<-loo(f2)

loos<-loo_compare(f1$loo, f2$loo)
loo_df<-as.data.frame(loos[1:2, 1:2])
loo_df[,1:2]<-lapply(loo_df[,1:2], round, 3)
loo_df<-cbind.data.frame('Model Syntax'=c("Change in Mean and Variance", "Change in Mean"), loo_df)

loo_df %>%  flextable() %>% 
  align(part = "all") %>% # left align
  set_header_labels(values=list("Model Syntax" = "Model Syntax", 
                                "elpd_diff" = "ELPD Difference", 
                                "se_diff" = "SE of Difference")) %>% 
  #set_caption(caption = "Add caption for Table 1 here") %>% 
  # font(fontname = "Calibri (Body)", part = "all") %>%
  fontsize(size = 12, part = "body") %>% 
  theme_booktabs() %>% # default theme
  border_outer() %>% 
  vline(part="header") %>% 
  bg(part="header", bg="#dbdbdb") %>% 
  bold(part="header") %>% 
  autofit()

Model Syntax	ELPD Difference	SE of Difference
Change in Mean and Variance	0	0.0
Change in Mean	-285	15.8

Second model with a change in both mean and variance is the better fit.

Plot fitted model over data.

plot(f2, q_fit = T, q_predict = T)+
   xlab("Julian Date")+ylab("Displacement (in km)")+
   theme_bw(base_size = 16)+
   theme(plot.title = element_text(face="bold"))+
   geom_rect(aes(xmin=132, xmax=133, ymin=0, ymax=Inf), fill="green",alpha=0.005, size=0.1)+
   theme(plot.title=element_text(hjust=0.5))

Displacement (distance from each location to the location at the start of the observation period) for a pregnant moose in Minnesota. The green area is the timeperiod identified by researchers as immediately preceding parturition. Grey lines represent 25 draws from the posterior distribution of the mean displacement. Red dotted lines depict 95% credible intervals for the mean displacement and green dotted lines depict 95% prediction intervals. The posterior distribution for the change point is shown in blue on the x-axis.

Posterior predictive check

Posterior predictive checks, a common method used to test goodness of fit, generate data from the fitted model by simulating from the posterior predictive distribution to visualize how well a model matches the observed values. Using the fitted model for moose movement, we demonstrate a graphical method to compare y, the observed response dataset, and y^{rep}, replicated datasets simulated from the fitted model. The two peaks at 3 km and 12 km correspond to the two intercepts fit by the model and overall, the observed data conform well to the replicated datasets.

pp_check(f2)+ggtitle("Kernel Density Posterior Predictive Check")+
  ylab("Density")+xlab("Displacement (in km)")+
  theme_bw(base_size = 18)+
  theme(plot.title = element_text(face="bold"))+
  theme(plot.title = element_text(hjust=0.5))

A kernel density posterior predictive check compares the distribution of observed outcomes (displacement in kilometers by a pregnant moose), shown in the black line, against 50 kernel densities of replicated datasets produced by the fitted model, each shown as a light blue line.

sessionInfo()

## R version 4.0.2 (2020-06-22)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19043)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United States.1252 
## [2] LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] flextable_0.6.8 loo_2.4.1       ezknitr_0.6     knitr_1.30     
##  [5] lubridate_1.7.9 mcp_0.3.0.9000  here_0.1        dplyr_1.0.2    
##  [9] ggplot2_3.3.2   readr_1.3.1    
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.8           lattice_0.20-41      tidyr_1.1.2         
##  [4] assertthat_0.2.1     rprojroot_1.3-2      digest_0.6.28       
##  [7] utf8_1.1.4           plyr_1.8.6           R6_2.4.1            
## [10] ggridges_0.5.3       backports_1.1.10     HDInterval_0.2.2    
## [13] evaluate_0.14        coda_0.19-3          pillar_1.4.6        
## [16] tidybayes_2.3.1      gdtools_0.2.2        rlang_0.4.10        
## [19] uuid_0.1-4           data.table_1.13.0    jquerylib_0.1.4     
## [22] rjags_4-10           checkmate_2.0.0      rmarkdown_2.11      
## [25] labeling_0.3         stringr_1.4.0        munsell_0.5.0       
## [28] compiler_4.0.2       xfun_0.26            pkgconfig_2.0.3     
## [31] systemfonts_0.2.3    base64enc_0.1-3      htmltools_0.5.2     
## [34] tidyselect_1.1.0     tibble_3.0.3         arrayhelpers_1.1-0  
## [37] matrixStats_0.57.0   fansi_0.4.1          crayon_1.3.4        
## [40] withr_2.3.0          distributional_0.2.2 ggdist_2.4.0        
## [43] grid_4.0.2           jsonlite_1.7.1       gtable_0.3.0        
## [46] lifecycle_0.2.0      magrittr_2.0.1       scales_1.1.1        
## [49] zip_2.1.1            cli_2.0.2            stringi_1.5.3       
## [52] reshape2_1.4.4       farver_2.0.3         xml2_1.3.2          
## [55] bslib_0.3.1          ellipsis_0.3.1       generics_0.0.2      
## [58] vctrs_0.3.4          tools_4.0.2          forcats_0.5.0       
## [61] svUnit_1.0.3         glue_1.4.2           officer_0.4.0       
## [64] purrr_0.3.4          hms_0.5.3            parallel_4.0.2      
## [67] fastmap_1.1.0        yaml_2.2.1           colorspace_1.4-1    
## [70] bayesplot_1.8.0      patchwork_1.1.1      sass_0.4.0

moose_example

David

3/11/2022

Posterior predictive check