Purpose

The purpose of this document is to illustrate how a simple RSF and SSF with random effects can be fitted to tracking data. We use a data set containing fisher locations from:

Load libraries and prepare data

library(glmmTMB)
library(INLA)
library(tidyverse)
library(raster)
library(survival)
library(TwoStepCLogit)
library(amt)

First load the fisher data (this is the data as it was downloaded from movebank). We simplified the fisher data slightly, by running the following code prior to upload the data (to save space). This code is not needed, unless you download the original data from: https://www.datarepository.movebank.org/handle/10255/move.330.

dat <- read_csv("Martes pennanti LaPoint New York.csv") %>% 
  filter(!is.na(`location-lat`)) %>% 
  select(x = `location-long`, y = `location-lat`, 
         t = `timestamp`, id = `tag-local-identifier`) %>% 
  filter(id %in% c(1465, 1466, 1072, 1078, 1016, 1469))
write_csv(dat, "Fisher_analysis/fisher_data.csv")

Now lets start by reading in the simplified fisher data

dat <- read_csv("fisher_data.csv")

Include sex of each animal and create tracks with an appropriate coordinate reference system using the amt package

dat_all <- dat %>% nest(-id) 
dat_all$sex <- c("f", "f", "f", "m", "m", "m")
dat_all <- dat_all %>% 
  mutate(trk = map(data, function(d) {
    make_track(d, x, y, t, crs = sp::CRS("+init=epsg:4326")) %>% 
      transform_coords(sp::CRS("+init=epsg:5070"))
  }))

Summarize sampling rates,

dat_all %>% mutate(sr = lapply(trk, summarize_sampling_rate)) %>% 
  select(id, sr) %>% unnest
## # A tibble: 6 x 10
##      id   min    q1 median  mean    q3   max    sd     n unit 
##   <int> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <int> <chr>
## 1  1072 3.92   9.8   10.0  20.7  10.3  1650.  95.2  1348 min  
## 2  1465 0.4    1.97   2.03  8.93  3.98 1080.  48.1  3003 min  
## 3  1466 0.417  1.97   2.07 15.7   4.08 2167. 104.   1500 min  
## 4  1078 1.35   9.78  10.0  21.7  10.3  2111.  87.2  1637 min  
## 5  1469 0.417  1.97   2.17 13.3   5.42 2889.  90.2  2435 min  
## 6  1016 0.1    1.93   2.03  8.04  2.57 1209.  44.0  8957 min

10 minutes seems to appropriate for all animals. Resample the track to 10 minutes with a tolerance of 2 minutes.

dat1 <- dat_all %>% mutate(dat_clean = map(trk, ~ {
  .x %>% track_resample(rate = minutes(10), tolerance = seconds(120))
  }))

Read in the landuse raster and reclassify to two categories (wet forests and other).

landuse <- raster("landuse_study_area.tif")
wet_forests <- landuse %in% c(90, 95)
names(wet_forests) <- "forest"

Resource Selection Functions (RSF)

Data development for RSF

Now start with an RSF by creating random points per animal and extract the covariates for the observed and random points.

dat_rsf <- dat1 %>% mutate(rp = map(dat_clean, ~ .x %>% random_points() %>% 
      extract_covariates(wet_forests))) %>% 
  select(id, rp) %>%  unnest()

Change id column, to 1:6

dat_rsf$id <- as.numeric(factor(dat_rsf$id))

Make response numeric (required for INLA)

dat_rsf$y <- as.numeric(dat_rsf$case_)

We use a weighted likelihood for to fit the RSF. To this end, we need to create a variable for the weights, where used points (case_ = TRUE) keep weight 1, and available points (case_ = FALSE) obtain a large weight \(W\) (here \(W=1000\)):

dat_rsf$weight <- 1000^(1 - dat_rsf$case_)

Mixed RSFs

glmmTMB()

As explained in the manuscript (Section 3.4), we recommend to manually fix the variance of the random intercept at a large value. This can be done in glmmTMB() by first setting up the model, but do not yet fit it:

fisher.tmp <- glmmTMB(case_ ~ forest + (1|id) + (0 + forest |id) , family=binomial(), data = dat_rsf,
                         doFit=FALSE, weights = weight)

Then fix the standard deviation of the first random term, which is the (1|id) component in the above model equation. We use \(\sigma=10^3\), which corresponds to a variance of \(10^6\):

fisher.tmp$parameters$theta[1] <- log(1e3)

We need to tell glmmTMB not to change the first entry of the vector of variances, and give all other variances another indicator to make sure they can be freely estimated:

fisher.tmp$mapArg <- list(theta=factor(c(NA, 1)))

Then fit the model and look at the results:

fisher.rsf <- glmmTMB:::fitTMB(fisher.tmp)
summary(fisher.rsf)
##  Family: binomial  ( logit )
## Formula:          case_ ~ forest + (1 | id) + (0 + forest | id)
## Data: dat_rsf
## Weights: weight
## 
##      AIC      BIC   logLik deviance df.resid 
## 167133.4 167161.6 -83563.7 167127.4    90978 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance  Std.Dev. 
##  id     (Intercept) 1.000e+06 1000.0000
##  id.1   forest      2.569e-01    0.5068
## Number of obs: 90981, groups:  id, 6
## 
## Conditional model:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -0.000185 408.280851   0.000        1    
## forest        0.902882   0.209202   4.316 1.59e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

INLA

Let us now carry the analysis with random intercept \(\mathsf{N}(0,\sigma_{id}^2)\) and fixed variance \(\sigma_{id}^2=10^6\) using INLA. A peculiarity of INLA is that the same variable cannot be used more than once. So for ID we need to generate a new (but identical) variable

dat_rsf$id1 <-dat_rsf$id

For the fixed effects we use the INLA (default) priors \(\beta \sim \mathsf{N}(0,\sigma_\beta^2)\) with \(\sigma_\beta^2=10^4\). The precisions of the priors are thus set to:

prec.beta.forest  <- 1e-4  

We now store the INLA formula with the fixed effects forest, plus two random effects, namely one for the individual-specific intercept and one for the individual-specific slope for forest. Note that the precision (thus \(1/\sigma^2\)) for id is fixed (fixed=TRUE) at the value of \(10^{-6}\) (thus the variance is fixed at \(10^6\)). The other precision is given a PC(1,0.05) prior:

formula.inla <- y ~  forest + 
  f(id, model="iid", hyper=list(theta = list(initial=log(1e-6),fixed=TRUE))) +
  f(id1,forest,values=1:6,model="iid",
    hyper=list(theta=list(initial=log(1),fixed=F,prior="pc.prec",param=c(1,0.05)))) 

The actual INLA call is then given as follows:

inla.setOption(enable.inla.argument.weights=TRUE)
fisher.inla  <- inla(formula.inla, family ="binomial", data=dat_rsf, weights=dat_rsf$weight,
                        control.fixed = list(
                          mean = 0,
                          prec = list(forest = prec.beta.forest)
                       )
)

The summary for the posterior distribution of the fixed effects is given as follows:

fisher.inla$summary.fixed
##                   mean          sd   0.025quant   0.5quant 0.975quant
## (Intercept) -9.4834515 418.0462496 -830.2491981 -9.4952241 810.597320
## forest       0.9023449   0.2522929    0.3916488  0.9030925   1.408816
##                   mode          kld
## (Intercept) -9.4834515 2.009582e-14
## forest       0.9046724 5.460981e-04

Since variances are parameterized and treated as precisions, the summary of the respective posterior distributions is given for the precisions:

fisher.inla$summary.hyperpar
##                      mean       sd 0.025quant 0.5quant 0.975quant     mode
## Precision for id1 3.57532 2.043031  0.8818724 3.172212   8.681721 3.177542

Source R functions for calculating posterior means and medians of the precisions.

source("inla_emarginal.R")
source("inla_mmarginal.R")
inla_emarginal(fisher.inla)
## Mean of variance for id1 
##                0.3826514
inla_mmarginal(fisher.inla)
## Mode of variance for id1 
##                0.3139998

Step-Selection Function (SSF)

Data development for step-selection function

First we need to prepare the data. We need to pair each observed point with 10 random points and extract the covariate value at the end point of each step.

dat_ssf <- dat1 %>% 
  mutate(stps = map(dat_clean, ~ .x %>% steps_by_burst() %>% 
                      random_steps() %>% extract_covariates(wet_forests))) %>% 
  select(id, stps) %>% unnest() %>% 
  mutate(
    y = as.numeric(case_),
    id = as.numeric(factor(id)), 
    step_id = paste0(id, step_id_, sep = "-"))
dat_ssf
## # A tibble: 58,608 x 16
##       id burst_ step_id_ case_    x1_    y1_    x2_    y2_
##    <dbl>  <dbl>    <int> <lgl>  <dbl>  <dbl>  <dbl>  <dbl>
##  1     2      4        1 TRUE  1.78e6 2.41e6 1.78e6 2.41e6
##  2     2      4        1 FALSE 1.78e6 2.41e6 1.78e6 2.41e6
##  3     2      4        1 FALSE 1.78e6 2.41e6 1.78e6 2.41e6
##  4     2      4        1 FALSE 1.78e6 2.41e6 1.78e6 2.41e6
##  5     2      4        1 FALSE 1.78e6 2.41e6 1.78e6 2.41e6
##  6     2      4        1 FALSE 1.78e6 2.41e6 1.78e6 2.41e6
##  7     2      4        1 FALSE 1.78e6 2.41e6 1.78e6 2.41e6
##  8     2      4        1 FALSE 1.78e6 2.41e6 1.78e6 2.41e6
##  9     2      4        1 FALSE 1.78e6 2.41e6 1.78e6 2.41e6
## 10     2      4        1 FALSE 1.78e6 2.41e6 1.78e6 2.41e6
## # … with 58,598 more rows, and 8 more variables: t1_ <dttm>, t2_ <dttm>,
## #   dt_ <drtn>, sl_ <dbl>, ta_ <dbl>, forest <dbl>, y <dbl>, step_id <chr>

Mixed SSFs

2StepCLogit

The two-step procedure with independent random effect (D=“UN(1)”):

r.Twostep <-  Ts.estim(formula = y ~ forest + strata(step_id) + 
                     cluster(id), data = dat_ssf, random = ~ forest,
                   all.m.1=F, D="UN(1)") 

Slope estimates and standard errors

r.Twostep$beta
##    forest 
## 0.4943116
r.Twostep$se
##    forest 
## 0.1427203

Variance estimates

r.Twostep$D
##           forest
## forest 0.1070722

glmmTMB

Now the same model using glmmTMB(). Note that we do not need an overall intercept in this model, because the stratum-specific intercepts are (almost) freely estimated due to the large, fixed variance. Again start to set up the model without fitting the model:

TMBStruc <- glmmTMB(y ~ -1 + forest + (1|step_id) + 
                     (0 + forest | id),
                   family=poisson, data = dat_ssf, doFit=FALSE) 

Set the value of the standard deviation of the first random effect (here (1|step_id)):

TMBStruc$parameters$theta[1] <- log(1e3) 

Tell glmmTMB not to change the first standard deviation, all other values are freely estimated (and are different from each other)

TMBStruc$mapArg <- list(theta=factor(c(NA,1)))

Fit the model and look at the summary:

glmm.TMB.random <- glmmTMB:::fitTMB(TMBStruc)
summary(glmm.TMB.random)
##  Family: poisson  ( log )
## Formula:          y ~ -1 + forest + (1 | step_id) + (0 + forest | id)
## Data: dat_ssf
## 
##      AIC      BIC   logLik deviance df.resid 
## 109683.5 109701.5 -54839.8 109679.5    58606 
## 
## Random effects:
## 
## Conditional model:
##  Groups  Name        Variance  Std.Dev. 
##  step_id (Intercept) 1.000e+06 1000.0000
##  id      forest      8.737e-02    0.2956
## Number of obs: 58608, groups:  step_id, 5328; id, 6
## 
## Conditional model:
##        Estimate Std. Error z value Pr(>|z|)    
## forest   0.4937     0.1306    3.78 0.000157 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

95% CIs for fixed and random effects (standard deviations) are obtained via the confint() function:

confint(glmm.TMB.random)
##                                         2.5 %       97.5 %     Estimate
## cond.forest                         0.2377399    0.7497029    0.4937214
## step_id.cond.Std.Dev.(Intercept) 1000.0000000 1000.0000000 1000.0000000

Note: It is currently a problem of glmmTMB that the confint() function only shows a table of the length equal to the number of parameters estimated. As the variance for str_ID was not estimated but fixed to 10^6, but is still listed, the last variance component (here the one for (0 + forest | id)) is not shown. This can be solved by moving the component (1|step_id) to the last position in the formula, and then replace the respective lines by

TMBStruc$parameters$theta[1] = log(1e3) 
TMBStruc$mapArg = list(theta=factor(c(1, NA)))

in the above code.

INLA

Start by setting the precision for the priors of slope coefficients (here for forest).

prec.beta.forest <- 1e-4 

In the model formula for INLA, we set the stratum-specific intercept variance to \(10^6\) (or rather: the precision to \(10^{−6}\)) by fixing it (fixed=T) to an intitial value. The other precision is given a PC(1,0.05) prior:

formula.random <- y ~  -1 + forest +
  f(step_id,model="iid",hyper=list(theta = list(initial=log(1e-6),fixed=T))) +
  f(id, forest, values=1:6,model="iid",
    hyper=list(theta=list(initial=log(1),fixed=F,prior="pc.prec",param=c(1,0.05)))) 

Fit the model

fisher.inla.ssf <- inla(formula.random, family ="Poisson", data=dat_ssf, 
                      control.fixed = list(
                        mean = 0,
                        prec = list(default = prec.beta.forest)
                      )
)

The summary for the posterior distribution of the fixed effects:

fisher.inla.ssf$summary.fixed 
##             mean        sd 0.025quant  0.5quant 0.975quant      mode
## forest 0.4942391 0.1688319  0.1547562 0.4936275  0.8361327 0.4923019
##                 kld
## forest 0.0003256006

Since variances are parameterized and treated as precisions, the summary of the respective posterior distributions is given for the precisions:

fisher.inla.ssf$summary.hyperpar
##                      mean       sd 0.025quant 0.5quant 0.975quant     mode
## Precision for id 10.11578 7.516201   2.024617  8.25162    31.6272 5.936236

Posterior mean and mode are obtained as

inla_emarginal(fisher.inla.ssf)
## Mean of variance for id 
##               0.1572805
inla_mmarginal(fisher.inla.ssf)
## Mode of variance for id 
##              0.08441722

Session Info

devtools::session_info()
##  setting  value                       
##  version  R version 3.4.4 (2018-03-15)
##  system   x86_64, linux-gnu           
##  ui       X11                         
##  language de_CH:de                    
##  collate  de_CH.UTF-8                 
##  tz       Europe/Zurich               
##  date     2019-06-17                  
## 
##  package       * version  date       source        
##  amt           * 0.0.4.0  2018-04-21 CRAN (R 3.4.4)
##  assertthat      0.2.1    2019-03-21 cran (@0.2.1) 
##  backports       1.1.4    2019-04-10 cran (@1.1.4) 
##  base          * 3.4.4    2018-03-16 local         
##  boot            1.3-20   2017-07-30 CRAN (R 3.4.1)
##  broom           0.4.2    2017-02-13 CRAN (R 3.4.0)
##  cellranger      1.1.0    2016-07-27 CRAN (R 3.4.0)
##  circular        0.4-93   2017-06-29 cran (@0.4-93)
##  cli             1.1.0    2019-03-19 cran (@1.1.0) 
##  codetools       0.2-15   2016-10-05 CRAN (R 3.3.1)
##  colorspace      1.3-2    2016-12-14 CRAN (R 3.4.0)
##  compiler        3.4.4    2018-03-16 local         
##  crayon          1.3.4    2017-09-16 CRAN (R 3.4.2)
##  datasets      * 3.4.4    2018-03-16 local         
##  Deriv           3.8.5    2018-06-11 CRAN (R 3.4.4)
##  devtools        1.13.4   2017-11-09 CRAN (R 3.4.2)
##  digest          0.6.18   2018-10-10 cran (@0.6.18)
##  dplyr         * 0.8.0.1  2019-02-15 CRAN (R 3.4.4)
##  evaluate        0.10.1   2017-06-24 CRAN (R 3.4.1)
##  fansi           0.4.0    2018-10-05 cran (@0.4.0) 
##  fitdistrplus    1.0-9    2017-03-24 cran (@1.0-9) 
##  forcats       * 0.2.0    2017-01-23 CRAN (R 3.4.0)
##  foreign         0.8-70   2018-04-23 CRAN (R 3.4.4)
##  ggplot2       * 3.1.1    2019-04-07 cran (@3.1.1) 
##  glmmTMB       * 0.2.3    2019-01-11 CRAN (R 3.4.4)
##  glue            1.3.1    2019-03-12 cran (@1.3.1) 
##  graphics      * 3.4.4    2018-03-16 local         
##  grDevices     * 3.4.4    2018-03-16 local         
##  grid            3.4.4    2018-03-16 local         
##  gtable          0.3.0    2019-03-25 cran (@0.3.0) 
##  haven           1.1.0    2017-07-09 CRAN (R 3.4.2)
##  hms             0.3      2016-11-22 CRAN (R 3.4.0)
##  htmltools       0.3.6    2017-04-28 CRAN (R 3.4.0)
##  httr            1.3.1    2017-08-20 CRAN (R 3.4.2)
##  INLA          * 18.07.12 2019-05-17 local         
##  jsonlite        1.5      2017-06-01 cran (@1.5)   
##  knitr           1.17     2017-08-10 CRAN (R 3.4.1)
##  lattice         0.20-35  2017-03-25 CRAN (R 3.4.0)
##  lazyeval        0.2.2    2019-03-15 cran (@0.2.2) 
##  lme4            1.1-20   2019-02-04 cran (@1.1-20)
##  lubridate       1.7.1    2017-11-03 CRAN (R 3.4.2)
##  magrittr        1.5      2014-11-22 CRAN (R 3.4.0)
##  MASS            7.3-47   2017-04-21 CRAN (R 3.4.2)
##  Matrix        * 1.2-14   2018-04-09 CRAN (R 3.4.4)
##  MatrixModels    0.4-1    2015-08-22 CRAN (R 3.4.0)
##  memoise         1.1.0    2017-04-21 CRAN (R 3.4.2)
##  methods       * 3.4.4    2018-03-16 local         
##  minqa           1.2.4    2014-10-09 CRAN (R 3.4.0)
##  mnormt          1.5-5    2016-10-15 CRAN (R 3.4.0)
##  modelr          0.1.1    2017-07-24 CRAN (R 3.4.2)
##  munsell         0.5.0    2018-06-12 cran (@0.5.0) 
##  mvtnorm         1.0-6    2017-03-02 CRAN (R 3.4.0)
##  nlme            3.1-137  2018-04-07 CRAN (R 3.4.4)
##  nloptr          1.2.1    2018-10-03 cran (@1.2.1) 
##  parallel        3.4.4    2018-03-16 local         
##  pillar          1.4.0    2019-05-11 cran (@1.4.0) 
##  pkgconfig       2.0.2    2018-08-16 cran (@2.0.2) 
##  plyr            1.8.4    2016-06-08 CRAN (R 3.4.2)
##  psych           1.8.4    2018-05-06 CRAN (R 3.4.4)
##  purrr         * 0.2.5    2018-05-29 cran (@0.2.5) 
##  R6              2.4.0    2019-02-14 cran (@2.4.0) 
##  raster        * 2.6-7    2017-11-13 CRAN (R 3.4.2)
##  Rcpp            1.0.1    2019-03-17 cran (@1.0.1) 
##  readr         * 1.1.1    2017-05-16 CRAN (R 3.4.2)
##  readxl          1.0.0    2017-04-18 CRAN (R 3.4.0)
##  reshape2        1.4.3    2017-12-11 cran (@1.4.3) 
##  rgdal           1.2-15   2017-10-30 CRAN (R 3.4.2)
##  rgeos           0.3-26   2017-10-31 CRAN (R 3.4.2)
##  rlang           0.3.4    2019-04-07 cran (@0.3.4) 
##  rmarkdown       1.7      2017-11-10 CRAN (R 3.4.2)
##  rprojroot       1.3-2    2018-01-03 cran (@1.3-2) 
##  rstudioapi      0.7      2017-09-07 CRAN (R 3.4.2)
##  rvest           0.3.2    2016-06-17 CRAN (R 3.4.0)
##  scales          1.0.0    2018-08-09 cran (@1.0.0) 
##  sp            * 1.2-5    2017-06-29 cran (@1.2-5) 
##  splines         3.4.4    2018-03-16 local         
##  stats         * 3.4.4    2018-03-16 local         
##  stringi         1.2.4    2018-07-20 cran (@1.2.4) 
##  stringr       * 1.3.1    2018-05-10 cran (@1.3.1) 
##  survival      * 2.41-3   2017-04-04 CRAN (R 3.4.0)
##  tibble        * 2.1.1    2019-03-16 cran (@2.1.1) 
##  tidyr         * 0.7.2    2017-10-16 cran (@0.7.2) 
##  tidyselect      0.2.5    2018-10-11 CRAN (R 3.4.4)
##  tidyverse     * 1.2.1    2017-11-14 CRAN (R 3.4.3)
##  TMB             1.7.14   2018-06-23 CRAN (R 3.4.4)
##  tools           3.4.4    2018-03-16 local         
##  TwoStepCLogit * 1.2.5    2016-03-21 CRAN (R 3.4.0)
##  utf8            1.1.4    2018-05-24 cran (@1.1.4) 
##  utils         * 3.4.4    2018-03-16 local         
##  vctrs           0.1.0    2018-11-29 cran (@0.1.0) 
##  withr           2.1.2    2018-03-15 CRAN (R 3.4.4)
##  xml2            1.2.0    2018-01-24 cran (@1.2.0) 
##  yaml            2.1.14   2016-11-12 CRAN (R 3.4.1)
##  zeallot         0.1.0    2018-01-28 cran (@0.1.0)