Density surface models

David L Miller (*CREEM, University of St Andrews*)

Cornell University

5 September 2014

Spatial modelling

- How many animals are there?
- How do environmental factors effect animals?
- Where are the animals?
- How certain are we about the above?

- Want to model counts
- uncertain detection
- availability issues

- Counts are functions of covariates
- Want to select the covariates
- Covariate effects must be flexible

- Spatially explicit models
- Uncertainty quantification

Density surface models

- Take count data
- Adjust/account for detectability/availability
- Fit a GAM w. environmental explanatory variables
- Perform inference on GAM

- Strip transects
- Line transects
- Point transects
- (more, depending on how strict you are with terminology)

Detectability

- Essentially want to compute “correction factor”
- Model \(\mathbb{P} \left[ \text{animal detected } \vert \text{ object at distance } y\right] = g(y;\boldsymbol{\theta})\)
- Calculate the average probability of detection:

\[ P_a = \frac{1}{w} \int_0^w g(y;\boldsymbol{\theta}) \text{d}y \]

- Horvitz-Thompson-type estimators \(\Rightarrow \hat{N}\)

- Extend \(\mathbb{P} \left[ \text{animal detected } \vert \text{ object at distance } y, \text{ observed covariates}\right] = g(y, \mathbf{z};\boldsymbol{\theta})\)
- Double observer (account for \(g(0)<1\))
- Detection function formulations
- Measurement error

Figure from Marques et al (2007), The Auk

Spatially explicit models

If we are modelling counts:

\[ \mathbb{E}(n_j) = A_j\exp \left\{ \beta_0 + \sum_k f_k(z_{jk}) \right\} \]

- \(n_j\) has some count distribution (quasi-Poisson, Tweedie, negative binomial)
- \(A_j\) is area of segment
- \(f_k\) are
*smooth*functions (splines \(\Rightarrow f_k(x)=\sum_l \beta_l b_l(x)\)) - \(f_k\) can just be fixed effects \(\Rightarrow\) GLM
- Add-in random effects, correlation structures \(\Rightarrow\) GAMM
- Wood (2006) is a good intro book

Minimise distance between data and model while minimizing:

\[ \lambda_k \int_\Omega \frac{\partial^2 f_k(z_k)}{\partial z_k^2} \text{ d}z_k \]

\[ \mathbb{E}(n_j) = A_j \hat{p}_j \exp \left\{ \beta_0 + \sum_k f_k(z_{jk}) \right\} \]

\[ \hat{n}_j = \sum_{i \text{ in segment } j} \frac{s_i}{\hat{p}_i} \]

\[ \mathbb{E}(\hat{n}_j) = A_j \exp \left\{ \beta_0 + \sum_k f_k(z_{jk}) \right\} \]

Case study I - Seabirds in RI waters

- Wind development in RI/MA waters
- Map of usage
- Estimate uncertainty
- Combine maps (Zonation)

Photo by jackanapes on flickr (CC BY-NC-ND)

- Availability
- correction factor from previous experimental work
- \(p_j \times \mathbb{P}(\text{available for detection})\)
- (Recent work by David Borchers may be useful)

- Term selection by approximate \(p\)-values
- Covariates are collinear (curvilinear)

Case study II - black bears in Alaska

- Area of 26,482 km
^{2}(~ size of VT/MA) - Double observer surveys using Piper Super Cubs
- 1238, 35km transects, 2001-2003

- Surveys in Spring, bears are there, but not too much foliage
- Generally search uphill
- Double observer (Borchers et al, 2006)
- Curtain between pilot and observer; light system
- Go off transect and circle to ID

- Truncate at 22m and 450m, leaving 351 groups (out of ~44,000 segments)
- Group size 1-3 (lone bears, sow w. cubs)
- 1402m elevational cutoff

- MRDS estimate: ~1500 black bears
- DSM estimate: ~1200 black bears (968 - 1635, CV ~13%)
- Not a
*huge*difference, so why bother?

- Flexible spatial models
- GLMs + random effects + smooths + other extras
- autocorrelation can be modelled

- Large areas, makes sense
- Spatial component is v. helpful for managers
- Two-stage models can be useful!
- Estimating temporal trends

- Distance for Windows
- Easy to use Windows software
- Len Thomas, Eric Rexstad, Laura Marshall

`Distance`

R package- Simple way to fit detection functions
- Me!

`mrds`

R package- More complex analyses - double observer surveys
- Jeff Laake, me

`dsm`

package- Design “inspired by” (“stolen from”)
`mgcv`

- Easy to build simple models, possible to build complex ones
Syntax example:

`model <- dsm(count ~ s(x,k=10) + s(depth,k=6), detection.function, segment.data, observation.data, family=negbin(theta=0.1))`

Utility functions: variance estimation, plotting, prediction etc

- Rhode Island
- Kris Winiarski, Peter Paton, Scott McWilliams
- Funding from the State of Rhode Island Ocean Special Area Management Plan

- Alaska
- Earl Becker, Becky Strauch, Mike Litzen, Dave Filkill
- Funding from Alaska Department of Fish and Game

Talk available at

http://converged.yt/talks/cornell-dsm/talk.html

http://converged.yt/talks/cornell-dsm/talk.html

- Borchers, DL, JL Laake, C Southwell, and CGM Paxton. Accommodating Unmodeled Heterogeneity in Double‐Observer Distance Sampling Surveys. Biometrics 62, no. 2 (2006): 372–378.
- Miller, DL, ML Burt, EA Rexstad and L Thomas. Spatial Models for Distance Sampling Data: Recent Developments and Future Directions. Methods in Ecology and Evolution 4, no. 11 (2013): 1001–1010.
- Winiarski, KJ, ML Burt, Eric Rexstad, DL Miller, CL Trocki, PWC Paton, and SR McWilliams. Integrating Aerial and Ship Surveys of Marine Birds Into a Combined Density Surface Model: a Case Study of Wintering Common Loons. The Condor 116, no. 2 (2014): 149–161.
- Winiarski, KJ, DL Miller, PWC Paton, and SR McWilliams. A Spatial Conservation Prioritization Approach for Protecting Marine Birds Given Proposed Offshore Wind Energy Development. Biological Conservation 169 (2014): 79–88.

- Goodness of fit testing
- Dunn, PK, and GK Smyth. Randomized Quantile Residuals. Journal of Computational and Graphical Statistics 5, no. 3 (1996): 236–244.
- Back transform for
**exactly**Normal residuals - Less problems with artefacts
- (Thanks to Natalie Kelly at CSIRO for the tip)

`gam.check`

`rqgam.check`