David L Miller (CREEM, University of St Andrews)

Duke University, 13 February 2014

- Statistician by training (St Andrews)
- PhD University of Bath, w. Simon Wood
- Postdoc, University of Rhode Island
- Research fellow at CREEM
- Developer of distance sampling software

Spatial modelling

- Relate covariates to animal abundance
- Estimate abundance in a spatially explicit way
- Calculate uncertainty
- Interpretability to biologists/ecologists
- Often using mixed/historical data

- How to model covariate effects?
- Model (term) selection
- Reponse distribution
- Uncertain detection
- Availability
- Autocorrelation

Density surface models

Detection

- Distance sampling! – Fit detection functions
- Estimate \(\mathbb{P}(\text{ detection } | \text{ object at distance } x) = g(x)\)
- Calculate average detection probability = \(\frac{1}{w}\int_0^w g(x) \text{ d}x\) (where \(w\) is truncation distance)

- Lots of other stuff going on here!
- Covariates that affect detectability
- Double observer (\(g(0)<1\))
- Detection function formulations

- Distance for Windows (6.2 out soon!)
- Easy to use Windows software
- Len Thomas, Eric Rexstad, Laura Marshall

`Distance`

R package- Simple way to fit detection functions
- Me!

`mrds`

R package- More complex analyses - double observer surveys
- Jeff Laake, me

Generalized additive models

If we are modelling counts:

\[ \mathbb{E}(n_j) = \exp \left\{ \beta_0 + \sum_k f_k(z_{jk}) \right\} \]

- \(n_j\) has some count distribution (quasi-Poisson, Tweedie, negative binomial)
- \(f_k\) are
*smooth*functions (splines \(\Rightarrow f_k(x)=\sum_l \beta_l b_l(x)\)) - \(f_k\) can just be fixed effects \(\Rightarrow\) GLM
- Add-in random effects, correlation structures \(\Rightarrow\) GAMM
- Wood (2006) is a good intro book

Minimise distance between data and model while minimizing:

\[ \lambda_k \int_\Omega \frac{\partial^2 f_k(z_k)}{\partial z_k^2} \text{ d}z_k \]

\[ \mathbb{E}(n_j) = A_j \hat{p}_j \exp \left\{ \beta_0 + \sum_k f_k(z_{jk}) \right\} \]

\[ \mathbb{E}(\hat{n}_j) = A_j \exp \left\{ \beta_0 + \sum_k f_k(z_{jk}) \right\} \]

\[ \hat{n}_j = \sum_{i \text{ in segment } j} \frac{s_i}{\hat{p}_i} \]

`dsm`

package- Design “inspired by” (“stolen from”)
`mgcv`

- Easy to build simple models, possible to build complex ones
Syntax example:

`model <- dsm(count ~ s(x,k=10) + s(depth,k=6), detection.function, segment.data, observation.data, family=negbin(theta=0.1))`

Utility functions: variance estimation, plotting, prediction etc

Case study I - Seabirds in RI waters

- Wind development in RI/MA waters
- Map of usage
- Estimate uncertainty
- Combine maps (Zonation)

Photo by jackanapes on flickr (CC BY-NC-ND)

- Availability
- correction factor from previous experimental work
- \(p_j \times \mathbb{P}(\text{available for detection})\)

- Term selection by approximate \(p\)-values
- Covariates are collinear (curvilinear)
`select`

- extra penalty- REML - better optimisation objective

From Fig. 1 of Wood (2011)

Case study II - black bears in Alaska

- Area of 26,482 km
^{2}(~ size of VT/MA) - Double observer surveys using Piper Super Cubs
- 1238, 35km transects, 2001-2003

- Surveys in Spring, bears are there, but not too much foliage
- Generally search uphill
- Double observer (Borchers et al, 2006)
- Curtain between pilot and observer; light system
- Go off transect and circle to ID

- Truncate at 22m and 450m, leaving 351 groups (out of ~44,000 segments)
- Group size 1-3 (lone bears, sow w. cubs)
- 1402m elevational cutoff

- bivariate smooth of location
- smooth of elevation
- bivariate smooth of slope and aspect

- MRDS estimate: ~1500 black bears
- DSM estimate: ~1200 black bears (968 - 1635, CV ~13%)
- Not a
*huge*difference, so why bother?

- Flexible spatial models
- GLMs + random effects + smooths + other extras
- autocorrelation can be modelled

- Large areas, makes sense
- Spatial component is v. helpful for managers
- Two-stage models can be useful!
- Estimating temporal trends

- Borchers, DL, JL Laake, C Southwell, and CGM Paxton. Accommodating Unmodeled Heterogeneity in Double‐Observer Distance Sampling Surveys. Biometrics 62, no. 2 (2006): 372–378.
- Miller, DL, ML Burt, EA Rexstad and L Thomas. Spatial Models for Distance Sampling Data: Recent Developments and Future Directions. Methods in Ecology and Evolution 4, no. 11 (2013): 1001–1010.
- Winiarski, KJ, ML Burt, Eric Rexstad, DL Miller, CL Trocki, PWC Paton, and SR McWilliams. Integrating Aerial and Ship Surveys of Marine Birds Into a Combined Density Surface Model: a Case Study of Wintering Common Loons. The Condor 116, no. 2 (2014): 149–161.
- Winiarski, KJ, DL Miller, PWC Paton, and SR McWilliams. A Spatial Conservation Prioritization Approach for Protecting Marine Birds Given Proposed Offshore Wind Energy Development. Biological Conservation 169 (2014): 79–88.

Talk available at http://dill.github.com/talks/duke-dsm/talk.html

- Rhode Island: Kris Winiarski, Peter Paton, Scott McWilliams
- Alaska: Earl Becker, Becky Strauch, Mike Litzen, Dave Filkill
- Elsewhere: Mark Bravington, Natalie Kelly, Eric Rexstad, Louise Burt, Len Thomas, Steve Buckland

- Goodness of fit testing
- Dunn, PK, and GK Smyth. Randomized Quantile Residuals. Journal of Computational and Graphical Statistics 5, no. 3 (1996): 236–244.
- Back transform for
**exactly**Normal residuals - Less problems with artefacts
- (Thanks to Natalie Kelly at CSIRO for the tip)

`gam.check`

`rqgam.check`