Spatial modelling of distance sampling surveys

David L Miller
CREEM, University of St Andrews

British Trust for Ornithology
Thetford, UK
26 June 2015






Ecological questions

How many animals are there?

Where are all the animals?

Why are they there?

Density surface models

(Spatial models that account for detectability)


(…and more)

\(\geq 2\)-stage models




Hedley and Buckland (2004). Miller et al (2014).

Spatially explicit models

Data setup

Ursus from PhyloPic.

Two options for response

\(n_j\)

 

\(\hat{n}_j\)

\[ \hat{n}_j = \sum_{i \text{ in segment } j} \frac{s_i}{\hat{p}_i} \]

Generalized additive models (in two pages) (I)

If we are modelling counts:

\[ \mathbb{E}(\hat{n}_j) = A_j\exp \left\{ \beta_0 + \sum_k f_k(z_{jk}) \right\} \]

Generalized additive models (in two pages) (II)

Minimise distance between data and model while minimizing:

\[ \lambda_k \int_\Omega \frac{\partial^2 f_k(z_k)}{\partial z_k^2} \text{ d}z_k \]

“just wiggly enough”

Fitting via REML, see Wood (2011).

Response distributions

Model selection

Software:
dsm package

The dsm package

install.packages(devtools)
# get the latest version
devtools::install_github("DistanceDevelopment/dsm")

dsm syntax

Case study:
Rhode Island seabirds

RI seabirds

RI transects

RI seabirds - Aims

Photo by jackanapes on flickr (CC BY-NC-ND)

Load the data

See also github.com/dill/MEPS-analysis

load(url("http://converged.yt/talks/BTO-dsm/loon-data.RData"))

Segment data

head(seg)
##     Sample.Label SeasonYear       Date Surveyed. Wind.speed Wind.direction
## 1 101-10/12/2010 Winter1011 10/12/2010       Yes         <5              N
## 3 101-09/12/2011 Winter1112 09/12/2011       Yes      10-15             SW
## 4 101-28/01/2012 Winter1112 28/01/2012       Yes       5-10             SW
## 5 102-10/12/2010 Winter1011 10/12/2010       Yes         <5              N
## 7 102-28/01/2012 Winter1112 28/01/2012       Yes       5-10             SW
## 8 102-09/12/2011 Winter1112 09/12/2011       Yes      10-15             SW
##         Weather Visibility Waveheight Whitecaps Observer_Left
## 1 Partly Cloudy         >5         <1   Minimal            KW
## 3 Partly Cloudy         >5        2-3   Minimal            KW
## 4         Sunny         >5        1-2   Minimal            KW
## 5 Partly Cloudy         >5         <1   Minimal            KW
## 7         Sunny         >5        1-2   Minimal            KW
## 8 Partly Cloudy         >5        2-3   Minimal            KW
##   Observer_Right Onsurvey_Left Onsurvey_Right Glare_Left Glare_Right
## 1             JV           Yes            Yes   Moderate        None
## 3             EJ           Yes            Yes       None     Minimal
## 4             EJ           Yes            Yes       None        None
## 5             JV           Yes            Yes   Moderate        None
## 7             EJ           Yes            Yes       None        None
## 8             EJ           Yes            Yes       None     Minimal
##   Observerconditions_Left Observerconditions_Right Comments Season Year
## 1                    Good                Excellent          Winter 2010
## 3                    Good                     Good          Winter 2011
## 4               Excellent                Excellent          Winter 2012
## 5                    Good                Excellent          Winter 2010
## 7               Excellent                Excellent          Winter 2012
## 8                    Good                     Good          Winter 2011
##       gchl eff.sides        tx       ty        bx        by         x
## 1 3.568708         1 -44.23136 14.42102 -44.09166 12.153487 -44.16151
## 3 5.528956         1 -44.23136 14.42102 -44.09166 12.153487 -44.16151
## 4 5.528956         1 -44.23136 14.42102 -44.09166 12.153487 -44.16151
## 5 3.179644         1 -44.09166 12.15349 -43.95196  9.885946 -44.02181
## 7 4.356627         1 -44.09166 12.15349 -43.95196  9.885946 -44.02181
## 8 4.356627         1 -44.09166 12.15349 -43.95196  9.885946 -44.02181
##          y   Effort     depth    depthm    slope roughness phimedian
## 1 13.28725 2269.998  -66.9324 -20.83643 1.131463  1.352299  1.057448
## 3 13.28725 2269.998  -66.9324 -20.83643 1.131463  1.352299  1.057448
## 4 13.28725 2269.998  -66.9324 -20.83643 1.131463  1.352299  1.057448
## 5 11.01972 2269.998 -129.5995 -39.91604 0.651089  0.522652  1.872487
## 7 11.01972 2269.998 -129.5995 -39.91604 0.651089  0.522652  1.872487
## 8 11.01972 2269.998 -129.5995 -39.91604 0.651089  0.522652  1.872487
##   distancelandkm gchl_winter gchl_spring gchl_summer gchl_fall      fcpi
## 1       1.755095    4.441985    8.242430   15.327192  6.308177 0.5833333
## 3       1.755095    4.441985    8.242430   15.327192  6.308177 0.5833333
## 4       1.755095    4.441985    8.242430   15.327192  6.308177 0.5833333
## 5       3.875946    3.721898    6.793428    7.609529  5.299252 0.4666667
## 7       3.875946    3.721898    6.793428    7.609529  5.299252 0.4666667
## 8       3.875946    3.721898    6.793428    7.609529  5.299252 0.4666667
##   gchl_long Transect.Label
## 1  8.449691              1
## 3  8.449691              1
## 4  8.449691              1
## 5  6.066381              1
## 7  6.066381              1
## 8  6.066381              1

Observation data

head(obs.loons)
##           Date     Time Species Group Location Bin Observer Length size
## 106 17/02/2011 18991230    COLO  Loon On Water   A       JV   7448    1
## 301 28/01/2012 18991230    COLO  Loon On Water   A    Ellen   7448    1
## 446 23/02/2011 18991230    COLO  Loon On Water   A       KW   7448    1
## 457 07/02/2011 18991230    COLO  Loon On Water   A       KW   7448    1
## 814 23/02/2011 18991230    COLO  Loon On Water   B       JV   7448    1
## 830 09/01/2012 18991230    COLO  Loon On Water   A       KW   7448    1
##     object          x         y distbegin distend distance Season Year
## 106    106 -17.525139 -33.33390        44     163      119 Winter 2011
## 301    301 -10.015215 -27.73880        44     163      119 Winter 2012
## 446    446  -3.793853 -25.40426        44     163      119 Winter 2011
## 457    457 -10.447177 -25.32420        44     163      119 Winter 2011
## 814    814 -14.415200 -21.22522       164     432      268 Winter 2011
## 830    830  27.030791 -21.05565        44     163      119 Winter 2012
##     SeasonYear Transect.Label    Sample.Label
## 106 Winter1011              8  819-17/02/2011
## 301 Winter1112             10 1022-28/01/2012
## 446 Winter1011             12 1222-23/02/2011
## 457 Winter1011             10 1021-07/02/2011
## 814 Winter1011              9  919-23/02/2011
## 830 Winter1112             21 2122-09/01/2012

Fit a detection function

library(Distance)
## Loading required package: mrds
## This is mrds 2.1.13
## Built: R 3.2.0; ; 2015-06-24 16:41:23 UTC; unix
hr.df.size <- ds(obs.loons, formula=~size, key="hr",
                 truncation=list(right=1000, left=44))
## Columns "distbegin" and "distend" in data: performing a binned analysis...
## Cannot perform AIC adjustment term selection when covariates are used.
## Fitting hazard-rate key function
## AIC= 1580.124
## No survey area information supplied, only estimating detection function.

Fitted detection function

plot(hr.df.size)

Plotting observations

Plotting observations per season-year

Spatial covariates - Chlorophyll

Spatial covariates - Depth

Spatial model

library(dsm)
## Loading required package: mgcv
## Loading required package: nlme
## This is mgcv 1.8-6. For overview type 'help("mgcv-package")'.
## This is dsm 2.2.8
## Built: R 3.2.0; ; 2015-06-25 13:24:11 UTC; unix
loon.model.hr <- dsm(Nhat~s(gchl_long, k=10)+
                          s(depthm, k=10)+
                          s(y, k=10),
                     hr.df.size, seg, obs.loons,
                     family=nb(), availability=0.7,
                     method="REML")

dsm summary

summary(loon.model.hr)
## 
## Family: Negative Binomial(0.105) 
## Link function: log 
## 
## Formula:
## Nhat ~ s(gchl_long, k = 10) + s(depthm, k = 10) + s(y, k = 10) + 
##     offset(off.set)
## 
## Parametric coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -14.62165    0.07504  -194.9   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df Chi.sq  p-value    
## s(gchl_long) 4.347  5.357  68.39 6.51e-13 ***
## s(depthm)    4.111  5.160  28.08 4.41e-05 ***
## s(y)         4.125  5.147  42.56 6.20e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.077   Deviance explained = 29.7%
## -REML = 3183.2  Scale est. = 1         n = 2067

Plotting smooth terms

plot(loon.model.hr, pages=1)

Model checking

Residual checking

gam.check(loon.model.hr)

Randomised quantile residuals

rqgam.check

rqgam.check(loon.model.hr)

Inference

Prediction grid

head(pred)
##   OBJECTID_1 Join_Cou_1 TARGET_F_1 LONGITUDE LATITUDE roughness phimedian
## 0          1          1          0 -71.68000 40.92203  0.082416  2.223548
## 1          2          1          1 -71.65626 40.92207  0.050347  2.340658
## 2          3          1          2 -71.63251 40.92209  0.056074  2.440206
## 3          4          1          3 -71.60877 40.92212  0.042995  2.498072
## 4          5          1          4 -71.58502 40.92214  0.036595  2.540494
## 5          6          1          5 -71.56128 40.92215  0.038354  2.552859
##   latitude_2 longitude1         x         y distancelandkm cellaream
## 0     40.925   -71.6750 -28.56647 -27.57289       71.51344     4e+06
## 1     40.925   -71.6625 -26.57151 -27.56911       76.54190     4e+06
## 2     40.925   -71.6375 -24.57647 -27.56578       81.77255     4e+06
## 3     40.925   -71.6125 -22.58151 -27.56300       81.74481     4e+06
## 4     40.925   -71.5875 -20.58648 -27.56077       81.65078     4e+06
## 5     40.925   -71.5625 -18.59145 -27.55911       82.07670     4e+06
##   chl_fall chl_winter chl_spring chl_summer gchl_winter gchl_spring
## 0 2.449310   2.251554   2.457058   1.263831    2.098199    2.337756
## 1 2.456197   2.339274   2.354573   1.281547    2.182396    2.233763
## 2 2.149591   2.934982   2.307914   1.075004    2.540287    2.192996
## 3 2.136389   2.824368   2.576211   0.887268    2.430343    2.412229
## 4 1.814823   2.608232   2.541746   0.872087    2.357794    2.397891
## 5 2.518508   2.439878   2.220565   0.851456    2.329046    2.083287
##   gchl_summer gchl_fall    depthm     depth    slope       fcpi gchl_long
## 0   1.1652222  2.275435 -50.41504 -165.4037 0.127724 0.03333333  1.647738
## 1   1.1777266  2.264136 -50.42341 -165.4311 0.033266 0.03333333  1.607348
## 2   1.0211105  2.099305 -51.29684 -168.2967 0.058634 0.01666667  1.542477
## 3   0.8770048  2.075592 -52.78529 -173.1801 0.020804 0.04166667  1.569906
## 4   0.8577625  1.777506 -53.22737 -174.6305 0.067070 0.02500000  1.548057
## 5   0.8364021  2.257372 -54.21326 -177.8650 0.061565 0.02500000  1.458858
##   width height
## 0     2      2
## 1     2      2
## 2     2      2
## 3     2      2
## 4     2      2
## 5     2      2

Estimating abundance

sum(predict(loon.model.hr, newdata=pred, off.set=pred$cellaream))
## [1] 5844.445

Mapping abundance

Estimating uncertainty

summary(dsm.var.gam(loon.model.hr, pred, off.set=pred$cellaream))
## Summary of uncertainty in a density surface model calculated
##  analytically for GAM, with delta method
## 
## Approximate asymptotic confidence interval:
##       5%     Mean      95% 
## 4419.042 5844.445 7729.626 
## (Using delta method)
## 
## Point estimate                 : 5844.445 
## Standard error                 : 809.3481 
## CV of detection function       : 0.03710214 
## CV from GAM                    : 0.1384816 
## Total coefficient of variation : 0.1434

Mapping uncertainty

plot(dsm.var.gam(loon.model.hr, split(pred, 1:nrow(pred)),
                 off.set=pred$cellaream),
                 observations=FALSE,
                 gg.grad=scale_fill_gradient())

plot of chunk unnamed-chunk-20

Conclusions

Conclusion

Acknowledgements

Funding from Rhode Island Department of Natural Resources.

Thanks!

Slides available at
http://converged.yt/talks/BTO-dsm/talk.html

References

Appendices

Appendix - REML

Appendix - Smoothing in less awkward regions

Miller and Kelly (in prep)

Appendix - Concurvity

\[ \text{Altitude} = f(x,y) + \epsilon \quad \text{or} \quad \text{Chlorophyll A} = f(\text{SST}) + \epsilon \]

Autocorrelation