In the last chapter we looked at how we could model the probability of an animal being detected using the distances to the animals from the transect. A little thought leads us to believe that in addition to distance, other factors will influence the detection probability. For example: a pod of 50 whales will be much easier to spot than a single whale; it’s harder to see birds early in the morning when it’s dark and misty; and it’s much more difficult to see (or do) anything when the sea is extremely choppy. We’d like to take these factors into account when we’re building our detection function, to ensure that we can model the detection process as accurately as possible.

Distance sampling methods have a property known as *pooling robustness* which means that if we assume that the heterogeneity in detectability is relatively small (i.e. there’s not much variation between animals in how detectable they are), then estimates of abundance will not be biased (though they might have high variance; Burnham et al. (2004), Section 11.12). This means that usually we do not need to worry about bias but if we can model the heterogeneity in detectability then we can improve the precision of our abundance estimates (Marques et al., 2007).

The plot indicates that there is some effect of both of these covariates on the distances that we can observe dolphins at. As one might expect, the rougher the sea (i.e. higher Beaufort), the shorter distance one observes animals at (though there is an odd spike at sea state 2; left panel). For the group size, we see an increase in observed distance as the size increases (solid line is a linear fit through the data, the dashed line is a LOESS smooth (Cleveland, Grosse and Shyu, 1991) which highlights the steep increase at smaller group sizes (right panel). It’s worth bearing in mind that sea state and group size both affect the observed distances in different ways and the above plots are only taking two dimensional slices through their effects (e.g. small groups in choppy seas are probably not very visible but larger groups may be) – so the above plots shouldn’t be over-interpreted.

The above plots plus some simple physical reasoning lead us to believe that the covariates have an effect on the detection probability – how do we include the covariates in our models though?

In the half-normal detection function we looked at in the last chapter, we saw that the scale parameter (*σ*) was the only parameter of the model. Decreasing *σ* made the detection function go to zero more quickly. We might expect that the covariates had a similar effect: animals are visible at larger distances when their groups are bigger: changing the rate of the drop-off in the detection function. If we are to embed the covariate effects in the scale parameter, we could re-write the scale as:

$$
\sigma_i = \exp\left( \beta_0 + \sum_{k=1}^K \beta_k z_{ik} \right)
$$

now we’re not estimating the scale parameters, but rather the *β*_{k} parameters. *β*_{0} represents a kind of “intercept” (in the sense that it’s the value that the scale parameter will take when all covariates are zero; *β*_{k} for *k* = 1, …, *K* give the parameters to be estimates for the *K* covariates, which for a given observation, *i* are *z*_{ik}. Usually *K* is relatively small (we usually only include a couple of covariates in our model).

Now we’ve seen how covariates can be justified and included from a mathematical perspective, we can see how to actually fit such a model in `Distance2`

. Let’s start by fitting a model with group size as a covariate^{2} :

```
hn.dolphins.size <- ds(mexdolphins, truncation=8000, model=df(scale=~size))
summary(hn.dolphins.size)
```

```
## Summary of fitted detection function
## Transect type : line
## Number of observations : 47
## Distance range : 0 - 8000
## AIC : 836.0817
##
## Detection function : Half-normal
##
## Detection function parameters
## Estimate SE
## (Intercept) 6.91408348 0.5091799
## size 0.03947174 0.0193140
##
## Kolmogorov-Smirnov p-value : 0.8957201
## Cramer-von Mises p-value : 0.8579195
##
## Estimate SE CV
## Average p 0.6601482 0.1228513 0.1860966
```

We can see that the summary now includes an additional row of parameters, giving the effect of the parameter for group size. One way to investigate the effect of adding the size covariate graphically by plotting the fitted model:

`plot(hn.dolphins.size)`

The right panel of the plot shows that larger groups are easier to detect than smaller ones.

Methods for including covariates in distance sampling detection functions have a long history (Beavers and Ramsey (1998) is an early contribution to this area). The beginning of a rigorous approach to such models is Marques and Buckland (2003), though Marques et al. (2007) is somewhat more accessible.

For the half-normal model we obtained an average detectability of 0.7231, where as now the average probability of detection is 0.6601. Though this is the *average* (across observed covariates)^{3} detectability, for a given group size we have a different value of the detectability, mathematically we can now view the probability of detection and the detection function as functions of the covariates^{4}. So now, for each observed value of a covariate we have a probability of detection. Mathematically:

$$
\hat{p_i} = \frac{1}{w}\int_0^w g(x; \boldsymbol{\beta}, \mathbf{z}_i) \text{d}x,
$$

we can look at the values of $\hat{p_i}$ for our dolphin model by calling the `predict`

method:

`predict(hn.dolphins.size)`

```
## [1] 0.3609790 0.9999242 0.9994546 0.9724441 0.8290257 0.7652063 0.6872192
## [8] 0.9999997 0.9960868 0.5988589 0.9999242 1.0000000 0.9999985 0.9999985
## [15] 1.0000000 0.9999242 0.9999242 0.3470822 0.3208042 0.2850027 0.3470822
## [22] 0.7652063 0.6872192 0.7652063 0.3470822 0.3084022 0.4216515 0.9999895
## [29] 0.5074922 0.9960868 0.9987996 0.4216515 0.6872192 0.9148433 0.8052838
## [36] 0.9987996 0.9411026 0.3470822 0.6872192 0.9999895 0.8290257 0.9991908
## [43] 0.9999985 0.9960868 0.9873138 0.9873138 0.9724441
```

where probabilities are in order of the observations. It’s easier to look at the quantiles:

`quantile(predict(hn.dolphins.size))`

```
## 0% 25% 50% 75% 100%
## 0.2850027 0.6430390 0.9411026 0.9996894 1.0000000
```

in both preceding results we see there is a wide range of probabilities of detection. These differences will become important later in Abdundance estimation.

We can look again at the question asked at the start of this chapter and ask how does group size affect probability of detection?

`plot(mexdolphins$size[!is.na(mexdolphins$size)], predict(hn.dolphins.size), xlab="Group size", ylab="Probability of detection")`

We can add the Beaufort sea state covariate into the model, alongside group size by simply adding it into the formula. Before doing so we ensure that sea state is a factor covariate:

```
mexdolphins$beaufort <- as.factor(mexdolphins$beaufort)
hn.dolphins.size.beaufort <- ds(mexdolphins, truncation=8000, model=df(scale=~size+beaufort))
```

```
## Warning in ds(mexdolphins, truncation = 8000, model = df(scale = ~size + :
## Unable to invert Hessian: using pseduo-inverse
```

`summary(hn.dolphins.size.beaufort)`

```
## Summary of fitted detection function
## Transect type : line
## Number of observations : 47
## Distance range : 0 - 8000
## AIC : 838.1266
##
## Detection function : Half-normal
##
## Detection function parameters
## Estimate SE
## (Intercept) 6.59546436 1.330671e+00
## size 0.04059144 3.299831e-02
## beaufort2 8.29262592 9.043820e+06
## beaufort3 0.51333594 8.863909e-01
## beaufort4 0.42457606 1.364223e+00
## beaufort5 -0.85601876 8.281355e-01
##
## Kolmogorov-Smirnov p-value : 0.8545092
## Cramer-von Mises p-value : 0.908566
##
## Estimate SE CV
## Average p 0.5717546 0.1711626 0.2993637
```

As Beaufort sea state is included as a factor covariate, which gives us a number of extra parameters in the model. We can see the effect of adding the covariate in a plot:

`plot(hn.dolphins.size.beaufort)`

The middle plot above shows that there is little difference in the lower sea states, but a considerable difference once we get to sea state 5, when we see a much faster decrease in probability of detection.

Just as when deciding between the half-normal and hazard-rate detection functions in the previous chapter, we can use the AIC to evaluate relative model fit and goodness of fit testing/Q-Q plots to evaluate absolute model fit.

We can see some improvement in the Q-Q plot results between the model with and without the group size covariate in the following plot:We can compare the models that we’ve fitted so far in a table, but we first fit a model with only Beaufort sea state as a covariate, so complete the set of model combinations:

`hn.dolphins.beaufort <- ds(mexdolphins, truncation=8000, model=df(scale=~beaufort))`

```
## Warning in ds(mexdolphins, truncation = 8000, model = df(scale =
## ~beaufort)): Unable to invert Hessian: using pseduo-inverse
```

`summary(hn.dolphins.beaufort)`

```
## Summary of fitted detection function
## Transect type : line
## Number of observations : 47
## Distance range : 0 - 8000
## AIC : 846.0192
##
## Detection function : Half-normal
##
## Detection function parameters
## Estimate SE
## (Intercept) 8.1816931 2.576122e-01
## beaufort2 8.7871511 1.225235e+07
## beaufort3 0.3839194 4.283888e-01
## beaufort4 -0.0619388 4.621588e-01
## beaufort5 0.3542033 5.984973e-01
##
## Kolmogorov-Smirnov p-value : 0.4006785
## Cramer-von Mises p-value : 0.5706475
##
## Estimate SE CV
## Average p 0.6693629 0.1268596 0.1895229
```

We can now build a table^{5} :

Model | Formula | # pars | P_{a} |
CV(P_{a}) |
AIC | |
---|---|---|---|---|---|---|

2 | Half-normal | ~size | 2 | 0.6601482 | 0.1860966 | 836.0817 |

4 | Half-normal | ~size + beaufort | 6 | 0.5717546 | 0.2993637 | 838.1266 |

1 | Half-normal | ~1 | 1 | 0.7230935 | 0.1895098 | 842.3143 |

3 | Half-normal | ~beaufort | 5 | 0.6693629 | 0.1895229 | 846.0192 |

Using the table, we can see that the best model by AIC is the one with only size included as a covariate. Also note that the AIC-best model also has the lowest coefficient of variation (CV(*P*_{a}); in general we expect covariate models to be more precise Marques et al. (2007)).

In this chapter we’ve seen how adding covariates to the detection function can account for heterogeneity in the detection probability. Covariate models make intuitive sense too, since we do not expect all individuals (or groups) in the population to have the same detectability at a given distance. This intuitive interpretation allows for direct interpretation of covariates, which may be biologically interesting. The ability to include covariates in an analysis will be particularly useful in Adundance estimation, where we will investigate stratification of abundance estimates by covariate.

Beavers, S. and Ramsey, F. (1998) Detectability analysis in transect surveys. *Journal of Wildlife Management*, 948–957.

Burnham, K.P., Buckland, S.T., Laake, J.L., Borchers, D.L., Marques, T.A., Bishop, J.R., et al. (2004) Further topics in distance sampling. *Advanced distance sampling* (eds S.T. Buckland), D.R. Anderson), K.P. Burnham), J.L. Laake), D.L. Borchers), & L. Thomas), Oxford University Press.

Cleveland, W.S., Grosse, E. and Shyu, M.W. (1991) Local regression models. *Statistical models in s* (eds J.M. Chambers), & T.J. Hastie), pp. 309–376. Pacific Grove: Wadsworth; Brooks.

Marques, F. and Buckland, S.T. (2003) Incorporating covariates into standard line transect analyses. *Biometrics*, **59**, 924–935.

Marques, T.A., Thomas, L., Fancy, S. and Buckland, S.T. (2007) Improving estimates of bird density using multiple-covariate distance sampling. *The Auk*, **124**, 1229–1243.

The scale actually goes up to 12 (“hurricane”) and beyond, see https://en.wikipedia.org/wiki/Beaufort_scale and http://www.metoffice.gov.uk/media/pdf/4/4/Fact_Sheet_No._6_-_Beaufort_Scale.pdf for more information. One may often find that Beaufort sea state is recorded as decimals (state of 4.5 for example), it’s tricky to work out what to do with these and is worth consulting those who collected the data.↩

Note that a covariate called

`size`

has a special meaning in`Distance2`

as it is used in the estimation of abundance via the Horvitz-Thompson estimator (see Adundance estimation).↩Note that when we refer to

*P*_{a}as the “average detection probability”. The average is calculated over the*observations*. We used the same term for*P*_{a}in the previous chapter when talking about non-covariate models to be consistent with what is reported in`Distance2`

. See Abundance estimation for more information on how this is calculated.↩Strictly speaking we can write the probability of detection as a function of the covariates: $\hat{p_i}(z_i)$, though the shorter

*p*_{i}form is used to save space. Hopefully the subscript will be a sufficient reminder.↩More details on how this was done in Improving the fit of detection functions.↩