Looking back at the first chapter, Introduction to distance sampling, we saw some simulations of animal populations. In these examples animals were distributed pretty uniformly across the survey area. In reality, the locatio of animals in space is driven by a variety of factors: climate, prey availability, ease of movement and more play a role.

Taking another look at the pantropical spotted dolphins in the Gulf of Mexico, the data were collected with spatial information (location of observations and bathymetry at those points). We can investigate the relationship between these covariates and the observered counts by plotting the data in two ways, first simply looking at histograms of counts with respect to covariate values:

## Warning: Ignoring unknown aesthetics: weight

## Warning: Ignoring unknown aesthetics: weight

## Warning: Ignoring unknown aesthetics: weight

We can see from the histograms that there are distinct peaks at particular values and avoidance of other values. For example, from this crude exploratory analysis we see that the dolphins tend to be observed near the centre of the survey area and appear to avoid shallow waters.

The one-dimensional slices offered by the histograms are useful, but don’t tell the full story about what’s happening in these data. So, we plot the observations over a map of the depth values:

Also plotted (in red) are the transect lines that the Oregon II travelled. This shows that the survey had relatively good coverage of the survey area in space, but less good coverage in terms of the depth covariate.

From the above plots we can see that there is definitely some correlation between the pantropical spotted dolphins and the covariates we’ve collected on location and depth.

The aim of this section is to talk about how to model this relationship. Before we go into the details of the models we’ll use, let’s first think about why one might want to do such an analysis.

Why go through all the fuss?

There are a number of reasons to model the distributions of biological populations explicitly in space:

• First and foremost, stated above, we note that animals do not generally have uniform distributions in space and their locations are largely dictated by biotic and abiotic environmental variables. This becomes problematic for abundance estimation as the study area becomes bigger or the relationship between distribution and environmental factors becomes more complex. In order to create realistic models we should take into account as many factors as possible that influence the distribution and abundance of the population in question.
• Since we then model the abundances on a smaller spatial scale (than when using the Horvtiz-Thompson estimator we saw in Estimating abundance), we can also expect more precise estimates of abundance.
• Modelling abundance as a function of not only spatial location, but also other environmental covariates allows us to make ecological inference about the population – what attracts a population to an area, what repels it?
• Non-quantitatively-minded people find looking at maps much more compelling than simply a number (or worse, a table of numbers1). Displaying maps to a non-statistical audience can be an effective way to get their attention and have them engage with the modelling process (especially in terms of model checking and criticism, which can often seen like magic to non-statisticians).

Often the first and last reasons above dominate most peoples’ motivation for building spatially explicit models: they need to know where animals are and they want this information to be accessible to others.

Model-based analysis

The major contrast between the approach detailed in this chapter and that of the previous chapters is that we now consider model-based inference about the biological populations in question rather than design-based inference. This has some advantages and some disadvantages. A spatially-explicit model can explain the between-transect variation (which is often a large component of the variance in design-based estimates) and so using a model-based approach can lead to smaller variance in estimates of abundance than design-based estimates.

• more here coverage prob., incidental surveys etc

Recap

This chapter looked breifly at the Gulf of Mexico data again, showing that there are spatial elements to the data. If abundance is non-uniform with respect to spatial or environmental covariates we should model this variation to ensure the most precise estimates of abundance. The next few chapters will explain how this is possible using the R package dsm.

• Elith and Leathwick (2009) provide a review of species distribution modelling, including history and broad conceptual overview.
• Warren (2012), McInerny and Etienne (2013) and Sillero (2011) discuss the nomenclature of the spatial modelling of animal distributions (sometimes called “niche modelling” or “species distribution modelling”).

References

Elith, J. and Leathwick, J.R. (2009) Species Distribution Models: Ecological Explanation and Prediction Across Space and Time. Annual Review of Ecology, Evolution, and Systematics, 40, 677–697.

Gelman, A. (2011) Why Tables Are Really Much Better Than Graphs. Journal of Computational and Graphical Statistics, 20, 3–7.

McInerny, G.J. and Etienne, R.S. (2013) ‘Niche’ or ‘distribution’ modelling? A response to Warren. Trends in Ecology & Evolution, 28, 191–192.

Sillero, N. (2011) What does ecological modelling model? A proposed classification of ecological niche models based on their underlying methods. Ecological Modelling, 222, 1343–1346.

Warren, D.L. (2012) In defense of ‘niche modeling’. Trends in Ecology & Evolution, 27, 497–500.

1. For more (light-hearted) discussion on graphs vs. tables see Gelman (2011).