Zebra

In this example you will load in some zebra location data and determine how their ranges overlap.

Data

The data originated from MoveBank

But I have cleaned it slightly for this workshop. Read the data file in - it is a shapefile of point data.

require(rgdal)
zebras = readOGR("./Data", "zebra")
## OGR data source with driver: ESRI Shapefile 
## Source: "./Data", layer: "zebra"
## with 34034 features and 3 fields
## Feature type: wkbPoint with 2 dimensions
summary(zebras)
## Object of class SpatialPointsDataFrame
## Coordinates:
##               min     max
## coords.x1 36.6111 37.0311
## coords.x2  0.2257  0.7166
## Is projected: FALSE 
## proj4string :
## [+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0]
## Number of points: 34034
## Data attributes:
##                timestamp                 species        individual   
##  2007-06-17 06:00:00:    8   Equus burchellii: 5865   GZ4    :11233  
##  2007-06-17 07:00:00:    8   Equus grevyi    :28169   GZ2    : 6155  
##  2007-06-17 08:00:00:    8                            PZ2    : 5055  
##  2007-06-17 09:00:00:    8                            GZ7    : 4198  
##  2007-06-17 10:00:00:    8                            GZ3    : 2975  
##  2007-06-17 11:00:00:    8                            GZ1    : 1773  
##  (Other)            :33986                            (Other): 2645
table(zebras$individual)
## 
##   GZ1   GZ2   GZ3   GZ4   GZ5   GZ6   GZ7   PZ1   PZ2 
##  1773  6155  2975 11233   534  1301  4198   810  5055

This tells us we have records for two species and 9 individuals - 7 Grevy's Zebra, and two Plains (Burchelli) zebra. We also see it is in lat-long projection. Let's change to Google Mercator and get a background image.

zebras = spTransform(zebras, CRS("+init=epsg:3857"))
summary(zebras)
## Object of class SpatialPointsDataFrame
## Coordinates:
##               min     max
## coords.x1 4075530 4122278
## coords.x2   25125   79776
## Is projected: TRUE 
## proj4string :
## [+init=epsg:3857 +proj=merc +a=6378137 +b=6378137 +lat_ts=0.0
## +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null
## +no_defs]
## Number of points: 34034
## Data attributes:
##                timestamp                 species        individual   
##  2007-06-17 06:00:00:    8   Equus burchellii: 5865   GZ4    :11233  
##  2007-06-17 07:00:00:    8   Equus grevyi    :28169   GZ2    : 6155  
##  2007-06-17 08:00:00:    8                            PZ2    : 5055  
##  2007-06-17 09:00:00:    8                            GZ7    : 4198  
##  2007-06-17 10:00:00:    8                            GZ3    : 2975  
##  2007-06-17 11:00:00:    8                            GZ1    : 1773  
##  (Other)            :33986                            (Other): 2645
require(dismo)
bg = gmap(extent(zebras))

Now we can plot the zebra locations

plot(bg)
plot(zebras, pch = 19, col = "#00000020", add = TRUE)
plot of chunk plotzmap

Now, lets choose nine colours and map them. The ColorBrewer system provides good colours for graphics, so we'll use 9 from one of its categorical colour palettes.

require(RColorBrewer)
## Loading required package: RColorBrewer
palette(brewer.pal(9, "Set1"))
plot(bg)
plot(zebras, col = zebras$individual, add = TRUE)
legend("left", fill = brewer.pal(9, "Set1"), legend = levels(zebras$individual), 
    bg = "white", inset = 0.1)
plot of chunk unnamed-chunk-3

Range Estimation

First, lets try plotting the bounding box of each zebra.

  • Split the SpatialPointsDataFrame into nine SPDFs based on the identifier
  • Use a nice palette
  • Plot the background map
  • Add the zebra point
  • Loop over the split list and plot the bounding boxes using the extent function from the raster package
splitz = split(zebras, zebras$individual)

palette(brewer.pal(9, "Set1"))

plot(bg)
plot(zebras, col = "black", cex = 0.5, add = TRUE)

require(raster)
for (i in 1:9) plot(extent(splitz[[i]]), add = TRUE, col = i, lwd = 5)
title("Bounding box ranges")
plot of chunk unnamed-chunk-4

Bounding boxes are a poor estimate of an animal's range. Lets try the convex hull. This is the smallest convex polygon that encloses a set of points. For example:

plot of chunk chull

So let's compute the convex hull for each zebra and plot. The procedure is:

  • The rgeos package for its gConvexHull function.
  • A loop over each of the sets of points
  • Save each convex hull in a list
  • Loop over the hulls to plot them in colour.
palette(brewer.pal(9, "Set1"))
require(rgeos)

chulls = list()
for (i in 1:9) {
    chulls[[i]] = gConvexHull(splitz[[i]])
}
names(chulls) = names(splitz)

plot(bg)
plot(zebras, col = "black", cex = 0.5, add = TRUE)
for (i in 1:9) {
    plot(chulls[[i]], border = i, lwd = 4, add = TRUE)
}
plot of chunk unnamed-chunk-5

Now some analysis. I want to see how much of each zebra's range overlaps with the other zebras. I can compute the area of a spatial object using gArea and compute intersections of spatial objects using gIntersection, both from the rgeos package.

The output is going to be a 9x9 matrix. Each row is the percentage of that zebra's range that overlaps with the zebra in that column. Obviously the diagonal will be 100% since every range is an exact overlap with itself.

# empty matrix to fill
zm = matrix(0, 9, 9)
# loop over this zebra
for (i in 1:9) {
    # compute area of its range
    areaI = gArea(chulls[[i]])
    # loop over other zebra
    for (j in 1:9) {
        # find intersection region
        overlap = gIntersection(chulls[[i]], chulls[[j]])
        # check if regions are distinct
        if (is.null(overlap)) {
            zm[i, j] = 0
        } else {
            # compute percentage of this zebra's area
            zm[i, j] = round(100 * gArea(overlap)/areaI)
        }
    }
}
# make nice labels
rownames(zm) = names(splitz)
colnames(zm) = names(splitz)
zm
##     GZ1 GZ2 GZ3 GZ4 GZ5 GZ6 GZ7 PZ1 PZ2
## GZ1 100 100   0   0  74  35  48   0  18
## GZ2  28 100   0   0  25  12  15   0   7
## GZ3   0   0 100 100   0   0   0  81   0
## GZ4   0   0  20 100   0   0   0  34   0
## GZ5  81  97   0   0 100  44  47   0  21
## GZ6  78  99   0   0  91 100  45   0  48
## GZ7  90 100   0   0  81  37 100   0   2
## PZ1   0   0  25  52   0   0   0 100   0
## PZ2  20  27   0   0  22  25   1   0 100

As an exercise, turn that code into a function. Think about the inputs and outputs.

Conclusions

What does this matrix tell us? The first zebra, GZ1, has a range that is enclosed totally within GZ2, but is completely different to GZ3 and GZ4.

The two plains zebras, PZ1 and PZ2, have non-overlapping ranges. We can plot these conclusions to check, and also label the ranges at the region centroid.

plot(bg)
plot(chulls[["GZ1"]], add = TRUE, col = "#00FF0080")
plot(chulls[["GZ2"]], add = TRUE, col = "#FF00FF80")
plot(chulls[["PZ1"]], add = TRUE, col = "#FF000080")
plot(chulls[["PZ2"]], add = TRUE, col = "#0000FF80")
text(coordinates(chulls[["GZ1"]]), "GZ1")
text(coordinates(chulls[["GZ2"]]), "GZ2")
text(coordinates(chulls[["PZ2"]]), "PZ2")
text(coordinates(chulls[["PZ1"]]), "PZ1")
plot of chunk zebrahulls

Merging All The Ranges

We can combine regions using the gUnion function. It takes two arguments and returns a merged object. We have 9 regions to merge. One way to do that would be a number of nested calls like this:

allRanges = gUnion(chulls[[1]], gUnion(chulls[[2]], gUnion(chulls[[3]], gUnion(chulls[[4]], 
    gUnion(chulls[[5]], gUnion(chulls[[6]], gUnion(chulls[[7]], gUnion(chulls[[8]], 
        chulls[[9]]))))))))

Luckily R has an easier way to do this, using the Reduce function:

allRanges = Reduce(gUnion, chulls)
plot(bg)
plot(allRanges, add = TRUE)
plot of chunk unnamed-chunk-7

This shows us that the zebras' ranges fall into two non-overlapping ranges

If we rearrange the matrix we can see the division into two communities more clearly:

zm[c(1, 2, 5, 6, 7, 9, 3, 4, 8), c(1, 2, 5, 6, 7, 9, 3, 4, 8)]
##     GZ1 GZ2 GZ5 GZ6 GZ7 PZ2 GZ3 GZ4 PZ1
## GZ1 100 100  74  35  48  18   0   0   0
## GZ2  28 100  25  12  15   7   0   0   0
## GZ5  81  97 100  44  47  21   0   0   0
## GZ6  78  99  91 100  45  48   0   0   0
## GZ7  90 100  81  37 100   2   0   0   0
## PZ2  20  27  22  25   1 100   0   0   0
## GZ3   0   0   0   0   0   0 100 100  81
## GZ4   0   0   0   0   0   0  20 100  34
## PZ1   0   0   0   0   0   0  25  52 100

Exercises

Plotting By Species

The $species field of the zebras data frame gives the species. Make some maps based on species.

plot of chunk unnamed-chunk-9

Alpha Shapes

You could also use alpha shapes, which are concave, but require a scale parameter, alpha. Also, if the input has coincident points, the algorithm can't cope. To work round that we add a tiny bit of 'jitter' to the coordinates.

Here's some alpha shapes for the first zebra:

require(alphahull)
par(mfrow = c(1, 3))
plot(ashape(jitter(coordinates(splitz[[1]])), alpha = 10000), asp = 1)
plot(ashape(jitter(coordinates(splitz[[1]])), alpha = 5000), asp = 1)
plot(ashape(jitter(coordinates(splitz[[1]])), alpha = 1000), asp = 1)
plot of chunk alphahull
par(mfrow = c(1, 1))

See what you get for some of the other zebras, and try and produce this map:

plot of chunk unnamed-chunk-10