# Functions

Functions are important in R - its a functional language

## Usage

A function has arguments, returns a result.

```> foo(1, 2, 3)
```
``` 5
```

## Simple argument matching

```> args(foo)
```
```function (x, a, b = 1)
NULL
```
```> foo()
```
```Error: 'a' is missing
```
```> foo(1)
```
```Error: 'a' is missing
```
```> foo(1, 2)
```
``` 3
```
```> foo(1, 2, 3)
```
``` 5
```
```> foo(1, 2, 3, 4)
```
```Error: unused argument(s) (4)
```

## Names

```> foo(1, 2, 3)
```
``` 5
```
```> foo(b = 3, a = 2, x = 1)
```
``` 5
```

## Seeing the source

```> foo
```
```function(x,a,b=1){a*x+b*x}
```
```> sqrt
```
```function (x)  .Primitive("sqrt")
```

```> foo = function(x, a, b = 1) {
+     a * x + b * x
+ }
> foo(1, 2, 3)
```
``` 5
```
```> zoo = foo
> zoo(1, 2, 3)
```
``` 5
```
```> rm(zoo)
> sqrt = foo
> sqrt(2)
```
```Error: 'a' is missing
```
```> rm(sqrt)
```

## Vector arguments

```> foo(1:10, b = 3, a = 2)
```
```   5 10 15 20 25 30 35 40 45 50
```
```> foo(1:10, b = c(0, 2), a = 2)
```
```   2  8  6 16 10 24 14 32 18 40
```
 x 1 2 3 4 5 6 7 8 9 10 a 2 2 2 2 2 2 2 2 2 2 b 0 2 0 2 0 2 0 2 0 2
```> foo(1:9, b = c(2, 3), a = 2)
```
```Warning: longer object length is not a multiple of
shorter object length
```
```  4 10 12 20 20 30 28 40 36
```
 x 1 2 3 4 5 6 7 8 9 a 2 2 2 2 2 2 2 2 2 b 2 3 2 3 2 3 2 3 2

## Help!

`help(sqrt)` or `?sqrt`
```MathFun                  package:base                  R Documentation

Miscellaneous Mathematical Functions

Description:

These functions compute miscellaneous mathematical functions.  The
naming follows the standard for computer languages such as C or
Fortran.

Usage:

abs(x)
sqrt(x)

Arguments:

x: a numeric or ‘complex’ vector or array.

Details:
....
References:
....
....
Examples:
require(stats) # for spline
require(graphics)
xx <- -9:9
plot(xx, sqrt(abs(xx)),  col = "red")
lines(spline(xx, sqrt(abs(xx)), n=101), col = "pink")

```
```> example(sqrt)
```
```sqrt> require(stats) # for spline

sqrt> require(graphics)

sqrt> xx <- -9:9

sqrt> plot(xx, sqrt(abs(xx)),  col = "red")
``` ```sqrt> lines(spline(xx, sqrt(abs(xx)), n=101), col = "pink")
```

## More Help

### From R

```> help("thing")
> help.search("thing")
> RSiteSearch("thing")
> help.start()
```

### Online

RDocumentation.org ## Its all functions

```> x=3;y=4
> "+"("^"(x,2),"^"(y,2))
```
``` 25
```
```> x^2 + y^2
```
``` 25
```

## Scope

```> x = 2
> foo = function(z) {
+     z + x
+ }
> foo(10)
```
``` 12
```
```> x = 3
> foo(10)
```
``` 13
```

```> x
```
``` 3
```
```> foo = function(z) {
+     x = 99
+     z + x
+ }
> foo(10)
```
``` 109
```
```> x
```
``` 3
```

global x unaffected - this is good!

## To Change Something, Return It

```> d = data.frame(age = 1:10)
> classify = function(df) {
+     df\$old = df\$age > 5
+ }
> classify(d)  # doesn't change d
```

modify, return, assign

```> classify = function(df) {
+     df\$old = df\$age > 5
+     return(df)
+ }
> d = classify(d)
```

'pass by value'

## Packages (and libraries)

```> require(ggplot2)
```
```Loading required package: ggplot2
```
```Loading required package: methods
```
```> ggplot(quakes) + geom_point(aes(x = long, y = lat, color = mag))
``` ```> search()
```
```  ".GlobalEnv"        "package:ggplot2"
 "package:methods"   "package:knitr"
 "package:stats"     "package:graphics"
 "package:grDevices" "package:utils"
 "package:base"
```
```> ls("package:datasets", pat = "^[pq].*")
```
``` "precip"     "presidents" "pressure"   "quakes"
```
```> summary(quakes)
```
```      lat             long         depth
Min.   :-38.6   Min.   :166   Min.   : 40
1st Qu.:-23.5   1st Qu.:180   1st Qu.: 99
Median :-20.3   Median :181   Median :247
Mean   :-20.6   Mean   :179   Mean   :311
3rd Qu.:-17.6   3rd Qu.:183   3rd Qu.:543
Max.   :-10.7   Max.   :188   Max.   :680
mag          stations
Min.   :4.00   Min.   : 10.0
1st Qu.:4.30   1st Qu.: 18.0
Median :4.60   Median : 27.0
Mean   :4.62   Mean   : 33.4
3rd Qu.:4.90   3rd Qu.: 42.0
Max.   :6.40   Max.   :132.0
```
`help(quakes)`
```Locations of Earthquakes off Fiji

Description:

The data set give the locations of 1000 seismic events of MB >
4.0.  The events occurred in a cube near Fiji since 1964.

Format:

A data frame with 1000 observations on 5 variables.

[,1]  lat       numeric  Latitude of event
[,2]  long      numeric  Longitude
[,3]  depth     numeric  Depth (km)
[,4]  mag       numeric  Richter Magnitude
[,5]  stations  numeric  Number of stations reporting
```

## Flow

If everything is a function, how do you control flow?

```> mean(quakes\$mag)  # no need to loop
```
``` 4.62
```
```> m = matrix(1:12, 3, 4)
> m
```
```     [,1] [,2] [,3] [,4]
[1,]    1    4    7   10
[2,]    2    5    8   11
[3,]    3    6    9   12
```
```> apply(m, 1, sum)  # apply a function over rows...
```
``` 22 26 30
```
```> apply(m, 2, sum)  # ...or columns
```
```  6 15 24 33
```

## For loops

```> for (i in 1:5) {
+     print(i * 2)
+ }
```
``` 2
 4
 6
 8
 10
```

## If...

Conditional execution

```> testme = function(x) {
+     if (x > 0.5) {
+         return("yes")
+     } else {
+         return("no")
+     }
+ }
> testme(0)
```
``` "no"
```
```> testme(1)
```
``` "yes"
```
```> testme(c(0.2, 0.9))
```
```Warning: the condition has length > 1 and only the
first element will be used
```
``` "no"
```
```> testme = function(x) {
+     ifelse(x > 0.5, "yes", "no")
+ }
> testme(c(0.2, 0.9, 0.05, 1.2))
```
``` "no"  "yes" "no"  "yes"
```

## If you must...

Compute the mean magnitude for quakes below 500km...

```> magsum = 0
> ndeep = 0
> for (i in 1:nrow(quakes)) {
+     if (quakes[i, "depth"] > 500) {
+         magsum = magsum + quakes[i, "mag"]
+         ndeep = ndeep + 1
+     }
+ }
> magsum/ndeep
```
``` 4.55
```

BUT much better to do

```> mean(quakes\$mag[quakes\$depth > 500])
```
``` 4.55
```

Break that down to see how it works:

```> head(quakes\$depth)
```
``` 562 650  42 626 649 195
```
```> head(quakes\$mag)
```
``` 4.8 4.2 5.4 4.1 4.0 4.0
```
```> head(quakes\$depth > 500)
```
```  TRUE  TRUE FALSE  TRUE  TRUE FALSE
```
```> head(quakes\$mag[quakes\$depth > 500])
```
``` 4.8 4.2 4.1 4.0 4.3 4.4
```

## Plots

```> par(mfrow = c(2, 2))  # 2x2 plots
> hist(quakes\$depth)
> hist(quakes\$mag)
> plot(quakes\$mag, quakes\$depth)
> plot(quakes\$lat, quakes\$long, pch = 19, col = "red")
``` ```> par(mfrow = c(1, 1))
> plot(quakes)
``` ## Objects

A way of creating complex objects with complex behaviour...

Let's fit a linear model

```> lm.stack <- lm(stack.loss ~ stack.x)
```

What is `lm.stack`?

```> class(lm.stack)
```
``` "lm"
```

We say this is an object of class 'lm'. What can we do with it?

```> lm.stack
```
```Call:
lm(formula = stack.loss ~ stack.x)

Coefficients:
(Intercept)    stack.xAir.Flow
-39.920              0.716
stack.xWater.Temp  stack.xAcid.Conc.
1.295             -0.152
```
```> summary(lm.stack)
```
```Call:
lm(formula = stack.loss ~ stack.x)

Residuals:
Min     1Q Median     3Q    Max
-7.238 -1.712 -0.455  2.361  5.698

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)        -39.920     11.896   -3.36   0.0038
stack.xAir.Flow      0.716      0.135    5.31  5.8e-05
stack.xWater.Temp    1.295      0.368    3.52   0.0026
stack.xAcid.Conc.   -0.152      0.156   -0.97   0.3440

(Intercept)       **
stack.xAir.Flow   ***
stack.xWater.Temp **
stack.xAcid.Conc.
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.24 on 17 degrees of freedom
Multiple R-squared: 0.914,	Adjusted R-squared: 0.898
F-statistic: 59.9 on 3 and 17 DF,  p-value: 3.02e-09
```
```> par(mfrow = c(2, 2))
> plot(lm.stack)
``` ```> par(mfrow = c(1, 1))
```

Functions that are defined on objects are called 'methods'. They will usually operate differently on different classes of objects.

We've already seen examples of this

• `summary(quakes)` vs `summary(lm.stack)`
• `plot(quakes)` vs `plot(quakes\$mag,quakes\$depth)` vs `plot(lm.stack)`