Functional programming in R

David L Miller

St Andrews useR group, 12 December 2013

If we wish to count lines of code, we should not regard them as “lines produced” but as “lines spent”

Edsger W. Dijkstra

This is not a talk about writing fast code

This is a talk about writing code for humans to read

This is a talk about writing code for humans to maintain

This is a talk about writing less code

Why is programming terrible?

Obfuscation

Duplication

What is “functional” programming?

Getting closer to the mathematics

sum(x-mean(x))^2

looks like

\(\left(\sum_{i=1}^n \left(x_i-\frac{\sum_{j=1}^n x_j}{n}\right)\right)^2\)

Compare this to:

x.mean <- mean(x)

result <- 0

for(i in 1:length(x)){
  result <- result + x[i]-x.mean
}

result <- result^2

Focus is on what we do not on implementation

Four examples of functional ideas

You’re probably already doing this…

data.mean <- apply(data, 1, mean)

(A function as an argument to a function)

Instead of

data.mean <- numeric(nrow(data))

for(i in 1:nrow(data)){
  data.mean[i] <- mean(data[i,])
}

What about this?

apply(data, 1, function(x) sum(x^2))

(Anonymous functions)

Instead of

sum.sq <- function(x) sum(x^2)
apply(data, 1, sum.sq)

What about?

my.stats <- list(mean,sd,median)

lapply(my.stats,function(f,x) apply(x,1,f), x=data)

(list of functions, example of functional)

Rather than

apply(data,1,mean)
apply(data,1,sd)
apply(data,1,median)

A little more esoteric

sum.poly <- function(order){
  function(x) sum(x^order)
}

sum.sq <- sum.poly(2)

apply(data, 1, sum.sq)

(Closure)

Rather than

sum.sq <- function(x) sum(x^2)

apply(data, 1, sum.sq)

First class functions

“First class” functions can be…

Closures

Closures

Okay, what happened in that last example?

sum.poly <- function(order){
  function(x) sum(x^order)
}

sum.sq <- sum.poly(2)

apply(data, 1, sum.sq)

sum.sq encloses the environment of sum.poly

> as.list(environment(sum.sq))
$order
[1] 2

“Function factories”

When is this actually useful?

Likelihoods

ll.point <- function(pars,data,c.vec){
  ll(pars,data)*c.vec
}

ll.line <- function(pars,data){
  ll(pars,data)
}

obj.fcn <- function(pars,data,c.vec,transect){
  if(transect=="point"){
    ll.point(pars,data,c.vec)
  }else if(transect=="line"){
    ll.line(pars,data)
  }else{
    stop()
  }
}

Compare to

obj.fcn <- function(transect){
  function(pars,data,c.vec){
    # function stuff
    if(transect=="point"){
      # ...
    }else if(transect=="line"){
      # ...
    }else{
      # ...
    }
  }
}

When are closures useful?

Functionals

Functions that take functions and return numbers

A functional functionals example

Want row means of a data.frame but only for columns in a range, range varies per row.

mean(data[1,2:9])
mean(data[2,4:11])
...

Map(f,...) takes lists and applies f to their elements in sync

indices <- list(3:9, c(1,2), 9:12)

Map(function(x,ind) mean(x[ind]),
    split(data,1:nrow(data)),
    indices)

State

States and side effects

> f(2)
[1] 4

But…

> g(3)
> f(2)
[1] 6

Um.

States and side effects

Sometimes state is useful

new_counter <- function() {
  i <- 0
  function() {
    i <<- i + 1
    i
  }
}
counter_one <- new_counter()
counter_two <- new_counter()
> counter_one()
[1] 1
> counter_one()
[1] 2
> counter_two()
[1] 1

Recap

References