Functional programming in R

St Andrews useR group, 12 December 2013

If we wish to count lines of code, we should not regard them as “lines produced” but as “lines spent”

Edsger W. Dijkstra
This is not a talk about writing fast code
This is a talk about writing code for humans to maintain
This is a talk about writing less code

Why is programming terrible?

Obfuscation

• bending (someone else’s) functions to your will
• complex flow control ({ and } tryanny)
• “what does this function do?”
• “quick” fixes

Duplication

• write 10 times, modify 9 times
• copy and paste coding
• typos (failure of duplication)
What is “functional” programming?

Getting closer to the mathematics

sum(x-mean(x))^2

looks like

$$\left(\sum_{i=1}^n \left(x_i-\frac{\sum_{j=1}^n x_j}{n}\right)\right)^2$$

Compare this to:

x.mean <- mean(x)

result <- 0

for(i in 1:length(x)){
result <- result + x[i]-x.mean
}

result <- result^2
Focus is on what we do not on implementation
Four examples of functional ideas

data.mean <- apply(data, 1, mean)

(A function as an argument to a function)

data.mean <- numeric(nrow(data))

for(i in 1:nrow(data)){
data.mean[i] <- mean(data[i,])
}

apply(data, 1, function(x) sum(x^2))

(Anonymous functions)

sum.sq <- function(x) sum(x^2)
apply(data, 1, sum.sq)

my.stats <- list(mean,sd,median)

lapply(my.stats,function(f,x) apply(x,1,f), x=data)

(list of functions, example of functional)

Rather than

apply(data,1,mean)
apply(data,1,sd)
apply(data,1,median)

A little more esoteric

sum.poly <- function(order){
function(x) sum(x^order)
}

sum.sq <- sum.poly(2)

apply(data, 1, sum.sq)

(Closure)

Rather than

sum.sq <- function(x) sum(x^2)

apply(data, 1, sum.sq)
First class functions

“First class” functions can be…

• passed as arguments to other functions
• anonymous
• assigned to variables and stored in lists
• returned from other functions
Closures

Closures

Okay, what happened in that last example?

sum.poly <- function(order){
function(x) sum(x^order)
}

sum.sq <- sum.poly(2)

apply(data, 1, sum.sq)

sum.sq encloses the environment of sum.poly

> as.list(environment(sum.sq))
\$order
[1] 2

“Function factories”

When is this actually useful?

Likelihoods

ll.point <- function(pars,data,c.vec){
ll(pars,data)*c.vec
}

ll.line <- function(pars,data){
ll(pars,data)
}

obj.fcn <- function(pars,data,c.vec,transect){
if(transect=="point"){
ll.point(pars,data,c.vec)
}else if(transect=="line"){
ll.line(pars,data)
}else{
stop()
}
}

Compare to

obj.fcn <- function(transect){
function(pars,data,c.vec){
# function stuff
if(transect=="point"){
# ...
}else if(transect=="line"){
# ...
}else{
# ...
}
}
}

When are closures useful?

• setting an option once
• many if()s
• writing “wrappers”
• function composition
Functionals

Functions that take functions and return numbers

• remove for loops
• simple situations use apply and co.
• what about more complicated situations?
• split-apply-combine pattern

A functional functionals example

Want row means of a data.frame but only for columns in a range, range varies per row.

mean(data[1,2:9])
mean(data[2,4:11])
...

Map(f,...) takes lists and applies f to their elements in sync

indices <- list(3:9, c(1,2), 9:12)

Map(function(x,ind) mean(x[ind]),
split(data,1:nrow(data)),
indices)
State

States and side effects

> f(2)
[1] 4

But…

> g(3)
> f(2)
[1] 6

Um.

States and side effects

• g is changing “state” in the program
• functional programming is a generalisation of “globals are bad”
• programs that don’t rely on state are easier to debug
• programs that don’t rely on state are easier to parallelise

Sometimes state is useful

new_counter <- function() {
i <- 0
function() {
i <<- i + 1
i
}
}
counter_one <- new_counter()
counter_two <- new_counter()
> counter_one()
[1] 1
> counter_one()
[1] 2
> counter_two()
[1] 1

Recap

• Functional ideas can:
• make programs shorter
• more like the mathematics they implement
• remove messy control flow
• avoid implementing object system for small problems
• closures: take data and return a function
• functionals: take functions and return data
• you can do some pretty weird/cool stuff with functions in R