Are apply style function in R, say sapply() any faster than for loops in R? We can gain some insight using the microbenchmark package.

If this question worries you, then just relax. If you have code that you would like to make faster, then the only thing to do is profile it. Hacking code up for performance before you need it is premature optimization, the root of all evil ;)

Rules of thumb

Write vectorized R code when possible, ie. y <- f(x). This single line of code totally expresses the desired computation.

If the function doesn’t operate on vectors or the data is a list then use a higher order function such as sapply, lapply, etc. These typically capture the intent of the code more succinctly than a for loop. I like them because they’re easier to parallelize.

Below is a little experiment:


library(microbenchmark)

n = 10000L
x = seq(from = 0, to = pi, length.out = n)

f = function(x)
{
    out = log(0.43 * x + 1)
    out = exp(x)
    x
}


baseline = median(microbenchmark(

    y <- f(x)

)$time)


apply_time = median(microbenchmark(

    y2 <- sapply(x, f)

)$time)


for_time = median(microbenchmark({

y3 <- vector(mode = "list", length = n)
for(i in 1:n){
    y3[[i]] = f(x[i])
}
y3 <- as.numeric(y3)

})$time)


apply_time / baseline

for_time / baseline

for_time / apply_time

Note the complexity of the implementations as measured by lines of code. The vectorized and sapply() versions take only 1 line, while the for() loop uses 4 or 5 lines.

Results

The sapply() was faster than the for() loop, but how much faster depends on the values of n. For n = 100 the sapply() is 15 times slower than the vectorized version, and the for() is 23 times slower than the sapply()!

For n = 10000 the sapply() is 19 times slower than the vectorized version, and the for() is 1.16 times slower than the sapply().

As the number of elements grows much past 1000 we see less speed benefits from choosing sapply() over for(). Choose sapply() for other reasons.

Main Point: Both sapply() and for() loops are much slower than vectorized code.