Software design, Technology

A small lesson in Scala

For fun, I’ve been working through the recent Coursera course on functional programming with Scala, run by Martin Odersky, Scala’s creator. It’s been a good ride, but towards the end I started having a new feeling — a real sense of wonder — and I want to share that.

This shows a bit of the power of functional programming (FP), with an example he used quite late on in the course. It does require some background, which I will provide. The main code is from the course, so all credit is due there.

By the way, I’m going to simplify the Scala slightly by removing all type declarations, so as to minimise new information and keep the focus on the topic in hand. I’m also going to mix up Scala and FP a bit. I know they’re different; I’ll be using Scala as an example of FP, to make it a bit more concrete. Other functional programming languages are available.

Maps and filters

One of the great features of FP is its handling of lists through maps and filters. So if we have a list of integers, mynums, and we want to double each one, then we can map it to a function which doubles an integer:

mynums map (_ * 2)

If we have a list of integers and only want to select those that are greater than 100 then we can use a filter:

mynums filter (_ > 100)

Nothing dramatic yet, but at least we’re all on the same page. Some things are worth noting, though. Yes, plenty of languages have mapping and filtering functions. But if you have a language that treats functions as first class entities then it’s easier to work with them. And yes, you can do this with an iterator (e.g. a for loop in Java), but one of the points of FP is that you don’t need to express the mechanics as much as the intent. In that sense you’re working with a declarative programming language — you tell it what you want, it works out how to do it.

Lazy evaluation

This is the concept of evaluating something only when it’s needed. If we consider our last example of a list of integers, then mynums map (_ * 2) would run through every element of the list and do the multiplication. As someone with a history outside FP languages I’d expect nothing less. You might, also. If mynums happens to be very long then that line will take a long time to execute, and I wouldn’t give it a second thought — that’s the price I pay for dealing with big data structures.

But suppose we just wanted to pick out the first element of the list that was greater than 100 after being doubled. To get that number our logic would look like this, using the head function, which picks the first item off a list:

mynums map (_ * 2) filter (_ > 100) head

In this case running through every element could be wasteful — potentially very wasteful. If the sought-after element was third in a list of 10,000 then all 10,000 elements would be mapped and filtered.

Lazy evaluation avoids that wastefulness. Scala has a data structure called a Stream, which is just like a List except that any elements after the head are evaluated only when needed. So if mynums in the above example was a Stream of integers rather than a List then its contents would be evaluated only as far as was necessary to get to the first element that’s greater than 100 after being doubled. If it meant only going as far as the third element out of 10,000 then only the first three elements would be evaluated.

We can see that a bit more starkly in the following example, assuming mynums is a Stream:

val doublednums = mynums map (_ * 2)
// ...Do some other stuff here...
val bignum = doublednums filter (_ > 100) head

Here, the potentially large sequence is carried around for a while before any evaluation takes place, which only occurs when we need to work out bignum. It’s powerful stuff.

Once again: yes, of course you can achieve these things in more imperative languages such as Java, but in those cases you need to be very explicit about the mechanics, and the syntax is likely to be a lot more cumbersome.

Lazy evaluation: A simple example

At this point I hope you’ve felt a bit impressed by this capability, just as I was. But you may be wondering if lazy evaluation of streams is so useful. After all, if you’ve got a stream of 10,000 items then surely you had to generate those 10,000 items in the first place. So it’s not as if everything has been saved.

Well, not necessarily. By the power of recursion, here is an infinite sequence of integers starting from a given integer n. It uses the #:: operator, where a #:: b means “the stream which is element a put in front of the stream b”. So if a is 1 and b is the stream 2, 3, 4, then a #:: b is the stream 1, 2, 3, 4:

def from(n) = n #:: from(n+1)

Now we can use this to define the infinite stream of all natural numbers, carry it around in a single value, and perform simple operations on it:

val natural_nums = from(1)
val tenth_natural_num = natural_nums(9)    // Indexing starts at 0
val multiples_of_four = natural_nums map (_ * 4)
val tenth_multiple_of_four = multiples_of_four(9)

A more informative example

That’s all good stuff. I hope you were also impressed with the creation of an infinite streams, just as I was. But to be honest, we probably don’t need all the power of a modern functional programming language to work out the 10th multiple of four. So here’s something a bit more interesting.

This is the sieve of Eratosthenes, a classic way to work out prime numbers:

  1. Take all the integers starting from two: 2, 3, 4, 5, 6, and so on.
  2. Look at the first number (it’s 2) and remove all multiples of it. So remove 4, 6, 8, and so on. We’re left with 2, 3, 5, 7, 9, 11, and so on.
  3. Now look at the next number (it’s a 3) and remove all multiples of that. So we remove 6, 9, 12, 15, etc. Actually some of them have already gone, but no matter. We’re left with 2, 3, 5, 7, 11, 13, 17.
  4. And we continue. Next with 5, then 7, and so on. The numbers that are left, after our unending work, are the prime numbers.

Now if I were to implement this algorithm in a straightforward way in an imperative language like Java I’d have one of two problems. Either I’d have to manage a very large list of integers, and continually sweep through, removing items. This is logical, but does depend on me getting the size of the list right — not too big that it’s wasteful, and not too small that I can’t do whatever it is I want with it. Or else I’d be implementing something that isn’t quite Eratosthenes’ algorithm.

But with an FP approach we get the best of both worlds. We can express the algorithm clearly, and manage an infinite sequence efficiently. The code looks like this. We use nums as a stream of integers. nums.head gets its first element, and nums.tail gets the rest of the stream:

// Function to sieve any stream of integers
def sieve(nums) =
  nums.head #:: sieve(nums.tail filter (_ % nums.head != 0))

// All the primes
val primes = sieve(from(2))

And at this point the value primes carries all the prime numbers, as calculated by the sieve of Eratosthenes. But even though it’s infinitely long and requires infinite calculations it’s not realised until we start making demands on it. So we might ask for the first 12 primes:

primes.take(12).toList

which will give us

List(2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37)

Note that the toList function call is needed, because without that we would still have an unrealised of primes plus the restriction to only take the first 12.

The program also has the advantage that it’s a fraction of the size of (at least what I’d write as) the corresponding Java code.

All the above still requires some brain-work. It’s a very different way of thinking about things. But this example alone really made me see Scala, and functional programming, as something genuinely exciting. I hope I’ve managed to convey some of that excitement here.

Advertisements

Discussion

One thought on “A small lesson in Scala

  1. See paragraph 3.5.2 in SICP (http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-24.html#%_sec_3.5.2) where the same example (sieve) is provided. Of course in this masterpiece Scheme is used instead of Scala (Scala or even Java didn’t exist at the time) but the authors show even how to implement lazy streams…

    Posted by sgsfak | 8 November 2012, 10:47 am