5.3 Day 2: Clipping Bushes and Other New Tricks

In Edward Scissorhands, a magical moment happens when Edward realizes that he’s come far from the house on the hill and his unique abilities may give him a special place in the existing society.

Anyone with an eye for programming language history has seen this fable played out before. When the object-oriented paradigm was new, the masses could not accept Smalltalk because the paradigm was too new. We needed a language that would let them continue to do procedural programming and experiment with object-oriented ideas. With C++, the new object-oriented tricks could live safely beside the existing C procedural features. The result was that people could start using the new tricks in an old context.

Now, it’s time to put Scala through its paces as a functional language. Some of this will seem awkward at first, but the ideas are powerful and important. They will form the foundation for the concurrency constructs you’ll see later in day 3. Let’s start from the beginning, with a simple function:

  scala> def double(x:Int):Int = x * 2
  double: (Int)Int
 
  scala> double(4)
  res0: Int = 8

Defining a function looks a whole lot like it does with Ruby. The def keyword defines both a function and a method. The parameters and their types come next. After that, you can specify an optional return type. Scala can often infer the return type.

To invoke the function, just use the name and the argument list. Notice that unlike Ruby, the parentheses are not optional in this context.

This is a one-line method definition. You can also specify a method definition in block form:

  scala> def double(x:Int):Int = {
  | x * 2
  | }
  double: (Int)Int
 
  scala> double(6)
  res3: Int = 12

That = after the Int return type is mandatory. Forgetting it will cause you trouble. These are the major forms of function declarations. You’ll see minor variations, such as omitting parameters, but these are the forms you’ll see most often.

Let’s move on to the variables that you’ll use within a function. You’ll want to pay careful attention to the life cycle of the variable if you want to learn the pure functional programming model.

var versus val

Scala is based on the Java virtual machine and has a tight relationship with Java. In some ways, these design goals limit the language. In other ways, Scala can take advantage of the last fifteen or twenty years of programming language development. You’ll see an increased emphasis on making Scala friendly for concurrent programming. But all the concurrency features in the world won’t help you if you don’t follow basic design principles. Mutable state is bad. When you declare variables, you should make them immutable whenever you can to avoid conflicting state. In Java, that means using the final keyword. In Scala, immutable means using val instead of var:

  scala> var mutable = "I am mutable"
  mutable: java.lang.String = I am mutable
 
  scala> mutable = "Touch me, change me..."
  mutable: java.lang.String = Touch me, change me...
 
  scala> val immutable = "I am not mutable"
  immutable: java.lang.String = I am not mutable
 
  scala> immutable = "Can't touch this"
  <console>:5: error: reassignment to val
  immutable = "Can't touch this"
  ^

So, var values are mutable; val values are not. In the console, as a convenience, you can redefine a variable several times even if you use val. Once you step outside of the console, redefining a val will generate an error.

In some ways, Scala had to introduce the var-style variables to support the traditional imperative programming style, but while you’re learning Scala, it’s best to avoid var when you can for better concurrency. This basic design philosophy is the key element that differentiates functional programming from object-oriented programming: mutable state limits concurrency.

Let’s move on to some of my favorite areas within functional languages, dealing with collections.

Collections

Functional languages have a long history of spectacularly useful features for collections. One of the earliest functional languages, Lisp, was built around the idea of dealing with lists. The very name stands for LISt Processing. Functional languages make it easy to build complex structures that contain data and code. Scala’s primary collections are lists, sets, and maps.

Lists

As with most functional languages, the bread-and-butter data structure is the list. Scala’s lists, of type List, are ordered collections of like things with random access. Enter these lists into the console:

  scala> List(1, 2, 3)
  res4: List[Int] = List(1, 2, 3)

Notice the first return value: List[Int] = List(1, 2, 3). This value not only shows the type of the overall list but also shows the type of the data structures within the list. A list of Strings looks like this:

  scala> List("one", "two", "three")
  res5: List[java.lang.String] = List(one, two, three)

If you’re seeing a little Java influence here, you’re right. Java has a feature called Generics that allows you to type the items within a data structure like a list or array. Let’s see what happens when you have a list combining Strings and Ints:

  scala> List("one", "two", 3)
  res6: List[Any] = List(one, two, 3)

You get the data type Any, which is the catchall data type for Scala. Here’s how you’d access an item of a list:

  scala> List("one", "two", 3)(2)
  res7: Any = 3
 
  scala> List("one", "two", 3)(4)
  java.util.NoSuchElementException: head of empty list
  at scala.Nil$.head(List.scala:1365)
  at scala.Nil$.head(List.scala:1362)
  at scala.List.apply(List.scala:800)
  at .<init>(<console>:5)
  at .<clinit>(<console>)
  at RequestResult$.<init>(<console>:3)
  at RequestResult$.<clinit>(<console>)
  at RequestResult$result(<console>)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Met...

You use the () operator. List access is a function, so you use () instead of []. Scala’s index for list starts with 0, as it does with Java and Ruby. Unlike Ruby, accessing an item out of range will throw an exception. You can try to index with a negative number. Earlier versions of Scala return the first element:

  scala> List("one", "two", 3)(-1)
  res9: Any = one
 
  scala> List("one", "two", 3)(-2)
  res10: Any = one
 
  scala> List("one", "two", 3)(-3)
  res11: Any = one

Since that behavior is a little inconsistent with the NoSuchElement exception for an index that’s too large, version 2.8.0 corrects that behavior, returning java.lang.IndexOutOfBoundsException.

One final note. Nil in Scala is an empty list:

  scala> Nil
  res33: Nil.type = List()

We’ll use this list as a basic building block when we cover code blocks, but for now, bear with me. I’m going to introduce a couple of other types of collections first.

Sets

A set is like a list, but sets do not have any explicit order. You specify a set with the Set keyword:

  scala> val animals = Set("lions", "tigers", "bears")
  animals: scala.collection.immutable.Set[java.lang.String] =
  Set(lions, tigers, bears)

Adding or subtracting from that set is easy:

  scala> animals + "armadillos"
  res25: scala.collection.immutable.Set[java.lang.String] =
  Set(lions, tigers, bears, armadillos)
 
  scala> animals - "tigers"
  res26: scala.collection.immutable.Set[java.lang.String] = Set(lions, bears)
 
  scala> animals + Set("armadillos", "raccoons")
  <console>:6: error: type mismatch;
  found : scala.collection.immutable.Set[java.lang.String]
  required: java.lang.String
  animals + Set("armadillos", "raccoons")
  ^

Keep in mind that set operations are not destructive. Each set operation builds a new set rather than modifying the old ones. By default, sets are immutable. As you can see, adding or removing a single element is a piece of cake, but you can’t use the + or - to combine sets, as you would in Ruby. In Scala, you want to use ++ and -- for set union and set difference:

  scala> animals ++ Set("armadillos", "raccoons")
  res28: scala.collection.immutable.Set[java.lang.String] =
  Set(bears, tigers, armadillos, raccoons, lions)
 
  scala> animals -- Set("lions", "bears")
  res29: scala.collection.immutable.Set[java.lang.String] = Set(tigers)

You can also perform set intersection (elements in two sets that are the same) with ** [11]:

  scala> animals ** Set("armadillos", "raccoons", "lions", "tigers")
  res1: scala.collection.immutable.Set[java.lang.String] = Set(lions, tigers)

Unlike a List, a Set is independent of order. This rule will mean that equality for sets and lists is different:

  scala> Set(1, 2, 3) == Set(3, 2, 1)
  res36: Boolean = true
 
  scala> List(1, 2, 3) == List(3, 2, 1)
  res37: Boolean = false

That’s enough set manipulation for now. Let’s move on to maps.

Maps

A Map is a key-value pair, like a Ruby Hash. The syntax should be familiar to you:

  scala> val ordinals = Map(0 -> "zero", 1 -> "one", 2 -> "two")
  ordinals: scala.collection.immutable.Map[Int,java.lang.String] =
  Map(0 -> zero, 1 -> one, 2 -> two)
 
  scala> ordinals(2)
  res41: java.lang.String = two

Like a Scala List or Set, you specify a Map with the Map keyword. You separate the elements of the map with the -> operator. You just used some syntactic sugar that makes it easy to create a Scala map. Let’s use another form of the hash map and specify the types of the key and value:

  scala> import scala.collection.mutable.HashMap
  import scala.collection.mutable.HashMap
 
  scala> val map = new HashMap[Int, String]
  map: scala.collection.mutable.HashMap[Int,String] = Map()
 
  scala> map += 4 -> "four"
 
  scala> map += 8 -> "eight"
 
  scala> map
  res2: scala.collection.mutable.HashMap[Int,String] =
  Map(4 -> four, 8 -> eight)

First, we import the Scala libraries for a mutable HashMap. That means the values within the hash map can change. Next, we declare an immutable variable called map. That means that the reference to the map cannot change. Notice that we’re also specifying the types of the key-value pairs. Finally, we add some key-value pairs and return the result.

Here’s what would happen if you specified the wrong types:

  scala> map += "zero" -> 0
  <console>:7: error: overloaded method value += with alternatives (Int)map.MapTo
  <and> ((Int, String))Unit cannot be applied to ((java.lang.String, Int))
  map += "zero" -> 0
  ^

As expected, you get a typing error. The type constraints are enforced where possible at compile time but also at run time. So now that you’ve seen the basics for collections, let’s dive into some of the finer details.

Any and Nothing

Before we move on to anonymous functions, let’s talk a little bit more about the class hierarchy in Scala. When you’re using Scala with Java, you will often be more concerned about the Java class hierarchy. Still, you should know a little bit about the Scala types. Any is the root class in the Scala class hierarchy. It’s often confusing, but know that any Scala type will inherit from Any.

Similarly, Nothing is a subtype of every type. That way, a function, say for a collection, can return Nothing and conform to the return value for the given function. It is all laid out in Figure 6, Any and Nothing . Everything inherits from Any, and Nothing inherits from everything.

There are a few different nuances when you’re dealing with nil concepts. Null is a Trait, and null is an instance of it that works like Java’s null, meaning an empty value. An empty collection is Nil. By contrast, Nothing is a trait that is a subtype of everything. Nothing has no instance, so you can’t dereference it like Null. For example, a method that throws an Exception has the return type Nothing, meaning no value at all.

Keep those rules in the back of your mind, and you’ll be fine. Now, you’re ready to do a little bit more with collections using higher-order functions.

images/scala.any.png

Figure 6. Any and Nothing

Collections and Functions

As we start on languages that have a stronger functional foundation, I want to formalize some of the concepts we’ve been working with all along. The first such concept is higher-order functions.

As with Ruby and Io, Scala collections get a whole lot more interesting with higher-order functions. Just as Ruby used each and Io used foreach, Scala will let you pass functions into foreach. The underlying concept that you’ve been using all along is the higher-order function. In layman’s terms, a higher-order function is one that produces or consumes functions. More specifically, a higher-order function is one that takes other functions as input parameters or returns functions as output. Composing functions that use other functions in this way is a critical concept for the functional family of languages and one that will shape the way you code in other languages as well.

Scala has powerful support for higher-order functions. We don’t have time to look at some of the advanced topics such as partially applied functions or currying, but we will learn to pass simple functions, often called code blocks, as parameters into collections. You can take a function and assign it to any variable or parameter. You can pass them into functions and return them from functions. We’re going to focus on anonymous functions as input parameters to a few of the more interesting methods on collections.

foreach

The first function we’re going to examine is foreach, the iteration workhorse in Scala. As with Io, the foreach method on a collection takes a code block as a parameter. In Scala, you’ll express that code block in the form variableName => yourCode like this:

  scala> val list = List("frodo", "samwise", "pippin")
  list: List[java.lang.String] = List(frodo, samwise, pippin)
 
  scala> list.foreach(hobbit => println(hobbit))
  frodo
  samwise
  pippin

hobbit => println(hobbit) is an anonymous function, meaning a function without a name. The declaration has the arguments to the left of the =>, and the code is to the right. foreach calls the anonymous function, passing in each element of the list as an input parameter. As you might have guessed, you can use the same technique for sets and maps too, though the order won’t be guaranteed:

  val hobbits = Set("frodo", "samwise", "pippin")
  hobbits: scala.collection.immutable.Set[java.lang.String] =
  Set(frodo, samwise, pippin)
 
  scala> hobbits.foreach(hobbit => println(hobbit))
  frodo
  samwise
  pippin
 
  scala> val hobbits = Map("frodo" -> "hobbit",
  "samwise" -> "hobbit", "pippin" -> "hobbit")
  hobbits: scala.collection.immutable.Map[java.lang.String,java.lang.String] =
  Map(frodo -> hobbit, samwise -> hobbit, pippin -> hobbit)
 
 
  scala> hobbits.foreach(hobbit => println(hobbit))
  (frodo,hobbit)
  (samwise,hobbit)
  (pippin,hobbit)

Of course, maps will return tuples instead of elements. As you recall, you can access either end of the tuple, like this:

  scala> hobbits.foreach(hobbit => println(hobbit._1))
  frodo
  samwise
  pippin
 
  scala> hobbits.foreach(hobbit => println(hobbit._2))
  hobbit
  hobbit
  hobbit

With these anonymous functions, you can do far more than just iterate. I’m going to walk you through some basics and then a few of the other interesting ways Scala uses functions in conjunction with collections.

More List Methods

I’m going to take a brief diversion to introduce a few more methods on List. These basic methods provide the features you’ll need to do manual iteration or recursion over lists. First, here are the methods to test for the empty state or check the size:

  scala> list
  res23: List[java.lang.String] = List(frodo, samwise, pippin)
 
  scala> list.isEmpty
  res24: Boolean = false
 
  scala> Nil.isEmpty
  res25: Boolean = true
 
  scala> list.length
  res27: Int = 3
 
  scala> list.size
  res28: Int = 3

Notice that you can check the size of a list with both length and size. Also, remember that the implementation of Nil is an empty list. As with Prolog, it’s useful to be able to grab the head and tail of a list for recursion.

  scala> list.head
  res34: java.lang.String = frodo
 
  scala> list.tail
  res35: List[java.lang.String] = List(samwise, pippin)
 
  scala> list.last
  res36: java.lang.String = pippin
 
  scala> list.init
  res37: List[java.lang.String] = List(frodo, samwise)

There’s a surprise. You can use head and tail to recurse head first, or last and init to recurse tail first. We’ll do a little more with recursion later. Let’s wrap up the basics with a few interesting convenience methods:

  scala> list.reverse
  res29: List[java.lang.String] = List(pippin, samwise, frodo)
 
  scala> list.drop(1)
  res30: List[java.lang.String] = List(samwise, pippin)
 
  scala> list
  res31: List[java.lang.String] = List(frodo, samwise, pippin)
 
  scala> list.drop(2)
  res32: List[java.lang.String] = List(pippin)

These do just about what you’d expect. reverse returns the list with inverted ordering, and drop(n) returns the list with the first n elements removed, without modifying the original list.

count, map, filter, and Others

As with Ruby, Scala has many other functions that manipulate lists in various ways. You can filter the list to match a given condition, sort a list using whatever criteria you want, create other lists using each element as an input, and create aggregate values:

  scala> val words = List("peg", "al", "bud", "kelly")
  words: List[java.lang.String] = List(peg, al, bud, kelly)
 
  scala> words.count(word => word.size > 2)
  res43: Int = 3
 
  scala> words.filter(word => word.size > 2)
  res44: List[java.lang.String] = List(peg, bud, kelly)
 
  scala> words.map(word => word.size)
  res45: List[Int] = List(3, 2, 3, 5)
 
 
  scala> words.forall(word => word.size > 1)
  res46: Boolean = true
 
  scala> words.exists(word => word.size > 4)
  res47: Boolean = true
 
  scala> words.exists(word => word.size > 5)
  res48: Boolean = false

We start with a Scala list. Then, we count all the words with a size greater than two. count will call the code block word => word.size > 2, evaluating the expression word.size > 2 for each element in the list. The count method counts all the true expressions.

In the same way, words.filter(word => word.size > 2) returns a list of all words that have a size greater than two, much like Ruby’s select. Using the same pattern, map builds a list of the sizes of all the words in the list, forall returns true if the code block returns true for all items in the set, and exists returns true if the code block returns true for any item in the set.

Sometimes, you can generalize a feature using code blocks to make something more powerful. For example, you may want to sort in the traditional way:

  scala> words.sort((s, t) => s.charAt(0).toLowerCase < t.charAt(0).toLowerCase)
  res49: List[java.lang.String] = List(al, bud, kelly, peg)

This code uses a code block that takes two parameters, s and t. Using sort,[12] you can compare the two arguments any way you want. In the previous code, we convert the characters to lowercase[13] and compare them. That will yield a case-insensitive search. We can also use the same method to sort the list by the size of the words:

  scala> words.sort((s, t) => s.size < t.size)
  res50: List[java.lang.String] = List(al, bud, peg, kelly)

By using a code block, we can sort[14] based on any policy that we want. Let’s take a look at a more complex example, foldLeft.

foldLeft

The foldLeft method in Scala is much like the inject method in Ruby. You’ll supply an initial value and a code block. foldLeft will pass to the code block each element of the array and another value. The second value is either the initial value (for the first invocation) or the result from the code block (for subsequent invocations). There are two versions of the method. The first version, /:, is an operator with initialValue /: codeBlock. Here’s the method in action:

  scala> val list = List(1, 2, 3)
  list: List[Int] = List(1, 2, 3)
 
  scala> val sum = (0 /: list) {(sum, i) => sum + i}
  sum: Int = 6

We walked through this sequence for Ruby, but it may help you to see it again. Here’s how it works:

The syntax of the other version of foldLeft will seem strange to you. It uses a concept called currying. Functional languages use currying to transform a function with multiple parameters to several functions with their own parameter lists. We’ll see more currying in Chapter 8, Haskell. Just understand that what’s going on under the covers is a composition of functions rather than a single function. Though the mechanics and syntax are different, the result is exactly the same:

  scala> val list = List(1, 2, 3)
  list: List[Int] = List(1, 2, 3)
 
  scala> list.foldLeft(0)((sum, value) => sum + value)
  res54: Int = 6

Notice that the function call list.foldLeft(0)((sum, value) => sum + value) has two parameter lists. That’s the currying concept that I mentioned earlier. You’ll see versions of this method with all the rest of the languages in this book.

What We Learned in Day 2

Day 1 was encumbered with working through the object-oriented features that you already know. Day 2 introduced Scala’s primary reason for being: functional programming.

We started with a basic function. Scala has flexible syntax with function definitions. The compiler can often infer the return type, the function body has one-line and code-block forms, and the parameter list can vary.

Next, we looked at various collections. Scala supports three: lists, maps, and sets. A set is a collection of objects. A list is an ordered collection. Finally, maps are key-value pairs. As with Ruby, you saw the powerful combinations of code blocks and collections of various kinds. We looked at some collection APIs that are indicative of functional programming paradigms.

For lists, we could also use Lisp-style first and rest methods, just like Prolog, to return the first element of the list or the rest. We also used count, isEmpty, and first methods for obvious purposes. But the most powerful methods took function blocks.

We iterated with foreach and used filter to selectively return various elements of the lists. We also learned to use foldLeft to accumulate results as we iterated through a collection to do things such as keeping a running total.

Much of functional programming is learning to manipulate collections with higher-level constructs instead of Java-style iteration. We will put these skills through their paces in day 3, when we will learn to use concurrency, do some XML, and work a simple practical example. Stay tuned.

Day 2 Self-Study

Now that we’ve gotten deeper into Scala, you’re starting to see some of its functional aspects. Whenever you deal with functions, the collections are a great place to start. These exercises will let you use some of the collections, as well as some functions.

Find:

Do: