So, before getting into the problem, I'll make a brief detour. In functional languages, there is a concept of "strict functions" and "non-strict functions". Informally, a "strict function" is one which always evaluates its argument. For programmers from a non-functional background, that's most of the functions they ever saw, but not all. For instance, the trinary operator, "test ? f1 : f2", does not evaluate all its arguments, f1 and f2, even though it always evaluate at least one of them.
Now, to see where that can lead to trouble, let's first see an example of non-strictness in action, with Scala 2.7:
scala> object X { | val x = -10 to 10 map (2 / _) | } defined module X scala> object Y { | val y = List.range(-10, 10) map (2 / _) | } defined module Y scala> Y.y(0) java.lang.ArithmeticException: / by zero at Y$$anonfun$1.apply(<console>:5) at Y$$anonfun$1.apply(<console>:5) at scala.List.map(List.scala:812) at Y$.<init>(<console>:5) at Y$.<clinit>(<console>) at .<init>(<console>:6) at .<clinit>(<console>) at RequestResult$.<init>(<console>:3) at RequestResult$.<clinit>(<console>) at RequestResult$result(<console>) at sun.reflect.Nativ... scala> X.x(0) res2: Int = 0
We see an error happens when the map on the list is applied, but not when the map on the range is applied. That's because the function "map" for Range (the result of "x to y" or "x until y" expressions) is non-strict. I can get the value for any part of that Range, except the eleventh element, without throwing an exception.
But let say we have a function expensiveComputation, which takes quite a while to process. What happens, then, if I do this:
scala> import scala.actors.Futures._ import scala.actors.Futures._ scala> val m = 1 to 1000 map (i => future(expensiveComputation(i))) m: RandomAccessSeq.Projection[scala.actors.Future[Int]] = RangeM(<function>, <function>, <function>, <function>, <function>, <function>, <function>, <function>, <function>, <function>, <function>, <fu nction>, <function>, <function>, <function>, <function>, <function>, <function>, <function>, <functi on>, <function>, <function>, <function>, <function>, <function>, <function>, <function>...
If you haven't seen it before, the "future" function uses threads to compute the function you passed to it. Your program can go on to do other stuff, while all that work is being done concurrently.
Except, of course, that's not what's happening. No computation is being done, because Range's non-strict map hasn't evaluated its arguments. In fact, each time I call "apply" to get the result of the computation, the function I passed to map (the future) will be computed again! For example:
scala> def expensiveComputation(n: Int) = { | println("Starting expensive computation") | val x = n * 2 | println("Finished expensive computation") | x | } expensiveComputation: (Int)Int scala> val m = 1 to 10 map (i => future(expensiveComputation(i))) (output from "m.toString" skipped) scala> m(0).apply Starting expensive computation Finished expensive computation res12: Int = 2 scala> m(0).apply Starting expensive computation Finished expensive computation res13: Int = 2
To see where that can be a problem to beginners, take a look at the code in this question at Stack Overflow:
def ttest() = { val threads = for (i <- 1 to 5) yield new Thread() { override def run() { println("going to sleep") Thread.sleep(1000) println("awake now") } } threads.foreach(t => t.start()) threads.foreach(t => t.join()) println("all done") }
Can you imagine what happens?
Well, gladly, Scala 2.8 should have a strict Range, so newcomers will be spared the trouble and lost debugging time. Here's what this will look like with Scala 2.8:
scala> val m = 1 to 10 map (i => future(expensiveComputation(i))) Starting expensive computation Finished expensive computation Starting expensive computation Starting expensive computation Finished expensive computation Starting expensive computation Finished expensive computation Starting expensive computation Finished expensive computation Starting expensive computation Finished expensive computation Starting expensive computation Finished expensive computation Starting expensive computation Finished expensive computation Starting expensive computation m: scala.collection.immutable.Vector[scala.actors.Future[Int]] = NewVectorImpl(<function0>, <function0>, <function0>, <f unction0>, <function0>, <function0>, <function0>, <function0>, <function0>, <function0>) Finished expensive computation Starting expensive computation Finished expensive computation scala> Finished expensive computation scala> m(0).apply res11: Int = 2 scala> m(0).apply res12: Int = 2
Also, if you try out the first code snippet on Scala 2.8, "x.x(0)" will also throw an exception. Now, if we want the previous behavior, we can just call the "view" method. It will work not only on Range, but on pretty much all standard collections, as it is defined on the class IterableLike.
scala> val m = (1 to 10 view) map (i => future(expensiveComputation(i))) m: scala.collection.SeqView[scala.actors.Future[Int],Seq[_]] = SeqViewM(...) scala> m.toList Starting expensive computation Finished expensive computation Starting expensive computation Finished expensive computation Starting expensive computation Finished expensive computation Starting expensive computation Starting expensive computation Starting expensive computation Starting expensive computation Finished expensive computation Starting expensive computation Finished expensive computation Finished expensive computation Finished expensive computation Finished expensive computation Starting expensive computation Starting expensive computation Finished expensive computation res5: List[scala.actors.Future[Int]] = List(, , , , , , , , , ) Finished expensive computation scala> m(0).apply Starting expensive computation Finished expensive computation res6: Int = 2 scala> m(0).apply Starting expensive computation Finished expensive computation res7: Int = 2
And, for the first example:
scala> object X { | val x = (-10 to 10 view) map (2 / _) | } defined module X scala> object Y { | val y = List.range(-10, 10).view map (2 / _) | } defined module Y scala> Y.y(0) res0: Int = 0 scala> X.x(0) res1: Int = 0
And, speaking of Scala 2.8 changes, there is one that is also likely to reduce confusion among newcomers, and is related to what we spoke of. It's related to how guards on for-comprehensions work. For instance, take a look at this code:
scala> def find(n: Int, l: List[Int]) = { | var found = false | for (el <- l; if !found) found = el == n | found | } find: (Int,List[Int])Boolean scala> find(5, List.range(1, 10)) res14: Boolean = false
This code doesn't work as expected because of how for-comprehensions are translated:
l.filter(el => !found).foreach(el => found = el == n)
Since "filter" for List is a strict method, this just doesn't work, as "filter" evaluates before "foreach". However, Range's "filter" on Scala 2.7 is non-strict. Does it work? Yes:
scala> def findR(n: Int, l: Range) = { | var found = false | for (el <- l; if !found) found = el == n | found | } findR: (Int,Range)Boolean scala> findR(5, 1 to 10) res17: Boolean = true
But, back to the first example, as of Scala 2.8 the for-comprehension code will translate like this:
l.withFilter(el => !found).foreach(el => found = el == n)
This new function, "withFilter", will be non-strict for standard Scala collections. There's no way, however, to ensure that third-party collections will preserve the non-strictness. Hopefully, people will act in a sensible manner! :-) At any rate, the "find" function, as written, will work on Scala 2.8.
I hope this helps people working with Scala 2.7, and hope even more for Scala 2.8 to arrive soon! As a parting thought, though, I'd like to briefly speak of "<-", seen in for-comprehensions. I have seen people calling it many things. Martin Odersky's (et al) Programming in Scala book, however, calls it "in". So "for (el <- l)" ought to read "for el in l". So, if you weren't sure what to call it, now you know! Happy coding!
In the first code snippet, which version of Scala are you running that under? I just downloaded the 10/28 build of 2.8 and X.x(0) is causing a ArithException.
ReplyDeleteScala 2.7.4. Yes, on Scala 2.8 it will cause errors, because Range became strict on Scala 2.8 -- why is the reason for this blog, in fact. I'll make a note about the Scala version, to be clear.
ReplyDeleteThanks for the answer. Should have read the rest of the blog before asking questions w/ obvious answers. BTW: I really enjoy your blog entries. Keep them coming.
ReplyDeleteIn your X and Y examples you probably mean calling X.x() and Y.y() on something other than zero, because regardless of range strictness if you call these functions on 0 you get division on zero.
ReplyDeleteYou get division on X.x(10) and Y.y(10).
ReplyDeleteI mean, you get division by zero on X.x(10) and Y.y(10), as the range goes from -10 (index 0) to 10 (index 20).
ReplyDeleteDaniel, the view example seems to have a copy-paste-#fail in the output (if it is supposed to be 2.8)
ReplyDelete//Örjan
Örjan,
ReplyDeleteThanks for reporting it. I corrected the example I think you were speaking of (one which displayed wrong output anyway). Scala 2.8.0 had not been released yet, and some changes in the REPL made that example look different -- though the explanations on how the code is supposed to work are still correct.
I also put in a note at the beginning, since this article is effectively useless nowadays. Newcomers are using Scala 2.8, and people using Scala 2.7 are rarely newcomers anymore. :-)
thank you for sharing useful post.
ReplyDeleteweb programming tutorial
welookups