tag:blogger.com,1999:blog-63739638293406325292024-03-13T08:15:26.603-03:00Algorithmically challengedRandom thoughts of an IT worker in the stone age of computer science.Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.comBlogger64125tag:blogger.com,1999:blog-6373963829340632529.post-25316977109785132014-07-04T23:57:00.001-03:002014-07-04T23:57:28.491-03:00File generation with SBTSomeone asked me a question on IRC about file generation with SBT. I pointed out <a href="http://www.scala-sbt.org/0.13.5/docs/Detailed-Topics/Classpaths.html#unmanaged-v-managed" target="_blank">this link</a> on the SBT documentation, and tried to briefly explain how it worked, but the subject got a little too long for IRC, so I thought I might make a blog post out of it. Good thing too, because there are some errors in that page.<br />
<br />
Anyway, let's start. The goal here is that, when you compile a project, some source files are going to be generated by code, and then compiled together with the other ones you wrote. The person wanted the generator to have tests -- for such, I recommend writing an SBT plugin. I won't go further into that, and just explain the basic mechanism for generating source files.<br />
<br />
If you <span style="font-family: Courier New, Courier, monospace;">inspect sourceGenerators</span>, the setting mentioned by the SBT page, you'll see the follow description:<br />
<br />
<br />
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">[info] Setting: scala.collection.Seq[sbt.Task[scala.collection.Seq[java.io.File]]]</span></div>
<div class="p1">
<br /></div>
<div class="p1">
That means it is a setting (that is, it's value is fixed by the configuration file). The setting contains a sequence, which means you can have more than one source generator. This sequence contains tasks, so each generator is a task, and <i>that</i> means they will be evaluated every time they get executed. The task must return a sequence of files, which I assumed, correctly, to be the list of files that were generated.</div>
<div class="p1">
<br /></div>
<div class="p1">
Now, you'll also see further down this information:</div>
<div class="p1">
<br /></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">[info] Reverse dependencies:</span></div>
<div class="p1">
</div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">[info] <span class="Apple-tab-span"> </span>root/compile:managedSources</span></div>
<div class="p1">
<br /></div>
<div class="p1">
That means it is <span style="font-family: Courier New, Courier, monospace;">managedSources</span> that uses <span style="font-family: Courier New, Courier, monospace;">sourceGenerators</span>. And <span style="font-family: Courier New, Courier, monospace;">inspect uses managedSources</span> shows this:</div>
<div class="p1">
<br /></div>
<div class="p1">
</div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">[info] <span class="Apple-tab-span"> </span>root/compile:sources</span></div>
<div class="p1">
<br /></div>
<div class="p1">
In other words, whenever you compile, any source generators you have defined will be run. You can see as well that this is defined not only for <span style="font-family: Courier New, Courier, monospace;">compile</span>, but also for <span style="font-family: Courier New, Courier, monospace;">test</span> or any other compilation task you may have (I also have <span style="font-family: Courier New, Courier, monospace;">it:compile</span>, for example).</div>
<div class="p1">
<br /></div>
<div class="p1">
So, with that in mind, we can start creating our generator. All the lines below can be placed in a <span style="font-family: Courier New, Courier, monospace;">build.sbt</span> file, though you'll use plain Scala files with a plugin. This is just to quickly demonstrate how it's used. First, I'm going to create a task of sequence of files, which will be my generator:</div>
<div class="p1">
<br /></div>
<div class="p1">
</div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace;"><span class="s1">lazy</span> <span class="s1">val</span> generator = taskKey[<span class="s2"><b>Seq[File]</b></span>](<span class="s3">"My Generator"</span>) </span></div>
<div class="p1">
<span class="s4"><br /></span></div>
<div class="p1">
<span class="s4">Don't ask me about why it's "<span style="font-family: Courier New, Courier, monospace;">lazy val</span>" -- I'm simply repeating what I saw elsewhere. :) Also note that this uses the equals sign, not the colon-equals sign.</span></div>
<div class="p1">
<span class="s4"><br /></span></div>
<div class="p1">
<span class="s4">Now that we have a task key, we can assign a task to it. Since it's going to be of some complexity, let's start with:</span></div>
<div class="p1">
<span class="s4"><br /></span></div>
<div class="p1">
<span class="s4">
</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace;">generator in <span class="s1"><b>Compile</b></span> := {</span></div>
<div class="p1">
<span class="s2"><br /></span></div>
<div class="p1">
<span class="s2">Now we can proceed with the rest. I'm going to define a method with the basic generating capabilities, and then call this method with some parameters as the body of this task. My generator will be pretty simple: given source and destination directories, copy all files ending with <span style="font-family: Courier New, Courier, monospace;">.txt</span> from the source to the destination, changing the extension to <span style="font-family: Courier New, Courier, monospace;">.scala</span>. Not very useful, perhaps, but enough to show how to get at some source, and produce something with it at a proper destination. So here is is:</span></div>
<div class="p1">
<span class="s2"><br /></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> <span class="s1">import</span> _root_.java.nio.file.Files</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> <span class="s3">def</span> <span class="s4"><b>generate</b></span>(src:<span class="s5"> File</span>, dst:<span class="s5"> File</span>):<span class="s5"> Seq[File]</span> = {</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> <span class="s3">val</span> sourceFiles = <span class="s4"><b>Option</b></span>(src.list) getOrElse <span class="s4"><b>Array</b></span>() filter (<span class="s3">_</span> endsWith <span class="s6">".txt"</span>)</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> <span class="s3">if</span> (sourceFiles.nonEmpty) dst.mkdirs()</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> <span class="s3">for</span> (file <span class="s3"><-</span> sourceFiles) <span class="s3">yield</span> {</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> <span class="s3">val</span> srcFile = src / file</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> <span class="s3">val</span> dstFile = dst / ((file take (file lastIndexOf <span class="s6">'.'</span>)) + <span class="s6">".scala"</span>)</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> Files.copy(srcFile.toPath, dstFile.toPath)</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> dstFile</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> }</span></div>
<div class="p1">
<span class="s2"><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">
</span></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> }</span></div>
<div class="p1">
<span class="s2"><br /></span></div>
<div class="p1">
<span class="s2">There's a couple of things here. First, note that I'm handling the case where there's no source files -- I tested it on a project with multiple subprojects, which resulted in annoying exceptions when trying out. Also, note that I create the target directory: even though SBT provided me with a target directory, it didn't actually create it. And I pass an option to replace existing files as well -- remember that it has to work without running <span style="font-family: Courier New, Courier, monospace;">clean</span> every time. Finally, notice how I return the destination files, as required by <span style="font-family: Courier New, Courier, monospace;">sourceGenerators</span>.</span></div>
<div class="p1">
<span class="s2"><br /></span></div>
<div class="p1">
<span class="s2">Now, for source and destination directories. There's a setting for the destination directory, which I also saw on the SBT docs linked page. As for the base directory, I'll get the base directory of the current project, and add a subdirectory to it. So my task ends with:</span></div>
<div class="p1">
<span class="s2"><br /></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace;"> generate(baseDirectory.value / <span class="s1">"managed"</span>, sourceManaged.value)</span></div>
<div class="p1">
<span class="s2"><span style="font-family: Courier New, Courier, monospace;">
</span></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace;">}</span></div>
<div class="p1">
<br /></div>
<div class="p1">
All that remains is assigning it to <span style="font-family: Courier New, Courier, monospace;">sourceGenerators</span>, which actually took some time because the documentation was wrong. In the end, I saw an email mentioning that the ".task" macro suggested in the SBT docs doesn't actually exist because it was already taken by something else. So trying to use that give strange errors. The actual syntax I had to use is this:</div>
<div class="p1">
<br /></div>
<div class="p1">
</div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace;">sourceGenerators in <span class="s1"><b>Compile</b></span> <+= (generator in <span class="s1"><b>Compile</b></span>)</span></div>
<div class="p1">
<span class="s2"><br /></span></div>
<div class="p1">
To test, I wrong some stuff to a text file, intentionally meant to cause a compilation error, and ran the compile task with this result:</div>
<div class="p1">
<br /></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace;">sinks:master>compile</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace;">[info] Compiling 1 Scala source to /Users/dsobral/src/sinks/target/scala-2.11/classes...</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace;">[<span class="s1">error</span>] /Users/dsobral/src/sinks/target/scala-2.11/src_managed/test.scala:1: expected class or object definition</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace;">[<span class="s1">error</span>] This file should cause a compilation error.</span></div>
<div class="p2">
<span style="font-family: Courier New, Courier, monospace;"><span class="s2">[</span>error<span class="s2">] ^</span></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace;">[<span class="s1">error</span>] one error found</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace;">[<span class="s1">error</span>] (root/compile:<span class="s1">compile</span>) Compilation failed</span></div>
<div class="p1">
</div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace;">[<span class="s1">error</span>] Total time: 1 s, completed Jul 4, 2014 8:55:12 PM</span></div>
<div class="p1">
<br /></div>
Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com2tag:blogger.com,1999:blog-6373963829340632529.post-62088594058050738562014-02-17T16:55:00.000-03:002014-02-23T16:09:50.858-03:00Two ThirdsThis is not my usual programming-related blog post. I decided to blog about books I have been reading.<br />
<br />
I'm a long time fan of the <a href="http://www.amazon.com/s/?_encoding=UTF8&camp=1789&creative=390957&field-keywords=honor%20harrington&linkCode=ur2&sprefix=honor%20%2Caps%2C167&tag=algorihmchall-20&url=search-alias%3Ddigital-text" target="_blank">Honor Harrington Series</a><img alt="" border="0" height="1" src="https://ir-na.amazon-adsystem.com/e/ir?t=algorihmchall-20&l=ur2&o=1" style="border: none !important; margin: 0px !important;" width="1" />, a military science fiction series that draws on the spirit of 17~19 century naval series such as <a href="http://www.amazon.com/s/?_encoding=UTF8&camp=1789&creative=390957&field-keywords=horation%20hornblower&linkCode=ur2&tag=algorihmchall-20&url=search-alias%3Ddigital-text" target="_blank">Horatio Hornblower</a><img alt="" border="0" height="1" src="https://ir-na.amazon-adsystem.com/e/ir?t=algorihmchall-20&l=ur2&o=1" style="border: none !important; margin: 0px !important;" width="1" /> or <a href="http://www.amazon.com/s/?_encoding=UTF8&camp=1789&creative=390957&field-keywords=aubrey-maturin%20series&linkCode=ur2&sprefix=aubrey-maturin%2Caps%2C181&tag=algorihmchall-20&url=search-alias%3Dstripbooks" target="_blank">Aubrey-Maturin</a><img alt="" border="0" height="1" src="https://ir-na.amazon-adsystem.com/e/ir?t=algorihmchall-20&l=ur2&o=1" style="border: none !important; margin: 0px !important;" width="1" /> (from which sprang the movie <a href="http://www.amazon.com/gp/product/B0001HLVS2/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B0001HLVS2&linkCode=as2&tag=algorihmchall-20">Master and Commander: The Far Side of the World</a><img alt="" border="0" src="http://ir-na.amazon-adsystem.com/e/ir?t=algorihmchall-20&l=as2&o=1&a=B0001HLVS2" height="1" style="border: none !important; margin: 0px !important;" width="1" />
) . These days, however, there are enough secondary stories in that universe that stories advancing the main plot are rather hard to come by. Though, on the other hand, one could say that the original story has finally concluded, and what's going on now is a new story.<br />
<br />
For a bit, I tried to turn to a follow up on what is possibly my favorite fantasy trilogy of books, <a href="http://www.amazon.com/gp/product/B00APA1E96/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B00APA1E96&linkCode=as2&tag=algorihmchall-20">The Deed of Paksenarrion</a><img alt="" border="0" src="http://ir-na.amazon-adsystem.com/e/ir?t=algorihmchall-20&l=as2&o=1&a=B00APA1E96" height="1" style="border: none !important; margin: 0px !important;" width="1" />. Elizabeth Moon returned to the series with <a href="http://www.amazon.com/gp/product/B0030DHPAC/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B0030DHPAC&linkCode=as2&tag=algorihmchall-20">Oath of Fealty</a><img alt="" border="0" src="http://ir-na.amazon-adsystem.com/e/ir?t=algorihmchall-20&l=as2&o=1&a=B0030DHPAC" height="1" style="border: none !important; margin: 0px !important;" width="1" />, followed by other books, but they pale in comparison with the original, which was a quite believable, and somewhat moving, story of the daughter of a sheep farmer on the back beyond who becomes a paladin.<br />
<br />
So, in despair, I tried searching for other stuff. First I came upon The Kingkiller Chronicles, feeling somewhat like the last one to know of it (and if you didn't know of it you should immediately get <a href="http://www.amazon.com/gp/product/B0010SKUYM/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B0010SKUYM&linkCode=as2&tag=algorihmchall-20">The Name of the Wind</a><img alt="" border="0" src="http://ir-na.amazon-adsystem.com/e/ir?t=algorihmchall-20&l=as2&o=1&a=B0010SKUYM" height="1" style="border: none !important; margin: 0px !important;" width="1" />
and <a href="http://www.amazon.com/gp/product/B00475AYJQ/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B00475AYJQ&linkCode=as2&tag=algorihmchall-20">The Wise Man's Fear</a><img alt="" border="0" src="http://ir-na.amazon-adsystem.com/e/ir?t=algorihmchall-20&l=as2&o=1&a=B00475AYJQ" height="1" style="border: none !important; margin: 0px !important;" width="1" />
). So, that's three books of which only two are written. This is fantasy, but, honestly, that's beside the point -- it is the prose and the attention to detail that make these books great reading.<br />
<br />
Back to waiting, I looked around and found The Golden Threads Trilogy, a mix of fantasy and science fiction (though the latter mainly from the second book) story that's quite clever. I particularly <i>love</i> how everyone in the first book, <a href="http://www.amazon.com/gp/product/B00B7ZCEQK/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B00B7ZCEQK&linkCode=as2&tag=algorihmchall-20">Thread Slivers</a><img alt="" border="0" src="http://ir-na.amazon-adsystem.com/e/ir?t=algorihmchall-20&l=as2&o=1&a=B00B7ZCEQK" height="1" style="border: none !important; margin: 0px !important;" width="1" />, has a different conception of what's going on and what other people want. It's highly amusing. The second book, <a href="http://www.amazon.com/gp/product/B00EV62DBS/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B00EV62DBS&linkCode=as2&tag=algorihmchall-20">Thread Strands</a><img alt="" border="0" src="http://ir-na.amazon-adsystem.com/e/ir?t=algorihmchall-20&l=as2&o=1&a=B00EV62DBS" height="1" style="border: none !important; margin: 0px !important;" width="1" />, sadly decreases the fog of war factor, and leads to... well, I'll have to wait for the third book to get published to find out. Again.<br />
<br />
As I waited, I noticed that the <img alt="" border="0" src="http://ir-na.amazon-adsystem.com/e/ir?t=algorihmchall-20&l=as2&o=1&a=B00BEQP3K6" height="1" style="border: none !important; margin: 0px !important;" width="1" />March Upcountry
series by John Ringo was getting combo-treatment, with <a href="http://www.amazon.com/gp/product/B00BEQP3K6/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B00BEQP3K6&linkCode=as2&tag=algorihmchall-20">March Upcountry</a> and <a href="http://www.amazon.com/gp/product/B00C9GFSW8/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B00C9GFSW8&linkCode=as2&tag=algorihmchall-20">March to the Sea</a><img alt="" border="0" src="http://ir-na.amazon-adsystem.com/e/ir?t=algorihmchall-20&l=as2&o=1&a=B00C9GFSW8" height="1" style="border: none !important; margin: 0px !important;" width="1" />
being bundled in <a href="http://www.amazon.com/gp/product/B00HW1TY9I/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B00HW1TY9I&linkCode=as2&tag=algorihmchall-20">Empire of Man</a><img alt="" border="0" src="http://ir-na.amazon-adsystem.com/e/ir?t=algorihmchall-20&l=as2&o=1&a=B00HW1TY9I" height="1" style="border: none !important; margin: 0px !important;" width="1" />. It seems <a href="http://www.amazon.com/gp/product/B00APAD2F0/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B00APAD2F0&linkCode=as2&tag=algorihmchall-20">March to the Stars</a><img alt="" border="0" src="http://ir-na.amazon-adsystem.com/e/ir?t=algorihmchall-20&l=as2&o=1&a=B00APAD2F0" height="1" style="border: none !important; margin: 0px !important;" width="1" />
and <a href="http://www.amazon.com/gp/product/B00AP91W76/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B00AP91W76&linkCode=as2&tag=algorihmchall-20">We Few</a><img alt="" border="0" src="http://ir-na.amazon-adsystem.com/e/ir?t=algorihmchall-20&l=as2&o=1&a=B00AP91W76" height="1" style="border: none !important; margin: 0px !important;" width="1" />, the fourth and final book, will be out in a combo soon as well. Anyway, this is military science fiction pitting commandos against dinosaurs and spear-wielding aliens. What's not to like? :)<br />
<br />
Now, after I re-read these books, I decided to search for other stuff by John Ringo, and came upon Black Tide Rising, a zombie series. This is one of the "realistic zombies" kind of series, where people aren't really zombies, just infected with a rabies-like virus. It tries to be realistic in the portrayal of how people survive and fight back as well, though its world is rather lighter than I feel is realistic. I don't mind though: I prefer more cheerful worlds, even in a zombie apocalypse, than what I think is realistic. :)<br />
<br />
Anyway, I read <a href="http://www.amazon.com/gp/product/B00ELR01M0/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B00ELR01M0&linkCode=as2&tag=algorihmchall-20">Under a Graveyard Sky</a><img alt="" border="0" src="http://ir-na.amazon-adsystem.com/e/ir?t=algorihmchall-20&l=as2&o=1&a=B00ELR01M0" height="1" style="border: none !important; margin: 0px !important;" width="1" />
in a single day, then followed up with <a href="http://www.amazon.com/gp/product/B00HW1TV8W/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B00HW1TV8W&linkCode=as2&tag=algorihmchall-20">To Sail a Darkling Sea</a><img alt="" border="0" src="http://ir-na.amazon-adsystem.com/e/ir?t=algorihmchall-20&l=as2&o=1&a=B00HW1TV8W" height="1" style="border: none !important; margin: 0px !important;" width="1" />
a little slower... but only because, damn!, that's it for now. And it's not even going to be a trilogy! As a bonus, the first book comes with a "zombie clearance playlist" -- nice! :)Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com4tag:blogger.com,1999:blog-6373963829340632529.post-13920461540087721022013-02-13T02:07:00.001-02:002013-02-13T02:07:36.504-02:00Adding numbers the hard wayI finally got to read the <a href="http://robey.lag.net/2012/11/14/how-to-add-numbers-2.html">second part</a> of Robey Pointer "How to add numbers" blog posts. Thinking about hardware "algorithms" can be interesting because distributed systems face similar problems sometimes.
Below is my implementation of Kogge-Stone addition for 32 bits integers. Brent-Kung hybrid is very clever, but I couldn't figure out an obvious "step" for Brent-Kung that could be recursed into.
<pre class="brush:scala">
def add(x: Int, y: Int, cin: Boolean) = {
def intToBooleanArray(n: Int): Array[Boolean] = {
(0 until 32 map ((1).<<) map (n.&) map (0.!=)).toArray
}
val xs: Array[Boolean] = intToBooleanArray(x)
val ys: Array[Boolean] = intToBooleanArray(y)
// P means cout depends on cin
// G means cout is 1 regardless of cin
case class PG(p: Boolean, g: Boolean)
def p(a: Boolean, b: Boolean) = a ^ b
def g(a: Boolean, b: Boolean) = a & b
// initial PG from input alone
val pg0 = (xs, ys).zipped map ((a, b) => PG(g = g(a, b), p = p(a, b)))
// Execute combine step until all PGs are final
def combStep(lastPGs: Array[PG], finalPGs: Array[PG], step: Int): Array[PG] = {
// Combines PGs formed by adjacent block of bits
def comb(pga: PG, pgb: PG): PG = PG(p = pgb.p & pga.p,
g = pgb.g | (pgb.p & pga.g))
if (lastPGs.isEmpty) finalPGs
else {
val (newFinalPGs, tempPGs) = lastPGs splitAt (step - step / 2)
val nextPGs = (finalPGs ++ lastPGs, tempPGs).zipped map comb
combStep(nextPGs, finalPGs ++ newFinalPGs, step << 1)
}
}
val pgs = combStep(pg0, finalPGs = Array.empty, step = 1)
// Carry for each bit
def cout(cin: Boolean)(pg: PG): Boolean = pg.g | (pg.p & cin)
val cn = pgs map cout(cin)
// Final result for each bit
val sn = (pg0 map (_.p), cin +: (cn take 31)).zipped map ((p, c) => p ^ c)
// Convert boolean array back into an int
val result = (sn.zipWithIndex map {
case (s, i) => if (s) 1 << i else 0
}).sum
(result, cn.last)
}
</pre>
Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com2tag:blogger.com,1999:blog-6373963829340632529.post-63799949540718465192013-01-18T03:57:00.000-02:002013-01-18T03:57:25.357-02:00Pattern matching on abstract types with Scala 2.10.0Scala 2.10.0 is out, and one of its greatest improvements is a completely new pattern matching algorithm on the compiler. That algorithm fixes lots of bugs that have existed all the way up to 2.9.x and adds more and better static checks.<br />
<br />
One interesting thing that has probably gone unnoticed by most, however, is that it can do <i>more</i> than what the old pattern matcher did, in at least one respect: it can match against abstract types, provides a ClassTag.<br />
<br />
To understand that better, consider this REPL session on Scala 2.9.2:<br />
<br />
<pre class="brush: scala">
scala> def f[T: ClassManifest](l: List[Any]) = l collect {
| case x: T => x
| }
<console>:8: warning: abstract type T in type pattern T is unchecked sin
ce it is eliminated by erasure
case x: T => x
^
f: [T](l: List[Any])(implicit evidence$1: ClassManifest[T])List[T]
scala> f[String](List(1, 2.0, "three"))
res0: List[String] = List(1, 2.0, three)
</pre><br />
Now let's look at what can be done with Scala 2.10.0:<br />
<br />
<pre class="brush: scala">
scala> import scala.reflect.ClassTag
import scala.reflect.ClassTag
scala> def f[T: ClassTag](l: List[Any]) = l collect {
| case x: T => x
| }
f: [T](l: List[Any])(implicit evidence$1: scala.reflect.ClassTag[T])List
[T]
scala> f[Int](List(1, 2.0, "three")) // It can't find Int because they are boxed
res0: List[Int] = List()
scala> f[String](List(1, 2.0, "three")) // But AnyRefs are ok
res1: List[String] = List(three)
</pre><br />
Note that it doesn't reify types -- that is, it can't tell whether your List[Any] is a List[String], but it does go a bit further than what was possible before.
Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com3tag:blogger.com,1999:blog-6373963829340632529.post-26563402986738290012012-12-11T19:12:00.002-02:002012-12-11T19:12:35.137-02:00Bugs and dynamically vs statically typed languagesSomething occurred to me that, unfortunately, was just a little bit too big for a tweet, so I decided to blog instead, and add more context.<br />
<br />
While thinking about the perennial question of whether static typing help reduce the number of bugs in an otherwise well tested code base, I was reminded of how many bugs I saw tagged by newer versions of Scala with improved static analysis and correctness, or how damn hard it is to do variance right when you start using it.<br />
<br />
I was then struck by a thought: the "bugs" I was thinking of were not caught by tests because no feature in the code actually used the buggy class in a way that revealed the bug. This also relates to variance, because variance in Java is defined at the <i>usage</i> site, instead of definition site. The definitions are all invariant in Java (aside from arrays).<br />
<br />
So, thinking about it, I came up with the following hypothesis:<br />
<br />
<i>Dynamically typed language proponents consider "bugs" to be things that the user of the software can cause it to do, while statically typed language proponents consider "bugs" to be things that users of the <b>code</b> can cause it to do.</i><br />
<i><br /></i>
That would certainly account for some differences in attitude and beliefs I perceive. On the other hand, it might just be my imagination.Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com10tag:blogger.com,1999:blog-6373963829340632529.post-64334847290579830892012-10-08T17:46:00.000-03:002012-10-08T17:46:39.019-03:00In hindsight, maybe I should have used Actors...<h2>
Postmortem</h2>
<div>
This post is meant for myself.</div>
<div>
<br /></div>
<div>
For the past six weeks, I have been working hard on a system target exclusively at the election day this past Sunday, October 7th 2012, in my country. Granted that I took off a few days to go to Agile Brazil 2012, but I returned every one of them with plenty interest.</div>
<div>
<br /></div>
<div>
So I write this for myself, as a postmortem, to help me clear up some ideas about it all. You see, this was the first time I had the opportunity to use Scala for anything critical -- or, for that matter, larger than a script -- as part of my job.</div>
<div>
<br /></div>
<div>
So, seriously, you don't have to read this. I'm sure there's some other post or presentation about Scala that's bound to be way more interesting than this, and which you haven't seen yet. Look up Scala Days, Skills Matter or NE Scala, if you are out of ideas. :-)</div>
<h3>
The Setting</h3>
<div>
Since you are still with me, I suppose I should explain a bit what the program was intended to do. It should help me figure out how within the scope everything turned out as well.</div>
<div>
<br /></div>
<div>
For a critical project, I'm a bit ashamed to admit it was nothing more than an <a href="http://en.wikipedia.org/wiki/Extract,_transform,_load" target="_blank">ETL - Extract, Transform and Load</a>. It was supposed to download election data, extract relevant information, and load it into a database. A few qualities deemed to be paramount:</div>
<div>
<ul>
<li><b>Resilience</b>. The program should be able to withstand processing errors such as invalid data, connectivity problems and such. Depending on the kind of error, it would either retry or ignore -- but always log!</li>
<li><b>Reliability</b>. If the program deemed a particularly process to be successfully concluded, it damn well better be both error free and finished. That was harder than it sounds, which is something I should look into.</li>
<li><b>Real Time</b>. This program would feed systems used to provide news feeds on the election, so the sooner it got news out, the better. There was actually a software provided to do the extraction part in batch, which set up the bar for us.</li>
<li><b>Resource Aware</b>. This was a vague requirement, and not everyone thought it was a requirement at all. The point was being able to use lots of resources to extract data, but without going overboard and somehow get throttled down.</li>
</ul>
<h3>
Highlights</h3>
</div>
<div>
Before I go any further on the real meat -- the problems -- I want to highlight the things that really shined, in my opinion.</div>
<h4>
<a href="http://dispatch.databinder.net/Dispatch.html" target="_blank">Dispatch</a></h4>
<div>
This was the first time I used this controversial Scala library -- scala.io.Source had sufficed for me before. I understand the current generation, 0.9.x, has some important differences from previous ones. It has a much reduced set of operators, and it has full support for operator-free use. Also, it's asynchronous, which is were the fun came.</div>
<div>
<br /></div>
<div>
Let me be plain: Dispatch is awesome., and that has nothing to do with operators. Sure, the clean syntax is nice, though sometimes I'd prefer it had better support for tuple varargs. But that's just a side show. No, the main game is its asynchronous nature, and how it got implemented.</div>
<div>
<br /></div>
<div>
Dispatch is based on <a href="https://github.com/sonatype/async-http-client" target="_blank">Async HTTP Client</a>, which provides a nice layer over Java's asynchronous I/O libraries. And, of course, Dispatch provides a Scala layer over Async HTTP, making it even nicer. It does not, however <i>hide</i> Async HTTP. On the contrary, you work with Async HTTP objects at all times, and have the Dispatch abstractions available as enhancements.</div>
<div>
<br /></div>
<div>
Now, doing I/O asynchronously means you make a requisition and tell the library what it should do with it once the requisition is done, instead of waiting for the result and then doing stuff with it. That's what you see with NodeJS callbacks, for instance, and anyone who worked long enough with it should be at least wary of its traps.</div>
<div>
<br /></div>
<div>
The abstraction provided by Dispatch to work around that is called Promise, though it better approaches what Scala 2.10 will introduce as <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.concurrent.Future$" target="_blank">Future</a>. Nope, that's not the old actor-based Future -- it's actually something that everyone has been doing (such as Dispatch), and which prompted it's inclusion in the library. See <a href="http://docs.scala-lang.org/sips/pending/futures-promises.html" target="_blank">Scala Improvement Process 14</a> for more details. No, really, see it.</div>
<div>
<br /></div>
<div>
The gist of it (Dispatch-style) is that you are waiting for something to be completed (like the HTTP request you made), and you are holding onto an object, the promise, that will get it to you once it's finished. If an exception is thrown by the process, your program then the exception will be encapsulated into that promise as well. Extract it, and the exception is thrown at the point you extracted it. And, of course, Scala-style, you can map (or flatMap that shit) the results as you wish.</div>
<div>
<br /></div>
<div>
There's one thing I like better on Dispatch's Promise than Scala's Future is that it provides methods to convert the Promise[T] into a Promise[Option[T]] or Promise[Either[Throwable, T]], which makes error handling nice. I don't understand why Scala's Future doesn't have these simple interfaces. Option seems to require at least a one liner through andThen and toOption on a Try, but getting an Either out seems even harder. On Dispath, promise.either returns a promise for Either, and promise.Option returns a promise for Option, and that's that.</div>
<div>
<br /></div>
<div>
Dispatch also has something to convert an iterable of promises into a promise of iterables. Happily, Scala 2.10 has that as well, and even more.</div>
<div>
<br /></div>
<div>
So, anyway, all this meant it was very easy to chain transformations triggered by HTTP requests, logging errors were it made sense, flattening unsuccessful computations into None, and getting Either to code that could make sense out of it.</div>
<div>
<br /></div>
<div>
Performance and concurrency, easily understandable, easy to manipulate, FTW.</div>
<h4>
<a href="http://specs2.org/" target="_blank">Specs2</a></h4>
<div>
I had some pretty tough challenges testing everything. Most of the code was written TDD, but there were plenty XMLs, JSONs, timing issues, concurrency issues, a dozen mostly similar classes that had to be tested for its small differences, etc. It's not that it was <i>hard</i>, though the wrong framework certainly could have made it so, but it made for some pretty gruesome tests, and one wants tests to be as easy to understand as possible.</div>
<div>
<br /></div>
<div>
Time and time again, Specs2 came through with the goods. I had used Specs2 before a few times already, but, interestingly enough, most of the features I was most familiar with (parser combinator testers, given-when-then combinators) were completely absent from my tests. To make up for it, I feel like I have seen <i>everything</i> that I had overlooked up to now. I know that's not true, because I have a list, you see, but...</div>
<div>
<br /></div>
<div>
I'm not claiming Specs2 does anything other frameworks don't -- the same way I didn't know Specs2 features I came to use, I don't know how well other frameworks handle them. I'm just saying that there wasn't a single thing I needed that couldn't be done with it, and in a very satisfactory manner.</div>
<div>
<br /></div>
<div>
A special mention goes to <a href="http://code.google.com/p/mockito/" target="_blank">Mockito</a>, a great mocking framework made greater by how well Specs2 is integrated with it.</div>
<h4>
Eric Torreborre</h4>
<div>
Eric's Specs2 author. The level of support he provided me was amazing, and anyone who spends sometime reading specs2 mailing list will know that's how he treats everyone. Did you find a bug, or needed something just a tiny little bit different than what Specs2 provides? It's rare the day I don't see Eric rolling out a new snapshot with it within 24 hours, tops.</div>
<div>
<br /></div>
<div>
Do you have trouble understanding how to make your tests work a certain way, or how to do something in Specs2? Eric will discuss it until you come away enlightened.</div>
<div>
<br /></div>
<div>
I've always been a bit amazed at Eric's awesomeness, but having it while writing a critical piece of software made it really personal. So, thanks Eric.</div>
<h4>
<a href="http://unfiltered.databinder.net/Unfiltered.html" target="_blank">Unfiltered</a></h4>
<div>
Unfiltered is a cousin (maybe a brother) to Dispatch, handling the other end of HTTP requests. What's nice about Unfiltered is that it works as a library, making it really easy to add small HTTP servers to your code. This wasn't part of the production code, but it made testing the parts of the code that did HTTP requests really easy.</div>
<div>
<br /></div>
<div>
I have no basis to make recommendations on it as a general library (though it certainly bears investigation), but I heartily endorse it to handle those interface (as in, with other systems) tests. It was doubly useful to me, since both my Input and my Output (the database I was feeding) were HTTP servers.</div>
<h4>
<a href="http://software.clapper.org/avsl/" target="_blank">AVSL</a> and <a href="http://software.clapper.org/grizzled-slf4j/" target="_blank">Grizzled SLF4J</a></h4>
<div>
These nice libraries by Brian Clapper made the logging goes smoothly. It was particularly striking given that some other modules of the project got a lot of trouble on this regard.</div>
<div>
<br /></div>
<div>
Anyway, I picked AVSL as a suggestion from Dispatch, which uses SLF4J-based logging. Knowing I could replace AVSL had a soothing effect, and it was pretty easy to get working quickly. I started out with an AVSL-only logging to handle some immediate problems I was having, and then replaced them with Grizzled to be able to replace AVSL if needed.</div>
<div>
<br /></div>
<div>
AVSL was easy to configure and use, and it was particularly nice being able to get extra verbose logging on some parts of my code while keeping the rest relatively quiet when I tried to figure out why something wasn't working as I expected it to.</div>
<h4>
<a href="http://www.jetbrains.com/idea/" target="_blank">IntelliJ IDEA</a></h4>
<div>
On the first week, I worked solely with Sublime Text 2, since that was the editor of preference of the person I was pairing with. Given that he didn't have Scala experience, I thought it was fair to have anything else I could as familiar as possible to him.</div>
<div>
<br /></div>
<div>
Then he moved on to some other part of the project that was demanding some urgent attention, and I was left finishing up the project (until the scope changed, that is).</div>
<div>
<br /></div>
<div>
Now, I'm a vim user, and happily so, but I came up upon some issues that demanded heavy refactoring, and so I turned to IntelliJ. I wish IntelliJ did more -- Scala support pales in comparison to Java's -- but the final result would have looked a lot uglier than it is if not for IntelliJ.</div>
<div>
<br /></div>
<div>
I bought a personal license for it at the start of the third week, as I geared up to start working on the nodejs module. That never happened, but I don't regret spending that money.</div>
<h4>
TDD</h4>
<div>
So, I arrived 9:20 am on Sunday to do the first load of static data for the election ("static" data could, and did, change at any time to account for judicial decisions). I ran the code and got some database insertion errors. Now, I had had some trouble with the database during stress tests, but this didn't look like the problems we had, and there certainly wasn't any stress going on.</div>
<div>
<br /></div>
<div>
So I investigated the problem, found the cause in some badly formatted input, changed the code to account for it, produced a new snapshot, tested it, and put it in production. I didn't ask for permission, I didn't talk to my colleages to hear what they thought about it, I just did it, not only without fear, but with absolute confidence I didn't break anything with that change. On the morning of the election day.</div>
<div>
<br /></div>
<div>
I've seen plenty of last minute changes throughout my career, mind you, but I might even take a bet on that more than half of them broke something else. Changing with confidence that nothing broke -- without breaking things! -- is something I have only known through TDD.</div>
<div>
<br /></div>
<div>
That was hardly the only time I got benefits out of TDD, but it serves to epitomize my whole experience. The scope changed on us many times, and I had to do fundamental changes to my code as result, but nothing that was working before ever stopped working.</div>
<div>
<br /></div>
<div>
Of course, I did introduce a few bugs despite TDD, though they were usually very short lived.</div>
<h4>
<a href="http://mobz.github.com/elasticsearch-head/" target="_blank">Elastic Search Head</a> (boring link, but follow to the github link)</h4>
<div>
This is an HTML5 application that serves as a general interface/console to the Elastic Search database. You can just clone the repository and open your browser on it's index.html to use it -- there are no additional requirements.</div>
<div>
<br /></div>
<div>
I cannot imagine ever working with Elastic Search without it, but, then again, there's absolutely no reason why should that ever happen -- it's free, and the only requisite is an HTML5 enabled browser. If you have an Elastic Search you interact with, get this.</div>
<div>
<br /></div>
<div>
Furthermore, I'm now very unhappy with everything else for not having something like it. Well, maybe they have and I have lived in the dark ages so far. Now that I have seen the light, though, I have some pretty high standards concerning what kind of tooling should be freely available to a database of any kind.</div>
<div>
<br /></div>
<div>
Elastic Search itself is responsible for it to a great extent -- like this, there are many other tools available for it.</div>
<h3>
Bumps</h3>
<div>
Naturally, there were parts of the project that, while providing value overall, were besmirched by some problems. Now, the following stuff provided overall net gains for the project, but I'm going to concentrate on what was difficult, as that's the places were improvements can be made.</div>
<h4>
SBT</h4>
<div>
You knew I was going to say that, don't you?</div>
<div>
<br /></div>
<div>
SBT is a major source of love-hate relationships. I mostly like it, and get confused why people seem to have so much trouble with it. It seems to come down to multi-module projects (that is, <i>not</i> using an internal repository to handle separate projects as separate projects), and plugins (and why do people need to do so much weird stuff anyway?). Well... it could stand to be more discoverable. Things like "show scala-version" vs "set scalaVersion" also get in the way, and, to further make this point, I still don't get exactly why it has to be this way.</div>
<div>
<br /></div>
<div>
But none of this was a problem to me. This was a single module project, with a build.sbt and a project/plugins.sbt, and that was that. Setting things up was a simple matter of copy&paste from library and sbt plugins. So, no, knowing what lines to put in these files never presented any problem.</div>
<div>
<br /></div>
<div>
My problem was getting SBT to run at all. Being a Scala geek, I've been writing my sbt shell scripts for a long time now, downloading launcher jars and stuff like that, but what worked for me on my own free time was simply not good enough to get such a critical project going.</div>
<div>
<br /></div>
<div>
Simply put, people who were not me and didn't really understand much of Scala, let alone SBT, had to be able to build and test the project, with minimal instructions. Fortunately, the <a href="http://www.scala-sbt.org/" target="_blank">SBT site</a> now has nice Debian packages for download. Only it doesn't, really -- it has a package that installs a source for the SBT package, and it turns out that I manage my sources automatically through Puppet, which made that very annoying. I had to install the package to find out what the source was so I could put it on the build server configuration.</div>
<div>
<br /></div>
<div>
Only that didn't really help. You see, I use SBT 0.12 since about the time it came out, and that's not the version available on the site above. Worse, some of the global plugins I have are incompatible with the version provided by the Debian package.</div>
<div>
<br /></div>
<div>
The problems didn't stop there, however. The developers machine need to go out through an NTLM authenticated proxy with Basic Authentication as fallback. Unix tools in general and Java in particular seems to have a lot of trouble with this setup. Browsers have no problem whatsoever, but getting SBT to download <i>itself</i> (the parts that are not the launcher) takes some doing. Using the proxy, yes, that's easy. Getting it to authenticate, now, there's a challenge.</div>
<div>
<br /></div>
<div>
In the end, I ended bundling SBT 0.12's jar, a hand crafted sbt shell script, and some obscure configuration options to make it download everything from our Artifactory. That meant adding every single dependency to it, and some of those dependencies -- like those of typesafe -- were tough to get right. Lots of trouble with ivy2 repos vs maven repos, and lots of maven repos were I had to tell Artifactory not to validate the poms.</div>
<div>
<br /></div>
<div>
Once all of that was ready, the rest was a breeze. After all the trouble I had, I delayed doing the Jenkins build for a long time, only to get it working the first time in less than ten minutes when i finally got to it. Code coverage reports, code duplication, style checks, source code x-ray, javadocs, single jar files, generation of ensime project for Sublime2, generation of IntelliJ and Eclipse projects... Almost all of it accomplished effortlessly.</div>
<div>
<br /></div>
<div>
Also, sbt ~test worked really well during the time I used Sublime Text 2, but I suppose that isn't a surprise to anyone. Or, at least, I sure hope so.</div>
<h4>
<a href="http://mtkopone.github.com/scct/" target="_blank">SCCT</a></h4>
<div>
This is the tool I used to generate code coverage. It mostly works, but I doubted it for a while. You see, you have to compile the whole project (or, perhaps, all of the tests) with its action for it to work. If it uses partial compilation on something, it only shows coverage for those files that were changed!</div>
<div>
<br /></div>
<div>
Once I figured that, it was a simple thing to add a clean to the Jenkins project, or when running on my own machine. Since I only look at code coverage sporadically, this wasn't a problem. There was no need for integration on Jenkins, so Jenkins build times were not a problem either. If they were, it would be pretty easy to separate normal testing from full artifact generation.</div>
<div>
<br /></div>
<div>
So, what do I mean by "mostly"? It displays abstract classes and traits as untested, and it really bothers me to see all those 0% where there isn't any code to be tested to begin with. Also, it marks literal constants (that is, untyped final vals on objects) as untested. That's fair enough, and I suppose it can't be helped unless you get the compiler to do some instrumentation for you.</div>
<div>
<br /></div>
<div>
As for the rest, everything works, and it integrates nicely with Jenkins.</div>
<h4>
<a href="https://github.com/harrah/browse" target="_blank">Scala X-Ray (Browse)</a></h4>
<div>
I don't know why I had so much trouble setting this up. I expect I assumed it could only be an SBT plugin, so all the instructions I found didn't make any sense at all. Once I realized it is a <i>compiler</i> plugin and just added the two lines indicated by the site above to my build.sbt, it worked. But I did spend a lot of time trying to figure it out, and didn't even get it working on the first time.</div>
<h4>
<a href="http://www.elasticsearch.org/" target="_blank">Elastic Search</a></h4>
<div>
After a brief flertation with some other solutions, we settled on Elastic Search as the data store. It was not exactly what we wanted, but there wasn't much time to explore solutions. ES was something I was mildly familiar with, and given that most of the project was written in Javascript, interacting with ES was really easy. It also benefited from Dispatch on the Scala side.</div>
<div>
<br /></div>
<div>
Unfortunately, the actual load we put on it was very different from what we expected. We expected lots of reads, few updates, but actually got something the other way around, and that caused performance issues (on stress tests only, though -- normal loads were fine).</div>
<div>
<br /></div>
<div>
Also, it's log didn't help much when we started having problems. In the end, it turned out that one of the tools we were using to monitor it, BigDesk, was actually causing it to crash. We stopped using it, and, aside from it not being capable of handling the stress test loads, it worked fine otherwise.</div>
<div>
<br /></div>
<div>
It's HTTP/JSON interface really did help get everything working easily, and it's data model, while targetted at uses rather different than ours, worked fine for us.</div>
<h3>
The Ugly Parts</h3>
<div>
I can't honestly point to anything and say that was a a major problem. The closest thing to that would be the changes in scope with progressively tighter deadlines (it's not like election day could be postponed, after all), but that comes with the job, and it's our responsibility to deal with it the best we can.</div>
<div>
<br /></div>
<div>
Some problems resulting from ill-conceived solutions to the new requisites marred the end result, but the alternative to have something that didn't work well would have been not having anything. The original scope was fully delivered, and worked with a single hitch.</div>
<h3>
In Hindsight</h3>
<div>
So, what I learned from all that?</div>
<div>
<br /></div>
<div>
I'd like to start with some lessons about Scala. I wonder what's the maturity level displayed by my code... There are no type classes, just two implicit declarations, and hardly any of the more complex you may have seen me write here and elsewhere. There's not a fold to be seen anywhere.</div>
<div>
<br /></div>
<div>
And, yet, it's mostly functional. There are some thing I wished to make immutable, but I couldn't figure out how to handle the concurrency requirements in a functional manner. There are a few places were I knew a way to make something immutable, but did not choose to do so, and ended up regretting it. And, yet, it looks -- to me -- pretty normal.</div>
<div>
<br /></div>
<div>
Maybe I'm wearing blinders. There are certainly plenty maps and a few flatMaps, which couldn't really be otherwise given the role Dispatch plays in it. But, to me, the code looks downright tame. That surprised me a lot, given the code size.</div>
<div>
<br /></div>
<div>
Very few case classes, but just perfect when I needed them. Some sealed traits and objects which could have been replaced by a decent enumeration (but do have a small advantage). One implicit conversion to a tuple so that I could use the unzip method, and some other place I needed to make an Ordering type implicit to get around the fact it isn't contra-variant.</div>
<div>
<br /></div>
<div>
Named and default parameters were put to very good use -- I wish Scala would let me get away with using named when interfacing with Java. It might not be able to check the parameter name, reorder parameters, and there's certainly no defaults, but method calls with booleans and primitives feel just naked without the parameter name.</div>
<div>
<br /></div>
<div>
Test code was another matter -- Specs2 brings plenty of magic to the table, but it's a standard magic so to speak, which makes it possible for people who don't understand it to use it by rote.</div>
<div>
<br /></div>
<div>
So, what about actors? Given my experience with Scala and the concurrent nature of my problem, you might have wondered why I didn't use actors to solve it. And, if that didn't occur to you naturally, the title of this post should certainly have done so!</div>
<div>
<br /></div>
<div>
It's... complicated. The original scope didn't look like it would benefit from using actors, and I still think so. As I originally envisioned it, a main thread would start Dispatch downloads, and the mappings done to the Promise returned by them would handle everything else. Nothing would ever be returned to the main thread. A single semaphore would put a hard limit on how many simultaneous downloads would be allowed, and the main thread would sleep until the time for more downloads came.</div>
<div>
<br /></div>
<div>
In that scenario, using actors would just add overhead. However, things changed. The first big change was that the interaction with the other modules would have to be batched. That's something I was never comfortable with, since it lost precious seconds gained by the ETL's architecture. But, most importantly, it meant information would have to go from the async download trigger threads to the main loop. At that point, we have a bi-directional conversation, which is were actors start getting interesting.</div>
<div>
<br /></div>
<div>
Still, the interaction was very simple, and none of the more sophisticated interactions enabled by actors would be of any use. As things progressed, however, that changed, as the contents of some downloads started dictating the need for other downloads, which could trigger other downloads, and failing any one of those downloads should dictate the outcome for the ones that triggered them.</div>
<div>
<br /></div>
<div>
And now you have a complex net of interactions, some of which would benefit from let-it-fail and corresponding recovery mechanisms. It now looks like a perfect fit for actors.</div>
<div>
<br /></div>
<div>
So, the code I wrote does handle almost every error condition, and with some refactoring it could have handled all of them. However, what started out natural and intuitive has become confusing. If I had used a simple actor setup right from the start, bare bones as it would have been, I could have pursued better solutions as the requirements changed.</div>
<div>
<br /></div>
<div>
The other, more troubling, consideration is that, given the batching limitation imposed upstream from my system, maybe we should have gone with the standard downloader and a bare bones extraction&load script.</div>
<h3>
In Conclusion</h3>
<div>
Overall, I don't regret any of the technical choices I made. Surely a second version would benefit from hindsight (and be vulnerable to the second version syndrome), but the choices I made proved equal to the task, and quite adequate to withstand the changing requisites and short deadline.</div>
<div>
<br /></div>
<div>
As a side note, this project could have gone with Ruby just as likely as with Scala, and it was just a coincidence that pushed it to Scala. I wonder how that would have turned out, and it is an experience I'd have loved to have.</div>
<div>
<br /></div>
<div>
And if you have read it all and think it was a waste of time, don't say I didn't warn you! :-)</div>
<div>
<br /></div>
Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com7tag:blogger.com,1999:blog-6373963829340632529.post-70159624644969154142012-09-02T01:12:00.000-03:002012-09-02T01:12:27.493-03:00Bugs, TDD and Functional Programming<div class="tr_bq">
I'm watching a BBC series produced long ago called "The Machine That Changed The World". At some point, <a href="http://en.wikipedia.org/wiki/Mitch_Kapor" target="_blank">Mitch Kapor</a> (founder of Lotus) <a href="http://webcache.googleusercontent.com/search?q=cache:3aKROQQsK3AJ:https://groups.google.com/group/comp.software-eng/tree/browse_frm/month/1993-04/3bd00b42012714ff%3Frnum%3D71%26_done%3D%252Fgroup%252Fcomp.software-eng%252Fbrowse_frm%252Fmonth%252F1993-04%253Ffwc%253D1%2526+&cd=1&hl=en&ct=clnk" target="_blank">says</a>, as the narrator compares building software to building airplanes, buildings and bridges:</div>
<blockquote>
So if you've got something that's not holding its weight well, you look to<br /> see if the the joint is tight, or if the screws are right. You don't have to<br /> go and analyze the whole building. Well, software doesn't work like that. If<br /> you see a problem when you attempt to execute a certain command, there is no<br /> simple and direct way of knowing which part of the code could have the problem.<br /> In some sense it could be almost anywhere. And so the detective problem of<br /> hunting down the source of the problem is enormously harder than in physical<br /> media because the digital media don't obey the same simplifying law of<br /> proximity of cause and effect.</blockquote>
I see both TDD and functional programming as ways to change that.<br />
<br />
For TDD, it's quite obvious, and, in fact, one of the benefits lauded by proponents of the practice. If some bug shows up (in the test), then it must be related to the change you just made. Since changes are supposed to be small if you do it right, then the code path you must follow to find the bug should be quite restricted.<br />
<br />
The same holds for functional programming, with the advantage of applying to bugs that did not show up on the tests. A program in functional programming is supposed to be a composition of functions, and each function should be<a href="http://en.wikipedia.org/wiki/Referential_transparency_(computer_science)" target="_blank"> referentially transparent</a> -- side-effect free, so that it's output relies solely on the input, and pure, so that it always return the same output for a given input. So, once you can reproduce the problem -- one you have an input that causes the bug -- you can go down the functions that handle it and see if they are returning the correct output for the input they are receiving.<br />
<br />
Add both together, and software starts looking a lot more like building an airplane or a bridge. In this respect, anyway.<br />
<br />
When I left work for the weekend on Friday, I had just started feeding input from another module of the systems we are building, as my own module finally got to the point one could start doing integration tests. There is a bug: some data I expected to be produced isn't showing up.<br />
<br />
As it happens, however, the code that looks at that data is functional and was developed using TDD. If it got the input I expected, it would have returned the output I wanted. That means when I arrive at work tomorrow, I'll look at the input being provided, and I know it will not be the same as the input the system was tested with. I'll see what's different, and, knowing that, I'll be able to tell exactly which piece of code should have handled it, and find out why it isn't doing what's supposed to do.<br />
<br />
It really beats the alternative.<br />
<br />Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com17tag:blogger.com,1999:blog-6373963829340632529.post-82701621992659510022012-08-25T23:16:00.001-03:002012-12-26T09:58:07.976-02:00JSON serialization with reflection in Scala! Part 2<h4>
An Apology</h4>
<div>
First off, I'd like to apologize to all for taking so long to follow up on my <a href="http://dcsobral.blogspot.com.br/2012/07/json-serialization-with-reflection-in.html" target="_blank">last post</a>. Believe it or not, I only posted that one once this one was written, but... I waited for a couple of bugs to be fixed, and then the API changed, and there were new bugs, and so on. Even as I write this, with 2.10.0-M7 out and a good likelihood of it being followed by RC1, I still uncovered a Blocker bug on reflection!</div>
<div>
<br /></div>
<div>
Anyway, I wanted to avoid teaching people things that would turn out not to be true, and, so, I delayed. I think it's time, now to go ahead with this post. As soon as I finish this preamble, I'll revise the post and the code, and I hope to catch all problems. If you find anything amiss, please notify me that I can fix it the sooner!<br />
<h4>
A Warning</h4>
</div>
<div>
<i>I'm writing this with 2.10.0 ready but not yet announced, and I've just became aware that, at the moment, <b>reflection is not thread-safe.</b> See <a href="https://issues.scala-lang.org/browse/SI-6240" target="_blank">issue 6240</a> for up-to-date information.</i></div>
<h4>
Recap</h4>
<div>
<span style="background-color: white;">We saw on part 1 that Scala 2.10 will have a reflection library, and baited you into reading all post with the promise of a serializer written with it. That library is divided into layers through the cake pattern, which is used to provide increasingly more detailed levels from very basic reflection all the way to what the compiler needs. Those layers are called Universes.</span></div>
<div>
<br /></div>
<div>
We saw the base and runtime Universe, and we looked at some code which made use of them. We did not, however, go into any detail about what's inside these layers, and how those things relate to each other, and how to manipulate them.</div>
<div>
<br /></div>
<div>
So get ready, because we are getting into that right now!</div>
<h4>
What's in a Universe?</h4>
<div>
Reflection is available at different layers, and layers are Universes, so what's in a Universe? Quite a lot, actually. I invite anyone to look at the inheritance list of a typical universe, such as <span style="font-family: 'Courier New', Courier, monospace;">scala.reflect.runtime.JavaUniverse</span>. Now look at <span style="font-family: 'Courier New', Courier, monospace;">scala.reflect.api.JavaUniverse</span>, which is the type you actually see when you use <span style="font-family: 'Courier New', Courier, monospace;">scala.reflect.runtime.universe</span>, and see how much smaller it is. That will make you feel better about how much stuff is still in there. :-)<br />
<br />
A Universe has pretty much every kind of member one can have in Scala: type, trait, class, object, val, lazy val, def... I haven't seen a var, fortunately. But I can't guarantee there isn't one either! There is so much stuff in them that browsing will look like a daunting activity. But there's some logic to what's in there, which I'd like to go into before we get into actual details.<br />
<br />
The core of Universe are its type members. The <span style="font-family: 'Courier New', Courier, monospace;">type</span> type members are usually defined as boundaries, which enable the different layers to make them different things while still exposing a rich API. There are also classes and traits declaring those APIs or implementing them.<br />
<br />
One pattern you'll see often enough is a type <span style="font-family: 'Courier New', Courier, monospace;">X</span>, with a <span style="font-family: 'Courier New', Courier, monospace;">trait XApi</span>, and possibly a <span style="font-family: 'Courier New', Courier, monospace;">class XBase</span>. The logic is the following: <span style="font-family: 'Courier New', Courier, monospace;">X</span> is used in the declarations of methods, and it is sometimes declared as a <span style="font-family: 'Courier New', Courier, monospace;">type</span> with upper and lower bounds. The traits by the name <span style="font-family: 'Courier New', Courier, monospace;">XApi</span>, often used as upper bounds on <span style="font-family: 'Courier New', Courier, monospace;">X</span>, refer to declarations on <span style="font-family: 'Courier New', Courier, monospace;">scala.reflect.api</span>, the full fledged reflection provided by <span style="font-family: 'Courier New', Courier, monospace;">scala-reflect.jar</span>. <span style="font-family: 'Courier New', Courier, monospace;">XBase</span> refers to things declared at <span style="font-family: 'Courier New', Courier, monospace;">scala.reflect.base</span>, the minimal implementation provided on the standard library.<br />
<br />
You'll often see class <span style="font-family: 'Courier New', Courier, monospace;">XExtractor</span> as well. That contains <span style="font-family: 'Courier New', Courier, monospace;">apply</span> and <span style="font-family: 'Courier New', Courier, monospace;">unapply</span> methods which enables its instances to be used to create instances of <span style="font-family: 'Courier New', Courier, monospace;">X</span>, or pattern match them. For each <span style="font-family: 'Courier New', Courier, monospace;">XExtractor</span>, you'll find a <span style="font-family: 'Courier New', Courier, monospace;">val X</span> with type <span style="font-family: 'Courier New', Courier, monospace;">XExtractor</span>. On <span style="font-family: 'Courier New', Courier, monospace;">scala.reflect.api</span>, classes will usually be <span style="font-family: 'Courier New', Courier, monospace;">abstract class</span>. Elsewhere, you may find concrete implementations.<br />
<br />
And speaking of value members, beside those extractors I just mentioned, you'll often see <span style="font-family: 'Courier New', Courier, monospace;">YTag</span><span style="background-color: white;">, which are constants for </span><span style="background-color: white; font-family: 'Courier New', Courier, monospace;">ClassTag</span>, and are used to preserve identities of abstract types from erasure, so that pattern matching works correctly. Sometimes you'll also see <span style="background-color: white; font-family: 'Courier New', Courier, monospace;">YTpe</span>, <span style="font-family: 'Courier New', Courier, monospace;">Type</span> constants, which expose core types.<br />
<br />
There are few methods. Their main use seems to be creating new stuff, though some of them find stuff through implicit values (most prominently, <span style="font-family: 'Courier New', Courier, monospace;">typeOf</span>). There's also some helper methods, to let you interactively discover what you have.<br />
<h4>
Basic Concepts: What's X?</h4>
<div>
Ok, there's a lot of X in there, and, to be honest, I don't know much about them. However, most of them are subtypes of one of a few main concepts, or repetitions following the type/api/base pattern I mentioned above. I'll quote again from Scala Reflect Pre-SIP, linked to their definitions on <span style="font-family: 'Courier New', Courier, monospace;">scala.reflect.api.JavaUniverse</span>:</div>
<br />
<table><tbody>
<tr><th align="left">Name</th><th align="left">Description</th></tr>
<tr><td><a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.api.Symbols$SymbolApi" target="_blank">Symbol</a></td><td>Types representing definitions</td></tr>
<tr><td><a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.api.Types$TypeApi" target="_blank">Type</a></td><td>Types representing types</td></tr>
<tr><td><a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.api.Trees$TreeApi" target="_blank">Tree</a></td><td>Types representing abstract syntax trees</td></tr>
<tr><td><a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.api.Names$NameApi" target="_blank">Name</a></td><td>Types representing term- and type-names</td></tr>
<tr><td><a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.api.AnnotationInfos$AnnotationInfoApi" target="_blank">AnnotationInfo</a></td><td>Types representing annotations</td></tr>
<tr><td><a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.api.PositionApi" target="_blank">Position</a></td><td>Types representing positions of tree nodes in source files</td></tr>
<tr><td><a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.api.FlagSets$FlagValuesApi" target="_blank">FlagSet</a></td><td>Type representing sets of flags that apply to symbols and definition trees</td></tr>
<tr><td><a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.api.Constants$ConstantApi" target="_blank">Constant</a></td><td>Type representing compile-time constants</td></tr>
</tbody></table>
<br />
Let's start with the first one. Anything you define in Scala has a <span style="font-family: 'Courier New', Courier, monospace;">Symbol</span>. If you give something a name, then it has a <span style="font-family: 'Courier New', Courier, monospace;">Symbol</span> associated with it. If you didn't give it a name, but you could have, then it has a <span style="font-family: 'Courier New', Courier, monospace;">Symbol</span>.<br />
<br />
A <span style="font-family: 'Courier New', Courier, monospace;">Symbol</span> has a <span style="font-family: 'Courier New', Courier, monospace;">Name</span>, a <span style="font-family: 'Courier New', Courier, monospace;">Type</span>, an owner (from which one can extract some other relationships, such as enclosing method, class and package), possibly a companion, and some flags with a bunch of information about it. Of particular relevance inside the compiler, a <span style="font-family: 'Courier New', Courier, monospace;">Symbol</span> is mutable (see method <span style="font-family: 'Courier New', Courier, monospace;">setFlags</span>). However, at run time it is immutable.<br />
<br />
I'll skip briefly ahead and talk about <span style="font-family: 'Courier New', Courier, monospace;">Name</span>. Scala has two namespaces: values and types. So a single <span style="font-family: 'Courier New', Courier, monospace;">String</span> can refer to different things according to the namespace. All names, however, are either <span style="font-family: 'Courier New', Courier, monospace;">TermName</span> (values) or <span style="font-family: 'Courier New', Courier, monospace;">TypeName</span> (types), which makes it possible to use a <span style="font-family: 'Courier New', Courier, monospace;">Name</span> anywhere required without having to specify to which namespace that name belongs.<br />
<br />
That's why, for example, I did not have to say I was looking for a <i>value member</i> called <span style="font-family: 'Courier New', Courier, monospace;">head</span>, instead of a<i> type member</i> of that name, in the example on part 1.<br />
<br />
Names also abstract their representation on a classfile, where they have to be encoded to be considered valid identifiers.<br />
<br />
<span style="font-family: 'Courier New', Courier, monospace;">Type</span> is where most of the reflective action is, in my opinion. A type is not, say, <span style="font-family: 'Courier New', Courier, monospace;">Int</span> -- that's just its <span style="font-family: 'Courier New', Courier, monospace;">Symbol</span> (assuming we are talking about <span style="font-family: 'Courier New', Courier, monospace;">scala.Int</span>, and not just the name). A <span style="font-family: 'Courier New', Courier, monospace;">Type</span> is the information about all members that compose that thing: methods, fields, type parameters, nested classes and traits, etc. If a <span style="font-family: 'Courier New', Courier, monospace;">Symbol</span> represents a definition, a <span style="font-family: 'Courier New', Courier, monospace;">Type</span> represents the whole structure of that definition. It is the union of all definitions that compose a class, the description of what goes into a method and what comes out, etc.<br />
<br />
Some types are simpler than others, of course. On the example at part 1, we looked at the three types that can represent a method. Other types, such as those for traits and classes, will inevitably be much more complex. We saw that too, as we searched inside the type <span style="font-family: 'Courier New', Courier, monospace;">intMethods</span> received, looking for the methods we wanted.<br />
<br />
And, because <span style="font-family: 'Courier New', Courier, monospace;">Type</span> can be so many different things, you'll often have to pattern match against a specific subclass of <span style="font-family: 'Courier New', Courier, monospace;">Type</span>, or its extractor, also as seen on that example. Types are immutable, though note that, inside the compiler, the symbols it contains are mutable.<br />
<br />
A Tree is the compiler's internal representation for source code. It is what is usually called an Abstract Syntax Tree, or AST. A <span style="font-family: 'Courier New', Courier, monospace;">Tree</span>, like <span style="font-family: 'Courier New', Courier, monospace;">Type</span>, is composed of many different subclasses, which together represent everything that has semantic meaning on Scala source code.<br />
<br />
Like names, a <span style="font-family: 'Courier New', Courier, monospace;">Tree</span> can be of a <span style="font-family: 'Courier New', Courier, monospace;">Type</span> (<span style="font-family: 'Courier New', Courier, monospace;">TypTree</span> -- <i>sic</i>), or of non-types (<span style="font-family: 'Courier New', Courier, monospace;">TermTree</span>). Either way, it is further broken down into subclasses.<br />
<br />
Trees have children (other trees), a <span style="font-family: 'Courier New', Courier, monospace;">Position</span> (<span style="font-family: 'Courier New', Courier, monospace;">pos</span>), a <span style="font-family: 'Courier New', Courier, monospace;">Type</span> (<span style="font-family: 'Courier New', Courier, monospace;">tpe</span>) and a <span style="font-family: 'Courier New', Courier, monospace;">Symbol</span> (<span style="font-family: 'Courier New', Courier, monospace;">symbol</span>), though the latter might be <span style="font-family: 'Courier New', Courier, monospace;">null</span>.<br />
<br />
<span style="font-family: 'Courier New', Courier, monospace;">Tree</span> is immutable at run time as well, but one can set symbols, types and positions are compile time, with macro's universe. We'll look more at trees when we discuss macros.<br />
<br />
The remaining concepts are not important to the task at hand, so I'll not go into them. I'd like to draw attention, however, to <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.api.StandardDefinitions" style="font-family: 'Courier New', Courier, monospace;" target="_blank">StandardDefinitions</a><span style="font-family: inherit;">, which is a superclass of </span><span style="font-family: 'Courier New', Courier, monospace;">JavaUniverse</span><span style="font-family: inherit;">, and, therefore, directly available through it, </span><span style="font-family: 'Courier New', Courier, monospace;"><a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.api.StandardDefinitions$DefinitionsApi" target="_blank">DefinitionsApi</a></span><span style="font-family: inherit;">, which is available through the method definitions on the previous trait. Together, they contain many symbols and types for things that are fundamental to the Scala language.</span><br />
<h4>
The Serializer</h4>
</div>
<div>
Let's start writing that serializer, then. First, I'd like to make clear the following:</div>
<div>
<ul>
<li>This serialization is not comprehensive.</li>
<li>This serialization is not correct (no edge cases treated, like quotes inside strings).</li>
<li>I'm making no claims on its speed.</li>
<li>THIS IS NOT PRODUCTION CODE.</li>
</ul>
Now that I trust people will not shoot themselves in the foot <i>and blame me </i>(they can still shoot themselves without blaming me), let me state that this code is actually pretty general, and not particularly difficult to improve to the point of being useful. But there's nothing to be gained by doing so (in this blog), and plenty good JSON serializers out there.</div>
<div>
<br /></div>
<div>
So, what <i>will</i> this serializer handle? I'm planning on the following:</div>
<div>
<ul>
<li>Strings (without handling required escaping).</li>
<li>AnyVal (without bothering with correcting Char or Unit).</li>
<li>Any class whose constructor parameters are preserved as fields.</li>
</ul>
The last one cover single parameter list case classes which, together with the first two, makes a reasonably comprehensive, if flawed, set of serializable things.<br />
<br />
For the remainder of this post, I assume you have this import:</div>
<div>
<br /></div>
<pre class="brush: scala">import scala.reflect.runtime.universe._</pre>
<div>
<br /></div>
<div>
The signature for my serializer is this:</div>
<div>
<br /></div>
<pre class="brush: scala">def serialize[T : TypeTag](obj: T): String
</pre>
<div>
<br /></div>
<div>
<span style="font-family: 'Courier New', Courier, monospace;"><a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/reflect/base/TypeTags$TypeTag.html" target="_blank">TypeTag</a></span> deserves an explanation. A <span style="font-family: 'Courier New', Courier, monospace;">TypeTag</span> is a type class which allows us to retrieve a <span style="font-family: 'Courier New', Courier, monospace;">Type</span>. That is, it provides the <span style="font-family: 'Courier New', Courier, monospace;">Type</span> equivalent of the <span style="font-family: 'Courier New', Courier, monospace;">getClass</span> method, but through a type class pattern. Its implicit can be used by <span style="font-family: 'Courier New', Courier, monospace;">typeOf</span> to retrieve the <span style="font-family: 'Courier New', Courier, monospace;">Type</span>.<br />
<br />
However, a <span style="font-family: 'Courier New', Courier, monospace;">TypeTag</span> is not just a container for <span style="font-family: 'Courier New', Courier, monospace;">Type</span>. It brings an <span style="font-family: 'Courier New', Courier, monospace;">Universe</span> and a <span style="font-family: 'Courier New', Courier, monospace;"><a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/reflect/api/JavaUniverse$JavaMirror.html" target="_blank">Mirror</a></span> with it, and the latter is associated with class loaders (a <span style="font-family: 'Courier New', Courier, monospace;">TypeTag</span> allows the creation of a <span style="font-family: 'Courier New', Courier, monospace;">Type</span> on any mirror -- we'll look at mirrors later). That means, while getting a <span style="font-family: 'Courier New', Courier, monospace;">Type</span> from a <span style="font-family: 'Courier New', Courier, monospace;">TypeTag</span> is easy, getting a <span style="font-family: 'Courier New', Courier, monospace;">TypeTag</span> needs more than just a <span style="font-family: 'Courier New', Courier, monospace;">Type</span>.<br />
<br />
With all that in mind, let's proceed.</div>
<h4>
Values and Strings</h4>
<div>
We start with values and strings, which are the easiest ones to handle if we don't treat special cases. If <span style="font-family: 'Courier New', Courier, monospace;">obj</span> is either a value (a descendant of <span style="font-family: 'Courier New', Courier, monospace;">AnyVal</span>), just output its string representation. If it's a <span style="font-family: 'Courier New', Courier, monospace;">String</span>, output it between double quotes.</div>
<div>
<br /></div>
<div>
We just have to compare types. Given a <span style="font-family: 'Courier New', Courier, monospace;">TypeTag</span>, I can use <span style="font-family: 'Courier New', Courier, monospace;">typeOf</span> on abstract types to get their <span style="font-family: 'Courier New', Courier, monospace;">Type</span>. We an also call <span style="font-family: 'Courier New', Courier, monospace;">typeOf</span> on <span style="font-family: 'Courier New', Courier, monospace;">AnyVal</span>, but there's a pre-defined constants for it which we'll make use of. So, for now, this is what we have:</div>
<div>
<br /></div>
<pre class="brush: scala">val objType = typeOf[T]
if (objType <:< typeOf[String]) s""""$obj""""
else if (objType <:< definitions.AnyValTpe) obj.toString
else "Unknown"
</pre>
<div>
<br /></div>
<div>
I'm going to refactor this code further on, so I'm not writing the full body of the method for now. Next, let's do collections. To handle them, I need to recurse, and that causes a problem... we can get the <span style="font-family: 'Courier New', Courier, monospace;">Type</span> for the type parameter of a collection, but, as we saw, getting a <span style="font-family: 'Courier New', Courier, monospace;">TypeTag</span> from a <span style="font-family: 'Courier New', Courier, monospace;">Type</span> is hard.<br />
<br />
Since we don't really need a <span style="font-family: 'Courier New', Courier, monospace;">TypeTag</span> anywhere, we'll create a nested method to recurse on, and pass <span style="font-family: 'Courier New', Courier, monospace;">Type</span> explicitly everywhere.<br />
<br />
To gather all collections in one step, we'll compare them against <span style="font-family: 'Courier New', Courier, monospace;">GenTraversable</span>, but that causes another trouble for us. We cannot write <span style="font-family: 'Courier New', Courier, monospace;">typeOf[GenTraversable]</span>, without a type parameter, nor do we have a known type parameter to put there. Even if we were to retrieve the type parameter of <span style="font-family: 'Courier New', Courier, monospace;">objType</span> -- which might not exist in first place -- we cannot insert it on our source code. We'll have to do something else instead, to get at that type parameter.<br />
<br />
There's a method on <span style="font-family: 'Courier New', Courier, monospace;">Type</span> which help us here. The method <span style="font-family: 'Courier New', Courier, monospace;">baseType</span> on <span style="font-family: 'Courier New', Courier, monospace;">Type</span> return a new <span style="font-family: 'Courier New', Courier, monospace;">Type</span> that corresponds to the symbol passed to it if that <span style="font-family: 'Courier New', Courier, monospace;">Type</span> is an ancestor of the first type. Ok, that's confusing as hell, so let's explain it with code:<br />
<br />
<pre class="brush: scala">scala> typeOf[List[Int]].baseType(typeOf[Seq[Int]].typeSymbol)
res26: reflect.runtime.universe.Type = Seq[Int]
scala> typeOf[List[Int]].baseType(typeOf[Seq[_]].typeSymbol)
res27: reflect.runtime.universe.Type = Seq[Int]
scala> typeOf[List[Int]].baseType(typeOf[Seq[String]].typeSymbol)
res28: reflect.runtime.universe.Type = Seq[Int]
scala> typeOf[Set[Int]].baseType(typeOf[Seq[Int]].typeSymbol) // Set is not <:< Seq
res29: reflect.runtime.universe.Type = <notype>
scala> typeOf[Seq[Int]].baseType(typeOf[List[Int]].typeSymbol) // Seq is not <:< List
res30: reflect.runtime.universe.Type = <notype>
scala> typeOf[Seq[Int]].baseType(typeOf[List[Int]].typeSymbol) == NoType
res31: Boolean = true
scala> typeOf[Seq[String]].typeSymbol // Symbols do not have type parameters
res32: reflect.runtime.universe.Symbol = trait Seq
</pre>
<br />
I hope that's clearer now. You might have expected using <span style="font-family: 'Courier New', Courier, monospace;">baseType</span> with <span style="font-family: 'Courier New', Courier, monospace;">Set</span> and <span style="font-family: 'Courier New', Courier, monospace;">Seq</span> to return <span style="font-family: 'Courier New', Courier, monospace;">Iterable</span>, but that's the "least upper bound", available through the <span style="font-family: 'Courier New', Courier, monospace;">lub</span> helper method on universes.<br />
<br />
All we need now is to try to find a <span style="font-family: 'Courier New', Courier, monospace;">baseType</span>, and, if we didn't get a <span style="font-family: 'Courier New', Courier, monospace;">NoType</span> back, we have a collection and we can recurse on it, producing a JSON array. Here's all of it, with the new inner recursion method:<br />
<br />
<pre class="brush: scala">import scala.collection.GenTraversable
def serialize[T : TypeTag](obj: T): String = {
def recurse(obj: Any, objType: Type): String = {
if (objType <:< typeOf[String]) s""""$obj""""
else if (objType <:< definitions.AnyValTpe) obj.toString
else if (objType.baseType(typeOf[GenTraversable[_]].typeSymbol) != NoType) {
val refinedType = objType.baseType(typeOf[GenTraversable[_]].typeSymbol)
val TypeRef(_, _, argumentType :: Nil) = refinedType
val objAsColl = obj.asInstanceOf[GenTraversable[_]]
val recursedSeq = objAsColl map (recurse(_, argumentType))
recursedSeq.mkString("[ ", ", ", " ]")
} else "Unknown"
}
recurse(obj, typeOf[T])
}
</pre>
<br />
That's pretty neat as far as it goes, -- and please notice the pattern matching used to extract the type parameter from GenTraversable -- but there are some issues with it. For instance, if the static type of the collection is something like <span style="font-family: 'Courier New', Courier, monospace;">List[Any]</span>, the elements we recurse into will be serialized as Unknown. If this seems like a small penalty, consider that any ADT will be represented by its top-most class. So, for instance, if I have a <span style="font-family: 'Courier New', Courier, monospace;">List[Tree] </span>where <span style="font-family: 'Courier New', Courier, monospace;">Tree</span> can be either the case class <span style="font-family: 'Courier New', Courier, monospace;">Node</span> or the case class <span style="font-family: 'Courier New', Courier, monospace;">Leaf</span>, we'll not be able to break it into nodes and leafs: we'll only see <span style="font-family: 'Courier New', Courier, monospace;">Tree</span>.<br />
<br />
One possibility to correct that is to go from the object's <span style="font-family: 'Courier New', Courier, monospace;">Class</span> to its <span style="font-family: 'Courier New', Courier, monospace;">Type</span><span style="font-family: inherit;">, taking advantage of the fact that we can always know the former at run time. This might lead to the opposite problem, where we serialize classes that are not meant to be visible. That would happen with </span><span style="font-family: 'Courier New', Courier, monospace;">List</span><span style="font-family: inherit;">'s subclass </span><span style="font-family: 'Courier New', Courier, monospace;">::</span><span style="font-family: inherit;">, for example, if the collection check did not catch it. But, since we are learning here, let's do it.</span><br />
<span style="background-color: white;"><br /></span>
<span style="background-color: white;">Before we do so, however, we need to talk about mirrors...</span><br />
<h4>
<a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.api.JavaUniverse$JavaMirror" target="_blank">Mirror</a> Images</h4>
What better way to get a reflection than use a mirror?<br />
<br />
Sorry, I had to get that pun out. You probably know how Java reflection works: we have handles representing classes, methods, fields, etc. You then call methods on these handles to get other handles and, finally, you pass an instance to a method on one of these handles to act upon that instance. For example, from <span style="font-family: 'Courier New', Courier, monospace;">Class</span> you get <span style="font-family: 'Courier New', Courier, monospace;">Method</span>, and on <span style="font-family: 'Courier New', Courier, monospace;">Method</span> you call invoke passing an instance to execute it.<br />
<br />
Now, there are some problems with that, but I'm not the person to argue them. I refer you to Reflecting Scala by Yohann Coppel (<a href="http://lampwww.epfl.ch/teaching/projects/archive/coppel_report.pdf" target="_blank">PDF</a>), and to Coppel's own main reference, Gilad Bracha's Mirrors: Design Principles for Meta-level Facilities of. Object-Oriented Programming Languages (<a href="http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CEwQFjAA&url=http%3A%2F%2Fbracha.org%2Fmirrors.pdf&ei=4PTxT63-MaOj6wHCwNnOBg&usg=AFQjCNFDl0VFKZLOAGMjUXLgUUEqrMWTMg&sig2=GRxPJGIXlJaQt25eG1vyGg" target="_blank">PDF</a>).<br />
<br />
I'll concentrate on what they are and how you use them. From Reflection Pre-SIP, a mirror defines a hierarchy of symbols starting with the root package _root_ and provides methods to locate and define classes and singleton objects in that hierarchy.<br />
<br />
Each universe may have one or more mirrors, and each mirror is tied to a Class Loader. This latter point is very important, because class loaders may load different classes with the same name. Of note, as well, class loaders may delegate to each other, so finding a class may involve doing the same search through a chain of class loaders.<br />
<br />
Mirrors mostly abstract all this away, but, as a consequence, they are tied to class loaders, which means you must reflect through a mirror that knows the class of that instance.<br />
<br />
There's a default mirror that will <i>usually</i> work, <span style="font-family: 'Courier New', Courier, monospace;">scala.reflect.runtime.currentMirror</span>. That will use the class loader of the class that invokes it to create the mirror, which will usually be ok, but I'll avoid using it. Instead, I'll use a method provided by the universe to create a new mirror based on the class loader for the class of the instance. It goes like this:<br />
<br />
<pre class="brush: scala">runtimeMirror(obj.getClass.getClassLoader)
</pre>
<br />
That's pretty straight-forward, and, I, at least, got used to it pretty quickly. I had a bit of resistance at first, but knowing it will always get the right classloader makes me feel more confident.<br />
<br />
Now that we saw two distinct ways of getting a <span style="font-family: 'Courier New', Courier, monospace;">Mirror</span>, let's talk about how we use it. Mirrors are supposed to reflect <i>the language</i>, and not its compile time representation. Each instance, class, field, method and module (packages or singleton objects) is reflected through a corresponding mirror.<br />
<br />
Through these mirrors we can act on the things they mirror. Note, however, that you do not pass the instance to these mirrors: you generate a mirror for the instance (an <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.api.Mirrors$InstanceMirror" target="_blank"><span style="font-family: 'Courier New', Courier, monospace;">InstanceMirror</span></a>), and, from there, you can get mirror to its methods, fields and class. You can also get a <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.api.Mirrors$ClassMirror" target="_blank"><span style="font-family: 'Courier New', Courier, monospace;">ClassMirror</span></a> without an instance, but you won't be able to do much more than get a mirror for the constructor or its companion object.<br />
<br />
Let's put that in another way. A <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.api.Mirrors$MethodMirror" target="_blank"><span style="font-family: 'Courier New', Courier, monospace;">MethodMirror</span></a> is associated not only with a method, but with a method on a specific instance. You don't pass an instance to it to invoke the method, you just invoke it.<br />
<h4>
Of Mirrors and Types</h4>
</div>
<div>
So, to put it all together, to invoke a method on an instance through reflection you need to:</div>
<div>
<br /></div>
<pre class="brush: scala">val obj = "String"
val objClass = obj.getClass
val classClassLoader = objClass.getClassLoader
val classLoaderMirror = runtimeMirror(classClassLoader)
val classSymbol = classLoaderMirror.classSymbol(objClass)
val classType = classSymbol.typeSignature
val methodName = newTermName("length")
val methodSymbol = classType.member(methodName).asMethod
val instanceMirror = classLoaderMirror.reflect(obj)
val methodMirror = instanceMirror.reflectMethod(methodSymbol)
methodMirror.apply() // == (obj.length: Any)
</pre>
<div>
<br /></div>
<div>
Got all that? That, of course, assumes we know that there is a method named <span style="font-family: 'Courier New', Courier, monospace;">length</span> on the thing we are reflecting about. If that we not the case, we would get a nasty surprise when we tried to retrieve the method's symbol.<br />
<br />
More importantly right now, we also saw how we go from an instance to that instance's type, though we loose any type parameters along the way (or, more precisely, they are abstract).<br />
<h4>
Untagged Serialization</h4>
</div>
<div>
Let's then rewrite what we had before, but without <span style="font-family: 'Courier New', Courier, monospace;">TypeTag</span>.</div>
<div>
<br /></div>
<pre class="brush: scala">import scala.collection.GenTraversable
def serialize(obj: Any): String = {
val mirror = runtimeMirror(obj.getClass.getClassLoader)
val objType = mirror.classSymbol(obj.getClass).typeSignature
val StringTpe = mirror.classSymbol(classOf[String]).typeSignature
if (objType <:< StringTpe) s""""$obj""""
else if (objType <:< definitions.AnyValTpe) obj.toString // DOES NOT WORK!!!
else if (objType.baseType(typeOf[GenTraversable[_]].typeSymbol) != NoType) {
val refinedType = objType.baseType(typeOf[GenTraversable[_]].typeSymbol)
val TypeRef(_, _, argumentType :: Nil) = refinedType
val objAsColl = obj.asInstanceOf[GenTraversable[_]]
val recursedSeq = objAsColl map serialize
recursedSeq.mkString("[ ", ", ", " ]")
} else "Unknown"
}
</pre>
<div>
<br /></div>
<div>
Anyway, there's one big problem there, right? Because we are coming from Java reflection, comparing to <span style="font-family: 'Courier New', Courier, monospace;">AnyVal</span> doesn't work at all. We can compare against the value types themselves, such as <span style="font-family: 'Courier New', Courier, monospace;">Int</span>, but it won't work here because any Int we find with this method will be boxed, and <span style="font-family: 'Courier New', Courier, monospace;">Integer</span> is not a subtype of <span style="font-family: 'Courier New', Courier, monospace;">Int</span>.</div>
<div>
<br /></div>
<div>
So while we gained something, we lost the ability of handling <span style="font-family: 'Courier New', Courier, monospace;">AnyVal</span> easily. What to do?</div>
<h4>
End of Part 2</h4>
<div>
What a perfect cliffhanger!<br />
<br />
Hopefully, someone will tell me what to do before part 3 comes out, and it will look like I know what I am doing! :-)<br />
<br />
More seriously, we have covered a lot of ground already. Whatever path we take from here will tread on familiar ground, just enriching what we have seen so far a bit more, making the code more elaborate. Until we get to macros, anyway.<br />
<br />
This is a good place to stop and consider what we have seen so far.<br />
<br />
Scala 2.10 will come with a reflection library. That library is built with the cake pattern on layers, so that code common to standard library, reflection library and compiler gets shared. It also keeps the reflection library in sync with the compiler, and makes the reflection more faithful to the language.<br />
<br />
The reflection layers are called Universes, and there are a few of them to choose from. A Universe contains APIs and implementations for a number of interrelated concepts. We saw the most important ones, and used <span style="font-family: 'Courier New', Courier, monospace;">Symbol</span>, <span style="font-family: 'Courier New', Courier, monospace;">Type</span> and <span style="font-family: 'Courier New', Courier, monospace;">Name</span> quite a bit. We also saw a bit of <span style="font-family: 'Courier New', Courier, monospace;">TypeTag</span>, which replaces <span style="font-family: 'Courier New', Courier, monospace;">Manifest</span> on Scala 2.10 (yeah, I know -- I didn't tell you that before).<br />
<br />
We were then surprised with the news that there was something else to learn about: mirrors! It turns out we can't actually <i>do</i> anything without a mirror to an instance, a class, a method or whatever. Mirrors are tied to class loaders, which introduces some complications and a little bit of ceremony, but they aren't really that hard to use.<br />
<br />
Empowered with our newly gained knowledge, we proceeded to tack the serialization problem in two different ways, with mixed results.<br />
<br />
At that point we were left in a major cliffhanger: What path to take? Is there a way around the problems? And what about macros?<br />
<br />
Rest assured, because this is to be continued...<br />
<h4>
Epilogue</h4>
</div>
<div>
While I waited and waited and waited, Alois Cochard wrote <a href="https://github.com/aloiscochard/sherpa" target="_blank">Sherpa</a>, a macro-based serialization toolkit. I'm guessing one can learn a lot by looking at its source code!</div>
Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com1Brazil-14.235004 -51.92528-44.215689 -92.3549675 15.745681000000001 -11.4955925tag:blogger.com,1999:blog-6373963829340632529.post-63417903808517693622012-07-03T19:14:00.001-03:002013-08-26T22:11:15.577-03:00JSON serialization with reflection in Scala! Part 1<h4>
So you want to do reflection?</h4>
Scala 2.10 will come with a <i>Scala</i> reflection library. I don't know how it fares when it comes to reflecting Java -- I'll look into it when I have time. For now, my hands are overflowing with all there is to learn about <span style="font-family: 'Courier New', Courier, monospace;">scala.reflect</span>.<br />
<br />
So, first warning: this is not an easy library to learn. You may gather some recipes to do common stuff, but to understand what's going on is going to take some effort. Hopefully, by the time you read this Scaladoc will have been improved enough to make the task easier. Right now, however, you'll often come upon types which are neither hyperlinked, nor have path information on mouse-over, so it is important to learn where you are likely to find them.<br />
<br />
It will also help to explore this thing from the REPL. Many times as I worked my way towards enlightenment (not quite there yet!), I found myself calling methods on the REPL just to see what type it returned. At other times, however, I stumbled upon exceptions where I expected none! As it happens, class loaders play a role when reflecting Scala -- and REPL has two different mirrors.<br />
<br />
There's also a difference between classes loaded from class files versus classes that have been compiled in memory. Scala stores information about the type from Scala's point of view in the class file. In its absence (REPL-created classes), not all of it can be recomposed.<br />
<br />
Bottom line: having the the code type check correctly is not a guarantee of success, when it comes to reflection. Even if the code compiles, if you did not make sure you have the proper mirror, it will throw an exception at run time.<br />
<br />
In this blog post, I'll try to teach reflection basics, while building a small JSON serializer using Scala reflection. I'll also go, in future posts, through a deserializer and a serializer that takes advantage of macros to avoid slow reflection calls. I'm not sure I'll succeed, but, if I fail, maybe someone will learn from my errors and succeed where I failed.<br />
<br />
As a final note: don't trust me! I'm learning this stuff as I go, and, besides, it isn't set into stone yet. Some of what you'll see below is already scheduled to be changed, other things might change as result of feedback from the community. But every piece of code will be runnable on the version I'm using -- which is something between Scala 2.10.0-M4 and whatever comes next. I'm told, even as I write this, that Scala 2.10.0-M5 is around the corner, and it will provide functionality I'm missing, and fix bugs I've stumbled upon.<br />
<br />
<i><b>Warning:</b> I'm writing this with 2.10.0 ready but not yet announced, and I've just became aware that, at the moment, <b>reflection is not thread-safe.</b> See <a href="https://issues.scala-lang.org/browse/SI-6240" target="_blank">issue 6240</a> for up-to-date information.</i><br />
<h4>
The Reflection Libraries</h4>
When Martin Odersky decided to write Scala's reflection library, he stumbled upon a problem: to do reflection right -- that is, to accomplish everything reflection is supposed to accomplish -- he'd need a significant part of the compiler code duplicated in the standard library.<br />
<br />
The natural solution, of course, would be moving that code <i>to</i> the standard library. That was not an enticing prospect. It happened before, with some code related to the XML support, and the result speaks for itself: the standard library gets polluted with code that makes little sense outside the compiler, and the compiler loses flexibility to change without breaking the standard library API. As time passes, code would start to be duplicated so it could be changed and diverge, and the code in the standard library would become useless and rot.<br />
<br />
So, what to do? Odersky found the solution in a famous Scala pattern: the cake pattern. Scala reflection is a giant multi-layered cake. We call these layers universes, and in them we find types, classes, traits, objects, definitions and extractors for various things. A simple look at what the base universe extends gives us a notion of what's in it: <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/reflect/base/Symbols.html" target="_blank">Symbols</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/reflect/base/Types.html" target="_blank">Types</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/reflect/base/FlagSets.html" target="_blank">FlagSets</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/reflect/base/Scopes.html" target="_blank">Scopes</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/reflect/base/Names.html" target="_blank">Names</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/reflect/base/Trees.html" target="_blank">Trees</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/reflect/base/Constants.html" target="_blank">Constants</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/reflect/base/AnnotationInfos.html" target="_blank">AnnotationInfos</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/reflect/base/Positions.html" target="_blank">Positions</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/reflect/base/TypeTags.html" target="_blank">TypeTags</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/reflect/base/TagInterop.html" target="_blank">TagInterop</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/reflect/base/StandardDefinitions.html" target="_blank">StandardDefinitions</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/reflect/base/StandardNames.html" target="_blank">StandardNames</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/reflect/base/BuildUtils.html" target="_blank">BuildUtils</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/reflect/base/Mirrors.html" target="_blank">Mirrors</a>.<br />
<br />
The base layers provides a simple API, with type members defined by upper bounds where minimal functionality is provided. That base layer is available in the standard library, shared by all users of reflection API: <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.base.Universe" target="_blank">scala.reflect.base.Universe</a>. <span style="background-color: white;">This provides replacements for the now deprecated</span><span style="background-color: white;"> </span><span style="font-family: 'Courier New', Courier, monospace;">Manifest</span><span style="background-color: white;"> </span><span style="background-color: white;">and</span><span style="background-color: white;"> </span><span style="font-family: 'Courier New', Courier, monospace;">ClassManifest</span><span style="background-color: white;"> </span><span style="background-color: white;">used elsewhere in the standard library, and also serves as </span>a minimally functional implementation of reflection, available even when scala-reflect.jar is not referenced.<br />
<br />
The standard library also has a bare bones <i>implementation</i> of that layer, through the class <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.base.Base" target="_blank">scala.reflect.base.Base</a>, an instance of which is provided as <span style="font-family: 'Courier New', Courier, monospace;">scala.reflect.basis</span>, in the <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.package" target="_blank">scala.reflect</a> package object.<br />
<br />
Take a moment and go over that again, because it's a common pattern on the reflection library. The instance <span style="font-family: 'Courier New', Courier, monospace;">basis</span> is a <span style="font-family: 'Courier New', Courier, monospace;">lazy val</span> of type <span style="font-family: 'Courier New', Courier, monospace;">Universe</span>, whose value is an instance of class <span style="font-family: 'Courier New', Courier, monospace;">Base</span>.<br />
<br />
Upon this layer we can find other layers that enrich that API, providing additional functionality. Two such layers can be found on a separate jar file, scala-reflect.jar. There's a base API on package <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.api.package" target="_blank"><span style="font-family: 'Courier New', Courier, monospace;">scala.reflect.api</span></a>, which is also called <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.api.Universe" target="_blank"><span style="font-family: 'Courier New', Courier, monospace;">Universe</span></a>. This API is inherited by the two implementations present in this jar: the java-based runtime <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.api.JavaUniverse" target="_blank"><span style="font-family: 'Courier New', Courier, monospace;">scala.reflect.api.JavaUniverse</span></a>, and <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html#scala.reflect.makro.Universe" target="_blank"><span style="font-family: 'Courier New', Courier, monospace;">scala.makro.Universe</span></a>, used by macros. By the way, the name of this last package will change, at which point this link will break. If I do not fix it, please remind me in the comments.<br />
<br />
Now, notice that these are APIs, interfaces. The actual implementation for <span style="font-family: 'Courier New', Courier, monospace;">scala.reflect.api.JavaUniverse</span> is <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/reflect/runtime/JavaUniverse.html" target="_blank"><span style="font-family: 'Courier New', Courier, monospace;">scala.reflect.runtime.JavaUniverse</span></a>, available through the <span style="font-family: 'Courier New', Courier, monospace;">lazy val scala.reflect.runtime.universe</span><span style="font-family: inherit;"> (whose type is </span><span style="font-family: 'Courier New', Courier, monospace;">scala.reflect.api.JavaUniverse</span><span style="font-family: inherit;">).</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">At any rate, the </span><span style="font-family: 'Courier New', Courier, monospace;">JavaUniverse</span><span style="font-family: inherit;"> exposes functionality that goes through Java reflection to accomplish some things. Not that it is limited by Java reflection, since the compiler can inject some information at compile time, and keeps more precise type information stored on class files. You'll see how far it can go in the example.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">There are more layers available in the compiler jar, but that's beyond our scope, and more likely to change with time. Look at the image below, from the Reflection pre-SIP for an idea of how these layers are structured.</span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://3.bp.blogspot.com/-NwYpvcFpDos/T_CA8T_G4WI/AAAAAAAAAt0/1EVJ4wfHGXc/s1600/reflection-hierarchy.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="480" src="http://3.bp.blogspot.com/-NwYpvcFpDos/T_CA8T_G4WI/AAAAAAAAAt0/1EVJ4wfHGXc/s640/reflection-hierarchy.jpg" width="640" /></a></div>
<span style="font-family: inherit;">And that's what I meant by reflection libraries in the heading: there are multiple versions of this library, exposing different amounts of API. For example, at one point the </span><a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/reflect/base/Positions.html" target="_blank">Positions</a><span style="font-family: inherit;"> API was not available on the standard library, something which was latter changed through feedback from the community. On the compiler, <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/compiler/index.html#scala.tools.nsc.Global" target="_blank">scala.tools.nsc.Global</a> extends </span><a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/compiler/scala/tools/nsc/symtab/SymbolTable.html">SymbolTable</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/compiler/scala/tools/nsc/CompilationUnits.html">CompilationUnits</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/compiler/scala/tools/nsc/plugins/Plugins.html">Plugins</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/compiler/scala/tools/nsc/PhaseAssembly.html">PhaseAssembly</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/compiler/scala/tools/nsc/ast/Trees.html">Trees</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/compiler/scala/tools/nsc/ast/Printers.html">Printers</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/compiler/scala/tools/nsc/ast/DocComments.html">DocComments</a> with <a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/compiler/scala/tools/nsc/ast/Positions.html">Positions</a>, which exposes much more functionality (and complexity) than the layers above.<br />
<h4>
What's it for, anyway?</h4>
<div>
So, what do we do with a reflection library? If you think of Java reflection, you'll end up with a rather stunted view of what reflection should do, given how limited it is. In truth, Java reflection doesn't even reflect the Java language, which is a much more limited language than Scala.</div>
<div>
<br /></div>
<div>
The Scala Reflection pre-SIP suggests some possibilities, though not all are contemplated by what's available on Scala 2.10.0:</div>
<div>
<ul>
<li>Runtime reflection where one explores runtime types and their members, and gains reflective access to data and method invocations. </li>
<li>Reification where types and abstract syntax trees are made available at runtime. </li>
<li>Macros which use reflection at compile time to access their context and produce trees and types.</li>
</ul>
</div>
<div>
<span style="font-family: inherit; white-space: pre-wrap;">It goes on to describe some questions that must be answered by the reflection library:</span></div>
<ul>
<li>What are the members of a certain kind in a type?</li>
<li>What are the types of these members as seen from an instance type?</li>
<li>What are the base classes of a type and at which instance types are they inherited?</li>
<li>Is some type is a subtype of another?</li>
<li>Among overloaded alternatives of a method, which one best matches a list of argument types?</li>
</ul>
Generalizing, when we do reflection, we <i>explore the structure of the program</i>, and <i>interact with that program through this knowledge</i>. In short, <i>we treat programs as data</i>.<br />
<br />
<span style="background-color: white;">That doesn't mean we can open a class and rewrite the code in its methods -- though macros might allow that at some point after 2.10.0. For one thing, bytecode doesn't preserve enough information to reconstruct the source code or even the abstract syntax tree (the compiler representation of the code).</span><br />
<div>
<br /></div>
<div>
A class file contains a certain amount of information about the type of a class, and an instance of a class just knows what its own class is. So, if all you have is an instance, you can't gain knowledge through reflection about the type parameters it might have. You can, however, use that information at compile time through macros, or preserve it as values for use at run time.</div>
<div>
<br /></div>
<div>
Likewise, though you cannot recover the AST (abstract syntax tree) of a method, but you can get the AST for code at compile time, modify it with macros, or even parse, compile and execute code at run time if you use what's available on scala-compiler.jar.<br />
<br />
<br />
Speaking of macros, I'd like to point out the suggestion that "reflection" might be used at compile time. Reflection is usually defined as a run time activity. However, the API used by macros (or the compiler) at compile time will be mostly the same as the one used by programs at run time. I predict Scala programmers will unassumingly talk about "reflection at compile time".<br />
<div>
<h4>
<span style="background-color: white;">"This is all getting a little confusing..."</span></h4>
</div>
</div>
<div>
Ok, so let's see some code to show you what I'm talking about. I'm not going to call methods or create instances, since that requires more information than we have seen so far, and, besides, you could do that stuff with Java reflection already.</div>
<div>
<br /></div>
<div>
Let's start with a simple example. We'll get the type of a <span style="font-family: 'Courier New', Courier, monospace;">List[Int]</span>, search for the method <span style="font-family: 'Courier New', Courier, monospace;">head</span>, and then look at its type:</div>
<div>
<br /></div>
<div>
<pre class="brush: scala">scala> import scala.reflect.runtime.universe._
import scala.reflect.runtime.universe._
scala> typeOf[List[Int]]
res0: reflect.runtime.universe.Type = List[Int]
scala> res0.member(newTermName("head"))
res1: reflect.runtime.universe.Symbol = method head
scala> res1.typeSignature
res2: reflect.runtime.universe.Type = => A
</pre>
<br />
The initial import made the runtime universe, the one that makes use of java reflection, available to us. We then got the the type we wanted with <span style="font-family: 'Courier New', Courier, monospace;">typeOf</span>, which is similar to <span style="font-family: 'Courier New', Courier, monospace;">classOf</span>. After that, we got a handle to the method named <span style="font-family: 'Courier New', Courier, monospace;">head</span>, and, from that handle (a <span style="font-family: 'Courier New', Courier, monospace;">Symbol</span>), we got a <span style="font-family: 'Courier New', Courier, monospace;">Type</span> again.<br />
<br />
Note that the type being returned by the method is "A". It has not been erased, but, on the other hand, is not what we expected from List[Int]. What we got was the true type of head, not the type it has on a List[Int]. Alas, we can do the latter quite easily:<br />
<br />
<pre class="brush: scala">scala> res1.typeSignatureIn(res0)
res3: reflect.runtime.universe.Type = => Int
</pre>
<br />
Which is what we expected. But, now, let's make things a bit more general. Say we are given a instance of something whose static type we know (that is, it isn't typed Any or something like that), and we want to return all non-private methods it has whose return type is Int. Let's see how it goes:<br />
<br />
<pre class="brush: scala">scala> def intMethods[T : TypeTag](v: T) = {
| val IntType = typeOf[Int]
| val vType = typeOf[T]
| val methods = vType.members.collect {
| case m: MethodSymbol if !m.isPrivate => m -> m.typeSignatureIn(vType)
| }
| methods collect {
| case (m, mt @ NullaryMethodType(IntType)) => m -> mt
| case (m, mt @ MethodType(_, IntType)) => m -> mt
| case (m, mt @ PolyType(_, MethodType(_, IntType))) => m -> mt
| }
| }
intMethods: [T](v: T)(implicit evidence$1: reflect.runtime.universe.TypeTag[T])Iterable[(reflect.runtime.universe.MethodSymbol, reflect.runtime.universe.Type)]
scala> intMethods(List(1)) foreach println
(method lastIndexWhere,(p: Int => Boolean, end: Int)Int)
(method indexWhere,(p: Int => Boolean, from: Int)Int)
(method segmentLength,(p: Int => Boolean, from: Int)Int)
(method lengthCompare,(len: Int)Int)
(method last,=> Int)
(method count,(p: Int => Boolean)Int)
(method apply,(n: Int)Int)
(method length,=> Int)
(method hashCode,()Int)
(method lastIndexOfSlice,[B >: Int](that: scala.collection.GenSeq[B], end: Int)Int)
(method lastIndexOfSlice,[B >: Int](that: scala.collection.GenSeq[B])Int)
(method indexOfSlice,[B >: Int](that: scala.collection.GenSeq[B], from: Int)Int)
(method indexOfSlice,[B >: Int](that: scala.collection.GenSeq[B])Int)
(method size,=> Int)
(method lastIndexWhere,(p: Int => Boolean)Int)
(method lastIndexOf,[B >: Int](elem: B, end: Int)Int)
(method lastIndexOf,[B >: Int](elem: B)Int)
(method indexOf,[B >: Int](elem: B, from: Int)Int)
(method indexOf,[B >: Int](elem: B)Int)
(method indexWhere,(p: Int => Boolean)Int)
(method prefixLength,(p: Int => Boolean)Int)
(method head,=> Int)
(method max,[B >: Int](implicit cmp: Ordering[B])Int)
(method min,[B >: Int](implicit cmp: Ordering[B])Int)
(method ##,()Int)
(method productArity,=> Int)
scala> intMethods(("a", 1)) foreach println
(method hashCode,()Int)
(value _2,=> Int)
(method productArity,=> Int)
(method ##,()Int)
</pre>
<br />
Please note the pattern matching above. <span style="font-family: 'Courier New', Courier, monospace;">Tree</span>, <span style="font-family: 'Courier New', Courier, monospace;">Type</span> and <span style="font-family: 'Courier New', Courier, monospace;">AnnotationInfo</span> are ADTs, and often you'll need pattern matching to deconstruct or to obtain a more precise type. For the record, the pattern matches above refer to methods without parameter lists (like getters), normal methods, and methods with type parameters, respectively.<br />
<br />
<b>EDIT</b>: A recent discussion indicates that pattern matching with types may not be reliable. Instead, the code above should have read like this:<br />
<br />
<pre class="brush: scala">def intMethods[T : TypeTag](v: T) = {
val IntType = typeOf[Int]
val vType = typeOf[T]
val methods = vType.members.collect {
case m: MethodSymbol if !m.isPrivate => m -> m.typeSignatureIn(vType)
}
methods collect {
case (m, mt @ NullaryMethodType(tpe)) if tpe =:= IntType => m -> mt
case (m, mt @ MethodType(_, tpe)) if tpe =:= IntType => m -> mt
case (m, mt @ PolyType(_, MethodType(_, tpe))) if tpe =:= IntType => m -> mt
}
}
</pre>
<br />
I decided to put up this edit as quickly as possible, so I'm just posting the revised version of the method, instead of revising the post with usage, etc.<br />
<br />
To finish up the examples, let's see some manipulation of code:<br />
<br />
<pre class="brush: scala">scala> import scala.tools.reflect.ToolBox
import scala.tools.reflect.ToolBox
scala> import scala.reflect.runtime.{currentMirror => m}
import scala.reflect.runtime.{currentMirror=>m}
scala> val tb = m.mkToolBox()
tb: scala.tools.reflect.ToolBox[reflect.runtime.universe.type] = scala.tools.reflect.ToolBoxFactory$ToolBoxImpl@5bbdee69
scala> val tree = tb.parseExpr("1 to 3 map (_+1)")
tree: tb.u.Tree = 1.to(3).map(((x$1) => x$1.$plus(1)))
scala> val eval = tb.runExpr(tree)
eval: Any = Vector(2, 3, 4)
</pre>
<br />
That's code by <a href="http://stackoverflow.com/users/465304/antoras" target="_blank">Antoras</a>, by the way, from <a href="http://stackoverflow.com/a/11056581/53013" target="_blank">this answer</a> on Stack Overflow. While we do not go into any manipulation of the parsed code, we can actually decompose it into <span style="font-family: 'Courier New', Courier, monospace;">Tree</span> sub-components, and manipulate it.<br />
<h4>
End of Part 1</h4>
</div>
<div>
There's still a lot to see! We have touched on some concepts like <span style="font-family: 'Courier New', Courier, monospace;">Type</span> and <span style="font-family: 'Courier New', Courier, monospace;">Tree</span>, but without going into any detail, and there was a lot going on under the covers. That will have to wait a bit, though, to keep these posts at manageable levels.</div>
<div>
<br /></div>
<div>
To recapitulate, Scala 2.10 will come with a Scala reflection library. That library is used by the compiler itself, but divided into layers through the cake pattern, so different users see different levels of detail, keeping jar sizes adequate to each one's use, and hopefully hiding unwanted detail.</div>
<div>
<br /></div>
<div>
The reflection library also integrates with the upcoming macro facilities, enabling enterprising coders to manipulate code at compile time.</div>
<div>
<br /></div>
<div>
We saw the base and runtime layers, mentioned a bit of other layers, then went into some reasons for wanting a reflection library, and some tasks such a library ought to be able to do.</div>
<div>
<br /></div>
<div>
Finally, we looked at bits of a REPL session, where we used the reflection library to find out information we could never have found through plain Java reflection. We also saw a bit of code that made parsing and running Scala code at run time pretty easy.</div>
<div>
<br /></div>
<div>
On part 2, we'll look at the main basic concepts of Scala Reflection's Universe, and how they relate to each other. We'll then proceed to the promised JSON serializer. We'll follow on this series with the deserializer, and end up by taking advantage of macros to avoid reflection at run time.</div>
<div>
<br /></div>
<div>
See you then!</div>
<div>
<br /></div>
Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com6Brazil-14.235004 -51.92528-40.023103500000005 -92.3549675 11.553095500000001 -11.4955925tag:blogger.com,1999:blog-6373963829340632529.post-3704274417428267552012-01-23T18:18:00.000-02:002012-01-23T18:38:41.971-02:00String Interpolation on Scala 2.10One of the <a href="http://docs.scala-lang.org/sips/" target="_blank">Scala Improvement Proposals</a> for Scala 2.10 is <a href="http://docs.scala-lang.org/sips/pending/string-interpolation.html" target="_blank">String Interpolation</a>. It has recently been added to trunk, behind the <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">-Xexperimental</span> flag, and I have started playing with it. At first, I stumbled upon bugs and limitations of its current implementation relative to the proposal, but I finally got something working.<br />
<br />
To be clear: the interpolation works fine, in both of the provided forms (with and without formatting). As usual with Scala, it's the possibility of replicating the functionality for my own purposes that gets me excited. I usually explain the whys and the hows of my code, but in this case a simple paste says it all, imho.<br />
<br />
<pre class="brush: scala">dcs@ayanami:~/github/scala (master)$ scala -Xexperimental
Welcome to Scala version 2.10.0.rdev-4232-2012-01-23-g9a20086 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_26).
Type in expressions to have them evaluated.
Type :help for more information.
scala> case class StringContext(parts: String*) {
| object matching {
| def unapplySeq(s: String): Option[Seq[String]] = {
| val re = (parts map ("\\Q" + _ + "\\E") mkString ("^", "(.*?)", "$")).r
| re findFirstMatchIn s map (_.subgroups)
| }
| }
| def s(args: Any*) = scala.StringContext(parts: _*).s(args: _*)
| }
defined class StringContext
scala> "23/01/2012" match { case matching"$day/$month/$year" => s"$month-$day-$year" }
res0: String = 01-23-2012
</pre><br />
<br />Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com2tag:blogger.com,1999:blog-6373963829340632529.post-24954077942119423422012-01-09T21:50:00.001-02:002012-01-10T02:57:29.986-02:00Adding methods to Scala Collections<span style="font-family: inherit;">I don't like getting much involved in these Scala flame wars, but <a href="http://yz.mit.edu/wp/true-scala-complexity/" target="_blank">this recent article</a> left me a bit irked. At some point, the following statement is made:</span><br />
<div>
<span style="font-family: inherit;"><br /></span></div>
<div>
<i><span style="font-family: inherit;">"<span style="background-color: white; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #373737; line-height: 24px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">it is impossible to insert a new method that behaves like a normal collection method</span><span style="background-color: white; color: #373737; line-height: 24px;">. "</span></span></i></div>
<div>
<i><span style="background-color: white; color: #373737; font-family: inherit; line-height: 24px;"><br /></span></i></div>
<div>
<span style="background-color: white; color: #373737; font-family: inherit; line-height: 24px;">So, for the record, that is just not true. Here's an implementation for the filterMap method:</span></div>
<div>
<span style="background-color: white; color: #373737; font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 15px; line-height: 24px;"><br /></span></div>
<pre class="brush: scala">import scala.collection._
import generic.CanBuildFrom
import mutable.Builder
class FilterMap[A, C <: Seq[A]](xs: C with SeqLike[A, C]) {
def filterMap[B, That](f: A => Option[B])
(implicit cbf: CanBuildFrom[C, B, That]): That = {
val builder: Builder[B, That] = cbf()
xs foreach { x => val y = f(x); if (y.nonEmpty) builder += y.get }
builder.result()
}
}
implicit def toFilterMap[A, C <: Seq[A]](xs: C with SeqLike[A, C]): FilterMap[A, C] = new FilterMap[A, C](xs)
</pre>
<span style="color: #373737; font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif;"><span style="font-size: 15px; line-height: 24px;"><br /></span></span><br />
<span style="background-color: white; color: #373737; font-family: inherit; line-height: 24px;">Is this what the author wanted? No. He wanted too add a method that behaved like a normal collection method <b style="font-style: italic;">to something that isn't a collection.</b> This is so crazy that, not surprisingly, people are misunderstanding him.</span><br />
<span style="color: #373737; font-family: inherit;"><span style="line-height: 24px;"><br /></span></span><br />
<span style="color: #373737; font-family: inherit;"><span style="line-height: 24px;">Now, one may come up and say that Scala does do that (add methods to stuff that are not collection), with String and Array. Yes, it does, and it does so <i>by creating a completely new class with all these methods for each of them</i>. Rest assured that if I create a completely new class for each one, I can add filterMap to String and Array too.</span></span><br />
<span style="color: #373737; font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif;"><span style="font-size: 15px; line-height: 24px;"><br /></span></span>Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com6tag:blogger.com,1999:blog-6373963829340632529.post-23119782348633277482011-12-27T16:27:00.005-02:002012-01-23T18:21:14.498-02:00Using Scala API DocumentationOne link that was missing from my post about <a href="http://dcsobral.blogspot.com/2011/12/scala-on-web.html" target="_blank">Scala on the Web</a> was the link to the Scala API documentation, often called "scaladoc" for short, after the tool that generates it. I tried to put it in, but it kind of broke the narrative and, to be honest, I have a lot to talk about the subject. So I decided to do a whole blog just about it.<br />
<br />
First of all, there are <i>two main links</i> that you can use to access it:<br />
<br />
<ul>
<li><a href="http://www.scala-lang.org/api/current/index.html" target="_blank">The link to the API of the latest Scala release</a>, and</li>
<li><a href="http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/index.html" target="_blank">The link to the nightly Scala build of the next Scala version</a></li>
</ul>
Most of the time, I prefer the second link: it contains improvements to the Scaladoc tool, and the documentation often contains improvements. On the other hand, being a nightly release, it is subject to regresssions in the Scaladoc tool, and the documentation contains information that is not correct for the latest release. For that reason, I keep both links in my bookmarks.<br />
<br />
So, what else is there to say? Well, if you are new to Scala, a lot. Those familiar with the Java equivalent probably find it both familiar and strange: the screen is divided into left and right sections, the right section having description of classes and packages, while the left contains a list of them. Whereas Java splits package and class lists, however, Scala does not. And, of course, the look is completely different. These, however, are skin-deep differences, and I want to get to the bones of the matter.<br />
<br />
<br />
<span class="Apple-style-span" style="font-size: large;"><b>One Doc, Two Frames</b></span><br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://2.bp.blogspot.com/-vHpdsnaUc2A/Tvnyfw_LuXI/AAAAAAAAAfo/5iknrJnJfQ4/s1600/Selection_001.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="267" src="http://2.bp.blogspot.com/-vHpdsnaUc2A/Tvnyfw_LuXI/AAAAAAAAAfo/5iknrJnJfQ4/s640/Selection_001.png" width="640" /></a></div>
<br />
<br />
The first thing I want you to notice are the two search boxes: one on the left, at the very top of the page, and one on the right, a bit lower. <i>Scaladoc is search-oriented!</i> That means you usually don't browse it, looking for stuff, but just type what you want. You can still browse, of course, which is useful when you don't know what you are looking for.<br />
<br />
On the left side you have the package hierarchy and classes, traits and objects belonging to the packages. Note that classes, traits and objects can be members of <i>other</i> classes traits and objects, in which case they won't appear on the left.<br />
<br />
The right side contains information about a selected package, trait, class or object. The URL will change to match what is being displayed on the right side of the screen, so that you can easily bookmark or share links to specific class or object. At the present, there's no way to further refer to a member of a class, such as a specific method. This improvement will be added at some point in the future.<br />
<br />
Typically, you'll search or browse for a class on the left side of the screen, and then browse its contents on the left, or further search for a particular member of that class.<br />
<br />
<span class="Apple-style-span" style="font-size: large;"><b>The Left Frame</b></span><br />
<br />
Let's drill down the left side, then, to get a better understanding of what's there and how to use it. At the top of the left frame, you'll see this:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://4.bp.blogspot.com/-NXmC3xLejA8/Tvn40QqcAnI/AAAAAAAAAf0/oeQJuuDCFmg/s1600/Selection_009.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="72" src="http://4.bp.blogspot.com/-NXmC3xLejA8/Tvn40QqcAnI/AAAAAAAAAf0/oeQJuuDCFmg/s320/Selection_009.png" width="320" /></a></div>
<br />
Topmost is the search box, with a little "x" icon on the right hand to clear the search. Searching for something will <i>hide</i> any traits, objects and classes that don't match, as well as any packages that don't have any matching members.<br />
<br />
Right below that is <i>an index of all existing methods</i>. Click on a letter and you'll get all methods starting with that letter, and the places they are defined at. Click on the hash symbol (#), and you'll get a page with all methods starting with a symbol instead of a letter. For example:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://4.bp.blogspot.com/-dV4bg0irZ4U/Tvn6dV0A2xI/AAAAAAAAAgA/GAfLGRRoyi8/s1600/Selection_011.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="http://4.bp.blogspot.com/-dV4bg0irZ4U/Tvn6dV0A2xI/AAAAAAAAAgA/GAfLGRRoyi8/s320/Selection_011.png" width="315" /></a></div>
<br />
Finally, you have a small caption saying "<b>display packages only</b>". Surprisingly, that's a clickable option. In fact, ScalaDoc is full of clickable parts, so if you are getting familiar with it, my advice is to try to click stuff, just to see what happens. Back to that caption, though, it switches the package hierarchy from displaying all entities to displaying packages only. Clicking it will give you something like this:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://4.bp.blogspot.com/-O6bxuIVdy_o/Tvn7W_BS1II/AAAAAAAAAgM/Vh_OW70mQbQ/s1600/Selection_007.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="294" src="http://4.bp.blogspot.com/-O6bxuIVdy_o/Tvn7W_BS1II/AAAAAAAAAgM/Vh_OW70mQbQ/s320/Selection_007.png" width="320" /></a></div>
<br />
If you click on "<b>show</b>" on one of these packages, it will open up that particular package. If you click on "<b>display all entities</b>", it will revert to the initial mode of display.<br />
<br />
Speaking of "<b>show</b>", let's now look at the various parts of the entities list:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://3.bp.blogspot.com/-yCIC7NwlI0E/Tvn7-S-IjLI/AAAAAAAAAgY/feESeZ-Jdn0/s1600/Selection_010.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="76" src="http://3.bp.blogspot.com/-yCIC7NwlI0E/Tvn7-S-IjLI/AAAAAAAAAgY/feESeZ-Jdn0/s320/Selection_010.png" width="320" /></a></div>
<br />
The darker background indicates package or package objects, and the entities on the lighter background right below it are the classes, objects and traits belonging to that package, as indicated by the icons on their left. Note that, though package objects have other members such as methods, they do not appear on the left frame of Scaladoc.<br />
<br />
The icons for <b>"o</b>", "<b>c</b>" and "<b>t</b>", in dark blue, green and light blue respectively, indicate objects, classes and traits. A name can be shared between an object and a trait or class, as seen for <b>Regex</b> above. Traits and classes cannot share the same name, however, so each name will have at most two icons beside it.<br />
<br />
<i>If you are not familiar with Scala, an object is a singleton, containing what, in Java, would be represented as static.</i><br />
<br />
One thing to realize here is that <i>everything in that image is clickable</i>, except whitespace. You can click on "<b>scala.util.matching</b>" and "<b>scala.util.parsing.combinator</b>", on "<b>hide</b>" and "<b>focus</b>", on <b>"Regex</b>" and "<b>RegexParsers</b>", and on the icons.<br />
<br />
Clicking on a package name will show its traits, classes and objects on the right frame, unless the package is a <i>package object</i>, in which case the right frame will show the other members it might have. This is important because many package objects contain implicit definitions used as helpers with that package. Check, for instance, <a href="http://www.scala-lang.org/api/current/index.html#scala.sys.process.package" target="_blank">scala.sys.process</a>.<br />
<br />
Clicking on "<b>hide</b>" will hide the entities belonging to that particular package, but not any subpackage that might exist. The text will then change to "<b>show</b>", which will revert this action upon being clicked.<br />
<br />
Clicking on "<b>focus</b>" will hide all other packages from view, like this:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://2.bp.blogspot.com/-2RKUaQriDAA/TvoAtmx8s-I/AAAAAAAAAgk/fiYFaVS-9LU/s1600/Selection_012.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://2.bp.blogspot.com/-2RKUaQriDAA/TvoAtmx8s-I/AAAAAAAAAgk/fiYFaVS-9LU/s1600/Selection_012.png" /></a></div>
<br />
Clicking on the "<b>x</b>" icon that appears will revert this action.<br />
<br />
Clicking on an icon for object, class or trait will show information for it on the right frame, as will clicking on the text itself. If an object shares a name with a trait or class, however, clicking on the text will show <i>not the object</i>, but the class or trait, following the assumption that this is what people want most of the time.<br />
<br />
On recent ScalaDoc versions, clicking on an entity will also move the focus to the search box on the right frame, so that you can instantly type<br />
<br />
<span class="Apple-style-span" style="font-size: large;"><b>The Right Frame</b></span><br />
<br />
To begin the discussion of the right, I picked <a href="http://www.scala-lang.org/api/current/index.html#scala.collection.GenSeq" style="font-weight: bold;" target="_blank">GenSeq</a>, which is rich enough in ScalaDoc UI components, but (relatively) poor enough in actual content to fit in here.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://2.bp.blogspot.com/-l5flMnAJfrc/TvoE8n0czQI/AAAAAAAAAgw/7nbIVK7s99g/s1600/Selection_013.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="406" src="http://2.bp.blogspot.com/-l5flMnAJfrc/TvoE8n0czQI/AAAAAAAAAgw/7nbIVK7s99g/s640/Selection_013.png" width="640" /></a></div>
<br />
You may have wondered why the search box is not at the top on the right frame, like it is on the left. The reason for it is our starting point in explaining the right frame.<br />
<br />
<i>The right frame is divided in three parts</i>, the topmost two being shown above. The first part contains general information about the selected entity, and it comprises everything from the top until right before the search box. The second part starts with the search box, and is comprised of everything in that gray background. This part contains <b style="font-style: italic;">display options</b> that affect the third part of the right frame. The third part contains all members of the selected entity, with individual information for each one.<br />
<br />
From the top, then, we have a green or blue background (the former for traits and classes, the latter for objects, packages and package objects) on which the name of the entity is prominently displayed. A big icon beside it indicates what kind of entity it is: <b>t</b> for traits, <b>c</b> for classes, <b>o</b> for objects and <b>p</b> for packages and package objects alike.<br />
<br />
If the icon is slightly folded on the bottom, like the one in the example, clicking on it (or the entity name) will switch the right frame to the <i>companion</i> of that entity. If you are not familiar with Scala, objects, traits and classes that share a name are said to be companions to each other. Clicking on the <b>GenSeq</b> trait above, then, will display the <b>GenSeq</b> object.<br />
<br />
In a smaller font above the entity name is the full path to that entity. Clicking on a path component will display that component.<br />
<br />
After that, in a gray background, comes a description of all classes, traits and type parameters used in the definition of that entity. It does not list classes or traits that are inherited -- that is available further below. Not to sound too repetitive, but clicking on any of the classes and traits will display it... Also, moving the mouse over one of these names will show the full path for that name.<br />
<br />
What follows is the full description and attributes associated with that entity. I deliberately choose one extremely poor in those, so that I could concentrate on what ScalaDoc provides. One of those things is the link to the source code in which that entity is defined. Not all Scala projects provide that attribute, but Scala API itself does. The link currently points to the web interface to the old Subversion repository -- I presume this will be switched to the new repository on Github in the near future. At any rate, all niceties of source version control systems are available on that link. For the Scala API, anyway.<br />
<br />
"<b>Linear Supertypes</b>" and "<b>Known Subclasses</b>" at the bottom of this part can be expanded upon click, to display the exact linearization of an entity's supertypes -- the inheritance precedence -- and all known subclasses. For example, for <b><a href="http://www.scala-lang.org/api/current/index.html#scala.collection.immutable.List" target="_blank">List</a></b> it will show this:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://4.bp.blogspot.com/-Q4bMc_hWXCU/TvoPqvb8ozI/AAAAAAAAAhg/uFUPf-x-FTo/s1600/Selection_017.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="108" src="http://4.bp.blogspot.com/-Q4bMc_hWXCU/TvoPqvb8ozI/AAAAAAAAAhg/uFUPf-x-FTo/s640/Selection_017.png" width="640" /></a></div>
<br />
Also shown above is the tool tip indicating the full path of one of the supertypes being pointed at, just like mentioned above for entity declaration.<br />
<br />
Let's now look at the second part of the right frame:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://2.bp.blogspot.com/-_kW-ylAc5NQ/TvoQ6VL7DjI/AAAAAAAAAhs/kgX8S3INJiw/s1600/Selection_013b.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="168" src="http://2.bp.blogspot.com/-_kW-ylAc5NQ/TvoQ6VL7DjI/AAAAAAAAAhs/kgX8S3INJiw/s640/Selection_013b.png" width="640" /></a></div>
<br />
The search box works pretty much like the one on the left frame, hiding anything that isn't a match, but it includes descriptions on the search as well. Search for a verb, therefore, will often yield good results, unless the verb is too general.<br />
<br />
One particularly nice feature, available on recent versions, is that typing multiple words will search for matches of any one of the words, which makes it easy to display two methods close together on the screen.<br />
<br />
All other options below the search box also change the way things are shown in the third part. The default <b>Ordering</b> mode, <b>Alphabetic</b>, will display <i>all non-private members of an entity</i>, separated in categories which we'll shown below, in alphabetical order. This is different than Java, which only shows full information for members defined or overridden on that class/interface.<br />
<br />
Clicking <b>"By inheritance</b>" will change the display mode to separate the members according to where it was last defined/overridden. Full information will still be displayed, and the members will still be shown in alphabetical order in their own section.<br />
<br />
The <b>Inherited</b> options let you easily filter out inherited methods. Clicking on "<b>Hide All</b>" will toggle off all supertypes, leaving only the entity itself selected. This will hide methods that are not abstract, defined or overridden on the entity. For <b>GenSeq</b>, for example only two methods will be displayed: <b>seq</b> and <b>companion</b>.<br />
<br />
Note that clicking on "<b>Show all</b>" <i>will not</i> select <b>AnyVal</b>, <b>AnyRef</b> or <b>Any</b>. Because the methods defined on these are available on pretty much everything, one rarely needs to see these methods.<br />
<br />
Because many times you might be interested in a particular aspect of an entity, you can also toggle each supertype individually. You can do so to display the methods on <b>Any</b>, for example, or you could look into what <b>Function1</b> has to offer to <b>List</b>.<br />
<br />
The last option, <b>Visibility</b>, will toggle between displaying only public members and everything except private members.<br />
<br />
The last part of the right frame, as mentioned, contains all members of that entity. These are divided into <i>type members</i> and <i>value members</i>. A type member is a <i>trait</i>, a <i>class</i> or a <i>type</i>, and value members are everything else. For example, the <a href="http://www.scala-lang.org/api/current/index.html#scala.util.matching.Regex$" target="_blank">object <b>Regex</b></a> contains this:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://2.bp.blogspot.com/-jukXxXs8nu8/TvoXLt4l07I/AAAAAAAAAh4/vNn1zsjLo2Q/s1600/Selection_018.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="270" src="http://2.bp.blogspot.com/-jukXxXs8nu8/TvoXLt4l07I/AAAAAAAAAh4/vNn1zsjLo2Q/s640/Selection_018.png" width="640" /></a></div>
<br />
Note that anything that shows up as a <i>type member</i> of anything but a package will not be displayed on the left frame, even though you can display it on the right frame by clicking on a link to it.<br />
<br />
Value members are actually divided in three separate sections: abstract value members, concrete value members and deprecated value members. These are shown in that precise order, so that one can easily see all members that must be implemented to make a concrete class, and don't get the screen polluted by methods they shouldn't be using -- deprecated methods -- when browsing.<br />
<br />
Let's look at some important points of value member definitions. The snapshot below was taken from a <b>List</b> with members filtered by "map":<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://3.bp.blogspot.com/-ecJPeOdnb4o/TvoY7MXwBOI/AAAAAAAAAiE/FWMexIdpZ48/s1600/Selection_019.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="400" src="http://3.bp.blogspot.com/-ecJPeOdnb4o/TvoY7MXwBOI/AAAAAAAAAiE/FWMexIdpZ48/s640/Selection_019.png" width="640" /></a></div>
<br />
By default, only a member's definition and the first sentence of its description are shown, like method <b>groupBy</b> above. Clicking on either the small arrow on the left or on the definition itself will show the full information for that member, as seen in the two definitions for <b>map</b>.<br />
<br />
Since Scala supports method overload, methods can have multiple definitions. In this particular case, however, the first definition is not a real one -- <i>this is indicated by the <b>[use case]</b> tag</i>. It is important to understand what use cases are, for two reasons: first, they represent the most common way to use the method, and, second, they are lies. Well-meaning lies, but lies.<br />
<br />
Compare the definition of the two <b>map</b> methods shown. Clearly, the second definition has a lot going on, whereas the first definition is pretty clear: on a <b>List[A]</b> (see definition for <b>List</b>), the <b>map</b> method takes a function that converts an <b>A</b> into a <b>B</b>, and returns a <b>List[B]</b>.<br />
<br />
Though that definition works very fine for <b>List</b>, the <b>map</b> method is not defined on <b>List</b>, but much higher on the hierarchy. And what works for <b>List</b> won't work for a <b>BitSet</b>, for example: since a <b>BitSet</b> is a <b>Set[Int]</b>, if you map those<b> Int</b> into <b>String</b>, you won't be able to return a <b>BitSet</b>! After all, a <b>BitSet</b> cannot be a <b>Set[String]</b>. The same thing happens in other cases: a <b>WrappedString</b> is a <b>Seq[Char]</b>, a <b>Map[A, B]</b> is a <b>Iterable[Tuple2[A, B]]</b> (aka <b>Iterable[(A, B)]</b>), etc. In any of these cases, a <b>map</b> definition like in the use case won't work.<br />
<br />
So the actual definition of <b>map</b> is the second one, which can handle all of these cases. If you need precise information about <b>map</b> -- for example, if you are extended Scala collections -- you can look it up. On the other hand, if you just want to know how a method is used, the use case should be much easier to understand.<br />
<br />
Most of the remaining information is pretty obvious: a description follows, and then various attributes describing how parameters are supposed to be used, what the return type is, etc.<br />
<br />
The final interesting thing here is the <b>Definition Classes</b> shown below each method. This only appears when a method has been inherited from elsewhere, and it indicates both where it was originally declared (which might have been as <i>abstract</i>), and all places where it was <i>overridden</i>, <i>ie</i>, the implementation has been changed.<br />
<br />
And this concludes the tutorial on using the Scala API documentation, as well as the docs for any other library written in Scala. Looking back, it is much longer (and took much more time to write) than I first thought. Yet, rest assured that using ScalaDoc becomes second nature very fast!<br />
<br />Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com10tag:blogger.com,1999:blog-6373963829340632529.post-90747042365586200492011-12-23T16:26:00.001-02:002011-12-23T16:26:12.720-02:00Scala on the WebI was wondering how to approach Scala on a dojo about it, and I kept returning to the idea that I should first introduce Scala on the Web. So, how would I go about it? What sites would I mention? Scala has grown such a rich ecosystem around it since I first started using it that I'm actually feeling a bit lost nowadays. And, let me tell you, that's a nice change! :-)<br />
<br />
Nevertheless, the problem remains. As I started compiling the sites in my head, I quickly realized I'd have to write it down somewhere, and where better than in my own, dusty blog? So, here it is: the DCS guide to Scala on the Web!<br />
<br />
For a long time, the starting point and central hub to Scala resources has been the <a href="http://www.scala-lang.org/" target="_blank">Scala-Lang site</a>. From there you can download Scala, find documentation, guides, support information, news... and it has become dated. Mind you, it still is a great resource, but some alternatives -- some <i>official</i> alternatives! -- have emerged since then.<br />
<br />
One of them is <a href="http://www.typesafe.com/" target="_blank">Typesafe</a>. Typesafe is a startup created around supporting Scala and related technologies. While this might not sound like a starter resource, they provide the <a href="http://www.typesafe.com/stack" target="_blank">Typesafe Stack</a>, a set of tools including not only the language, but also Eclipse support, a build tool, a distributed system framework and a web framework. I think they plan to add more elements to it, but, anyway, go to their <a href="http://www.typesafe.com/stack/download" target="_blank">DOWNLOAD</a> link and you'll have Scala up-and-running in no time.<br />
<br />
Another important resource is the <a href="http://docs.scala-lang.org/" target="_blank">Documentation Site</a>. A newly designed site to replace the old one on scala-lang. Scala-lang even has a link to it, though it still keeps the old one as well. Here you'll find guides, tutorials, cheatsheets, FAQ, and even a glossary of Scala terms and a Scala Style Guide. Oh, and a link to the <a href="http://wiki.scala-lang.org/" target="_blank">Scala Wiki</a> as well.<br />
<br />
While on the wiki, please pay close attention to the <a href="https://wiki.scala-lang.org/display/SW/Tools+and+Libraries" target="_blank">Tools and Libraries</a> page. You might not find everything there, and you might find things that have not been kept up-to-date, but even so it is an invaluable resource.<br />
<br />
And speaking of FAQs, some brave souls took the time to compile <a href="http://stackoverflow.com/tags/scala/info" target="_blank">Stack Overflow Scala Tutorial</a>! The thing is nothing short of amazing: dozens, maybe more than a hundred of the most important questions categorized in 31 different topics, from "Introduction to Scala" to "Functional Scala", with a side dish of five additional topics for further learning.<br />
<br />
By the way, if you don't know <a href="http://stackoverflow.com/" target="_blank">Stack Overflow</a>, it is a question&answer site about programming questions -- only it actually works and is enjoyable to use, as opposed to their competition. More importantly, the Scala community is well represented on Stack Overflow, and you can use it to get questions about Scala answered in no time. In fact, it has probably been asked and answered already. :-)<br />
<br />
The only problem with Stack Overflow is that it doesn't search for symbols, and Scala has a fair share of them. If you have a question about a symbol, use the <a href="http://symbolhound.com/" target="_blank">Symbol Hound</a> site to search for it.<br />
<br />
What else? Well, there's <a href="http://www.planetscala.com/" target="_blank">Planet Scala</a>, a blog feed aggregator, and <a href="http://implicit.ly/" target="_blank">Implicit.ly</a>, a feed of Scala projects that is automatically fed by a plugin on the build system. It doesn't contain all Scala projects, of course, but it goes a long way.<br />
<br />
And if you want to try Scala without ever installing it, there's <a href="http://www.simplyscala.com/" target="_blank">Simply Scala</a>, a Scala tutorial that let you execute the sample code, as well as try out code of your own.<br />
<br />
If you want to get more dee involved with Scala itself, look at its <a href="https://github.com/scala" target="_blank">github account</a>. <a href="https://github.com/scala/scala" target="_blank">Scala,</a> compiler and library, is there, as well as <a href="https://github.com/scala/scala.github.com" target="_blank">the documentation site</a> I mentioned earlier.<br />
<br />
Ok, I'm sure I left a bunch of sites out (like some news aggregators -- I don't follow them, but if you do, send me the link and I'll put them up), but let's address now some of the basic tools most people will be searching for.<br />
<br />
I spoke earlier about a build tool, so let's discuss it a bit. The build tool of preference for Scala is <a href="https://github.com/harrah/xsbt/wiki" target="_blank">SBT</a>. If you search for it on the Internet, you might come up on the link to the old version (up to 0.7.7), which is hosted on Google Code. The newer versions (0.10.0+) are hosted on Github, and are incompatible with the older ones. Make sure you get the new version.<br />
<br />
Alternatively, you may opt to install the <a href="https://github.com/paulp/sbt-extras" target="_blank">SBT Extras</a> script instead. It's the same SBT, but with a starter script that provides a richer set of options. There's also a couple of <a href="https://github.com/harrah/xsbt/wiki/Scripts" target="_blank">alternate starter scripts</a> on the SBT site, called "screpl" and "scalas", which uses SBT to load dependencies while starting Scala's REPL (that's an "interpreter" console, so to speak) or starting Scala shell scripts.<br />
<br />
You don't actually need to download Scala at all if you install SBT: SBT will download whatever version of Scala you tell it your project uses, requiring only that you have a JVM installed.<br />
<br />
What about testing, what should you use? Scala is in the unfortunate position of having two excellent testing frameworks: <a href="http://www.scalatest.org/" target="_blank">ScalaTest</a> and <a href="http://specs2.org/" target="_blank">Specs2</a>. They are both mature, fully-featured, actively developed and supported, and with big communities. There's no way to recommend one over the other: it really comes down to personal preference.<br />
<br />
Both of them also support <a href="http://code.google.com/p/scalacheck/" target="_blank">ScalaCheck</a>, a testing framework that test stuff by automatically generating input and verifying if the specified conditions hold true. ScalaCheck is great to find boundary conditions, and, as mentioned, you can use its checker under both ScalaTest and Specs2.<br />
<br />
You can use existing Java mocking frameworks (Specs2 has special support for Mockito, though Specs, the older version, supported JMock and EasyMock as well), but there's a Scala mocking framework available as well: <a href="http://scalamock.org/" target="_blank">ScalaMock</a>. ScalaMock has advanced features, such as mocking Scala objects (equivalent to Java static members) and even constructors! That means you can actually mock the behavior of a "new" statement without replacing it with factories or dependency injection.<br />
<br />
To wrap it up, web frameworks and database access.<br />
<br />
Scala has many <a href="https://wiki.scala-lang.org/display/SW/Tools+and+Libraries#ToolsandLibraries-Frameworks" target="_blank">web frameworks</a> available, and there's even a healthy reuse of components between these frameworks. For instance, <a href="https://github.com/scalate/scalate" target="_blank">Scalate</a>, a templating engine that can even serve as web server on a stretch, seems to be used by pretty much everyone nowadays. But let's look into the main alternatives.<br />
<br />
<a href="http://liftweb.net/" target="_blank">Lift</a> is the oldest one (at least among the more serious frameworks), and has a strong community. David Pollack, It's author looked into a number of successful frameworks and decided to pick the best of the best, add a few twists of his own, and use Scala's power to provide an incredible piece of software. Among Lift's strengths, there's a strong separation of concerns (web page design and code are strictly separated), a seamless AJAX/Comet integration, powerful wizards for common concerns, and a design that took security considerations into account right from the beginning. If you do go with Lift, however, please make use of their mailing list as support -- that is their main support channel, and they prefer to concentrate their help there.<br />
<br />
To those who are familiar with Sinatra, there's <a href="https://github.com/scalatra/scalatra#readme" target="_blank">Scalatra</a> and <a href="http://bowlerframework.org/" target="_blank">Bowler</a>, that latter being built on top of the former. And I might well be mistaken, but <a href="http://circumflex.ru/index.html" target="_blank">Circumflex</a> seems to go the same way as well.<br />
<br />
If providing a web service is just a small part of your application, you might want to opt for <a href="http://unfiltered.databinder.net/" target="_blank">Unfiltered</a> instead. With Unfiltered, the web server is a library in your application, instead of your application being a servlet inside a framework.<br />
<br />
And if what you really need are web services to interconnect systems, try <a href="https://github.com/jdegoes/blueeyes" target="_blank">Blue Eyes</a>.<br />
<br />
Now, there are a lot of other web frameworks, I'll mention just one other. Typesafe has decided to integrate the <a href="http://scala.playframework.org/" target="_blank">Play! Framework</a> into its stack. Play! offers the ease of web development that PHP has, but with all the advantages Scala has to offer.<br />
<br />
Which leaves us, at last, with databases. Again, there are <a href="https://wiki.scala-lang.org/display/SW/Tools+and+Libraries#ToolsandLibraries-DataStorage" target="_blank">many option</a>s to choose from. If you are doing web development, the framework of your choice probably already has some recommended libraries to deal with it -- their own or others (did I mention that there's a healthy reuse of components? :). I suggest you go with that.<br />
<br />
If not, I can make some suggestions. I don't have personal experience with this, so I'm mostly recommending based on what I perceive to be preferred choices with active communities. There's <a href="http://scalaquery.org/" target="_blank">ScalaQuery</a>, which has always been a favorite of mine (just waiting for a project where I can put it to use :). <a href="https://github.com/nkallen/querulous" target="_blank">Querulous</a> and <a href="http://squeryl.org/" target="_blank">Squeryl</a> also get a lot of traffic, but once you get to NoSQL, the main choice seems to be MongoDB's <a href="https://github.com/mongodb/casbah" target="_blank">Casbah</a> driver. I have played a little bit with it, and it is certainly quite easy to get up and running for small projects or experiments.<br />
<br />
Do look into the existing choices, however, as there might be something better suited to your needs.<br />
<br />
And if you haven't payed attention the first time I linked to it, the <a href="https://wiki.scala-lang.org/display/SW/Tools+and+Libraries" target="_blank">Wiki of Tools and Libraries</a> is a very good resource to get started on specialized, well, tools and libraries. :-)<br />
<br />
If you are new to Scala, I hope this can get you going. If you are experienced, I hope you can use this when helping others. Of course, this post will get old, the links will get outdated, and new cool stuff will come up.<br />
<br />
For now, enjoy!<br />
<br />Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com7tag:blogger.com,1999:blog-6373963829340632529.post-19082731816080524542011-10-12T15:13:00.000-03:002011-10-14T00:37:28.158-03:00String Interpolation on 2.10?Happiness is made of small things...
<br />
<br/>
<pre class="brush:scala">Welcome to Scala version 2.10.0.r25815-b20111010230908 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_23).
Type in expressions to have them evaluated.
Type :help for more information.
scala> val x = 1
x: Int = 1
scala> "\{x}"
res0: String = 1
</pre>
<br />
Requires the -Xexperimental flag. This test case reveals even more interesting things:
<br />
<br />
<pre class="brush:scala">scala> val x = 1.1
x: Double = 1.1
scala> println("We have a \{ x ;2.2f}% chance of success")
We have a 1,10% chance of success
</pre>
<br />
Mind you, there's a lot of things that have been hidden behind -Xexperimental, some of them for a long time now, and some of them ended up canned.
<br />
<b>Edit:</b> The proposal for this extension can be found <a href="http://scala.github.com/sips//pending/string-interpolation.html">here</a>.Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com8tag:blogger.com,1999:blog-6373963829340632529.post-42013874730437570242011-08-30T10:26:00.001-03:002011-08-30T10:42:37.196-03:00A quick detour through Grub and choosing default operating systemI'm going to do a quick detour to talk about <a href="http://www.gnu.org/software/grub/">grub</a>.<br />
<div><br />
</div><div>Grub is the boot manager of choice for Linux systems, and the one installed by default by <a href="http://www.ubuntu.com/">Ubuntu</a>, among others. My desktop at home dual boots between Ubuntu and Windows (I have gaming needs, after all), and Windows is the default operating system, so my wife doesn't have to do anything.</div><div><br />
</div><div>Setting that default was not exactly trivial. The web abounds with instructions on how to choose the default, most of which refer to the previous version of grub. About the newer version, not so much information.</div><div><br />
</div><div>So, in the end I used Ubuntu's <a href="https://help.ubuntu.com/community/StartUpManager">Startup Manager</a> to set this up. By the way, can anyone explain to me why you use Startup Manager to choose which OS to boot, and <a href="http://www.marzocca.net/linux/bum.html">Bootup Manager</a> to choose which processes will start? Sorry, I digress...</div><div><br />
</div><div>That worked nicely until the day I upgraded the system, at which point a new entry was created, changing the relative position of the Windows boot. That was disagreeable, so I decided to take a closer look.</div><div><br />
</div><div>The configuration used by grub during boot is located at <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">/boot/grub</span>, in particular <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">/boot/grub/grub.cfg</span>. But you shouldn't edit this file directly -- it is generated from the scripts located at <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">/etc/grub</span>, plus configuration on <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">/etc/default/grub</span>. Usually, it is this latter file you should edit.</div><div><br />
</div><div>To change the default operating system, you edit <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">/etc/default/grub</span>, change the setting <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">GRUB_DEFAULT</span>, and then run <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">update-grub</span>. Alas, the Startup Manager will do all this for you, but there is one thing the Startup Manager doesn't do...</div><div><br />
</div><div>So, here's the trick. The <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">GRUB_DEFAULT</span> is usually set to a number, indicating the relative position of the entry you want, but it can also be set to a name! To see what the entry names are, you can do a "<span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">grep menuentry /boot/grub/grub.cfg</span>" -- the names are the strings between single or double-quotes right after "<span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">menuentry</span>". For example:</div><div><br />
</div><div><div><span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"></span><br />
<pre><span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">$ grep menuentry /boot/grub/grub.cfg
menuentry 'Ubuntu, with Linux 2.6.38-11-generic' --class ubuntu --class gnu-linux --class gnu --class os {
menuentry 'Ubuntu, with Linux 2.6.38-11-generic (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os {
menuentry 'Ubuntu, with Linux 2.6.38-10-generic' --class ubuntu --class gnu-linux --class gnu --class os {
menuentry 'Ubuntu, with Linux 2.6.38-10-generic (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os {
menuentry 'Ubuntu, with Linux 2.6.38-8-generic' --class ubuntu --class gnu-linux --class gnu --class os {
menuentry 'Ubuntu, with Linux 2.6.38-8-generic (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os {
menuentry "Memory test (memtest86+)" {
menuentry "Memory test (memtest86+, serial console 115200)" {
</span></pre></div></div><div><br />
</div><div>This also shows why Startup Manager uses numbers instead of names: so that the latest Linux is always chosen. This seems a poor choice for me, for anything <i>except</i> the first entry, but there you go.<br />
<br />
Anyway, once you got the name, all you have to do is edit <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">/etc/default/grub</span>, change the setting <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">GRUB_DEFAULT</span>, so that it is set to the name of the entry you want (don't forget the quotes), and then run <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">update-grub</span>.</div><div><br />
</div>Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com3tag:blogger.com,1999:blog-6373963829340632529.post-43893169839257725282011-07-05T13:55:00.000-03:002011-07-05T13:55:20.750-03:00Build a binary tree from pre-order traversal and in-order traversalThis post comes from <a href="http://www.ihas1337code.com/2011/04/construct-binary-tree-from-inorder-and-preorder-postorder-traversal.html">i has 1337 code</a> by way of <a href="http://analgorithmaday.blogspot.com/2011/05/construct-binary-tree-from-pre-order.html">An Algorithm a Day</a>. I'm really just posting this to show the power of Scala collections.<br />
<br />
The idea is that you have two sequential representations of a tree: one for in-order traversal, and the order for pre-order traversal. Just one of these representations are not enough to reconstruct the tree, but the two of them, in the absence of duplicate elements, is. See the links above for an example.<br />
<br />
The problem lends itself to recursive solutions, but there's some list manipulation required to build the input for the recursive steps. The key point of the algorithm is the realization that the first element of pre-order is the root, and every element to the left of said root in the in-order belongs to the left of the tree, and every element to the right belongs to the right of the tree. The rest is pretty trivial.<br />
<br />
Even so, Scala makes the trivial, well, trivial. Here's the full code:<br />
<br />
<pre class="brush: scala">case class Tree[T](el: T, left: Option[Tree[T]], right: Option[Tree[T]])
def mkTree[T](preorder: List[T], inorder: List[T]): Option[Tree[T]] = preorder.headOption map { head =>
val (left, _ :: right) = inorder span (head !=)
Tree(head,
mkTree(preorder filter (left contains), left),
mkTree(preorder filter (right contains), right))
}
</pre><br />
Note: I'm using <code>List</code> instead of <code>Seq</code> so that I can use the <code>::</code> extractor.Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com5Distrito Federal, Brasil-15.826691 -47.9218204-16.102796 -48.4111789 -15.550586000000001 -47.4324619tag:blogger.com,1999:blog-6373963829340632529.post-3586649462185344532011-06-27T16:14:00.000-03:002011-06-27T16:14:17.566-03:00A very quick guide to project creation on SBT 0.10.0<a href="https://github.com/harrah/xsbt/wiki">SBT 0.10.0</a> is out, and it is a very different beast at first. Aside from the need for retooling and relearning, it is a very big improvement.<br />
<br />
However, these minor differences can slow down things a bit. For one thing, previous versions of SBT asked if you wanted to create a project if run on an empty directory, and that does not happen anymore. Now it creates some stuff automatically, but other stuff -- project name, version, scala version, directory layout -- it doesn't. So, let's take a quick look on how to accomplish (roughly) the same tasks.<br />
<br />
In the example below, I named my SBT 0.10.0 script <i>xsbt</i>, so that I could use the older 0.7.7 with projects that are not yet migrated.<br />
<br />
<pre>
dcs@ayanami:~/github$ mkdir TestProject
dcs@ayanami:~/github$ cd TestProject
dcs@ayanami:~/github/TestProject$ xsbt
Getting net.java.dev.jna jna 3.2.3 ...
:: retrieving :: org.scala-tools.sbt#boot-app
confs: [default]
1 artifacts copied, 0 already retrieved (838kB/35ms)
Getting Scala 2.8.1 (for sbt)...
:: retrieving :: org.scala-tools.sbt#boot-scala
confs: [default]
4 artifacts copied, 0 already retrieved (15296kB/232ms)
Getting org.scala-tools.sbt sbt_2.8.1 0.10.0 ...
:: retrieving :: org.scala-tools.sbt#boot-app
confs: [default]
34 artifacts copied, 0 already retrieved (6012kB/215ms)
[info] Set current project to root (in build file:/home/dcs/.sbt/plugins/)
[info] Set current project to default (in build file:/home/dcs/github/TestProject/)
> set name := "TestProject"
[info] Reapplying settings...
[info] Set current project to default (in build file:/home/dcs/github/TestProject/)
> set version := "1.0"
[info] Reapplying settings...
[info] Set current project to default (in build file:/home/dcs/github/TestProject/)
> set scalaVersion := "2.9.0-1"
[info] Reapplying settings...
Getting Scala 2.9.0-1 ...
:: retrieving :: org.scala-tools.sbt#boot-scala
confs: [default]
4 artifacts copied, 0 already retrieved (20447kB/186ms)
[info] Set current project to default (in build file:/home/dcs/github/TestProject/)
> session save
[info] Reapplying settings...
[info] Set current project to default (in build file:/home/dcs/github/TestProject/)
> exit
dcs@ayanami:~/github/TestProject$ find . -type d
.
./project
./project/target
./project/target/config-classes
./project/target/scala_2.8.1
./project/boot
./project/boot/other
./project/boot/other/net.java.dev.jna
./project/boot/other/net.java.dev.jna/jna
./project/boot/other/net.java.dev.jna/jna/3.2.3
./project/boot/scala-2.9.0-1
./project/boot/scala-2.9.0-1/lib
./project/boot/scala-2.8.1
./project/boot/scala-2.8.1/lib
./project/boot/scala-2.8.1/org.scala-tools.sbt
./project/boot/scala-2.8.1/org.scala-tools.sbt/sbt
./project/boot/scala-2.8.1/org.scala-tools.sbt/sbt/0.10.0
./project/boot/scala-2.8.1/org.scala-tools.sbt/sbt/0.10.0/compiler-interface-bin_2.7.7.final
./project/boot/scala-2.8.1/org.scala-tools.sbt/sbt/0.10.0/compiler-interface-src
./project/boot/scala-2.8.1/org.scala-tools.sbt/sbt/0.10.0/compiler-interface-bin_2.9.0.final
./project/boot/scala-2.8.1/org.scala-tools.sbt/sbt/0.10.0/compiler-interface-bin_2.8.1.final
./project/boot/scala-2.8.1/org.scala-tools.sbt/sbt/0.10.0/xsbti
./target
./target/streams
./target/streams/$global
./target/streams/$global/$global
./target/streams/$global/$global/streams
./target/streams/$global/$global/streams/$global
dcs@ayanami:~/github/TestProject$ ls
build.sbt project target
dcs@ayanami:~/github/TestProject$ cat build.sbt
name := "TestProject"
version := "1.0"
scalaVersion := "2.9.0-1"
</pre><br />
So far so good, but note that the directories for source are not created. The new version of SBT expects your IDE to do that (which is rather annoying for us vim users), or so it seems. However, the <a href="https://github.com/typesafehub/sbteclipse">eclipse plugin</a> can do that at the same time it creates the eclipse project. Here's an example:<br />
<br />
<pre>dcs@ayanami:~/github/TestProject$ cat ~/.sbt/plugins/build.sbt
resolvers += {
val typesafeRepoUrl = new java.net.URL("http://repo.typesafe.com/typesafe/releases")
val pattern = Patterns(false, "[organisation]/[module]/[sbtversion]/[revision]/[type]s/[module](-[classifier])-[revision].[ext]")
Resolver.url("Typesafe Repository", typesafeRepoUrl)(pattern)
}
libraryDependencies <<= (libraryDependencies, sbtVersion) { (deps, version) =>
deps :+ ("com.typesafe.sbteclipse" %% "sbteclipse" % "1.1" extra("sbtversion" -> version))
}
dcs@ayanami:~/github/TestProject$ xsbt
[info] Compiling 1 Scala source to /home/dcs/.sbt/plugins/project/target/scala_2.8.1/classes...
[info] Set current project to root (in build file:/home/dcs/.sbt/plugins/)
[info] Compiling 8 Scala sources to /home/dcs/.sbt/staging/a69240767cc8e721757e/target/scala-2.8.1.final/classes...
[info] Set current project to default (in build file:/home/dcs/github/TestProject/)
> eclipse create-src
[info] Updating...
[info] Done updating.
[info] Successfully created Eclipse project files. Please select the appropriate Eclipse plugin for Scala 2.9.0-1!
> exit
dcs@ayanami:~/github/TestProject$ find . -type d
.
./project
./project/target
./project/target/config-classes
./project/target/scala_2.8.1
./project/boot
./project/boot/other
./project/boot/other/net.java.dev.jna
./project/boot/other/net.java.dev.jna/jna
./project/boot/other/net.java.dev.jna/jna/3.2.3
./project/boot/scala-2.9.0-1
./project/boot/scala-2.9.0-1/lib
./project/boot/scala-2.8.1
./project/boot/scala-2.8.1/lib
./project/boot/scala-2.8.1/org.scala-tools.sbt
./project/boot/scala-2.8.1/org.scala-tools.sbt/sbt
./project/boot/scala-2.8.1/org.scala-tools.sbt/sbt/0.10.0
./project/boot/scala-2.8.1/org.scala-tools.sbt/sbt/0.10.0/compiler-interface-bin_2.7.7.final
./project/boot/scala-2.8.1/org.scala-tools.sbt/sbt/0.10.0/compiler-interface-src
./project/boot/scala-2.8.1/org.scala-tools.sbt/sbt/0.10.0/compiler-interface-bin_2.9.0.final
./project/boot/scala-2.8.1/org.scala-tools.sbt/sbt/0.10.0/compiler-interface-bin_2.8.1.final
./project/boot/scala-2.8.1/org.scala-tools.sbt/sbt/0.10.0/xsbti
./target
./target/streams
./target/streams/$global
./target/streams/$global/ivy-sbt
./target/streams/$global/ivy-sbt/$global
./target/streams/$global/ivy-configuration
./target/streams/$global/ivy-configuration/$global
./target/streams/$global/update
./target/streams/$global/update/$global
./target/streams/$global/project-descriptors
./target/streams/$global/project-descriptors/$global
./target/streams/$global/$global
./target/streams/$global/$global/streams
./target/streams/$global/$global/streams/$global
./target/scala-2.9.0.1
./target/scala-2.9.0.1/cache
./target/scala-2.9.0.1/cache/update
./src
./src/main
./src/main/resources
./src/main/java
./src/main/scala
./src/test
./src/test/resources
./src/test/java
./src/test/scala
</pre><br />
That's it! I strongly recommend reading the wiki linked at the beginning of this post, but this will get you going for small stuff.Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com4tag:blogger.com,1999:blog-6373963829340632529.post-53472880831828439102011-05-21T12:49:00.000-03:002011-05-21T12:49:45.307-03:00Scala 2.9 optimizes for comprehensions way better!Ok, I completely missed this. For comprehensions in Scala 2.9 was way better optimized with the parameter <code>-optimize</code> than they were before! Take this code:<br />
<br />
<pre class="brush:scala">class OptEx {
def sum(l: Array[Int]) = {
var acc = 0
for (i <- 0 until l.length) acc += l(i)
acc
}
}
</pre>
This is the java bytecode generated with Scala 2.8.1 for the method <code>sum</code>:
<pre class="brush: text">public int sum(int[]);
Code:
0: new #7; //class scala/runtime/IntRef
3: dup
4: iconst_0
5: invokespecial #12; //Method scala/runtime/IntRef."<init>":(I)V
8: astore_2
9: new #14; //class scala/runtime/RichInt
12: dup
13: iconst_0
14: invokespecial #15; //Method scala/runtime/RichInt."<init>":(I)V
17: aload_1
18: arraylength
19: invokevirtual #19; //Method scala/runtime/RichInt.until:(I)Lscala/collection/immutable/Range$ByOne;
22: new #21; //class OptEx$$anonfun$sum$1
25: dup
26: aload_0
27: aload_1
28: aload_2
29: invokespecial #24; //Method OptEx$$anonfun$sum$1."<init>":(LOptEx;[ILscala/runtime/IntRef;)V
32: invokeinterface #30, 2; //InterfaceMethod scala/collection/immutable/Range$ByOne.foreach$mVc$sp:(Lscala/Functio
n1;)V
37: aload_2
38: getfield #34; //Field scala/runtime/IntRef.elem:I
41: ireturn
</pre>
And this is what Scala 2.9.0 does:
<pre class="brush: text">public int sum(int[]);
Code:
0: new #7; //class scala/runtime/IntRef
3: dup
4: iconst_0
5: invokespecial #12; //Method scala/runtime/IntRef."<init>":(I)V
8: astore 6
10: new #14; //class scala/runtime/RichInt
13: dup
14: iconst_0
15: invokespecial #15; //Method scala/runtime/RichInt."<init>":(I)V
18: aload_1
19: arraylength
20: istore_3
21: astore_2
22: getstatic #21; //Field scala/collection/immutable/Range$.MODULE$:Lscala/collection/immutable/Range$;
25: aload_2
26: invokevirtual #25; //Method scala/runtime/RichInt.self:()I
29: iload_3
30: invokevirtual #29; //Method scala/collection/immutable/Range$.apply:(II)Lscala/collection/immutable/Range;
33: dup
34: astore 8
36: invokevirtual #34; //Method scala/collection/immutable/Range.length:()I
39: iconst_0
40: if_icmple 83
43: aload 8
45: invokevirtual #37; //Method scala/collection/immutable/Range.last:()I
48: istore 4
50: aload 8
52: invokevirtual #40; //Method scala/collection/immutable/Range.start:()I
55: istore 9
57: iload 9
59: iload 4
61: if_icmpne 89
64: iload 9
66: istore 5
68: aload 6
70: aload 6
72: getfield #44; //Field scala/runtime/IntRef.elem:I
75: aload_1
76: iload 5
78: iaload
79: iadd
80: putfield #44; //Field scala/runtime/IntRef.elem:I
83: aload 6
85: getfield #44; //Field scala/runtime/IntRef.elem:I
88: ireturn
89: iload 9
91: istore 7
93: aload 6
95: aload 6
97: getfield #44; //Field scala/runtime/IntRef.elem:I
100: aload_1
101: iload 7
103: iaload
104: iadd
105: putfield #44; //Field scala/runtime/IntRef.elem:I
108: iload 9
110: aload 8
112: invokevirtual #47; //Method scala/collection/immutable/Range.step:()I
115: iadd
116: istore 9
118: goto 57
</pre><br />
Time to take your old benchmarks out of the closet, people!Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com3tag:blogger.com,1999:blog-6373963829340632529.post-3818339161521164432011-05-19T12:46:00.000-03:002011-05-19T12:46:49.468-03:00Regex AgainI have been thinking about regex lately. I have never felt comfortable with how Scala regex works, but I could never settle on what should be done about. Recently, I have started more and more of thinking of regex like this:<br />
<br />
<pre class="brush: scala">class RegexF(pattern: String) extends String => Option[Seq[String]]
</pre><br />
or, perhaps,<br />
<br />
<pre class="brush: scala">class RegexPF(pattern: String) extends PartialFunction[String, Seq[String]]
</pre><br />
In fact, <code>RegexPF.lift</code> would (could) yield a <code>RegexF</code>. It then caught my attention that <code>RegexF.apply</code> has the same signature as <code>Regex.unapplySeq</code>, which is the standard way of handling regex in Scala!<br />
<br />
Might this be what has been bugging me about Scala's regex all along? Should we translate<br />
<br />
<pre class="brush: scala">val YYYYMMDD = """(\d{4})-(\d{2})-(\d{2})""".r
val MMDDYYYY = """(\d{2})/(\d{2})/(\d{4})""".r
def getYear(s: String) = s match {
case YYYYMMDD(year, _, _) => year
case MMDDYYYY(_, _, year) => year
}
</pre><br />
into<br />
<br />
<pre class="brush: scala">val YYYYMMDD = """(\d{4})-(\d{2})-(\d{2})""".r
val MMDDYYYY = """(\d{2})/(\d{2})/(\d{4})""".r andThen (fields => fields.last +: fields.init)
def getYear(s: String) = ((YYYYMMDD orElse MMDDYYYY) andThen (_.head))(s)
</pre><br />
I can certainly see the advantages of pattern matching, but... it doesn't compose very well. And it has some performance issues, which is a big deal for most regex usages. And being a PartialFunction would not prevent a Regex from having extractors as well.Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com2tag:blogger.com,1999:blog-6373963829340632529.post-54744127667824779802011-05-14T23:30:00.000-03:002011-05-14T23:30:40.814-03:00A Cute AlgorithmThese days I <a href="http://analgorithmaday.blogspot.com/2011/05/find-k-th-minimum-in-two-sorted-array.html">read abou</a>t an algorithm challenge: given two sorted arrays, find the k-th minimum element from their merge.<br />
<br />
Well, if you do merge them, you can just get the element at index k, and the merge can be done in O(n + m), where n and m are the respective size of each array.<br />
<br />
The solution given is O(k) and pretty simple: keep an index into each array, and increase one or other until you reach k. It can be done in O(log k), though, and, fortunately for me, my first idea on how to solve it in O(k) is more easily adaptable.<br />
<br />
My own O(k) version is like this: point an index at the k-th element on the first array, and another at the first element of the second array. If the element on the first array is smaller than the element on the first array, return that. Otherwise, as long as the element on the first array is bigger than the element on the second array, decrease the first index and increase the second. After doing that, you'll have the elements on each array that make up the k smallest elements, the k-th being the bigger between the top one in each array.<br />
<br />
In code, something like this:<br />
<br />
<pre class="brush: scala">def kMin(a1: Array[Int], a2: Array[Int], k: Int) = {
def recurse(k2: Int): Int =
if (a1(k - k2 - 1) < a2(k2)) recurse(k2 + 1)
else k2
if (a1(k - 1) < a2(0)) a1(k - 1)
else {
val k2 = recurse(1)
a1(k - k2 - 1) max a2(k2 - 1)
}
}
</pre><br />
Now, that code isn't particularly good, as there are some conditions that can break it. For instance, if the first array's size is smaller than k, you'll get an array index out of bounds exception. However, it gives the basis for explaining the O(k) algorithm.<br />
<br />
Here we search linearly for the k smallest elements of both arrays together, but we know these arrays are sorted. So, instead of going one by one, we can use <i>binary search</i> instead, and turn it into O(log k)!<br />
<br />
The concept is simple. We are looking into the k smallest elements of the two array together, so we know beforehand that the maximum number of elements we need to look into either array is k.<br />
<br />
We'll search one array for the biggest element that is smaller than or equal to the k-th minimum, with the upper bound being the k-th element of that array, and the lower bound being 0 (meaning the k smallest elements are all in the other array).<br />
<br />
To check if the number x-th is among the k-smallest ones, we see if that number is smaller than the (k - x)-th element on the other array. If it is, then x is among the k smallest. The intuitive explanation for that is that, if you take (k - x) elements from the one array and x elements from the other, you get exactly k elements. No element y > x in x's array will be smaller than x, since the array is sorted. And since the (k - x)-th element in the other array is also bigger than it, then no other element in the other array can be smaller either.<br />
<br />
So, as long as we find an element that belongs in the k-th smallest, we move the lower bound. If we find an element that does not belong in the k-th smallest, we move the upper bound below it.<br />
<br />
Once we find how many elements in one array belong in the k-smallest, we also know how many elements we must take from the other array. Pick the biggest among the biggest in each array, and you have the k-th smallest.<br />
<br />
Here's the code below, which is much more concise than the above explanation. It finds the k-th smallest element, with k=1 being the smallest element of all. It assumes there are at least k elements overall on the arrays, though k may be bigger than the number of elements on one array. In fact, either array may be empty (but not both). One can find this code at my <a href="https://github.com/dcsobral/kMin">github repository</a>, along with an sbt project and a Scalacheck test case.<br />
<br />
<pre class="brush: scala"> def kMin(a1: Array[Int], a2: Array[Int], k: Int): Int = {
def select(k2: Int) = k2 match {
case `k` => a2(k - 1)
case 0 => a1(k - 1)
case _ => a1(k - k2 - 1) max a2(k2 - 1)
}
def recurse(top: Int, bottom: Int): Int = {
if (top == bottom) select(top)
else {
val x = (bottom + top) / 2 max bottom + 1
if (a1(k - x) <= a2(x - 1)) recurse (x - 1, bottom)
else recurse(top, x)
}
}
recurse(k min a2.size, 0 max k - a1.size)
}
</pre>Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com6tag:blogger.com,1999:blog-6373963829340632529.post-16234755859834566082011-05-12T14:52:00.000-03:002011-05-13T17:35:07.488-03:00Scala 2.9 and Parallel collectionsSo, <a href="http://www.scala-lang.org/node/9483">Scala 2.9.0</a> is out. Also, the <a href="http://typesafe.com/stack">Typesafe Stack</a> is also out, which brings together Scala, Akka, and a few other things to get one up-and-running quickly. Much fun.<br />
<br />
On the collection side of things, one of the first questions I saw was: do parallel collections share a common interface with standard collections. The answer is yes, they do, but not one that existed in 2.8.1.<br />
<br />
You see, a trouble with parallel collections is that, now that they are available, people will probably be passing them around. If they could be passed to old code -- as it was briefly contemplated -- that old code could crash in mysterious ways. In fact, it happens with REPL itself.<br />
<br />
For that reason, ALL of your code comes with a <i>guarantee</i> that it will only accept sequential collections. In other words, Iterable, Seq, Set, etc, they all now share a guarantee to be sequential, which means you cannot pass a parallel sequence to a method expecting Seq.<br />
<br />
The parallel collections start with Par: ParIterable, ParSeq, ParSet and ParMap. No ParTraversable for now. These are guaranteed to be parallel. They can be found inside scala.collection.parallel, scala.collection.parallel.immutable, etc.<br />
<br />
You can also get a parallel collection just by calling the ".par" method on it, and, similarly, the ".seq" method will return a sequential collection.<br />
<br />
Now, if you want your code to <i>not care</i> whether it receives a parallel or sequential collection, you should prefix it with Gen: GenTraversable, GenIterable, GenSeq, etc. These can be either parallel or sequential.<br />
<br />
And, now, something fun to try out:<br />
<br />
<pre class="brush: scala">def p[T](coll: collection.GenIterable[T]) = coll foreach println; p(1 to 20); p((1 to 20).par)
</pre>Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com0tag:blogger.com,1999:blog-6373963829340632529.post-36592031436179591842011-04-30T23:02:00.000-03:002011-04-30T23:02:46.208-03:00Expressive Code and the Alternative VoteOne of the joys of writing code in Scala is how expressive it looks. Instead of dealing with the minutia of handling data, I can concentrate on what the code is actually doing. With experienced Scala programmers, that goes without saying. Newcomers have a harder time, because they are still a bit confused by grammar and vocabulary to pay proper attention to what is being said, so to speak.<br />
<br />
The most striking example of that comes from the use of Scala's collection. There are many powerful collection libraries out there, but you are usually made very aware that you are handling a collection -- code to be hidden inside a class, to avoid contaminating business logic with it. Scala's collections, on the other hand, can easily fit in the highest abstraction levels of the code.<br />
<br />
Let me take an example from the upcoming British referendum about the adoption (or not) of <a href="http://en.wikipedia.org/wiki/Instant-runoff_voting">Alternative Vote</a>. In this voting system, each voter ranks the candidates in order of preference. Depending on the specifics of its implementation, it may be possible or not to leave candidates unranked. The winner is the candidate that manages to get 50% of the votes, with the the candidate with the least votes being removed and his or her votes being reassigned according to the voter's preferences until some candidate gets elected.<br />
<br />
So, let's consider how one could implement the algorithm that decides who the winner is. Let's say we have a set of candidates, identified by their names, and a list of votes. Each vote is a list of candidates ranked by preference. From that, we have to produce the name of the winner. In other words:<br />
<br />
<pre class="brush: scala">def elect(candidates: Set[String], votes: Seq[Seq[String]]): String = {</pre><br />
So, where do we begin? There are three essential tasks: we need to tally the votes for every candidate, we need to see if the candidate with most votes has at least 50% of all votes, and we need to discover who's the candidate the the least amount of votes.<br />
<br />
Let's say we did these tasks, and now we know who the candidate with least and most votes are, and we have a tally of votes for all candidates. In this case, we can return the winner very easily, if there's one:<br />
<br />
<pre class="brush: scala"> if (votesByCandidate(mostVotesCandidate) >= votes.size / 2) mostVotesCandidate
</pre><br />
Which still leave us the problem of how to handle the (expected) usual case of no candidate reaching 50% on first tally. There's an easy solution for that, though: just remove the candidate with least votes from the pool of candidates, and try to get a winner out of that. It's easy, because we already have a function that does that:<br />
<br />
<pre class="brush: scala"> else elect(candidates - leastVotesCandidate, votes)
</pre><br />
Of the three tasks posed earlier, two are pretty simple as well: deciding who's the candidate with least and most votes. We could sort all candidates by vote and take the first and last candidate, but we don't actually need to <i>sort</i> anything: just knowing top and bottom is enough. We can do that like this:<br />
<br />
<br />
<pre class="brush: scala"> val mostVotesCandidate = candidates maxBy votesByCandidate
val leastVotesCandidate = candidates minBy votesByCandidate
</pre><br />
Now all that's left is finding how many votes each candidate has. We could pick the first preference in each vote and do a tally on that, but some candidates may have been removed from the pool. Instead let's say the <i>valid candidates</i> of a vote are the ranked list of candidates for a vote that are still in the running. We can compute that for a vote by doing:<br />
<br />
<pre class="brush: scala"> def validCandidates(vote: Seq[String]): Seq[String] = vote filter candidates
</pre><br />
This doesn't read right, actually. A minor complain some people have about collection methods is that filter seems to do exactly the opposite of what's wanted: if you say filter X (those for which X is true), then it will <i>keep</i> X instead of discarding it. So, when we say "filter candidates", it will <i>keep</i> these candidates in the vote, and discard the rest.<br />
<br />
The other non-obvious thing about this line is "candidates" itself. What does it mean to say "filter candidates"? Well, "filter" takes a function which, given a value of the collection, will return true or false depending on whether that value must be kept or discarded. That means "candidates" must be a function which, given the name of a candidate, returns true or false.<br />
<br />
However, "candidates" is a set! We declared it so in the very first line of code presented, didn't we? Well, in Scala a set is <i>also</i> a function that tests whether a value is present in the set or not, returning true or false accordingly. In fact, sequences and maps are also functions in Scala, the former from indices to values, and the latter from keys to values.<br />
<br />
Well, enough about that. We can now just take the first candidate in the list of valid candidates and tally the votes... as long as all candidates are ranked in each vote. If that is not the case, then a vote may well not contain any candidates still in the running, in which case the list of valid candidates will be empty.<br />
<br />
Since this example comes from the British referendum, and the AV proposed in that referendum does not require votes to rank all candidates, we'll deal with that. Let's say the first valid candidate of a vote may be some candidate or no one. That is,<br />
<br />
<pre class="brush: scala"> def firstValidCandidate(vote: Seq[String]): Option[String] = validCandidates(vote) headOption
</pre><br />
We can then use this to get a list of first choices for all votes with a valid first candidate:<br />
<br />
<pre class="brush: scala"> val firstChoices = votes flatMap firstValidCandidate
</pre><br />
The votes for a candidate are the number of first choices for that candidate. We'll make a map out of it to avoid recounting that every time.<br />
<br />
<br />
<pre class="brush: scala"> def votesFor(candidate: String) = firstChoices count (candidate ==)
val votesByCandidate = candidates map (candidate => candidate -> votesFor(candidate)) toMap;
</pre><br />
<div>Finally, we have to do something about the possibility of no candidate reaching 50%, which is possible in a system where not all candidates are ranked. I don't know how the proposed system will do in that case, but I'll just choose the most voted candidate if there aren't more than two.</div><div><br />
</div><div>With that fix in, this is what the whole code looks like:</div><div><br />
</div><pre class="brush: scala">def elect(candidates: Set[String], votes: Seq[Seq[String]]): String = {
def validCandidates(vote: Seq[String]): Seq[String] = vote filter candidates
def firstValidCandidate(vote: Seq[String]): Option[String] = validCandidates(vote) headOption
val firstChoices = votes flatMap firstValidCandidate
def votesFor(candidate: String) = firstChoices count (candidate ==)
val votesByCandidate = candidates map (candidate => candidate -> votesFor(candidate)) toMap;
val mostVotesCandidate = candidates maxBy votesByCandidate
val leastVotesCandidate = candidates minBy votesByCandidate
if (votesByCandidate(mostVotesCandidate) >= votes.size / 2 || candidates.size <= 2) mostVotesCandidate
else elect(candidates - leastVotesCandidate, votes)
}
</pre><br />
<div>While there's a Scala oddity here and there, the code is pretty clear for all that it is doing.</div>Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com7tag:blogger.com,1999:blog-6373963829340632529.post-64542203517207485012011-03-22T13:09:00.000-03:002011-03-22T13:09:37.726-03:00Scala popularityI had 5 posts in this blog throughout 2010 -- two in January, two in June. One post January of this year. Given that, I'm pretty sure no one follows my blog except, perhaps, as a forgotten automatic tracker of some sort.<div><br />
</div><div>Well... I decided to blog about something back on Sunday, which is a terrible day to blog something if you want hits. <i>Late</i> Sunday. Looking at statistics for Monday, though, I see that I have hit three times more hits than my previous record in a single day.</div><div><br />
</div><div>It is a heartening indication of how much interest Scala is attracting nowadays.</div>Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com9tag:blogger.com,1999:blog-6373963829340632529.post-29284366446104625402011-03-20T19:54:00.002-03:002011-03-24T09:27:19.786-03:00On Scala 2.9's road...I suspect a lot of people are eagerly waiting for the parallel collections on Scala 2.9. The thing is... it's just not <i>my</i> thing. I like that it is being made available, but it's just not a pervasive feature for my small daily needs.<br />
<br />
So, while I was somewhat bored by Scala 2.9, after the huge jump 2.8 was, there <i>has</i> been some nice improvements. For one thing, the jLine library used by REPL was replaced with one based on <a href="https://github.com/jdillon/jline2">this</a> (the canonical repository to the jLine actually used in Scala is <a href="https://github.com/paulp/jline2">here</a>) giving a much superior experience. Now one can edit input that spans multiple lines (longer than the number of columns in the screen) without trouble, search the history, etc. There's even something to show its key-bindings: just type <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">:keybindings</span>.<br />
<br />
And speaking of REPL, it doesn't stop there, by a long margin! There's <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">:javap</span>, which will happily decompile a class or file, <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">:type</span> which will show an expression's type without evaluating it, and <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">:implicits</span> to show what implicits are in scope. Add <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">-v</span> to that last one, and it will show those that come with <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">Predef</span> by default.<br />
<br />
Those of you pasting code into REPL, or wanting to define companion objects, or pretty much any other feature that depends on the content of the next line, you'll be happy to know there's now <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">:paste</span>. Instead of instantly evaluating each line, it will wait until you hit <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">^D</span>.<br />
<br />
More recently, a few features came up that will help those that like to do Scala scripting. The <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">-absolute-cp</span> parameter will ensure relative paths on classpaths will be made absolute based on where the script is being run from, not where the compilation daemon was started at. If you don't even know what I'm talking about, then trust me: that will save you a lot of pain.<br />
<br />
Another option, <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">-max-idle</span>, will let you specify how long the compilation daemon will stay up when idle, and even disable its auto-shutdown.<br />
<br />
And just to make scripting even nicer, <a href="http://code.google.com/p/simple-build-tool/">SBT's </a><span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">Process </span>library is now available in Scala, as <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">sys.process</span>! Now we can do stuff like this:<br />
<br />
<pre class="brush: scala">import sys.process._
Process cat new URL("http://databinder.net/dispatch/About") !
</pre><pre class="brush: scala">"find src -name *.scala -exec grep null {} ;" #| "xargs test -z" #&& "echo null-free" #|| "echo null detected" !</pre><br />
Another interesting scripting feature is that not only will files containing a single object with a main method be runnable as if it were a script, but jar files produced by scala when it compiles scripts will be runnable like programs. For example:<br />
<br />
<pre class="brush: scala">% cat script.scala
object FiddleDeeMain {
def main(args: Array[String]): Unit = {
println(args mkString " ")
}
}
% scala -nocompdaemon script.scala a b c
a b c
</pre><br />
And, conversely,<br />
<br />
<pre class="brush: scala">% cat script2.scala
println(args mkString " ")
% scala -save script2.scala arg1 arg2
arg1 arg2
% scala script2.jar arg1 arg2
arg1 arg2
</pre><br />
On the library side, a few things have happened too. Some changes where made to <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">view</span>, which made it much faster than it used to be. Arguably, its previous lack of performance resulted from a bug, so if you had performances issues with it, you might want to check it out again. Also on the performance front, another change with lots of potential is the introduction of a new hash code algorithm -- murmur3.<br />
<br />
To those of you who like writing methods and classes with <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">Numeric</span> and <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">Ordering</span>, you'll probably like to know that you can now add "<span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">import Numeric.Implicits._</span>" and "<span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">import Ordering.Implicits._</span>" and avoid all that messy implicit parameter handling. For example:<br />
<br />
<pre class="brush: scala">import Numeric.Implicits._
def sum[N: Numeric](lst: List[N]) = lst reduceLeft (_ + _)
</pre><br />
It might not seem much for a single method like that, but if you use this stuff often, I'm sure you see the advantages.<br />
<br />
There are plenty of small changes that will make life easier, or more intuitive. For example, <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">-5.abs</span> will now work. Scaladoc is much faster. Many bugs have been fixed, even on places you'd never guess there were bugs (scary thought).<br />
<br />
An interesting improvement is the introduction of <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">DelayedInit</span>. While most people won't ever hear of it -- unless it starts getting used in DSLs --, it enabled the rehabilitation of a much criticized feature: the <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">Application </span>trait. It is not that <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">Application </span>was a bad idea, but it was badly implemented. Given all that has been written about staying away from it, its new implementation was also given a new name, which we can now start using in our blogs everywhere: <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">App</span>.<br />
<br />
On the realm of experimental stuff that may never make 2.9 at all, I'm pretty fond of <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">-Yrich-exceptions</span>. It adds some capabilities to exceptions on REPL. One example is <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">lastException.show</span>, which will display the source code location of the exception, if available (to use it, point <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">SOURCEPATH </span>to the source code).<br />
<br />
There are plenty of other improvements, way too many to talk about: better docs, better error messages, better performance, bug fixes, more methods, new traits and classes, not to mention improvements made for people creating compiler plugins or working on Scala itself. If you want to get more dibs into Scala 2.9, there's a <a href="https://sites.google.com/site/scalatohoku/changes-and-improvements-on-scala-2-9">japanese site tracking the changes</a>, though only looking through the commit log one can truly feel the scope of what Scala 2.9 is. Kudos to Scala's development team!Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com11tag:blogger.com,1999:blog-6373963829340632529.post-54238064466105218952011-01-24T17:21:00.000-02:002011-01-24T17:21:56.405-02:00Testing with Scala for Fun and ProfitSometimes I like Scala, and sometimes I <i>really</i> like Scala.<br />
<br />
So, I was writing some algorithms to compute the median of a sequence of values, just for the fun of it, adapting from pseudo-code description of an algorithm. The problem with pseudo-code is that it is pseudo-precise, meaning my algorithm was pseudo-correct, so I wanted to test it to ensure it really worked.<br />
<br />
As it happens, the median of values has a trivial implementation that could be used to test against. So, one import and one one-liner afterwards, I had a quick way to test it from the REPL, where I could then experiment with the failing test case to understand what went wrong:<br />
<br />
<pre class="brush: scala">import org.scalacheck.Prop._
forAll((lst: List[Double]) => lst.nonEmpty ==> (myAlgorithm(lst) == lst.sorted.apply((lst.size - 1) / 2))).check</pre><br />
That uses the <a href="http://code.google.com/p/scalacheck/">Scalacheck</a> library, the best way to test algorithms in my opinion. What I'm doing with <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">forAll</span> is saying that for all inputs (<span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">List[Double]</span>) of that function, the conditions must hold true. Specifically, if the input is not empty, then the result of my algorithm must be equal to the result of the trivial implementation of median. That will result in a <i>property</i> (class <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">Prop</span><span class="Apple-style-span" style="font-family: inherit;">).</span><br />
<br />
I then tell it to <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">check</span> that property, which will be done by automatically generating input, with some heuristics to improve the chance of catching boundary cases, and testing the condition for each such input. If 100 tests pass, it finishes by saying:<br />
<br />
<pre class="brush: scala">+ OK, passed 100 tests.</pre><br />
Or, if it fail at some point, it will say something like this:<br />
<br />
<pre class="brush: scala">! Falsified after 0 passed tests.
> ARG_0: List("-1.0") (orig arg: List("-1.0", "8.988465674311579E307"))</pre><br />
To be honest, my one-liner was slightly longer, because I was using arrays and arrays in Java do not have a useful <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">toString</span> method, so I had to tell Scalacheck how to print the array. Both Scalatest and Specs support integration with Scalacheck too, so this can be easily turned into part of a test suite.<br />
<br />
Now this helped me achieve correctness, but I was also interested in how fast the code could run. I had three different algorithms (including the trivial implementation), and two of them could be further refined by abstracting over the selection of a pivot (quicksort-style). At that point, I decided to build a small framework which would help me test everything automatically.<br />
<br />
I wanted to test each algorithm three times for different input sizes. The basic measurement algorithm was done by one of the methods in my toolkit:<br />
<br />
<pre class="brush: scala">import java.lang.System.currentTimeMillis
def bench[A](n: Int)(body: => A): Long = {
val start = currentTimeMillis()
1 to n foreach { _ => body }
currentTimeMillis() - start
}</pre><br />
This should be familiar to anyone who ever did microbenchmarking with Scala. With that in hand, another one liner got the results I wanted:<br />
<br />
<pre class="brush: scala">println(bench(50000)(myAlgorithm(Array.fill(500)(nextDouble))))</pre><br />
Which worked well enough for a while, but really didn't scale as I got more algorithm variations and tested them with different settings. So, to get the results for each algorithm, I wrote this:<br />
<br />
<pre class="brush: scala">import scala.util.Random.nextDouble
def benchmark(algorithm: Array[Double] => Double,
arraySizes: List[Int]): List[Iterable[Long]] =
for (size <- arraySizes)
yield for (iteration <- 1 to 3)
yield bench(50000)(algorithm(Array.fill(size)(nextDouble)))</pre><br />
Which let me pass a list of sizes I wanted to test the stuff at, and run each benchmark three times, to give me a feel for the variation in the results. Next, I made a list of the algorithms I wanted tested:<br />
<br />
<pre class="brush: scala">val algorithms = sortingAlgorithm :: immutableAlgorithms</pre><br />
That's the list I started with, but it grew as I added other algorithms. As for the immutable algorithms, they were all the same method call, but passing different pivot selection as parameters. As it got a bit verbose, I decided to apply a bit of DRY. First, I made a list of my pivot selection algorithms:<br />
<br />
<pre class="brush: scala">val immutablePivotSelection: List[(String, Array[Double] => Double)] = List(
"Random Pivot" -> chooseRandomPivot,
"Median of Medians" -> medianOfMedians,
"Midpoint" -> ((arr: Array[Double]) => arr((arr.size - 1) / 2))
)</pre><br />
Next, I used that list to produce a list of median algorithms using each of these pivot selections:<br />
<br />
<pre class="brush: scala">val immutableAlgorithms = for ((name, pivotSelection) <- immutablePivotSelection)
yield name -> (findMedian(_: Array[Double])(pivotSelection))</pre><br />
With the list of algorithms in hand, it was a simple for comprehension to produce a list of results:<br />
<br />
<pre class="brush: scala">val benchmarkResults: List[String] = for {
(name, algorithm) <- algorithms
results <- benchmark(algorithm, arraySizes).transpose
} yield formattingString format (name, formatResults(results))</pre><br />
The transposition let me see each size on a different column, making it easier to compare the algorithms when displayed together. Anyway, once I had that ready, I could also easily use the list of algorithms to test them:<br />
<br />
<pre class="brush: scala">def test(algorithm: Array[Double] => Double,
reference: Array[Double] => Double): String = {
def prettyPrintArray(arr: Array[Double]) = arr mkString ("Array(", ", ", ")")
val resultEqualsReference = forAll { (arr: Array[Double]) =>
arr.nonEmpty ==> (algorithm(arr) == reference(arr)) :| prettyPrintArray(arr)
}
Test.check(Test.Params(), resultEqualsReference)(Pretty.Params(verbosity = 0))
}
val testResults = for ((name, algorithm) <- algorithms)
yield formattingString format (name, test(algorithm, sortingAlgorithm._2))</pre><br />
I could even make the test method more general, but parameterizing the type of the algorithm, and adding a parameter for a no-parameter function generating the input. This could be easily done, but it was not needed by the time I was finished.<br />
<br />
I find the final result of a certain elegance (then again, I'm obviously biased), that's not the reason I really like Scala. What I really, really like about it, is how I could start very small, with a few easy commands on the REPL, and then use that as the basis for an increasingly more flexible framework to do the tests I wanted.<br />
<br />
If anyone is interested in seeing the full code, it can be found <a href="http://stackoverflow.com/questions/4662292/scala-median-implementation/4663557#4663557">here</a>.Danielhttp://www.blogger.com/profile/07505997833685327219noreply@blogger.com1