Friday, July 4, 2014

File generation with SBT

Someone asked me a question on IRC about file generation with SBT. I pointed out this link on the SBT documentation, and tried to briefly explain how it worked, but the subject got a little too long for IRC, so I thought I might make a blog post out of it. Good thing too, because there are some errors in that page.

Anyway, let's start. The goal here is that, when you compile a project, some source files are going to be generated by code, and then compiled together with the other ones you wrote. The person wanted the generator to have tests -- for such, I recommend writing an SBT plugin. I won't go further into that, and just explain the basic mechanism for generating source files.

If you inspect sourceGenerators, the setting mentioned by the SBT page, you'll see the follow description:

[info] Setting: scala.collection.Seq[sbt.Task[scala.collection.Seq[]]]

That means it is a setting (that is, it's value is fixed by the configuration file). The setting contains a sequence, which means you can have more than one source generator. This sequence contains tasks, so each generator is a task, and that means they will be evaluated every time they get executed. The task must return a sequence of files, which I assumed, correctly, to be the list of files that were generated.

Now, you'll also see further down this information:

[info] Reverse dependencies:
[info] root/compile:managedSources

That means it is managedSources that uses sourceGenerators. And inspect uses managedSources shows this:

[info] root/compile:sources

In other words, whenever you compile, any source generators you have defined will be run. You can see as well that this is defined not only for compile, but also for test or any other compilation task you may have (I also have it:compile, for example).

So, with that in mind, we can start creating our generator. All the lines below can be placed in a build.sbt file, though you'll use plain Scala files with a plugin. This is just to quickly demonstrate how it's used. First, I'm going to create a task of sequence of files, which will be my generator:

lazy val generator = taskKey[Seq[File]]("My Generator")          

Don't ask me about why it's "lazy val" -- I'm simply repeating what I saw elsewhere. :) Also note that this uses the equals sign, not the colon-equals sign.

Now that we have a task key, we can assign a task to it. Since it's going to be of some complexity, let's start with:

generator in Compile := {

Now we can proceed with the rest. I'm going to define a method with the basic generating capabilities, and then call this method with some parameters as the body of this task. My generator will be pretty simple: given source and destination directories, copy all files ending with .txt from the source to the destination, changing the extension to .scala. Not very useful, perhaps, but enough to show how to get at some source, and produce something with it at a proper destination. So here is is:

  def generate(src: File, dst: File): Seq[File] = {
    val sourceFiles = Option(src.list) getOrElse Array() filter (_ endsWith ".txt")
    if (sourceFiles.nonEmpty) dst.mkdirs()
    for (file <- sourceFiles) yield {
      val srcFile = src / file
      val dstFile = dst / ((file take (file lastIndexOf '.')) + ".scala")
      Files.copy(srcFile.toPath, dstFile.toPath)

There's a couple of things here. First, note that I'm handling the case where there's no source files -- I tested it on a project with multiple subprojects, which resulted in annoying exceptions when trying out. Also, note that I create the target directory: even though SBT provided me with a target directory, it didn't actually create it. And I pass an option to replace existing files as well -- remember that it has to work without running clean every time. Finally, notice how I return the destination files, as required by sourceGenerators.

Now, for source and destination directories. There's a setting for the destination directory, which I also saw on the SBT docs linked page. As for the base directory, I'll get the base directory of the current project, and add a subdirectory to it. So my task ends with:

  generate(baseDirectory.value / "managed", sourceManaged.value)

All that remains is assigning it to sourceGenerators, which actually took some time because the documentation was wrong. In the end, I saw an email mentioning that the ".task" macro suggested in the SBT docs doesn't actually exist because it was already taken by something else. So trying to use that give strange errors. The actual syntax I had to use is this:

sourceGenerators in Compile <+= (generator in Compile)

To test, I wrong some stuff to a text file, intentionally meant to cause a compilation error, and ran the compile task with this result:

[info] Compiling 1 Scala source to /Users/dsobral/src/sinks/target/scala-2.11/classes...
[error] /Users/dsobral/src/sinks/target/scala-2.11/src_managed/test.scala:1: expected class or object definition
[error] This file should cause a compilation error.
[error] ^
[error] one error found
[error] (root/compile:compile) Compilation failed
[error] Total time: 1 s, completed Jul 4, 2014 8:55:12 PM

1 comment:

  1. If you change `generator` to be a regular val, not lazy, then you can write:

    sourceGenerators in Compile += (generator in Compile).taskValue

    thereby avoiding the undocumented <+= operator.

    Why it blows up if the val is lazy, I have no idea.

    Did you open a ticket on the mistake in the sbt docs...?