all that jazz

james' blog about scala and all that jazz

sbt - A declarative DSL

This is my second post in a series of posts about sbt. The first post, sbt - A task engine, looked at sbt’s task engine, how it is self documenting, making tasks discoverable, and how scopes allow heirarchical fallbacks. In this post we’ll take a look at how the sbt task engine is declared, again taking a top down approach, rooted in practical examples.

§Settings are not settings

In the previous post, we were introduced to settings, being a specialisation of tasks that are only executed once at the start of an sbt session. File that bit of knowledge away, and now reset your definition of setting to nothing. I’m guessing this is a relic of past versions of sbt, but the word setting in sbt can mean two distinct things, one is a task that’s executed at the start of the session, the other is a task (or setting, as in executed at start of session) declaration. No clearer?

Let’s introduce another bit of terminology, a task key. A task key is a string and a type, the string is the name of the task, it’s how you reference it on the command line. The type is the type that the task produces. So the sources task has a name of "sources", and a type of Seq[File]. It is defined in sbt.Keys:

val sources = TaskKey[Seq[File]]("sources", "All sources, both managed and unmanaged.", BTask)

You can see it also has a description and a rank, those are not really important to us now. The thing that uniquely defines this task key is the sources string. You could define another sources key elsewhere, as long as they have the same name, they will be considered the key for the same task. Of course, if you define two tasks using the same key name, but different key types, that’s not allowed, and sbt will give you an error.

In addition to TaskKey’s there are also SettingKey’s, this is setting as in only executed once per session. Now these keys by themselves do nothing. They only do something when you declare some behaviour for them. So, a setting as in a task declaration is a task or setting key, potentially scoped, with some associated behaviour. For the remainder of this post, when I say setting, that’s what I’m referring to.

§Settings are executed like a program

Defining sbt’s task engine is done by giving sbt a series of settings, each setting declaring a task implementation. sbt then executes those settings in order. Tasks can be declared multiple times by multiple settings, the last one to execute wins.

So where do these settings come from? They come from many places. Most obviously, they come from the build file. But most settings don’t come from there - as you’re aware, sbt defines many, many fine grained tasks, but you don’t have to declare settings for these yourself. Most of the settings come from sbt plugins. One in particular that is enabled by default is the JvmPlugin. This contains all the settings necessary for building a Java or Scala program, including settings that declare the sources task that we saw yesterday. Plugin settings are executed before the settings in your build file, this means that any settings you declare in your build file will override the settings declared by the plugins.

This ordering of settings is important to note, it means settings have to be read from top to bottom. I have handled a number of support cases and mailing list questions where people haven’t realised this, they have declared a setting, and then after that included a sequence of settings from elsewhere in their build that redeclares the setting. They expected their setting to take precedence, but since their setting came before the setting from the sequence, the setting from the sequence overwrites it.

§sbt build file syntax

We’re about to get into some concrete examples of declaring settings, so before we do that we better cover the basics of the sbt build file. sbt builds can either be specified in plain *.scala files, or in sbts own *.sbt file format. As of sbt 0.13.7, the sbt format has become powerful enough that there is really not much that you can’t do with it, so we’re only going to look at that.

An sbt file may have any name, convention is that a projects main build file be called build.sbt, but that is only a convention. The file may contain a series of Scala statements and expressions, and it’s important here to distinguish between statements and expressions. What’s the difference? A statement doesn’t return a value. For example:

val foo = "bar"

This is a statement, it has assigned the val foo to "bar", but this assignment doesn’t return a value. In contrast:

5 + 4

This is an expression, it returns a value of type Int.

Expressions in an sbt file must have a type of either Setting[_] or Seq[Setting[_]]. sbt will evaluate all these exrpessions, and add them to the settings for your build. Any expression in your sbt file that isn’t of one of those types will generate an error.

Statements can be anything. They can be utility methods, vals, lazy vals, whatever. In most cases, sbt ignores them, but that doesn’t make them useless, you can use them in other expressions or statements, to help you define your build. There is one type of statement though that sbt doesn’t ignore, and that is statements that assign a val to a project, this is how projects are defined:

lazy val sbtFunProject = project in file(".")

The final thing to know about sbt build files is that sbt automatically brings in a number of imports. For example, sbt._ is imported, as is sbt.Keys._, so you have access to the full range of built in task keys that sbt defines without having to import them yourself. sbt also brings in imports declared by plugins, making it straight forward to use those plugins.

§Declaring a setting

The process of declaring a setting is done by taking a task key, optionally apply a scope to it, and then declaring an implementation for that task. Here’s a very basic example:

name := "sbt-fun"

In this case we’re declaring the implementation of the name task to simply be a static value of "sbt-fun". Note that the above is a expression, not a statement. := is not a Scala language feature, it is actually a method that sbt has provided. sbt’s syntax for declaring settings is a pure scala DSL. If this syntax confuses you, then I strongly recommend that you read a post I wrote a few years ago called Three things I had to learn about Scala before it made sense. This post explains how DSL’s are implemented in Scala, and is essential reading before you read on in this post if you don’t understand that already.

What if we want to declare our own implementation of the sources task? Remembering that we want it scoped to compile, we can do this:

sources in Compile := Seq(file("src/main/scala/MySource.scala"))

Again we’re only setting a static value for this task to return, but this time you can see how we’ve scoped the sources task in the compile scope. Note that configurations such as compile and test are available through capitalised vals, in scope in your build.

§Back to first principles

What if we want to declare a dependency on another task? Let’s say we want to declare sources to be, as it’s described, all managed and unmanaged sources. If you’ve used sbt before, you probably know that you can use this syntax:

sources := managedSources.value ++ unmanagedSources.value

This was introduced is sbt 0.13, and it’s actually implemented by a macro that does some magic for you. It’s great, I use that syntax all the time, and so should you. However, as with anything that does magic for you, if you don’t understand what it’s doing for you and how it does it, you can run into troubles.

As I described in the last post, sbt is a task engine, and tasks declare dependencies that are executed before, and provided as input, to them. In the above example, it doesn’t look like this is happening at all, what it looks like is that when the sources task is executed, it executes the managedSources task by calling value, and the unmanagedSources task by calling value, and then concatenates their results together. There is a macro that is transforming this code to something that does declare dependencies, and takes the inputs of those dependencies and passes them to the implementation.

So in order to understand what the macro is doing for us, let’s implement this ourselves manually - let’s declarce this setting from first principles.

Firstly, we’re going to use the <<= operator instead, this is how to say that I am declaring this task to be dependent on other tasks. Now, we could do a very straight forward declaration to another task:

sources <<= unmanagedSources

This will say that the sources task has a dependency on unmanagadSources, and will take the output of unmanagedSources as is, and return it as the output of sources. What if we wanted to change that value before returning it? We can do that using the map method:

sources <<= unmanagedSources.map(files => files.filterNot(_.getName.startsWith("_")))

So now we’ve filtered out all the files that start with _ (note that sbt already provides an excludesFilter task that can be used to configure this, this is just an example).

At this point let’s take a step back and think about what the code above has done. For one, nothing has yet been executed, at least not the task implementation. That <<= method returns an object of type Setting. This setting has the following attributes:

  • The key (potentially scoped) that it is the task declaration for, in this case, sources.
  • The keys (potentially scoped) of tasks that it depends on, in this case, unmanagedSources.
  • A function that takes the output of the tasks that it depends as input parameters, executes the task, and returns the output of the task being declared (that is, the function we passed to the map method, that filters out all files that start with _).

You can see here that we haven’t actually executed anything in the task, we have only declared how the task is implemented. So when sbt goes to execute the sources task, it will find this declaration, execute the dependencies, and then execute the callback. This is why I’ve called this blog post “sbt - A declarative DSL”. All our settings just declare how tasks are implemented, they don’t actually execute anything.

So, what if we want to depend on two different tasks? Through the magic of the sbt DSL, we can put them in a tuple, and then map the tuple:

sources <<= (unmanagedSources, managedSources).map { (unmanaged, managed) => unmanaged ++ managed) }

And now we actually have our first principles implementation of the sources task. Sort of, we haven’t scoped it to compile, but that’s not hard to do:

sources in Compile <<= (unmanagedSources in Compile, managedSources in Compile).map(_ ++ _)

For brevity I’ve used a shorter syntax for concatenating the two sources sequences.

§sbt uses macros heavily

So now that we’ve seen how to declare tasks from from first principles, let’s see how the macros work. We have our declaration from before:

sources := { managedSources.value ++ unmanagedSources.value }

I’ve inserted the curly braces to make it clear what is being passed to the := macro. The := macro will go through the block of code passed to it, and find all the instances of where value is called, and gather all the keys that it is invoked on. It will then generate Scala code (or rather AST) that builds those keys as a tuple, and then invokes map. To the map call, it will pass the original code block, but replacing all the keys that had value on them with parameters that are taken as the input arguments to the function passed to map. Essentially, it builds exactly the same code that we implemented in our first principles implementation.

Now, it’s important to understand how these macros work, because when you try to use the value call outside of the context of a macro, you will obviously run into problems. An important thing to realise is that the code generated by the macro never actually invokes value, value is just a place holder method used to tell the macro to extract these keys out to be dependencies that get mapped. The value method itself is in fact a macro, one that if you invoke it outside of the context of another macro, will result in a compile time error, the exact error message being value can only be used within a task or setting macro, such as :=, +=, ++=, Def.task, or Def.setting.. And you can see why, since sbt settings are entirely declarative, you can’t access the value of a task from the key, it doesn’t make sense to do that.

From now on in this post we’ll switch to using the macros, but remember what these macros compile to.

§Redeclaring tasks

So far we’ve seen how to completely overwrite a task. What if you don’t want to ovewrite the task, you just want to modify its output? sbt allows you to make a task depend on itself, if you do that, the task will depend on the old implementation of itself, giving the output of that implementation to you as your input. In the previous blog post, I brought up the possibility of only compiling source files with a certain annotation inside them, let’s say we’re only going to compile source files that contain the text "COMPILE ME". Here’s how you might implement that, depending on the existing sources implementation:

sources := {
  sources.value.filter { sourceFile =>
    IO.read(sourceFile).contains("COMPILE ME")
  }
}

sbt also provides a short hand for doing this, the ~= operator, which takes a function that takes the old value and returns the new value:

sources ~= _.filter { sourceFile =>
  IO.read(sourceFile).contains("COMPILE ME")
}

Another convenient shorthand for modifying the old value of a task that sbt provides, and that you have likely come across before, is the += and ++= operators. These take the old value, and append the item or sequence of items produced by your new implementation to it. So, to add a file to the sources:

sources += file("src/other/scala/Other.scala")

Or to add multiple files:

sources ++= Seq(
  file("src/other/scala/Other.scala"),
  file("src/other/scala/OtherOther.scala")
)

These of course can depend on other tasks through the value macro, just like when you use :=:

sources ++= Seq(
  (sourceDirectory.value / "other" / "scala").***
)

The *** method loads every file from a directory recursively.

§Scope me up

We’ve talked a little bit about scopes, but most of our examples so far have excluded them for brevity. So let’s take a look at how to scope settings and their dependencies.

To apply a scope to a setting, you can use the in method:

sources in Compile += file("src/other/scala/Other.scala")

Applying multiple scopes can be done by using multiple in calls, for example:

excludeFilter in sbtFunProject in unmanagedSources in Compile := "_*"

Or, they can also be done by passing multiple scopes to the in method, in the order project, configuration then task:

excludeFilter in (sbtFunProject, Compile, unmanagedSources) := "_*"

The same syntax can be used when depending on settings, though make sure you put parenthesis around the whole scoped setting in order to invoke the value method on it:

(sources in Compile) := 
  (managedSources in Compile).value ++ 
  (unmanagedSources in Compile).value

§Conclusion

In the first post in this series, we were introduced to the concepts behind sbt and its task engine, and how to explore and discover the task graphs that sbt provides. In this post we saw the practical side of how task dependencies and implementations are declared, using both the map method to map dependency keys, as well as macros. We saw how to modify existing task declarations, as well as how to use scopes.

One thing I’ve avoided here is showing cookbooks of how to do specific tasks, for example, how to add a source generator. The sbt documentation is really not bad for this, especially for cookbook type documentation, but I also hope that after reading these posts, you aren’t as dependent on copying and pasting sbt configuration, but rather can use the tools built in to sbt to discover the structure of your build, and modify it accordingly.

sbt - A task engine

sbt is the best build tool that I’ve used. But it’s also the build tool with the steepest learning curve that I’ve ever used, and I think most people would agree that it’s very difficult to learn. When you first start using it, configuring it is like casting spells, spells that have to be learned from a spell book, that have to be said in the exact right way, otherwise they don’t work. There are lots of guides out there that are essentially spell books, they teach you all the things you need to know to achieve various tasks. But I haven’t seen a lot out there that actually explains what sbt is, what it does, why it is the way it is. This blog post is my attempt to do that.

§A task engine

Simply put, sbt is a task engine. You have tasks. A task may be dependent on other tasks. Any task from any point in the build may be redefined, and new tasks can be easily added. In some ways it is a bit like make or ant, but it differs in a fundamental way, sbt tasks produce an output value, and are able to consume the output values of the tasks they depend on - whereas make and ant just modify the file system. This property of sbt allows you to break build steps up into very fine grained tasks.

So let’s take an example. A common step that build tools support is compilation. In many traditional build tools, a compilation task is responsible for finding a set of files to compile based on some input parameters, such as a list of source directories, and compiling them. In sbt, the compile task is not responsible for finding a set of files to compile, this is the responsibility of the sources task. The output value of the sources task is a list of files to compile. The compile task depends on the sources task, taking its list of files to compile as an input.

So what’s so good about this? What it means is that I can completely customise the way sources are located, by redefining the sources task. So if I have a crazy build requirement such as wanting to put a special annotation in my source files to say whether they get compiled or not, I can very easily implement my own sources task to do that. This is something that would be very difficult to do in another build tool, but in sbt it’s straight forward.

In other build tools, if I want to generate some sources, I have to make sure that the task to generate the sources runs before the compilation task, and puts them in a place the compilation task will find. In sbt, I can just redefine the sources task to make it generate the sources. In practice though, I don’t need to do that, because generating sources is a very common requirement. Remember that I said that sbt tasks can be very fine grained. The sources task itself depends on many other tasks, one of them is the managedSources task, which collects all the files that are managed (or generated) by the build (in contrast to unmanaged sources, which are your regular source files that you manage yourself). That task in turn depends on the sourceGenerators task, which I can redefine to add new source generators.

§A self documenting task engine

At this point you might be starting to see that there are many, many tasks involved in even the simplest build in sbt. I’ve talked about just one small part, how generated sources end up being compiled, but there are many more than that. How is someone that is new to sbt supposed to know what tasks exist, so that they can customise their build? Well, it turns out sbt comes with a few built in tools for inspecting the available tasks. These are often seen as advanced features of sbt, but I think really this is what new users to sbt should be introduced to first. So if you’re new, its time to fire up sbt.

First we need a simple project. In an empty directory, create a file called build.sbt, and set your projects name:

name := "sbt-fun"

Now, if you already have sbt 0.13 or later installed, you can use that. If you already have activator installed - which is basically just a script that launches sbt, then you can use that. If you have neither, then go here and download activator or sbt, it doesn’t matter which, and install it, and then start it in your projects directory:

$ sbt
[info] Loading project definition from /Users/jroper/sbt-fun/project
[info] Updating {file:/Users/jroper/sbt-fun/project/}sbt-fun-build...
[info] Resolving org.fusesource.jansi#jansi;1.4 ...
[info] Done updating.
[info] Set current project to sbt-fun (in build file:/Users/jroper/sbt-fun/)
> 

So, now we’re on the sbt console. Earlier we were talking about the sources task. Let’s have a look at it. sbt has a command called inspect, which lets you inspect a task:

> inspect sources
[info] Task: scala.collection.Seq[java.io.File]
[info] Description:
[info]  All sources, both managed and unmanaged.
[info] Provided by:
[info]  {file:/Users/jroper/sbt-fun/}sbt-fun/compile:sources
[info] Defined at:
[info]  (sbt.Defaults) Defaults.scala:188
[info] Dependencies:
[info]  compile:unmanagedSources
[info]  compile:managedSources
[info] Delegates:
[info]  compile:sources
[info]  *:sources
[info]  {.}/compile:sources
[info]  {.}/*:sources
[info]  */compile:sources
[info]  */*:sources
[info] Related:
[info]  test:sources

What are we looking at? First, we can see that sources is a task that produces a sequence of files - as I said before. We can also see a description of the task, All sources, both managed and unmanaged. The Defined at section is interesting, it shows us where the sources task is defined, in this case, it’s on line 188 of the sbt Defaults class. We can see that it has two tasks that it depends on, unmanagedSources and managedSources. The rest of the information we won’t worry about for now.

Now before we start playing with our build, we can actually get even more information here, not only is it possible to inspect a single task in sbt, you can also inspect a whole tree of tasks, using the inspect tree command:

> inspect tree sources
[info] compile:sources = Task[scala.collection.Seq[java.io.File]]
[info]   +-compile:unmanagedSources = Task[scala.collection.Seq[java.io.File]]
[info]   | +-*/*:sourcesInBase = true
[info]   | +-*/*:excludeFilter = sbt.HiddenFileFilter$@5a63fa71
[info]   | +-*:baseDirectory = /Users/jroper/sbt-fun
[info]   | +-*/*:unmanagedSources::includeFilter = sbt.SimpleFilter@44a44a04
[info]   | +-compile:unmanagedSourceDirectories = List(/Users/jroper/sbt-fun/src/main/scala, /Users/jroper/sbt-fun/sr..
[info]   |   +-compile:javaSource = src/main/java
[info]   |   | +-compile:sourceDirectory = src/main
[info]   |   |   +-*:sourceDirectory = src
[info]   |   |   | +-*:baseDirectory = /Users/jroper/sbt-fun
[info]   |   |   |   +-*:thisProject = Project(id sbt-fun, base: /Users/jroper/sbt-fun, configurations: List(compile,..
[info]   |   |   |   
[info]   |   |   +-compile:configuration = compile
[info]   |   |   
[info]   |   +-compile:scalaSource = src/main/scala
[info]   |     +-compile:sourceDirectory = src/main
[info]   |       +-*:sourceDirectory = src
[info]   |       | +-*:baseDirectory = /Users/jroper/sbt-fun
[info]   |       |   +-*:thisProject = Project(id sbt-fun, base: /Users/jroper/sbt-fun, configurations: List(compile,..
[info]   |       |   
[info]   |       +-compile:configuration = compile
[info]   |       
[info]   +-compile:managedSources = Task[scala.collection.Seq[java.io.File]]
[info]     +-compile:sourceGenerators = List()
[info]     

So in here you can see that sources to managedSources to sourceGenerators chain that I mentioned before, and you can also see the unmanagedSources chain, which is a lot more complex, we can see directory hierarchies, filters for deciding which files to include and exclude, etc.

§Settings vs Tasks

At this point you may notice that there are two types of tasks in the tree, there are things like managedSources, which just describe the type of the task:

compile:managedSources = Task[scala.collection.Seq[java.io.File]]

And then there are things like scalaSource, which actually display a value:

compile:scalaSource = src/main/scala

This is actually an sbt optimisation, sbt has a special type of task called a Setting. Settings get executed once per session, so when you start sbt up, you start a new session, and all the settings get executed then. This is why when I inspect the tree, sbt can show me the value, because it already knows it. In contrast, an ordinary Task gets executed once per execution. So if I now run the sources task, that managedSources task will be executed then. If I run sources again, it will be executed again. But my settings only get executed once for the whole session.

It should be noted that an execution is a request by the user to execute a task. If two tasks in my tree depend on the sources task twice, sbt will ensure that the sources task only gets executed once. So if I run the publish task, which transitively depends on the compile task, as well as the doc task (that generates java/scala docs), and the packageSrc task (that generates source jars), these all depend on the same sources task, which will only be executed once during my publish execution, and the value will be reused as the input for all three tasks.

Now naturally, since settings are executed at the start of the session, and not as part of an execution, they can’t depend on tasks, they can only depend on other settings. Meanwhile, tasks can depend on both other tasks and settings.

When it’s important to know the difference between settings and tasks is when you’re writing your own sbt plugins that define their own settings and tasks. But in general, you can consider them to be the same thing, settings are just a small optimisation so that they don’t have to be executed every time. When defining your own tasks or settings, a good rule of thumb is if in doubt, just define a task.

§Scopes

Scopes are another important feature of the sbt task engine. A task can be scoped. When a task depends on another task, it can depend on that task in a particular scope. Now one obvious type of scope that sbt supports is the configuration scope. sbt has a few built in configurations, the two main ones that you’ll interact with are compile and test. So above, when the sources command depends on managedSources, you can see that it actually depends on compile:managedSources, which means it depends on managedSources in the compile scope.

In actual fact, you can see at the top that we are looking at the tree for compile:sources. When you don’t specify a scope, sbt will choose a default scope, in this case it has chosen the compile scope. The logic in how it makes that decision we won’t cover here. We could also inspect the test:sources tree:

> inspect tree test:sources
[info] test:sources = Task[scala.collection.Seq[java.io.File]]
[info]   +-test:unmanagedSources = Task[scala.collection.Seq[java.io.File]]
[info]   | +-test:unmanagedSourceDirectories = List(/Users/jroper/sbt-fun/src/test/scala, /Users/jroper/sbt-fun/src/t..
[info]   | | +-test:javaSource = src/test/java
[info]   | | | +-test:sourceDirectory = src/test
[info]   | | |   +-*:sourceDirectory = src
[info]   | | |   | +-*:baseDirectory = /Users/jroper/sbt-fun
[info]   | | |   |   +-*:thisProject = Project(id sbt-fun, base: /Users/jroper/sbt-fun, configurations: List(compile,..
[info]   | | |   |   
[info]   | | |   +-test:configuration = test
[info]   | | |   
[info]   | | +-test:scalaSource = src/test/scala
[info]   | |   +-test:sourceDirectory = src/test
[info]   | |     +-*:sourceDirectory = src
[info]   | |     | +-*:baseDirectory = /Users/jroper/sbt-fun
[info]   | |     |   +-*:thisProject = Project(id sbt-fun, base: /Users/jroper/sbt-fun, configurations: List(compile,..
[info]   | |     |   
[info]   | |     +-test:configuration = test
[info]   | |     
[info]   | +-*/*:unmanagedSources::includeFilter = sbt.SimpleFilter@44a44a04
[info]   | +-*/*:excludeFilter = sbt.HiddenFileFilter$@5a63fa71
[info]   | 
[info]   +-test:managedSources = Task[scala.collection.Seq[java.io.File]]
[info]     +-test:sourceGenerators = List()
[info]     

It looks pretty similar to the compile:sources tree, except that it depends on test scoped settings. In some cases, you can see that the scope is *, this means that it’s depending on an unscoped task/setting.

Configuration is not the only axis that you can scope tasks on in sbt, sbt supports two other axes, project and task.

The project axis is scoped by an sbt project. An sbt build can have multiple projects, and each project can have its own set of settings. When you define tasks on a project, sbt will automatically scope those tasks, and the dependencies of those tasks, to be for that project, that is if you haven’t already explicitly scoped them to a project yourself. Tasks scoped to one project can also depend on tasks in another project, so you could for example make the packageSrc command in one project depend on the sources for all the other projects, thus bringing all your sources together into one source jar.

The syntax for scoping something by project on the sbt command line is to prefix the task with the project name followed by a slash, then the task. For example sbt-fun/compile:sources is the sources task in the compile scope from the sbt-fun project. You can actually see from the output of the plain inspect command, in the Provided By section, that the full task is {file:/Users/jroper/sbt-fun/}sbt-fun/compile:sources, this is the path of the build, followed by the project name, configuration and task. Sometimes tasks and settings are scoped to be global or for the entire build, you can see some such settings above, they are prefixed with */, so */*:excludeFilter is the excludeFilter task, with no configuration scope, and no project scope.

The final axis is to be scoped by another task. Scoping by another task is incredibly useful, which we’ll see when we get to scope fallbacks, but what it means is that the same task key can be used and explicitly configured for many tasks. In the above tree we can see that unmanagedSources depends on includeFilter scoped to the unmanagedSources task, the syntax for this is unmanagedSources::includeFilter. includeFilter may also be used elsewhere, for example, in discovering resources, in that case it will be scoped to the unmanagedResources task.

§Scope fallbacks

Scopes work in a hierarchical fashion, allowing fallbacks through the hierarchy when tasks at a specific scope can’t be found. I mentioned above that unmanagedSources depends on unmanagedSources::includeFilter. Let’s have a closer look, by inspecting it:

> inspect unmanagedSources
[info] Task: scala.collection.Seq[java.io.File]
[info] Description:
[info]  Unmanaged sources, which are manually created.
[info] Provided by:
[info]  {file:/Users/jroper/sbt-fun/}sbt-fun/compile:unmanagedSources
[info] Defined at:
[info]  (sbt.Defaults) Defaults.scala:182
[info]  (sbt.Defaults) Defaults.scala:209
[info] Dependencies:
[info]  compile:baseDirectory
[info]  compile:unmanagedSourceDirectories
[info]  compile:unmanagedSources::includeFilter
[info]  compile:unmanagedSources::excludeFilter
...

So we can see that compile:unmanagedSources depends on compile:unmanagedSources::includeFilter and compile:unmanagedSources::excludeFilter. But if we have a look at the inspect tree command, we’ll notice a discrepancy:

> inspect tree unmanagedSources
[info] compile:unmanagedSources = Task[scala.collection.Seq[java.io.File]]
[info]   +-*/*:sourcesInBase = true
[info]   +-*/*:excludeFilter = sbt.HiddenFileFilter$@5a63fa71
[info]   +-*:baseDirectory = /Users/jroper/sbt-fun
[info]   +-*/*:unmanagedSources::includeFilter = sbt.SimpleFilter@44a44a04
...

So, while it depended on sbt-fun/compile:unmanagedSources::includeFilter, it actually got */*:unmanagedSources::includeFilter, that is, it requested a task at a specific project and configuration, but got a task that was defined for no project or configuration. Furthermore, the excludeFilter which was similarly requested, was satisfied by */*:excludeFilter, that is, it isn’t even scoped to the unmanagedSources task. This is a demonstration of how sbt uses fallbacks. When a task declares a dependency, sbt will try and satisfy that dependency with the most specific task it has for it, but if no task is defined at that specific scope, it will fallback to a less specific scope.

What this means, for example for excludeFilter, is that if you have a text editor that generates temporary files of a particular format, you can exclude those by adding it to the global excludeFilter, you don’t need to define an excludeFilter for every single scope. But, I might also decide that I want to exclude certain files in the test scope, so I can configure a different excludeFilter for tests by scoping it to test. Or, I might decide that I want a different filter again just for unmanagedSources, as opposed to unmanagedResources, so I can define the excludeFilter specifically for those tasks. The general approach that sbt takes in its predefined task dependency trees is to depend on tasks at a very specific scope, but define them at the most general scope that makes sense, allowing tasks to be overridden in a blanket fashion, but at a very fine grained level when required.

§Parallel by default

There is one last feature of the sbt task engine that I think is worth mentioning in this post. It’s not one that really needs to be understood well in order to use sbt, but it is a very powerful one that sbt’s architecture makes very simple. In sbt, all tasks are executed in parallel by default. Now of course, if a task declares a dependency on another task, those two tasks can’t run in parallel. But two tasks that have no dependency on each other, such as unmanagedSources and managedSources, can, and will be executed in parallel. Given sbt’s fine grained tasks, this makes for some considerable (and much needed, given the speed of scala compilation) performance improvements out of the box compared to other build tools.

sbt’s concurrent execution is also configurable, tasks can be tagged, and then you can define, for example, what the maximum number of tasks with that tag can be run in parallel. You can read more about these capabilities here.

§Conclusion

In this blog post we have seen that sbt is actually a task engine, and that the fact that it breaks tasks up into many smaller interdependent tasks gives you a lot of power and flexibility. We have seen that the sbt console can be used to inspect tasks, their dependencies, and entire dependency graphs of tasks, and this allows us to learn about sbt, the tasks that are available, and see how our build fits together. We have learned how tasks can be scoped to different configurations, projects, and other tasks, and how sbt uses a fallback system to resolve dependencies at specific scopes. Hopefully sbt is now more transparent to you, you no longer need a spellbook to know how to configure it, rather, you can use the inspect commands to discover what you can configure yourself.

We have not seen anything about how to define or redefine tasks, or the syntax of the sbt build file. This is the topic of my next blog post, sbt - A declarative DSL.

About

Hi! My name is James Roper, and I am a software developer with a particular interest in open source development and trying new things. I program in Scala, Java, Go, PHP, Python and Javascript, and I work for Lightbend as the architect of Kalix. I also have a full life outside the world of IT, enjoy playing a variety of musical instruments and sports, and currently I live in Canberra.