all that jazz

james' blog about scala and all that jazz

SBT per version global plugins

I'm an IntelliJ user, so I have the SBT idea plugin permanently in my global SBT plugins. This is fine, until I start working across versions of SBT, and particularly with milestone builds of unreleased SBT versions, because in that case, the version of the idea plugin that I usually use with other versions of SBT may not be available. Hence I need to have per SBT version global plugins. I think this might be easier to do in future, but for now, here's a way to do it:

libraryDependencies <++= (sbtBinaryVersion in update, scalaBinaryVersion in update) { (sbtV, scalaV) =>
  sbtV match {
    case sbt013 if sbt013.startsWith("0.13.") => Seq(
      Defaults.sbtPluginExtra("com.github.mpeltonen" % "sbt-idea" % "1.5.0-SNAPSHOT", sbtV, scalaV)
    )
    case _ => Seq(
      Defaults.sbtPluginExtra("com.github.mpeltonen" % "sbt-idea" % "1.4.0", sbtV, scalaV)
    )
  }
}

Advanced routing in Play Framework

We frequently get questions about how to meet all sorts of different routing needs in Play Framework. While the built in router is enough for most users, sometimes you may encounter use cases where it's not enough. Or, maybe you want a more convenient way to implement some routing pattern. Whatever it is, Play will allow you to do pretty much anything. This blog post is going to describe some common use cases.

Hooking into Plays routing mechanism

If for some reason you don't like Plays router, or if you want to use a modified router, then Play allows you to do this easily. Global.onRouteRequest is the method that is invoked to do routing. By default, this delegates to the Play router, but you can override it to do whatever you want. For example:

override def onRouteRequest(req: RequestHeader): Option[Handler] = {
  (req.method, req.path) match {
    case ("GET", "/") => Some(controllers.Application.index)
    case ("POST", "/submit") => Some(controllers.Application.submit)
    case _ => None
  }
}

As you can see, I've practically implemented my own little routing DSL here. I could also delegate back to the default router by invoking super.onRouteRequest(req).

An interesting thing that could also be done is to delegate to different routers based on something in the request. A play router compiles to an instance of Router.Routes, and it will be an object called Routes itself. By default, any file with the .routes extension in the conf directory will by compiled, and will go in the package with the same name as the filename, minus the .routes. So if I had two routers, foo.routes and bar.routes, I could implemented a crude form of virtual hosting like so:

override def onRouteRequest(req: RequestHeader): Option[Handler] = {
  if (req.host == "foo.example.com") {
    foo.Routes.routes.lift(req)
  } else if (req.host == "bar.example.com") {
    bar.Routes.routes.lift(req)
  } else {
    super.onRouteRequest(req)
  }
}

So here are some use cases that overriding onRouteRequest may be useful for:

  • Modifying the request in some way before routing is done
  • Plugging in a completely different router (eg, jaxrs)
  • Delegating to different routes files based on some aspect of the request

Implementing a custom router

We saw in the previous example how to use Plays Router.Routes interface, another option is to implement it. Now, there's no real reason to implement it if you're just going to delegate to it directly from onRouteRequest. However, by implementing this interface, you can include it in another routes file, using the sub routes include syntax, which in case you haven't come across this before, typically looks like this:

->    /foo         foo.Routes

Now something that people often criticise Play for is that it doesn't support rails style resource routing, where a convention is used to route commonly needed REST endpoints to particular methods on a controller. Although Play comes with nothing out of the box that does this, it is not hard to implement this today for your project, Play 2.1 has everything you need to support it, by using the routes includes syntax, and implementing your own router. And I have some good news too, we will be introducing a feature like this into Play soon. But until then, and also if you have your own custom conventions that you want to implement, you will probably find these instructions very helpful.

So let's start off with an interface that our controllers can implement:

trait ResourceController[T] extends Controller {
  def index: EssentialAction
  def newScreen: EssentialAction
  def create: EssentialAction
  def show(id: T): EssentialAction
  def edit(id: T): EssentialAction
  def update(id: T): EssentialAction
  def destroy(id: T): EssentialAction
}

I could provide default implementations that return not implemented, but then implementing it would require using override keywords. I think it's a matter of preference here.

Now I'm going to write a router. The router interface looks like this:

trait Routes {
  def routes: PartialFunction[RequestHeader, Handler]
  def documentation: Seq[(String, String, String)]
  def setPrefix(prefix: String)
  def prefix: String
}

The routes method is pretty self explanatory, it is the function that looks up the handler for a request. documentation is used to document the router, it is not mandatory, but it used by at least one REST API documenting tool to discover what routes are available and what they look like. For brevity in this post, we won't worry about implementing it. The prefix and setPrefix methods are used by Play to inject the path of the router. In the routes includes syntax that I showed above, you could see that we declared the router to be on the path /foo. This path is injected using this mechanism.

So we'll write an abstract class that implements the routes interface and the ResourceController interface:

abstract class ResourceRouter[T](implicit idBindable: PathBindable[T]) 
    extends Router.Routes with ResourceController[T] {
  private var path: String = ""
  def setPrefix(prefix: String) {
    path = prefix
  }
  def prefix = path
  def documentation = Nil
  def routes = ...
}

I've given it a PathBindable, this is so that we have a way to convert the id from a String extracted from the path to the type accepted by the methods. PathBindable is the same interface that's used under the covers when in a normal routes file to convert types.

Now for the implementation of routes. First I'm going to create some regular expressions for matching the different paths:

  private val MaybeSlash = "/?".r
  private val NewScreen = "/new/?".r
  private val Id = "/([^/]+)/?".r
  private val Edit = "/([^/]+)/edit/?".r

I'm also going to create a helper function for the routes that require the id to be bound:

def withId(id: String, action: T => EssentialAction) = 
  idBindable.bind("id", id).fold(badRequest, action)

badRequest is actually a method on Router.Routes that takes the error message and turns it into an action that returns that as a result. Now I'm ready to implement the partial function:

def routes = new AbstractPartialFunction[RequestHeader, Handler] {
  override def applyOrElse[A <: RequestHeader, B >: Handler](rh: A, default: A => B) = {
    if (rh.path.startsWith(path)) {
      (rh.method, rh.path.drop(path.length)) match {
        case ("GET", MaybeSlash()) => index
        case ("GET", NewScreen()) => newScreen
        case ("POST", MaybeSlash()) => create
        case ("GET", Id(id)) => withId(id, show)
        case ("GET", Edit(id)) => withId(id, edit)
        case ("PUT", Id(id)) => withId(id, update)
        case ("DELETE", Id(id)) => withId(id, destroy)
        case _ => default(rh)
      }
    } else {
      default(rh)
    }
  }

  def isDefinedAt(rh: RequestHeader) = ...
}

I've implemented AbstractPartialFunction, and the main method to implement then is applyOrElse. The match statement doesn't look much unlike the mini DSL I showed in the first code sample. I'm using regular expressions as extractor objects to extract the ids out of the path. Note that I haven't shown the implementation of isDefinedAt. Play actually won't call this, but it's good to implement it anyway, it's basically the same implementation as applyOrElse, except instead of invoking the corresponding methods, it returns true, or for when nothing matches, it returns false.

And now we're done. So what does using this look like? My controller looks like this:

package controllers

object MyResource extends ResourceRouter[Long] {
  def index = Action {...}
  def create(id: Long) = Action {...}
  ...
  def custom(id: Long) = Action {...}
}

And in my routes file I have this:

->     /myresource              controllers.MyResource
POST   /myresource/:id/custom   controllers.MyResource.custom(id: Long)

You can see I've also shown an example of adding a custom action to the controller, obviously the standard crud actions are not going to be enough, and the nice thing about this is that you can add as many arbitrary routes as you want.

But what if we want to have a managed controller, that is, one whose instantiation is managed by a DI framework? Well let's created another router that does this:

class ManagedResourceRouter[T, R >: ResourceController[T]]
    (implicit idBindable: PathBindable[T], ct: ClassTag[R]) 
    extends ResourceRouter[T] {

  private def invoke(action: R => EssentialAction) = {
    Play.maybeApplication.map { app =>
      action(app.global.getControllerInstance(ct.runtimeClass.asInstanceOf[Class[R]]))
    } getOrElse {
      Action(Results.InternalServerError("No application"))
    }
  }

  def index = invoke(_.index)
  def newScreen = invoke(_.newScreen)
  def create = invoke(_.create)
  def show(id: T) = invoke(_.show(id))
  def edit(id: T) = invoke(_.edit(id))
  def update(id: T) = invoke(_.update(id))
  def destroy(id: T) = invoke(_.destroy(id))
}

This uses the same Global.getControllerInstance method that managed controllers in the regular router use. Now to use this is very simple:

package controllers

class MyResource(dbService: DbService) extends ResourceController[Long] {
  def index = Action {...}
  def create(id: Long) = Action {...}
  ...
  def custom(id: Long) = Action {...}
}
object MyResource extends ManagedResourceRouter[Long, MyResource]

And in the routes file:

->     /myresource              controllers.MyResource
POST   /myresource/:id/custom   @controllers.MyResource.custom(id: Long)

The final thing we need to consider is reverse routing and the Javascript router. Again this is very simple, but I'm not going to go into any details here. Instead, you can check out the final product, which has a few more features, here.

Java 8 Lambdas - The missing link to moving away from Java

I learnt functional programming, but then I decided I liked imperative programming better so I switched back.
— Nobody, ever

Moving from imperative programming to functional programming is a very common thing to do today. Blog posts on the internet abound with testimonies about it. Everything I've read and every person I've talked to, including myself, has the same story. Once they started functional programming, there was no looking back. They loved it, and in the early days, even the small amount they learnt gave them a thirst to learn more.

To me it seems clear, going from imperative programming to functional programming is a one way street under heavy traffic. It's a diode with a million volts sitting across it. It's a check valve on a mains water pipe. Not only can you not go back, but it comes with an irresistible desire to explore and learn more that pushes you further into functional programming.

Java 8 Lambdas

With Java 8 lambdas on the way, this poses an interesting turning point for one of the largest groups of developers on the planet. Lambdas in themselves don't necessarily equate to functional programming. But they do enable it. And as a developer here starts dabbling in functional programming, a library maintainer there, we'll start to see some new things in Java source code. Methods that previously might have returned null will start returning Optional. Libraries that do IO, for example HTTP client libraries, will start returning CompletableFuture. More and more functional concepts will start creeping into Java interfaces, there will be methods called fold, map, reduce, collect. And so will start the one way street of the Java masses moving from imperative programming to functional programming.

But will Java satisfy their thirst? Looking at the Lambda spec, I suspect not. I see an essence of genius in the Lambda spec, it makes many, many existing libraries instantly usable, with no changes, with Lambdas. The reason for this is that a Lambda is just syntactic sugar for implementing a single-abstract-method (SAM) interface. In Java you'll find SAM's everywhere, from Runnable and Callable in the concurrency package, to ActionListener in Swing, to Function and Supplier in Guava, and the list goes on. All of these libraries are Lambda ready today.

However, this also presents a problem. Functional programming gets interesting when you start composing things. The ability to pass functions around and compose them together gives a lot of power - but the Java 8 Lambdas are not composable. Java 8 does provide a Future SAM, but so does Guava, and many other libraries. To compose these together, you'd need all permutations of composition methods. Two SAM's of the same type aren't even very simple to compose, at least, not in a traditional Java way, since you can't add any methods to the SAM, such as a map or transform method, to do the composing.

So, without the ability to do one of the most basic functional concepts, composing functions, can Java ever become a functional language? Maybe there are some creative ways to solve this that I haven't thought of. And maybe it doesn't need to, I don't think the designers of Java 8 Lambdas had any intention of making Java a functional language, so you can't call this a fault of the Lambda spec. But the problem is the developers, having got a taste for functional programming, as I pointed out earlier, are going to want more, and going to want more fast. Even if Java could become a functional language, I don't think it will keep up with the movement of Java developers to functional programming.

So I'm going to make a prediction. Java 8 Lambdas will be eagerly adopted. So eagerly that Java itself will be left behind, and the majority of Java developers will move on to a language that meets their needs as eager budding new functional programmers.

Which language?

Before I speculate on which language Java developers will move to, let me first just qualify that I am both biased and ignorant. I work for Typesafe, and so am obviously biased towards Scala. And apart from playing with Haskell and ML at university, I have never used any other functional language in anger. So take my words with a grain of salt, and if you disagree, write your own blog post about it.

Scala as a transitionary language

So firstly, I think Scala makes a great transitionary language for imperative programmers to switch to functional programming. Having got a taste for functional programming with Java 8 Lambdas, a Java developer will find themselves very comfortable in Scala. They can still do everything in the same way they used to, they have vars and mutable collections, they have all the standard Java libraries at their finger tips. And of course, they can start deepening their knowledge in functional programming. So Scala provides a smooth transition from imperative programming to functional programming, you can adopt functional programming as quickly or as slowly as you like.

Scala as the destination language

Having transitioned to functional programming, will developers stay at Scala, or will they move on, like they moved on from Java, in search of a more pure language? My opinion here is no. Broadly speaking, I see two camps in the functional programming community. The first camp sees functional programming as a set of laws that must be followed. For this camp, Scala has a lot of things that are not necessary and/or dangerous, and they would probably not see Scala as the end destination.

The second camp sees functional programming as a powerful tool that should be widely exploited, but not a set of laws that must be followed. This is where I stand, and Scala fills the needs of this camp very well. Functional programming has first class support in Scala, but you can always fall back to imperative when it makes sense. I suspect that the majority of the Java community would be inclined to join this camp, otherwise, they would already be shunning Java and writing Haskell.

So I think Java 8 Lambdas are going to be very good for Scala, in that they give Java developers a taste for what Scala will do for them, and thus funnel the masses into Scala development.

Understanding the Play Filter API

With Play 2.1 hot off the press, there have been a lot of people asking about the new Play filter API. In actual fact, the API is incredibly simple:

trait EssentialFilter {
  def apply(next: EssentialAction): EssentialAction
}

Essentially, a filter is just a function that takes an action and returns another action. The usual thing that would be done by the filter is wrap the action, invoking it as a delegate. To then add a filter to your application, you just add it to your Global doFilter method. We provide a helper class to do that for you:

object Global extends WithFilters(MyFilter) {
  ...
}

Easy right? Wrap the action, register it in global. Well, it is easy, but only if you understand Plays architecture. This is very important, because once you understand Play's architecture, you will be able to do far more with Play. We have some documentation here that explains Plays architecture at a high level. In this blog post, I'm going to explain Play's architecture in the context of filters, with code snippets and use cases along the way.

A short introduction to Plays architecture

I don't need to go in depth here because I've already provided a link to our architecture documentation, but in short Play's architecture matches the flow of an HTTP request very well.

The first thing that arrives when an HTTP request is made is the request header. So an action in Play therefore must be a function that accepts a request header.

What happens next in an HTTP request? The body is received. So, the function that receives the request must return something that consumes the body. This is an iteratee, which is a reactive stream handler, that eventually produces a single result after consuming the stream. You don't necessarily need to understand the details about how iteratees work in order to understand filters, the important thing to understand is that iteratees eventually produce a result that you can map, just like a future, using their map function. For details on writing iteratees, read my blog post.

The next thing that happens in an HTTP request is that the http response must be sent. So what is the result that of the iteratee? An HTTP response. And an HTTP response is a set of response headers, followed by a response body. The response body is an enumerator, which is a reactive stream producer.

All of this is captured in Plays EssentialAction trait:

trait EssentialAction extends (RequestHeader => Iteratee[Array[Byte], Result])

This reads that an essential action is a function that takes a request header and returns an iteratee that consumes the byte array body chunks and eventually produces a result.

The simpler way

Before I go on, I'd like to point out that Play provides a helper trait called Filter that makes writing filters easier than when using EssentialFilter. This is similar to the Action trait, in that Action simplifies writing EssentialAction's by not needing to worry about iteratees and how the body is parsed, rather you just provide a function that takes a request with a parsed body, and return a result. The Filter trait simplifies things in a similar way, however I'm going to leave talking about that until the end, because I think it is better to understand how filters work from the bottom up before you start using the helper class.

The noop filter

To demonstrate what a filter looks like, the first thing I will show is a noop filter:

class NoopFilter extends EssentialFilter {
  def apply(next: EssentialAction) = new EssentialAction {
    def apply(request: RequestHeader) = {
      next(request)
    }
  }
}

Each time the filter is executed, we create a new EssentialAction that wraps it. Since EssentialAction is just a function, we can just invoke it, passing the passed in request. So the above is our basic pattern for implementing an EssentialFilter.

Handling the request header

Let's say we want to look at the request header, and conditionally invoke the wrapped action based on what we inspect. An example of a filter that would do that might be a blanket security policy for the /admin area of your website. This might look like this:

class AdminFilter extends EssentialFilter {
  def apply(next: EssentialAction) = new EssentialAction {
    def apply(request: RequestHeader) = {
      if (request.path.startsWith("/admin") && request.session.get("user").isEmpty) {
        Iteratee.ignore[Array[Byte]].map(_ => Results.Forbidden())
      } else {
        next(request)
      }
    }
  }
}

You can see here that since we are intercepting the action before the body has been parsed, we still need to provide a body parser when we block the action. In this case we are returning a body parser that will simply ignore the whole body, and mapping it to have a result of forbidden.

Handling the body

In some cases, you might want to do something with the body in your filter. In some cases, you might want to parse the body. If this is the case, consider using action composition instead, because that makes it possible to hook in to the action processing after the action has parsed the body. If you want to parse the body at the filter level, then you'll have to buffer it, parse it, and then stream it again for the action to parse again.

However there are some things that can be easily be done at the filter level. One example is gzip decompression. Play framework already provides gzip decompression out of the box, but if it didn't this is what it might look like (using the gunzip enumeratee from my play extra iteratees project):

class GunzipFilter extends EssentialFilter {
  def apply(next: EssentialAction) = new EssentialAction {
    def apply(request: RequestHeader) = {
      if (request.headers.get("Content-Encoding").exists(_ == "gzip")) {
        Gzip.gunzip() &>> next(request)
      } else {
        next(request)
      }
    }
  }
}

Here using iteratee composition we are wrapping the body parser iteratee in a gunzip enumeratee.

Handling the response headers

When you're filtering you will often want to do something to the response that is being sent. If you just want to add a header, or add something to the session, or do any write operation on the response, without actually reading it, then this is quite simple. For example, let's say you wanted to add a custom header to every response:

class SosFilter extends EssentialFilter {
  def apply(next: EssentialAction) = new EssentialAction {
    def apply(request: RequestHeader) = {
      next(request).map(result => 
        result.withHeaders("X-Sos-Message" -> "I'm trapped inside Play Framework please send help"))
    }
  }
}

Using the map function on the iteratee that handles the body, we are given access to the result produced by the action, which we can then modify as demonstrated.

If however you want to read the result, then you'll need to unwrap it. Play results are either AsyncResult or PlainResult. An AsyncResult is a Result that contains a Future[Result]. It has a transform method that allows the eventual PlainResult to be transformed. A PlainResult has a header and a body.

So let's say you want to add a timestamp to every newly created session to record when it was created. This could be done like this:

class SessionTimestampFilter extends EssentialFilter {
  def apply(next: EssentialAction) = new EssentialAction {
    def apply(request: RequestHeader) = {

      def addTimestamp(result: PlainResult): Result = {
        val session = Session.decodeFromCookie(Cookies(result.header.headers.get(HeaderNames.COOKIE)).get(Session.COOKIE_NAME))
        if (!session.isEmpty) {
          result.withSession(session + ("timestamp" -> System.currentTimeMillis.toString))
        } else {
          result
        }
      }

      next(request).map {
        case plain: PlainResult => addTimestamp(plain)
        case async: AsyncResult => async.transform(addTimestamp)
      }
    }
  }
}

Handling the response body

The final thing you might want to do is transform the response body. PlainResult has two implementations, SimpleResult, which is for bodies with no transfer encoding, and ChunkedResult, for bodies with chunked transfer encoding. SimpleResult contains an enumerator, and ChunkedResult contains a function that accepts an iteratee to write the result out to.

An example of something you might want to do is implement a gzip filter. A very naive implementation (as in, do not use this, instead use my complete implementation from my play extra iteratees project) might look like this:

class GzipFilter extends EssentialFilter {
  def apply(next: EssentialAction) = new EssentialAction {
    def apply(request: RequestHeader) = {

      def gzipResult(result: PlainResult): Result = result match {
        case simple @ SimpleResult(header, content) => SimpleResult(header.copy(
          headers = (header.headers - "Content-Length") + ("Content-Encoding" -> "gzip")
        ), content &> Enumeratee.map(a => simple.writeable.transform(a)) &> Gzip.gzip())
      }

      next(request).map {
        case plain: PlainResult => gzipResult(plain)
        case async: AsyncResult => async.transform(gzipResult)
      }
    }
  }
}

Using the simpler API

Now you've seen how you can achieve everything using the base EssentialFilter API, and hopefully therefore you understand how filters fit into Play's architecture and how you can utilise them to achieve your requirements. Let's now have a look at the simpler API:

trait Filter extends EssentialFilter {
  def apply(f: RequestHeader => Result)(rh: RequestHeader): Result
  def apply(next: EssentialAction): EssentialAction = {
    ...
  }
}

object Filter {
  def apply(filter: (RequestHeader => Result, RequestHeader) => Result): Filter = new Filter {
    def apply(f: RequestHeader => Result)(rh: RequestHeader): Result = filter(f,rh)
  }
}

Simply put, this API allows you to write filters without having to worry about body parsers. It makes it look like actions are just functions of request headers to results. This limits the full power of what you can do with filters, but for many use cases, you simply don't need this power, so using this API provides a simple alternative.

To demonstrate, a noop filter class looks like this:

class NoopFilter extends Filter {
  def apply(f: (RequestHeader) => Result)(rh: RequestHeader) = {
    f(rh)
  }
}

Or, using the Filter companion object:

val noopFilter = Filter { (next, req) =>
  next(req)
}

And a request timing filter might look like this:

val timingFilter = Filter { (next, req) =>
  val start = System.currentTimeMillis

  def logTime(result: PlainResult): Result = {
    Logger.info("Request took " + (System.currentTimeMillis - start))
    result
  }

  next(req) match {
    case plain: PlainResult => logTime(plain)
    case async: AsyncResult => async.transform(logTime)
  }
}

Comments complement code

Today I read this quote:

Good code is its own best documentation. As you’re about to add a comment, ask yourself, ‘How can I improve the code so that this comment isn’t needed?’

I just want to say, it's a load of rubbish. Take a look at the following code:

def toCharArray(
     decoder: CharsetDecoder = Charset.forName("UTF-8").newDecoder()
  ): Enumeratee[Array[Byte], Array[Char]] = new Enumeratee[Array[Byte],Array[Char]] {

  def step[A](inner: Iteratee[Array[Char], A], partialChar: Option[Array[Byte]] = None)(in: Input[Array[Byte]]): 
      Iteratee[Array[Byte], Iteratee[Array[Char], A]] = {
    in match {
      case EOF => partialChar.map(_ => Error[Array[Byte]]("EOF encountered mid character", EOF))
        .getOrElse(Done[Array[Byte],Iteratee[Array[Char],A]](inner, EOF))

      case Empty => Cont(step(inner, partialChar))

      case El(data) => {
        val charBuffer = CharBuffer.allocate(data.length + 1)
        val byteBuffer = partialChar.map({ leftOver =>
          val buffer = ByteBuffer.allocate(leftOver.length + data.length)
          buffer.mark()
          buffer.put(leftOver).put(data)
          buffer.reset()
          buffer
        }).getOrElse(ByteBuffer.wrap(data))

        decoder.decode(byteBuffer, charBuffer, false)

        val leftOver = if (byteBuffer.limit() > byteBuffer.position()) {
          Some(byteBuffer.array().drop(byteBuffer.position()))
        } else None

        val decoded = charBuffer.array().take(charBuffer.position())
        val input = if (decoded.length == 0) Empty else El(decoded)

        inner.pureFlatFold {
          case Step.Cont(k) => Cont(step(k(input), leftOver))
          case _ => Done(inner, Input.Empty)
        }
      }
    }
  }

  def applyOn[A](inner: Iteratee[Array[Char], A]) = Cont(step(inner))
}

If you know iteratees and you know Scala, it's pretty obvious what this does. It converts a stream of byte arrays into a stream of char arrays, taking into the consideration the possibility that one character may be split across multiple byte arrays. Structurally it is purely functional, however the actual decoding is not, it uses the high performance Java CharBuffer and ByteBuffer classes to do the decoding, which are mutable, and arguably this is necessary since this enumeratee is a place where performance matters. I wrote it, and in my opinion it's not badly written, though if you can see anything that could be improved, please let me know.

So, tell me, on line 14, why do I allocate a char buffer of the incoming byte array length plus one? What is the reason for the plus one? When I first wrote it, I didn't have the plus one there, I didn't think it was needed. You see, when converting an array of bytes to an array of UTF-16 Java characters, at most, 8 bytes will become 8 characters, right? 8 bytes could become 4 characters, if those characters were multi byte characters, the number of chars needed might be less than the number of bytes being decoded, but it can never be more, right? One byte can't become multiple UTF-16 chars, so why would I ever need 9 characters for 8 bytes?

Now maybe you might criticise my code because the +1 is actually a magic number, and if I gave it a name, then that would explain everything. Well, let's give it a name, and reasonable a name (I could give it a two hundred character long name and that might explain everything but you can hardly call two hundred character long variable names good code. Well, maybe you can in Java, but not in Scala). So I'll create a val PotentialMultiCharOffset = 1. Does that help you at all? Do you know what it's for? Why is it 1? Why is it added, why don't I multiply by 2? If you do know the reason behind it, then hats off to you, you are a genius. But for the rest of us, we don't know. It as only after I wrote comprehensive unit tests for the code that I found the bug (I've heard other people say that unit tests are not necessary for functional code, another fallacy).

Let me show you the comment that is above that line of code:

// The +1 here is very important, it is there for the case when there are
// 3 bytes of a 4 byte character in the partialChar array, and so this data
// should contain the final byte, but that one byte will become 2 Chars.
val charBuffer = CharBuffer.allocate(data.length + 1)

Understand it now? Was there any way that I could have written the code that would have explained that? Was there any variable name that I could have given it that would have explained it better than that comment? No, it just needed a simple comment explaining its purpose. Without the comment, you'd be sitting there wondering why on earth I had added 1, you might have even thought "this is allocating more memory than needed, I'll just optimise this" and you would have injected a bug. In this case, a comment is aptly suited to making the code understandable. The comment complements the code. It is necessary and the best way of describing it.

And the fact is that we come across things every day where some really obscure edge case means we have to do some otherwise obscure behaviour. Maybe in a world of higher order logic this isn't the case, but we work in a world of far less than perfect protocols with edge cases that are impossible to memorise, where optimising an equals comparison to return early when you encounter the first character that isn't equal is a security vulnerability, where bugs in other software that our software has to interface to means we have to do counter intuitive things to work around their issues, and where some things are just plain hard to get your heard around, and sometimes a little plain English (or whatever language you speak) just does that little bit to helping you or the next developer make sense of it all.

Comments complement code. Good code does not negate the need of comments. Good code includes comments where comments are needed.

About

Hi! My name is James Roper, and I am a software developer with a particular interest in open source development and trying new things. I program in Scala, Java, Go, PHP, Python and Javascript, and I work for Lightbend as the architect of Kalix. I also have a full life outside the world of IT, enjoy playing a variety of musical instruments and sports, and currently I live in Canberra.