all that jazz

james' blog about scala and all that jazz

About

Benchmarking Scala against Java

Posted 16 October 2012

A question recently came up at work about benchmarks between Java and Scala. Maybe you came across my blog post because you too are wanting to know which is faster, Java or Scala. Well I'm sorry to say this, but if that is you, you are asking the wrong question. In this post, I will show you that Scala is faster than Java. After that, I will show you why the question was the wrong question and why my results should be ignored. Then I will explain what question you should have asked.

The benchmark

Today we are going to choose a very simple algorithm to benchmark, the quick sort algorithm. I will provide implementations both in Scala and Java. Then with each I will sort a list of 100000 elements 100 times, and see how long each implementations takes to sort it. So let's start off with Java:

    public static void quickSort(int[] array, int left, int right) {
        if (right <= left) {
            return;
        }
        int pivot = array[right];
        int p = left;
        int i = left;
        while (i < right) {
            if (array[i] < pivot) {
                if (p != i) {
                    int tmp = array[p];
                    array[p] = array[i];
                    array[i] = tmp;
                }
                p += 1;
            }
            i += 1;
        }
        array[right] = array[p];
        array[p] = pivot;
        quickSort(array, left, p - 1);
        quickSort(array, p + 1, right);
    }

Timing this, sorting a list of 100000 elements 100 times on my 2012 MacBook Pro with Retina Display, it takes 852ms. Now the Scala implementation:

  def sortArray(array: Array[Int], left: Int, right: Int) {
    if (right <= left) {
      return
    }
    val pivot = array(right)
    var p = left
    var i = left
    while (i < right) {
      if (array(i) < pivot) {
        if (p != i) {
          val tmp = array(p)
          array(p) = array(i)
          array(i) = tmp
        }
        p += 1
      }
      i += 1
    }
    array(right) = array(p)
    array(p) = pivot
    sortArray(array, left, p - 1)
    sortArray(array, p + 1, right)
  }

It looks very similar to the Java implementation, slightly different syntax, but in general, the same. And the time for the same benchmark? 695ms. No benchmark is complete without a graph, so let's see what that looks like visually:

So there you have it. Scala is about 20% faster than Java. QED and all that.

The wrong question

However this is not the full story. No micro benchmark ever is. So let's start off with answering the question of why Scala is faster than Java in this case. Now Scala and Java both run on the JVM. Their source code both compiles to bytecode, and from the JVMs perspective, it doesn't know if one is Scala or one is Java, it's just all bytecode to the JVM. If we look at the bytecode of the compiled Scala and Java code above, we'll notice one key thing, in the Java code, there are two recursive invocations of the quickSort routine, while in Scala, there is only one. Why is this? The Scala compiler supports an optimisation called tail call recursion, where if the last statement in a method is a recursive call, it can get rid of that call and replace it with an iterative solution. So that's why the Scala code is so much quicker than the Java code, it's this tail call recursion optimisation. You can turn this optimisation off when compiling Scala code, when I do that it now takes 827ms, still a little bit faster but not much. I don't know why Scala is still faster without tail call recursion.

This brings me to my next point, apart from a couple of extra niche optimisations like this, Scala and Java both compile to bytecode, and hence have near identical performance characteristics for comparable code. In fact, when writing Scala code, you tend to use a lot of exactly the same libraries between Java and Scala, because to the JVM it's all just bytecode. This is why benchmarking Scala against Java is the wrong question.

But this still isn't the full picture. My implementation of quick sort in Scala was not what we'd call idiomatic Scala code. It's implemented in an imperative fashion, very performance focussed - which it should be, being code that is used for a performance benchmark. But it's not written in a style that a Scala developer would write day to day. Here is an implementation of quick sort that is in that idiomatic Scala style:

  def sortList(list: List[Int]): List[Int] = list match {
    case Nil => Nil
    case head :: tail => sortList(tail.filter(_ < head)) ::: head :: sortList(tail.filter(_ >= head))
  }

If you're not familiar with Scala, this code may seem overwhelming at first, but trust me, after a few weeks of learning the language, you would be completely comfortable reading this, and would find it far clearer and easier to maintain than the previous solution. So how does this code perform? Well the answer is terribly, it takes 13951ms, 20 times longer than the other Scala code. Obligatory chart:

So am I saying that when you write Scala in the "normal" way, your codes performance will always be terrible? Well, that's not quite how Scala developers write code all the time, they aren't dumb, they know the performance consequences of their code.

The key thing to remember is that most problems that developers solve are not quick sort, they are not computation heavy problems. A typical web application for example is concerned with moving data around, not doing complex algorithms. The amount of computation that a piece of Java code that a web developer might write to process a web request might take 1 microsecond out of the entire request to run - that is, one millionth of a second. If the equivalent Scala code takes 20 microseconds, that's still only one fifty thousandth of a second. The whole request might take 20 milliseconds to process, including going to the database a few times. Using idiomatic Scala code would therefore increase the response time by 0.1%, which is practically nothing.

So, Scala developers, when they write code, will write it in the idiomatic way. As you can see above, the idiomatic way is clear and concise. It's easy to maintain, much easier than Java. However, when they come across a problem that they know is computationally expensive, they will revert to writing in a style that is more like Java. This way, they have the best of both worlds, with the easy to maintain idiomatic Scala code for the most of their code base, and the well performaning Java like code where the performance matters.

The right question

So what question should you be asking, when comparing Scala to Java in the area of performance? The answer is in Scala's name. Scala was built to be a "Scalable language". As we've already seen, this scalability does not come in micro benchmarks. So where does it come? This is going to be the topic of a future blog post I write, where I will show some closer to real world benchmarks of a Scala web application versus a Java web application, but to give you an idea, the answer comes in how the Scala syntax and libraries provided by the Scala ecosystem is aptly suited for the paradigms of programming that are required to write scalable fault tolerant systems. The exact equivalent bytecode could be implemented in Java, but it would be a monstrous nightmare of impossible to follow anonymous inner classes, with a constant fear of accidentally mutating the wrong shared state, and a good dose of race conditions and memory visibility issues.

To put it more concisely, the question you should be asking is "How will Scala help me when my servers are falling over from unanticipated load?" This is a real world question that I'm sure any IT professional with any sort of real world experience would love an answer to. Stay tuned for my next blog post.

Why does Chrome consider some ports unsafe?

Posted 23 August 2012

Today in my Twitter feed I noticed a frustrated tweet from Dan North complaining about why Chrome seems to arbitrarily block connections to some ports, giving the confusing error code "net::ERR_UNSAFE_PORT". I've encountered this before, and have also been similarly frustrated, so I decided to do a little research, and work out what Google means by unsafe. There are a few discussions in stack overflow and forums about exactly which ports are blocked, but I didn't find much about why they are considered unsafe.

My first assumption was that somehow it was unsafe for Chrome itself to connect to services on these ports, because maybe that services protocol might confuse Chrome into doing something wrong. However, this didn't make sense, and I then worked out that it's not about protecting Chrome, but protecting the services themselves running on these ports. Many internet protocols are designed to be very permissive as to what they accept, and this presents an interesting attack opportunity for attackers.

As web developers with a mind for security are well aware, the browser is incredibly obliging to attackers when it comes to making requests on servers on your behalf. XSRF is a perfect example of this, but the good news is, we know how to protect against XSRF. However, what if an attacker tricked a web browser into making a connection to something that doesn't speak HTTP? Let's take, for example, an SMTP server. If you've never seen an SMTP session before, here is an example session that I've ripped from wikipedia:

S: 220 smtp.example.com ESMTP Postfix
C: HELO relay.example.org
S: 250 Hello relay.example.org, I am glad to meet you
C: MAIL FROM:
S: 250 Ok
C: RCPT TO:
S: 250 Ok
C: RCPT TO:
S: 250 Ok
C: DATA
S: 354 End data with .
C: From: "Bob Example" 
C: To: "Alice Example" 
C: Cc: theboss@example.com
C: Date: Tue, 15 January 2008 16:02:43 -0500
C: Subject: Test message
C:
C: Hello Alice.
C: This is a test message with 5 header fields and 4 lines in the message body.
C: Your friend,
C: Bob
C: .
S: 250 Ok: queued as 12345
C: QUIT
S: 221 Bye

On first glance, this may look nothing like HTTP. Except for one important feature, like HTTP, SMTP separates elements of its protocol with newlines. Furthermore, what happens when we send data to an SMTP server that it doesn't understand? Well, here's a simple HTTP GET request with an example SMTP server:

C: GET / HTTP/1.1
S: 500 5.5.1 Command unrecognized: "GET / HTTP/1.1"
C: Host: www.example.com
S: 500 5.5.1 Command unrecognized: "Host: www.example.com"
C: 
S: 500 5.5.1 Command unrecognized: ""

So, obviously our mail server is confused, but an important thing to notice here is that it's not outright rejecting our connection, it's telling the client that there is an error, but continuing to accept commands. So can we exploit this? The answer is a resounding yes. What if I, an attacker, were to craft a page that had a form, that was automatically submitted (using a javascript on load event that invoked the submit method - similar to a form based XSRF attack) to an SMTP server. In order to speak SMTP with this server, I would need to be able to include new lines in my messages. This is not possible using application/x-www-form-urlencoded, since newlines are URL encoded, however using multipart/form-data, I can create a field that has the client side messages in the above SMTP conversation, and so using my victims web browser, I can submit an email to the target SMTP server. The SMTP server would ignore all the HTTP protocol as in the above example, including the method and headers, as well as the multipart/form-data header content, but once it got my field, which would have my HELO, MAIL FROM, RCPT TO etc commands in it, it would start processing them.

But of course, an attacker could just connect directly to the SMTP server, right? For some SMTP servers yes. However, many SMTP servers are protected with nothing more than a firewall, and especially if the SMTP server is configured to use this as a security mechanism for verifying the authenticity of the source of the email, it may be desirable for an attacker to be able to send email through such an SMTP server. So what this means is an attacker could use a browser running behind a firewall that was protecting an SMTP server as a proxy to connecting to that SMTP server, and so send messages using it.

As it turns out, some SMTP servers actually detect HTTP requests made on them and immediately close the connection (my servers SMTP server did this when I tried with it). However, not all SMTP servers do this, because it wasn't envisioned that a piece of client server, such as a web browser, could be used as an open proxy to a protected network. And it's certainly not a good idea to assume that other types of services/servers, such as FTP servers and even IRC servers, will do this too.

So, why does Chrome refuse to connect to some ports? Because the Google engineers has gone through the list of well known ports, and worked out how tolerant the protocols that use these ports are to being sent HTTP requests, and if they are tolerant, they've marked it as unsafe and so blocked it, to prevent Google Chrome from being an open proxy to a secured network. Should all web browsers do this? Probably.

My complaint against Haskell

Posted 7 July 2012

I have nothing against Haskell. I've been getting my teeth stuck into some Scala over the recent months, and I am loving functional programming, immutable types, and am even starting to see what's so good about monads. So Haskell, with its reputation of being a good pure functional language, is interesting to me. I did Haskell when I was in university. I didn't learn it, if I had learnt it, I would have understood it. I just did it. We learnt really useful things like doubling every integer in a list with a simple map function. Like curried and uncurried functions, like zipping, folding, reducing... all within the context of trivial contrived examples that did absolutely nothing to help us in the real world. In my 15 years of programming, I have only once ever needed to double every integer in a list, that was in my Haskell lab at uni. Very useful.

My problem is not with Haskell. My problem is with the people that teach and promote Haskell. They have never presented me with an example of how Haskell can help me in the real world. And today I'd like to highlight this, by looking at the first piece of code you'll encounter if you read the documentation on Haskell on the Haskell website. If you visit that website, and you want to know what Haskell is, you'll very quickly notice the link "What is Haskell". Clicking on it, you'll be taken to a page, and as you scroll down, you'll come across your first piece of Haskell code. It's quicksort:

quicksort :: Ord a => [a] -> [a]
quicksort []     = []
quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
    where
        lesser  = filter (< p) xs
        greater = filter (>= p) xs

If you've ever implemented quicksort in a procedural language before, you'll notice that that code is very simple in comparison. It has some syntax you might not recognise, but it's not a soup of crazy characters like Perl, it looks ordered, and easy to read. It's impressive. But is it practical?

What is quicksort? The name implies that it is quick. So what's its running time? O(n²). Wait, hang on, that's not quick, there are faster algorithms out there, like heap sort, it runs in O(n log(n)) time. Well, it turns out, the average running time of quicksort is n log(n), but why would you use it over another algorithm that guarantees n log(n) running time? The reason is, the actual implementation of quicksort is really simple from a computation perspective, it does very little per element of the array, making its actual running time (on average) about 3 times faster than heap sort or merge sort. This makes quicksort very useful, if you want an algorithm that sorts an array as fast as possible, at the cost of in some circumstances not being very fast, then quicksort is perfect for you. This is why quicksort is called quicksort, it is quick. It may not mathematically have the best running time, but practically, it is quick.

This brings me to quicksort in Haskell. One of the reasons why quicksort, implemented in a procedural language, is so quick is that it does very close to the minimum necessary swap operations needed to sort the array. It sorts in place, it doesn't require building supplementary data structures, like two separate arrays in merge sort or the maximum heap in heap sort. It plays very nicely with hardware optimisations like CPU caches, maximising cache hits regardless of the size of the array and cache. However, the implementation above in Haskell does use additional structures. It partitions the list by creating two new lists on each recursion of the algorithm, requiring many more memory writes than the in place implementation. It uses linked lists, which play very poorly with CPU caches. It still runs in n log(n) on average, but it's now doing a lot more operations with very little help from hardware. This added overhead means that quicksort is no longer 3 times faster than merge or heap sort. Which means it now, on average, has similar run times to the O(n log(n)) algorithms, and still with the possibility of running in n² for some inputs. It no longer is quick, and hence is not practical to use in the real world, you may as well use merge sort or heap sort.

To make matters worse, this implementation will always perform poorly on lists that are already k sorted, which is not an uncommon real world input to sort algorithms. A procedural implementation may also perform poorly on k sorted lists, but it's a simple change to make it perform well, just choose a random pivot, or a pivot from somewhere in the middle of the list. In Haskell, you can't do that without adding yet another overhead, because your lists are linked lists, and so accessing a random pivot takes n time.

Now, none of this means that Haskell is a bad language. Merge sort in Haskell is also very simple, and is also a very practical sorting algorithm in many cases, and so could have been used as an example instead. But why did the authors of this documentation choose a piece of code that is completely useless in the real world as the first taste of Haskell that many people will ever see? This is my complaint about Haskell, it's never been promoted to me in a way that bears any practical usefulness. I'm sure it's a great language. Pity about the people that promote it.

Hashbangs on the new Google Groups

Posted 12 May 2012

Hash-bang URLs have been a contentious issue at my work over the past week, so much so that we completely got rid of them in the project that I'm working on. Coming out of the experience, I'm not convinced that hash-bangs are always a bad idea, however, there are some types of sites where they are very bad. Google groups I believe falls firmly into this category. And it's too late to do anything about it.

The major issue with hash-bangs is not that it's a hack - many people would have called dynamically generating documents on each request a hack when websites first started doing that almost 20 years ago. It's not that it's ugly. It's not that some users don't have Javascript enabled (HTML5 mandates various Javascript APIs, if it doesn't do Javascript, it doesn't do HTML, Javascript is a core building block of the web - get over it). The big issue is maintainability of URLs.

For some sites, this is irrelevant. Take a site that is your own personal dashboard to managing your own private things - a webmail site for example. Nothing is going to link to it, so there is no need to maintain URLs. For other sites, it's an issue, but it's not so important. I know the folks at Twitter are considering it a big issue at the moment, but really, if a link breaks to someones 140 character word of wisdom... does it really matter?

But then there is a category of sites for which it is really important. These are the sites that maintain permanent records for future use. Mailing lists are one such example. I have an open source project, called the mongo jackson mapper. It has a small user base, and I provide support to people when they asked questions. About a month ago, I noticed that I was getting a number of emails from people asking for help. I was more than happy to answer these, however, I was concerned that if one person has a question, then other people are likely to have the same question, and I would be doing a lot of repeat work in answering people questions. So, I setup a Google group for questions.

Now, when someone asks the same question twice, I can simply link to the old answer, no need to rewrite the answer. Furthermore, people on other sites, such as stack overflow, can link to my answers. The URL to that answer will stand as a link in the chain to a record in time that will help many people help themselves in future. However, the usefulness is dependent on that link still working in future. Let's have a look at an example link:

https://groups.google.com/forum/?fromgroups#!topic/mongo-jackson-mapper/AeFF5Cqqolw

As you can see, the ID of the message only appears after the hash-bang. This means, it doesn't get sent to the server when you request it, the only way Google can know what message is being request is to serve up some Javascript that parses it out of the URL. And that's exactly what they do. But what happens, in future, when they decide to rewrite Google groups again? And again? And again... If the resource, https://groups.google.com/forum/?fromgroups, doesn't continue to serve up that Javascript, then all links to all posts that people have copied from their browser address bar and pasted elsewhere will break. The permanent record of networked help pages that people use to solve their every day problems will be broken.

In contrast, if they didn't use hash-bang URLs, then they could make whatever changes to their URL structure that they liked, and all they have to do to maintain the URLs is implement a simple redirect on the server side. No need to serve Javascript up at a particular resource.

Google groups does have a permalink feature, but really, who uses that? It feels unnatural for me even though I know and understand the difference between permalinks and what is in my address bar. When I want to copy a URL, without thinking, I copy it straight from the address bar, because this is what I always do for every other site.

So, Google groups, really? Hash-bangs? Are you not thinking of the future? Are you seriously committing to maintaining Javascript at arbitrary URLs for the lifetime of the internet? Unfortunately, the horse has bolted, these hash-bang links to Google groups are already starting to find themselves around the web, so you have no choice. And I'm guessing that someone in future will forget this, and and so you'll just break all those links.

ScalaFlect: Statically typed query DSLs

Posted 23 April 2012

I've been hard at work over the past few weeks redesigning my Mongo object mapper, the Mongo Jackson Mapper, to support Jackson 2.0, be simpler to use, and be far more powerful with less surprises to users. Something that has been at the front of my mind is how to make it nicer to use in Scala, and the biggest thing that I'd like to do here is implement the query and update APIs using a DSL that makes the queries look like ordinary code. But there's a problem. If you want your classes to be ordinary case classes, then when querying, you're going to have to refer to the properties using strings. Strings that are very easy to make typos in, strings that don't get updated automatically when you refactor, strings that don't throw compile errors if they are wrong, and in the case of MongoDB queries, strings that don't even throw runtime errors if they are wrong. Scala has no support for referring to a field, rather than its value, so unfortunately, strings are the only option. That is, until now.

Today I started a new open source project called ScalaFlect. It provides exactly what every query DSL implementor has been wanting, statically typed reflection. It allows you to reference a field or a method, and the result being that you get the java.lang.reflect.Field/Method associated with it. Before I go into the details of how I did this, let's first look at an example DSL that I've written to demonstrate the capabilities so far.

Example DSL usage

So to start off, I have some case classes, a BlogPost, and a Comment, with a one to many relationship between blog posts and comments:

case class BlogPost(author: String, title: String, published: Boolean,
  tags: Set[String], comments: List[Comment])
case class Comment(author: String, text: String)

Now I have implemented my DSL by implementing an abstract class that provides the basic elements of my DSL. This class is called ObjectListDao, and as the name suggests, it stores objects in a list. But it could just as easily be storing objects in a database like MongoDB. Let's look at a DAO that uses this DSL:

class BlogPostDao extends ObjectListDao(classOf[BlogPost]) {
  import au.id.jazzy.scalaflect.ScalaFlect.toTraverser

  def findByAuthor(author: String) = {
    query(
      $(_.author) %= author
    )
  }

  def findPublishedByTag(tag: String) = {
    query(
      $(_.tags.$) %= tag,
      $(_.published) %= true
    )
  }

  def findPostsCommentedOnByAuthor(author: String) = {
    query(
      $(_.comments.$.author) %= author
    )
  }
}

So the first thing you notice is the static access to properties. The $ method accepts a function that accepts a parameter of type BlogPost, and the function should return the property that you want to reference. You can also see that using the $ operator on a collection, you can refer to properties in that collections type.

Example DSL definition

So how have I implemented my DSL? Quite simply. Here is the code:

class ObjectListDao[T](val clazz: Class[T]) {
  private val scalaFlect = new ScalaFlect(clazz)
  private val list = new ListBuffer[T]

  def $[R](f : T => R): QueryField[R] = new QueryField(scalaFlect.reflect(f))
  
  def query(exps: QueryExpression*) = {
    var results = list.toList
    for (exp <- exps) {
      results = results.filter(exp.matches(_))
    }
    results.toList
  }

  def save(obj: T) {
    list += obj
  }
}

class QueryField[R](val field: Member[R]) {
  def %=(value: R): QueryExpression = new QueryExpression {
    def matches(obj: Any) = !(field.get(obj) filter {v: Any => v == value} isEmpty)
  }
}

trait QueryExpression {
  def matches(obj: Any): Boolean
}

Here we have an instance of ScalaFlect for our class (in the example usage, this was BlogPost). The $ method simply invokes reflect() on this, which is what does the reflection. reflect() returns a Member. A Member is a reference to a series of methods and fields that were invoked by the reflection function. Using it, you can get access to the Java reflection API classes that represent these methods and fields.

The other thing to note is the implementation of matches. This is where the Member class is actually used. If this was a DSL for something like MongoDB, then at this point the Memmber and its parents would be inspected to build a query object, which would then be sent to the database. In our case though we're just operating on a list of objects that we already have in memory, and Member provides a convenient get function that gets all the values of that property for a given object. Note that I said values, not value, ScalaFlect allows stepping into collections using the $ operator, so if the reference goes through a collection, then multiple values might be returned.

Implementation

At this point, especially if you've thought about how to do this before, you're probably wondering how on earth I've implemented this. The passed in reflection function is never executed. An important thing to remember in Scala is that functions are actually classes themselves. ScalaFlect simply takes the function class, decompiles it, finds the apply method, and then does some fairly simple analysis on the code to find out what properties were referenced. By fairly simple, I mean I wrote it in 2 hours. The result is cached so each function is only ever analysed once.

Ok, so this sounds like a hack, and it is. And it certainly has its limitations. The behaviour if you put complex code into your reflection function is completely undefined (it will most likely throw an exception, in future I'll add some decent error handling). However, it's a hack that will work well - there is nothing magic about how a Scala function that invokes some methods/accesses some fields on the passed in argument and returns the result is compiled down to bytecode. I don't expect that anything will be added to Scala in future that might break this.

One limitation that I do anticipate though is working with hot swapped code. I imagine, if you update a query, and hotswap it into a running JVM, there will be no way for ScalaFlect to either detect that, or even analyse it if it could. This is because the way ScalaFlect decompiles the bytecode is by loading the class from the reflection functions classloader. I actually have no idea how hot swapping works, but I suspect that it doesn't update the bytecode that the classloader sees. The same goes for any runtime generated and enhanced code, but maybe that issue is not as likely to ever cause a problem.

The future

I read somewhere that Scala 2.10 is getting improved reflection, but I don't know what that means. The need for this library can be completely removed if Scala added language level features for referring to a field, and not it's value, in code. My ideal outcome from implementing ScalaFlect is that the value of such a feature will be realised, and hence added to Scala. In the meantime, ScalaFlect will work nicely at giving DSL writers statically typed reflection.

So check out the library on GitHub, and feel free to comment on my approach. I haven't cut a release yet, but I'll get to that and deploy a beta to Maven central in the coming days (depending on how fast Sonatype will process my request to deploy to my domains groupId).

Update

It's just been pointed out to me that Scala 2.10 will (might?) have macro support, which would achieve this (and potentially much more) in a much cleaner fashion. This is great news for writing query DSLs. I don't know though what the state of the macro support is, it's marked as experimental and doesn't seem to be really mentioned in any Scala 2.10 feature lists. So maybe ScalaFlect might have some time to serve some useful purpose yet.