all that jazz

james' blog about scala and all that jazz

ScalaFlect: Statically typed query DSLs

I've been hard at work over the past few weeks redesigning my Mongo object mapper, the Mongo Jackson Mapper, to support Jackson 2.0, be simpler to use, and be far more powerful with less surprises to users. Something that has been at the front of my mind is how to make it nicer to use in Scala, and the biggest thing that I'd like to do here is implement the query and update APIs using a DSL that makes the queries look like ordinary code. But there's a problem. If you want your classes to be ordinary case classes, then when querying, you're going to have to refer to the properties using strings. Strings that are very easy to make typos in, strings that don't get updated automatically when you refactor, strings that don't throw compile errors if they are wrong, and in the case of MongoDB queries, strings that don't even throw runtime errors if they are wrong. Scala has no support for referring to a field, rather than its value, so unfortunately, strings are the only option. That is, until now.

Today I started a new open source project called ScalaFlect. It provides exactly what every query DSL implementor has been wanting, statically typed reflection. It allows you to reference a field or a method, and the result being that you get the java.lang.reflect.Field/Method associated with it. Before I go into the details of how I did this, let's first look at an example DSL that I've written to demonstrate the capabilities so far.

Example DSL usage

So to start off, I have some case classes, a BlogPost, and a Comment, with a one to many relationship between blog posts and comments:

case class BlogPost(author: String, title: String, published: Boolean,
  tags: Set[String], comments: List[Comment])
case class Comment(author: String, text: String)

Now I have implemented my DSL by implementing an abstract class that provides the basic elements of my DSL. This class is called ObjectListDao, and as the name suggests, it stores objects in a list. But it could just as easily be storing objects in a database like MongoDB. Let's look at a DAO that uses this DSL:

class BlogPostDao extends ObjectListDao(classOf[BlogPost]) {
  import au.id.jazzy.scalaflect.ScalaFlect.toTraverser

  def findByAuthor(author: String) = {
    query(
      $(_.author) %= author
    )
  }

  def findPublishedByTag(tag: String) = {
    query(
      $(_.tags.$) %= tag,
      $(_.published) %= true
    )
  }

  def findPostsCommentedOnByAuthor(author: String) = {
    query(
      $(_.comments.$.author) %= author
    )
  }
}

So the first thing you notice is the static access to properties. The $ method accepts a function that accepts a parameter of type BlogPost, and the function should return the property that you want to reference. You can also see that using the $ operator on a collection, you can refer to properties in that collections type.

Example DSL definition

So how have I implemented my DSL? Quite simply. Here is the code:

class ObjectListDao[T](val clazz: Class[T]) {
  private val scalaFlect = new ScalaFlect(clazz)
  private val list = new ListBuffer[T]

  def $[R](f : T => R): QueryField[R] = new QueryField(scalaFlect.reflect(f))
  
  def query(exps: QueryExpression*) = {
    var results = list.toList
    for (exp <- exps) {
      results = results.filter(exp.matches(_))
    }
    results.toList
  }

  def save(obj: T) {
    list += obj
  }
}

class QueryField[R](val field: Member[R]) {
  def %=(value: R): QueryExpression = new QueryExpression {
    def matches(obj: Any) = !(field.get(obj) filter {v: Any => v == value} isEmpty)
  }
}

trait QueryExpression {
  def matches(obj: Any): Boolean
}

Here we have an instance of ScalaFlect for our class (in the example usage, this was BlogPost). The $ method simply invokes reflect() on this, which is what does the reflection. reflect() returns a Member. A Member is a reference to a series of methods and fields that were invoked by the reflection function. Using it, you can get access to the Java reflection API classes that represent these methods and fields.

The other thing to note is the implementation of matches. This is where the Member class is actually used. If this was a DSL for something like MongoDB, then at this point the Memmber and its parents would be inspected to build a query object, which would then be sent to the database. In our case though we're just operating on a list of objects that we already have in memory, and Member provides a convenient get function that gets all the values of that property for a given object. Note that I said values, not value, ScalaFlect allows stepping into collections using the $ operator, so if the reference goes through a collection, then multiple values might be returned.

Implementation

At this point, especially if you've thought about how to do this before, you're probably wondering how on earth I've implemented this. The passed in reflection function is never executed. An important thing to remember in Scala is that functions are actually classes themselves. ScalaFlect simply takes the function class, decompiles it, finds the apply method, and then does some fairly simple analysis on the code to find out what properties were referenced. By fairly simple, I mean I wrote it in 2 hours. The result is cached so each function is only ever analysed once.

Ok, so this sounds like a hack, and it is. And it certainly has its limitations. The behaviour if you put complex code into your reflection function is completely undefined (it will most likely throw an exception, in future I'll add some decent error handling). However, it's a hack that will work well - there is nothing magic about how a Scala function that invokes some methods/accesses some fields on the passed in argument and returns the result is compiled down to bytecode. I don't expect that anything will be added to Scala in future that might break this.

One limitation that I do anticipate though is working with hot swapped code. I imagine, if you update a query, and hotswap it into a running JVM, there will be no way for ScalaFlect to either detect that, or even analyse it if it could. This is because the way ScalaFlect decompiles the bytecode is by loading the class from the reflection functions classloader. I actually have no idea how hot swapping works, but I suspect that it doesn't update the bytecode that the classloader sees. The same goes for any runtime generated and enhanced code, but maybe that issue is not as likely to ever cause a problem.

The future

I read somewhere that Scala 2.10 is getting improved reflection, but I don't know what that means. The need for this library can be completely removed if Scala added language level features for referring to a field, and not it's value, in code. My ideal outcome from implementing ScalaFlect is that the value of such a feature will be realised, and hence added to Scala. In the meantime, ScalaFlect will work nicely at giving DSL writers statically typed reflection.

So check out the library on GitHub, and feel free to comment on my approach. I haven't cut a release yet, but I'll get to that and deploy a beta to Maven central in the coming days (depending on how fast Sonatype will process my request to deploy to my domains groupId).

Update

It's just been pointed out to me that Scala 2.10 will (might?) have macro support, which would achieve this (and potentially much more) in a much cleaner fashion. This is great news for writing query DSLs. I don't know though what the state of the macro support is, it's marked as experimental and doesn't seem to be really mentioned in any Scala 2.10 feature lists. So maybe ScalaFlect might have some time to serve some useful purpose yet.

Three things I had to learn about Scala before it made sense

Last year I sat down in a meeting room with a few other colleagues to learn from one of Atlassian's biggest Scala advocates about Scala. I kept on hearing good stuff about Scala, and I like trying new things, so I was eager to have my eyes opened so that I could learn how awesome this thing was. Unfortunately, I was disapponted. Most of the lesson went straight over my head. But to my dismay, the other people in the room seemed to get it. It was on this day that I realised an interesting thing about my learning style. I remembered back at uni, I was exactly the same in lectures... I never got anything I was taught in lectures. Which is probably why by the end of uni, I had pretty much stopped attending all CS lectures. It wasn't until I was given a real problem in a lab that I had to solve that I got it. This is why I never really got functional languages at uni, the lecturers that taught functional languages at my uni weren't the most practical of people. Actually, I don't think they ever even considered whether their work had any practical implications at all. So they taught things like map, reduce, filter, fold, curried functions etc like this "Look, here's something you've never heard of! Behold its awesomeness! Now here's a trivial mathematical use case with no apparent real world application! Look at how practical it is!"

Some people have no problems learning this way, and I envy them. For me, I need to have a problem that I understand and want to solve today, that the thing that I'm learning can solve, in order to learn it. So if you're reading this and thinking "That is so me!", and you want to learn Scala, then hopefully this is the blog post for you. I'm going to start with a practical problem that hopefully you will understand and want to solve, and then show you how three particular Scala features allow solving this really elegantly and simply. Hopefully after that, you'll have a good understanding of those features in Scala, and be able to understand a lot more of Scala code when you encounter it.

The problem

The problem that I want to solve is dependency management. If you've ever used maven, you'll know how verbose this can be. For those that haven't encountered maven, here's what specifying a single dependency looks like:

<dependency>
    <groupId>net.vz.mongodb.jackson</groupId>
    <artifactId>mongo-jackson-mapper</artifactId>
    <version>1.4.1</version>
</dependency>

In English, this means a dependency on the artifact mongo-jackson-mapper version 1.4.1 from the organisation net.vz.mongodb.jackson. The English explanation is shorter than the code! Maybe the Maven authors could have made things nicer, by placing it on one line, and that would be clearer too, for example, something like this:

<dependency>net.vz.mongodb.jackson:mongo-jackson-mapper:1.4.1</dependency>

There are however some advantages to the more verbose model, the language (XML) that the DSL (maven pom.xml format) uses is able to fully express every element of the dependency, rather than the elements being defined in some other format XML is unaware of. There are also other attributes (such as scope and classifier) that the more verbose solution can easily unambiguously specify, these become very hard if it's just an arbitrary string of characters.

So here is the problem. I want a DSL to specify dependencies with as little noise as possible from the DSL, so that my dependency specification is short, unambiguous, and its format should be enforced by the language it's specified in. The language I'm going to use is Scala.

And just to ground this problem even more solidly in reality, this example is not contrived. SBT, the Scala Build Tool, uses Scala to specify its build configuration, including dependencies. The solution I'm going to show you is the solution that SBT uses. So by the end of this you will also understand a little about how to use SBT, which you will no doubt need to know as you delve into the world of Scala.

How would we do it with no weird Scala tricks?

First of all I will give a solution that uses language constructs that you should already be familiar with, using classes, fields, methods etc. If the following syntax makes no sense to you, you should probably do some reading on basic Scala syntax.

def groupId(groupId: String) = new GroupId(groupId)

class GroupId(val groupId: String) {
  def artifact(artifactId: String) = new Artifact(groupId, artifactId)
}

class Artifact(val groupId: String, val artifactId: String) {
  def version(version: String) = new VersionedArtifact(groupId, artifactId, version)
}

class VersionedArtifact(val groupId: String, val artifactId: String, val version: String) {
}

Here we have three classes, a GroupId, which has a method for creating an Artifact from that group id, which in turn has a method for creating a VersionedArtifact, which captures our dependency. There's also a factory method for creating the GroupId. To use it, we can do this:

groupId("net.vz.mongodb.jackson")
    .artifact("mongo-jackson-mapper")
    .version("1.4.1")

Already it's looking better than the XML version, but this could be done in Java too. Scala has a lot more to offer yet. As we go through the following lessons, it's important to remember that each change we make, we are still basically achieving the same as the above.

Lesson One: Scala methods can be made of operator characters

To Java developers such as myself, this is a bit weird, though in other languages this is quite common. There's some strict rules in Scala over exactly what a method name can be, but one of them is that it can be made of just operator characters (ie, +, =, *, /, %, -). So, the first change that we're going to make to our above code is to get rid of the method names. It makes things more concise, but other than that it may seem like a weird first step. Bear with me.

def groupId(groupId: String) = new GroupId(groupId)

class GroupId(val groupId: String) {
  def %(artifactId: String) = new Artifact(groupId, artifactId)
}

class Artifact(val groupId: String, val artifactId: String) {
  def %(version: String) = new VersionedArtifact(groupId, artifactId, version)
}

class VersionedArtifact(val groupId: String, val artifactId: String, val version: String) {
}

And so now our code looks like this:

groupId("net.vz.mongodb.jackson").%("mongo-jackson-mapper").%("1.4.1")

Lesson Two: Scala lets you call methods without dots or braces

There are some strict rules governing this, including what happens when obvious ambiguities arise, and you can read about that in your own research. Simply put, if a method has only one parameter, you can call it without using the dot or the braces, rather replace them with whitespace:

groupId "net.vz.mongodb.jackson" % "mongo-jackson-mapper" % "1.4.1"

This is called "infix" notation, which you may have also come across in other languages. We're looking much more concise now, yet the language is still enforcing the rules of what a dependency needs, and we still end up with a strongly typed VersionedArtifact. But we're not finished yet.

Lesson Three: Scala lets you implicitly convert any type to any other type

For me, this was the weirdest feature of Scala when I first learnt it. It sounded so cool and dangerous at the same time. Most languages, when you call a method on an object, if that method doesn't exist on that object, will throw an error. For the statically typed languages, this will be a compile error, for dynamic languages it will be a runtime error. However, Scala is a bit different. When it can't find a method on an object (at compile time, because Scala is statically typed), it has an extra step before it gives up with an error.

Scala has a concept of implicit methods (and fields, but right now let's only worry about methods). When a method that you call doesn't exist on the object that you're trying to invoke it on, it will check the current scope for any implicit methods that take the type of your object as an argument. If the implicit methods result has a method that matches the method you are trying to invoke, then Scala wraps your object in that implicit method call. Have I lost you? Let's look at some code:

implicit def groupId(groupId: String) = new GroupId(groupId)

So the only change I've made to the groupId method is that it is now implicit. When I define my dependency, I do this:

"net.vz.mongodb.jackson" % "mongo-jackson-mapper" % "1.4.1"

Notice that the call to groupId has completely disappeared. If at this point our newly learned Scala syntax is a bit overwhelming for you, maybe it would help to see what would happen if we applied the implicit rule first, before we got rid of our dots and braces, and before we changed our method names to percent signs:

"net.vz.mongodb.jackson".artifact("mongo-jackson-mapper").version("1.4.1")

What it looks like is that the String class has a method called artifact. But we know it doesn't, and the Scala compiler knows it doesn't. When it sees that, it says "are there any implicit methods available that accept a String as an argument, and return a type that has an artifact method?" And so it finds the groupId method, which matches that criteria, and converts the code to this:

groupId("net.vz.mongodb.jackson").artifact("mongo-jackson-mapper").version("1.4.1")

When I first learnt about this feature, I thought that's going to cause massive amounts of confusion. And, if used irresponsibly, it certainly can. But when used carefully, it provides a massive amount of power, for example, in creating DSLs like the one above, or for adding methods to third party/legacy libraries.

On a side note, Scala actually uses this to add functional methods to the Java collections API, by providing implicit conversion methods to convert Java collections to Scala collections. The confusion that could arise can be avoided by being careful to only import the implicit methods you need, where you need them. Scala allows you to put an import statement anywhere in code, so if you have one method in a class that needs to work with Java collections, you can import the implicit collection conversions into just that block of code. In the case of DSLs, it's usually the case that the use of these are separated from the rest of your code by normal encapsulation, so a DSL for accessing a database would only appear in a DAO, where you're expecting it to appear, so there would be no confusion.

Putting it into context

There is one last step, and that is to show what we actually do with the dependency we've declared. In maven, our XML ends up being part of the rest of the overly verbose POM file. In SBT, there is a file called build.sbt that contains snippets of Scala, which build up the configuration. Available to the snippets are certain variables, for example, libraryDependencies, which contain the configuration. So adding my dependency to my SBT project means adding a line to the build.sbt file, like this:

libraryDependencies += "net.vz.mongodb.jackson" % "mongo-jackson-mapper" % "1.4.1"

You may notice here that there is another use of infix operator named methods, += is a method defined in mutable collections for adding elements. Multiple dependencies can be added more tersely like this:

libraryDependencies ++= Seq(
    "net.vz.mongodb.jackson" % "mongo-jackson-mapper" % "1.4.1",
    "net.vz.mongodb.jackson" % "play-mongo-jackson-mapper" % "1.0.0"
)

So there we have it. We have a very terse way of specifying dependencies, that is strongly typed, and so validated at compile time. I hope that the problem I have presented and solved is a problem that you have understood and can relate to, and that you've understood how Scala can be used to help solve it, and that in learning how Scala helps solve it, you've learnt and understood some important features of the Scala language that you're now keen to apply to other problems that you have.

Open Source Developers

Since everyone's doing it, I thought I'd do it too. My contribution to the What people think I do meme:

Cracking Random Number Generators - Part 4

In Part 3 of this series, we investigated the Mersenne Twister, and saw how with 624 consecutive integers obtained from it, we can predict every subsequent integer it will produce. In this part, we will look at how to calculate previous integers that it has produced. We will also discuss how we might go about determining the internal state of the Mersenne Twister if we are unable to get 624 consecutive integers.

Breaking it down

Before we go on, let's look again at the algorithm for generating the next value for state[i]:

int y = (state[i] & 0x80000000) + (state[(i + 1) % 624] & 0x7fffffff);
int next = y >>> 1;
if ((y & 1L) == 1L) {
  next ^= 0x9908b0df;
}
next ^= state[(i + 397) % 624];

We can see that there are 3 numbers from the previous state that are involved here, the old state[i], state[(i + 1) % mod 624], and state[(i + 397) % mod 624]. So, taking these 3 numbers, let's have a look at what this looks like in binary:

1. 11100110110101000100101111000001 // state[i]
2. 10101110111101011001001001011111 // state[i + 1]
3. 11101010010001001010000001001001 // state[i + 397]

// y = state[i] & 0x80000000 | state[i + 1] & 0x7fffffff
4. 10101110111101011001001001011111 // y
5. 01010111011110101100100100101111 // next = y >>> 1
6. 11001110011100100111100111110000 // next ^= 0x9908b0df
7. 00100100001101101101100110111001 // next ^= state[i + 397]

If we work backwards from i = 623, then the pieces of information from the above equation is the end result (7), state[i + 1] and state[i + 397]. Starting from the result, the easiest step to unapply is the last one, undoing an xor is as simple as applying that same xor again. So we can get from 7 to 6 by xoring with state[i + 397].

To get from 6 to 5 depends on whether y was odd, if it wasn't, then no operation was applied. But we can also see from the bitshift to the right applied from 4 to 5, that the first bit at 5 will always be 0. Additionally, the first bit of the magic number xored at 6 is 1. So, if the first bit of the number at 6 is 1, then the magic number must have been applied, otherwise it hasn't. Hence, we can conditionally unapply step 5.

At this point, we have the first bit of the old state[i] calculated, in addition to the middle 30 bits of state[i + 1] calculated. We can also infer the last bit of state[i + 1], it is the same as the last bit of y, and if y was odd, then the magic number was applied at step 6, otherwise it wasn't. We've already worked out whether the magic number was applied at step 6, so if it was, the last bit of state[i + 1] was 1, or 0 otherwise.

However, as we work backwards through the state, we will already have calculated state[i + 1], so determining its last 31 bits is not useful to us. What we really want is to determine the last 31 bits of state[i]. To do this we can apply the same transformations listed above to state[i - 1]. This will give us the last 31 bits of state[i].

Putting it all together, our algorithm looks like this:

for (int i = 623; i >= 0; i--) {
  int result = 0;
  // first we calculate the first bit
  int tmp = state[i];
  tmp ^= state[(i + 397) % 624];
  // if the first bit is odd, unapply magic
  if ((tmp & 0x80000000) == 0x80000000) {
    tmp ^= 0x9908b0df;
  }
  // the second bit of tmp is the first bit of the result
  result = (tmp << 1) & 0x80000000;

  // work out the remaining 31 bits
  tmp = state[(i - 1 + 624) % 624];
  tmp ^= state[(i + 396) % 624];
  if ((tmp & 0x80000000) == 0x80000000) {
    tmp ^= 0x9908b0df;
    // since it was odd, the last bit must have been 1
    result |= 1;
  }
  // extract the final 30 bits
  result |= (tmp << 1) & 0x7fffffff;
  state[i] = result;
}

Dealing with non consecutive numbers

What happens, if when collecting 624 numbers from the application, that some other web request comes in at the same time and obtains a number. Our 624 numbers won't be consecutive. Can we detect that, and what can we do about it?. Detecting it is simple, having collected 624, we can predict the 625th, if it doesn't match the next number, then we know we've missed some.

Finding out how many numbers we've missed is the next task. Let's say we only missed the 624th number. This is fairly straight forward to detect, we would find that our state[623] is equal to what we were expecting to be the new state[0]. We would then know that we've missed one number, and by continuing to extract numbers from the application, and comparing that with the results we were expecting, we can narrow down which one.

A generalised algorithm for doing this is beyond the scope of these blog posts. But it should be clear from the reverse engineering of the steps that if most of the values are correct, but only a few are missing, determining what they were will be a fairly simple process.

We now know how to determine the internal state and how to go backwards as well as forwards in two of the most popular PRNG algorithms. In Part 5, we will look at what developers can do to ensure that their applications are safe against PRNG attacks.

Cracking Random Number Generators - Part 3

In Part 1 and Part 2 of this series we focussed on one of the simplest PRNG's, the linear congruential PRNG. We looked at detail into Java's implementation, and then wrote algorithms to crack the seed, and to calculate previous seeds from the current seed.

Not many other languages use a linear congruential PRNG. Originally, the C and C++ implementations of rand() were very simple linear congruential PRNGs, but these days they use more complex algorithms to generate higher quality random numbers. PHPs rand() function delegates to the platforms rand() function, so depending on the platform, it may be a linear congruential PRNG.

A much more popular PRNG algorithm is the Mersenne Twister. This is used by Ruby's rand() function, Pythons random module, and PHP's mt_rand() function.

The Mersenne Twister was invented in 1997. The most common implementation of it uses an internal state of 624 32 bit words (integers). These integers are handed out sequentially, applying a transform algorithm to each one before handing them out. Once all 624 integers have been handed out, an algorithm is applied to the state, to get the next 624 integers.

Generating the next state

The algorithm for generating the next state is simple, it heavily uses bitwise operations such as shifts, masks and xors. The algorithm is as follows:

int[] state;
// Iterate through the state
for (i = 0; i < 624; i++) {
  // y is the first bit of the current number,
  // and the last 31 bits of the next number
  int y = (state[i] & 0x80000000) + (state[(i + 1) % 624] & 0x7fffffff);
  // first bitshift y by 1 to the right
  int next = y >>> 1;
  // xor it with the 397th next number
  next ^= state[(i + 397) % 624];
  // if y is odd, xor with magic number
  if ((y & 1L) == 1L) {
    next ^= 0x9908b0df;
  }
  // now we have the result
  state[i] = next;
}

Obtaining the next number

Before handing the next integer out, a transform is applied to each integer. This transform has 4 steps, each step involves xoring the number with a bit shifted and bit masked version of itself. The algorithm is as follows:

currentIndex++;
int tmp = state[currentIndex];
tmp ^= (tmp >>> 11);
tmp ^= (tmp << 7) & 0x9d2c5680;
tmp ^= (tmp << 15) & 0xefc60000;
tmp ^= (tmp >>> 18);
return tmp;

Cracking the Mersenne Twister

Since the Mersenne Twister contains 624 integers of internal state that it hands out sequentially, obtaining 624 consecutive integers from the generator is the first step in cracking it. In a common webapp, this might be done by issuing 624 requests, and recording the token. If we can then undo the transform applied to these integers, we will then have the complete internal state. Having calculated that, we can predict every subsequent number.

In order to undo the transform, we will take it one step at a time, in reverse order. The step of the transform we need to understand first is therefore:

tmp ^= (tmp >>> 18);

Let's look at what this looks like in binary:

10110111010111100111111001110010                    tmp
00000000000000000010110111010111100111111001110010  tmp >>> 18
10110111010111100101001110100101                    tmp ^ (tmp >>> 18)

As is clear from the bold bits above, we can easily work out what the first 18 bits of the previous number was, they are the same as the result. Taking those 18 bits, we can bitshift them across 18 bits, and xor them with the value, this will give us the entire number. A generalised algorithm can be written to do this for any arbitrary number of bits, and we can reuse this to reverse the first step as well:

int unBitshiftRightXor(int value, int shift) {
  // we part of the value we are up to (with a width of shift bits)
  int i = 0;
  // we accumulate the result here
  int result = 0;
  // iterate until we've done the full 32 bits
  while (i * shift < 32) {
    // create a mask for this part
    int partMask = (-1 << (32 - shift)) >>> (shift * i);
    // obtain the part
    int part = value & partMask;
    // unapply the xor from the next part of the integer
    value ^= part >>> shift;
    // add the part to the result
    result |= part;
    i++;
  }
  return result;
}

This method can be used to reverse the two right bitshift operations. The left bitshift operations also include masking the bitshifted number with a magic number, so when we unapply part as above, we have to apply the mask to it, as follows:

int unBitshiftLeftXor(int value, int shift, int mask) {
  // we part of the value we are up to (with a width of shift bits)
  int i = 0;
  // we accumulate the result here
  int result = 0;
  // iterate until we've done the full 32 bits
  while (i * shift < 32) {
    // create a mask for this part
    int partMask = (-1 >>> (32 - shift)) << (shift * i);
    // obtain the part
    int part = value & partMask;
    // unapply the xor from the next part of the integer
    value ^= (part << shift) & mask;
    // add the part to the result
    result |= part;
    i++;
  }
  return result;
}

Using these two functions, we can unapply the transformation applied to the numbers coming out of the PRNG:

int value = output;
value = unBitshiftRightXor(value, 18);
value = unBitshiftLeftXor(value, 15, 0xefc60000);
value = unBitshiftLeftXor(value, 7, 0x9d2c5680);
value = unBitshiftRightXor(value, 11);

Having applied the above algorithm to 624 consecutive numbers from the PRNG, we now have the complete state of the Mersenne Twister, and can easily determine every subsequent value. In Part 4 of the series, we'll investigate how to find previous values generated by the Mersenne Twister, and also discuss how to deal with being unable to get consecutive numbers out of the PRNG.

About

Hi! My name is James Roper, and I am a software developer with a particular interest in open source development and trying new things. I program in Scala, Java, Go, PHP, Python and Javascript, and I work for Lightbend as the architect of Kalix. I also have a full life outside the world of IT, enjoy playing a variety of musical instruments and sports, and currently I live in Canberra.