all that jazz

james' blog about scala and all that jazz

Women in tech - It's a mans problem

A few days ago a panel of four men showed the world that they had no clue about the issues that women in tech faced, and how they should be solved. The overwhelming theme that I picked up from the criticism that I read about it was “you’re not listening to us”. Is that true? Are these male industry leaders really not listening to women? I mean surely they have read the accounts of the issues that many women in tech have faced, isn’t that enough?

Now I am about to break one of the cardinal sins of talking about women in tech - relating it to my wife - but please bear with me, it’s not what you think. My wife Beth and I are currently getting marriage counselling1. In it we have rediscovered two things that we always knew but need to keep being reminded of. The first is that during conflict, Beth speaks using emotional language. The second is that during conflict, I speak using logical language. In order for us to resolve conflict, we need to speak each others language, I need to talk more about how things make me feel, and Beth needs to step back from her feelings and reason logically about hers and my actions.

It is from this difference in languages that the problems of women in tech flow. No! That’s not it at all. But, if you’re a man, you may have been nodding your head as you read that statement. If you’re a woman, if it weren’t for the no in bold text immediately after that statement, you probably would have closed your browser in anger. And this highlights a deep problem.

We men have a tendency to approach the things that women are saying - the accounts of harassment and abuse, the accounts of every day prejudice, and the calls to action - as if those women speak a different language, the way our wives/mothers/girlfriends do. But read some of the accounts again. Are these spoken with a language of emotion? They certainly talk about emotion, but those accounts are very well reasoned and logical texts that speak plainly about actual events. Even if, and that is a very big if, the women who told these accounts do have a tendency to use emotional language over logic and sound reasoning, they have clearly mastered the skill of communicating to men - after all, in this industry, they have to.

So what is the impact of this approach that we men take, and what do I mean by it? When I first read Julie Ann Horvath’s account of her experience at GitHub, my subconscious immediately told me that I had to be careful. Women have a tendency to overreact, to speak with a language of emotion that does not follow sound reasoning or logic, and I should take the things that I am reading with a grain of salt. Any reaction that I have to this I should carefully measure, I should refrain from saying anything too strongly about it, in case it turns out not to be true. While I believed that it probably was true, I let my subconscious prejudice stop me from taking it too seriously.

This reaction doesn’t make sense. But the deep impact that it has is that it causes me to distance myself somewhat from the problem. And while in certain circumstances distancing yourself from problems does little harm, in this instance, it is the worst thing I could possibly do. Why? Because what if the problem is me? If I distance myself from the problem, I will never see that it’s me.

No wonder women are complaining that men are not listening. As long as we approach women as a group that speaks a different language, we will never listen to them. We will never understand what they have to say. We will distance ourselves from their arguments, and from the implications, and this means, if there is any problem in us, any implication that should change us, we will not hear it.

It has taken me a long time to learn this. I used to think that the issue of women in tech was just some gripe over numbers, that the problem was that the number of women in tech didn’t equal the number of men in tech, and that some vocal women believed that that needed to be fixed. As I’ve read more and more and more accounts of women facing sexual harassment and discrimination I’ve slowly come to understand that it is something very different. I should have listened earlier, and come to this conclusion a long time ago. But better late than never. I’ve come to the conclusion that the issue of women in tech is a man’s problem.

§It’s a man’s problem

The only person that can change your attitudes is you. Other people can’t change them - they can point you in the right direction, they can present you with well reasoned arguments on why you should change and how to change, but at the end of the day, the only person that can change them is you. The women in tech issue comes down to the attitudes of us men, and therefore it is a problem that only we men can fix. This is what I mean by it’s a man’s problem - the changing will be done by men.

However, the initiative to fix it must be led by women. Why? Because only women can explain how they are prejudiced against, how the actions of men, particularly the small seemingly inconsequential ones that happen every day, impact women. They are the ones that see and experience the problem, and so they are the only ones that can describe and instruct on how to remedy the problem.

But the main force of change must come from men that are listening to these women. Men who are not just reading the accounts and remedies, but are actually listening to them without prejudice. These men have two tasks:

  1. Change themselves. When women identify something that men are doing that is harmful to women’s acceptance in the IT industry, men need to examine themselves to see if they are exhibiting that action, and if so, change it.
  2. Convince other men to listen to women without prejudice. This is a job that must be done by men, because if the men that need convincing aren’t listening to women, then nothing a woman says will resound with them.

If you’re a man reading this, and you think “after reading this I now understand the issues that women in tech face”, then you’ve missed the point. I don’t understand the issues that women in tech face, so there’s no way after reading something that I’ve written that you could understand them. I’ve merely pointed out the first step - to start listening to women without prejudice. The next step is to actually listen to them! Read the blog posts and news articles of the accounts of women in tech with unprejudiced eyes. Talk to your female coworkers and friends about the issues they face, and listen to them. Attend conferences and meetups aimed at promoting women in tech, and listen! This is a man’s problem that requires action by men, and the first step is listening to women.

§Footnotes

  1. No, our marriage is not on the rocks. Beth and I believe that a marriage is like a car, and marriage counselling is like a mechanic. If you wait until a car breaks down before you take it to the mechanic, it will have a much bigger impact and cost a lot more to fix - it may even get written off. Rather, you take the car to the mechanic for regular checkups while it's healthy. Likewise, waiting till a marriage breaks down to see a marriage counsellor is likely to cause a lot of pain and take a very long time to fix. Rather, seeing a marriage counsellor while your marriage is healthy ensures the long term health of the marriage, and also ensures that you both get the most out of the marriage too. We see the marriage counselling we're getting now as our 5 year checkup.

Introducing ERQX

Today I migrated my blog to a new blogging engine that I’ve written called ERQX. Now to start off with, why did I write my own blog engine? A case of not invented here syndrome? Or do I just really like writing blog engines (I was, technically still am, the lead developer of Pebble, the blog that I used to use)?

I was very close to migrating to a Jekyll blog hosted on GitHub, but there are a few reasons why I didn’t do this:

  • As a full time maintainer of Play, I don’t get a lot of opportunities to use Play as an end user. This is bad, how can I be expected to guide Play forward if I don’t feel the pain points as an end user? Hence, I jump at every opportunity I can to write new apps in it, and what better use case is there than my own blog?
  • I really like the setup we have with the documentation on the Play website - we have implemented some custom markdown extensions that allow extracting code snippets from compiled and tested source files, and all documentation is served directly out of git, which turns out to be a great way to deploy and distribute content.
  • I wanted to see how easy it would be to make a full reusable and skinnable application within Play.
  • Because I love Play!

§Features

So what are the features of ERQX? Here are a few:

§Embedabble

The blog engine is completely embeddable. All you need to do is add a single line to your routes file to include the blog router, and some configuration in application.conf pointing to a git repository, and you’re good to go.

Not convinced? Here is everything you need to do to include a blog in your existing Play application.

  1. Add a dependency to your build.sbt file:

    resolvers += "ERQX Releases" at "https://jroper.github.io/releases"
    
    libraryDependencies += "au.id.jazzy.erqx" %% "erqx-engine" % "1.0.0"
    
  2. Add the blog router to your routes file:

    ->  /blog       au.id.jazzy.erqx.engine.controllers.BlogsRouter
    
  3. Add some configuration pointing to the git repo for your blog:

    blogs {
      default {
        gitConfig {
          gitRepo = "/path/to/some/repo"
          remote = "origin"
          fetchKey = "somesecret"
        }
      }
    }
    

And there you have it!

§Git backend

In future I hope to add other backends, I think a prismic.io backend would be really cool, but for now it just supports a git backend. The layout of the git repo is somewhat inspired by Jekyll, blog posts go in a folder named _posts, named with the date and title in the name, and each blog post has a front matter in yaml format. Blog posts can either be in markdown or HTML format. There is also a _config.yml file which contains configuration for the blog, such as the title, description and a few other things.

Changes are deployed to the blog either by polling, or by registering a commit hook on GitHub. In the example adove, the url for the webhook would be http://example.com/blog/fetch/somesecret. Using commit hooks, blog posts are published within seconds of pushing to GitHub. ERQX also takes advantage of the git hash, serving that as the ETag for all content, allowing caching of the blog and its associated resources.

§Markdown

Blog posts can be in markdown format, and uses the Play documentation renderer to support pulling code samples out of compiled and tested source files. This is invaluable if you write technical blog posts full of code and you want to ensure that the code in the blog post works.

§Themeable

The blog is completely themeable, allowing you to simply override the header and footer to plug in different stylesheets, or completely use your own templates to render blog posts.

The default theme uses a pure CSS responsive layout, switching to rendering the description of the blog in a slideout tab on mobile devices, and provides support for comments via Disqus.

§Multi blog support

ERQX allows serving multiple blogs from the one server. Each may have its own theme.

§Source code and examples

ERQX and its associated documentation can be found on GitHub.

The website for this blog, showing how the blog can be emedded in a real application, plus the content of the blog itself, can also be found on GitHub. The website is in the master branch, while the blog content is in the allthatjazz branch.

Fun doesn't mean compromising scalability

Today I read an interesting piece on InfoWorld about Meteor, Meteor aims to make JavaScript programming fun again. It is an interview with Matt DeBergalis, a co-author of Meteor, about Meteor and why a developer would choose it. The title in particular resonated well with me, "making programming fun again" is a catch phrase I have often used in presentations I've given about Play Framework.

As the demands on the applications we write shifts, the technologies we use start to make it harder to meet them, and pretty soon we feel like we are always working against the technologies that are supposed to be helping us. By taking a step back, rethinking the technologies, and creating new ones that are better suited to todays demands, we can continue being productive writing modern applications, and its then that development becomes fun again. Though obviously not always the case, how much fun you have working with a particular technology is often well correlated to how well suited it is for solving the problems you are trying to solve, and so there is some merit to switching to technologies that are more fun.

In this light, Meteor is not a bad framework, it is particularly very interesting in its approach to solving the problems of making web applications responsive to data updates. Writing apps in it will definitely, at least initially, be very fun. But my reason for writing this post is that I had one main gripe with the article. The problem was that DeBergalis continually likened what Meteor achieves with Facebook, implying that Facebook could be implemented using Meteor. This couldn't be further from the truth.

While the end result of an application written in Meteor and Facebook are very similar - they are both applications that update instantly as people interact with them - the approach that Facebook takes to writing their apps is the complete opposite from Meteor. Meteor places a massive emphasis on "don't worry about how data is communicated, let the framework deal with that for you". Although I have not worked on Facebook myself, I am sure that their approach is all about how the data is communicated - they don't just let the framework deal with that for them.

The problem with Meteor's approach to web development is that it makes the same mistakes that some very old technologies that many people now loath made. I am going to highlight two such technologies.

The first is relational databases. The promise of relational databases was that you don't have to worry about how your data was accessed - just make sure you store it in a normalised form, and let the database handle whatever load you throw at it. Performance can be achieved by tuning with indexes. But the problem that we found on the web is that that approach did not scale. Denormalisation and caching became necessary in any app with even a modest load. And that's when NoSQL databases started popping up. NoSQL databases intentionally limited what you could do in them - forcing you to take a different perspective on your data, namely how is it going to be read/written? They forced you to make decisions that would allow you to scale early in the design process, and we found that making these decisions early were key to successfully scaling a web application.

The second technology is n-tier application servers. The promise of application servers was that you didn't have to worry about deployment, you just wrote your applications, and let the application server worry about scalability and resilience. This led to people writing massive monolithic apps, where almost every function in the app depended on every single other function, killing any chance of ever having either resilience or scalability. When performance became an issue, clustering was "turned on", and often performance went down. And that's when containerless micro service solutions started becoming popular - small services that could be individually scaled. These new architectures forced you to think about scalability up front, making those decisions early.

Are you seeing a pattern here? Letting the technology handle resilience and scaling for you is bad, forcing you to address it up front is good. But Meteor seems to be making the exact same mistakes the relational databases and n-tier application servers made. It's trying to hide those concerns from you, in the name of "making programming fun again". While fun at first, this is certainly not going to be fun when your site gets popular and starts falling over because of the load it gets.

But maybe the Meteor developers have come up with a smart way to scale it. There are apparently two ways you can run multiple Meteor nodes, and the apparently better one is described here. The approach? Have each Meteor node tail the MongoDB Oplog. Or in simple English, make every write operation in the system go to every node in the cluster. I'll let you decide whether you think making that approach scale is fun.

As I said at the start I resonated well with the title of the article - but it seems that I have a very different idea of what's fun to what the authors of Meteor have. In my opinion, hiding the details of hard problems to scale is not fun. Rather, putting them in your face, giving you the tools to solve them at the right time, now that's fun. This is exactly what Play Framework and Akka do - particularly Akka, in which the assumption when you program is that every other part of the app is likely down or not responding, and you are forced to deal with what happens when that's the case. Using these technologies to solve these hard problems is not only fun, it's very satisfying - seeing an app with 50000 concurrent users broadcasting updates every second scale with only 10 nodes, it's exciting too!

The fun approach to hard problems is not to run away from them to something that pretends they don't exist. It's to embrace them head on, using technologies that are designed to help you do so.

A practical solution to the BREACH vulnerability

Two weeks ago CERT released an advisory for a new vulnerability called BREACH. In the advisory they say there is no practical solution to this vulnerability. I believe that I've come up with a practical solution that we'll probably implement in Play Frameworks CSRF protection.

Some background

First of all, what is the BREACH vulnerability? I recommend you read the advisory, there's no point in me repeating it here, but for those that are lazy, here are is a summary. The prerequisites for exploiting this vulnerability are:

  1. The target page must be using HTTPS, preferably with a stream cipher (eg RC4) though it is possible to exploit when block ciphers with padding are used (eg AES)
  2. The target page must be using HTTP level compression, eg gzip or deflate
  3. The target page must produce responses with a static secret in them. A typical example would be a CSRF token in a form.
  4. The target page must also reflect a request parameter in the response. It may also be possible to exploit if it reflected POSTED form body values in the response.
  5. Responses must be otherwise reasonably static. Dynamic responses, particularly ones that vary the length of the response, are much more expensive to exploit.
  6. The attacker must be able to eavesdrop on the connection, and specifically, measure the length of the encrypted responses.
  7. The attacker must be able to coerce the victims browser to request the target web page many times.

To exploit, the attacker gets the victims browser to submit specially crafted requests. These requests will contain repeat patterns that the compression algorithm will compress. If the pattern matches the first part of the secret, then the response will be shorter than if it doesn't, since that part of the secret will also be compressed along with the repeat patterns. Then character by character, the attacker can determine the secret.

Some work arounds

The advisory mentions some work arounds. Whether these work arounds are effective depend greatly on the specific application, none of them can be effectively done by a framework, without potentially breaking the application.

Probably the most effective of the work arounds is randomising the secret on each request. In the case of CSRF protection tokens, which is often provided by many frameworks, this would prevent a user from using the application from multiple tabs at the same time. It would also cause issues when a user uses the back button.

I would like to propose a variant of using randomised tokens, that should work for most framework provided CSRF protection mechanisms, and that, pending feedback from the internet on whether my approach will be effective, we will probably implement in Play Framework.

Signed nonces

The idea is to use a static secret, but combine it with a nonce, sign the secret and the nonce, and do this for every response that the secret is sent in. The signature will effectively create a token that is random in each response, thus violating the third prerequisite above, that the secret be static.

The nonce does not need to be generated in a cryptographically secure way, it may be a predictable value such as a timestamp. The important thing is that the nonce should change sufficiently frequently, and should repeat old values sufficiently infrequently, that it should not be possible to get many responses back that use the same nonce. The signature is the unpredictable part of the token.

Application servers will need to have a mechanism for signing the nonce and the secret using a shared secret. For applications served from many nodes, the secret will need to be shared between all nodes.

The application will represent secrets using two types of tokens, one being "raw tokens", which is just the raw secret, the other being "signed tokens". Signed tokens are tokens for which a nonce has been generated on each use. This nonce is concatenated with the raw token, and then signed. An algorithm to do this in Scala might look like this:

def createSignedToken(rawToken: String) = {
  val nonce = System.currentTimeMillis
  val joined = rawToken + "-" + nonce
  joined + "-" + hmacSign(joined)
}

where hmacSign is a function that signs the input String using the applications shared secret using the HMAC algorithm. HMAC is not the only signing algorithm that could be used, but it is a very common choice for these types of use cases.

Each time a token is sent in a response, it must be a newly generated signed token. While it is ok to publish the raw token in HTTP response headers, to avoid confusion on which incoming tokens must be signed and which can be raw, I recommend to always publish and only accept signed tokens. When comparing tokens, the signature should be verified on each token, and if that passes then only the raw part of the tokens need to be compared. An algorithm to extract the raw token from the signed token created using the above algorithm might look like this:

def extractRawToken(signedToken: String): Option[String] = {
  val splitted = signedToken.split("-", 3)
  val (rawToken, nonce, signature) = (splitted(0), splitted(1), splitted(2))
  if (thetaNTimeEquals(signature, hmacSign(rawToken + "-" + nonce))) {
    Some(rawToken)
  } else {
    None
  }
}

where thetaNTimeEquals does a String comparison with Θ(n) time when the lengths of the Strings are equal, to prevent timing attacks. Verifying that two tokens match might look like this:

def compareSignedTokens(tokenA: String, tokenB: String) = {
  val maybeEqual = for {
    rawTokenA <- extractRawToken(tokenA)
    rawTokenB <- extractRawToken(tokenB)
  } yield thetaNTimeEquals(rawTokenA, rawTokenB)
  maybeTrue.getOrElse(false)
}

Why this works

When using a signed token, the attacker can still work out what the raw token is using the BREACH vulnerability, however since the application doesn't accept raw tokens, this is not useful to the attacker. Because the attacker doesn't have the secret used to sign the signed token, they cannot generate a signed token themselves from the raw token. Hence, they need to determine not just the raw token, but an entire signed token. But since signed tokens are random for each response, this breaks the 3rd prerequisite above, that secrets in the response must be static, hence they cannot do a character by character evaluation using the BREACH vulnerability.

Encrypted tokens

Another option is to encrypt the concatenated nonce and raw token. This may result in shorter tokens, and I am not aware of any major performance differences between HMAC and AES for this purpose. APIs for HMAC signing do tend to be a little easier to use safely than APIs for AES encryption, this is why I've used HMAC signing as my primary example.

Framework considerations

The main issue that might prevent a framework from implementing this is that they might not readily have a secret available to them to use to do the signing or encrypting. When an application runs on a single node, it may be acceptable to generate a new secret at startup, though this would mean the secret changes on every restart.

Some frameworks, like Play Framework, do have an application wide secret available to them, and so this solution is practical to implement in application provided token based protection mechanisms such as CSRF protection.

100 Continue support in Play

The 100 Continue status code in the HTTP spec is one that most people know very little about. You kind of read it, don't really understand what it's talking about, and then just skip over it. I didn't know what it was about until I became a developer of a web framework. It turns out to be very useful in certain situations.

Let's say a client needs to make a very large upload, for example 1GB. What happens if the server can't satisfy the clients request? For example, what if the client submitted invalid authentication credentials? Or the request content was too long? Or the wrong media type? HTTP is a half duplex protocol, the client and server take it in turns to speak. This means that even though the server may know immediately after receiving the request header that it can't process the request, it still has to read the entire request body before it can tell the client that, even if that request body is a 1GB long and takes an hour to upload. And if you've ever done any large HTTP uploads before, you'll know there's nothing more frustrating than getting to the end of a large upload, only get an error back from the server.

HTTP has a solution to this, in the form of the Expect request header. The Expect header is used to tell the server that the client expects a certain behaviour of it. There is one defined value for it in the HTTP spec, and that is 100-continue. This tells the server that after sending the request headers, the client will not send the body of the request until it has received a 100 continue response. Otherwise, the server can immediately return with any other response code. After receiving a 100 continue response, the client will continue to send the body, and once the server has consumed that, the server will send a second response.

This can be used whenever the server wants to do validation of just the request headers. Here are some examples:

  • Authentication - if the client is not authenticated, the server can respond with 401 Unauthorized.
  • Authorisation - if the client is not authorised to make the request, the server can respond with 403 Forbidden.
  • Resource existence - if the client has attempted to put a resource at a location that doesn't exist, the server can respond with 404 Not Found
  • Content length limits - if the client hasn't sent a content length, the server can respond with 411 Length Required, or if the content length is larger than the server is willing to accept, the server can respond with 413 Request Entity Too Large
  • Content type validation - if the client is sending a content type that the server doesn't support, the server can respond with 415 Unsupported Media Type

100 continue support in Play Framework

So with all this in mind, how can this be implemented in Play framework? As you may be aware, at the lowest level, a Play action looks like this:

trait EssentialAction extends (RequestHeader => Iteratee[Array[Byte], Result])

The iteratee that the essential action function returns is what consumes the body. An iteratee can be in one of three states, done, cont (ready to receive more input), or error. When Play invokes an action to get the iteratee for the body, and a client has specified the Expect: 100-continue header, Play is able to check if that iteratee is ready to receive input, or if it's in a done or error state. If it's in a done or error state, Play will send the result immediately without consuming the body. If it's in the cont state, then Play will send a 100 continue response, and then feeds the body into the iteratee.

So for an action to take advantage of this, it just needs to ensure that it returns a done iteratee if the validation fails. Plays built in authentication action does just this:

def Authenticated[A](
  userinfo: RequestHeader => Option[A],
  onUnauthorized: RequestHeader => Result)(action: A => EssentialAction): EssentialAction = {

  EssentialAction { request =>
    userinfo(request).map { user =>
      action(user)(request)
    }.getOrElse {
      Done(onUnauthorized(request), Input.Empty)
    }
  }
}

In addition, all of Plays body parsers, when they check the content type, will return a done iteratee if the content type is wrong. So if I have an action that looks like this:

def upload = Authenticated(
    rh => rh.headers.get("Authentication-Token").filter(_ == "secret-token"), 
    rh => Forbidden("Authentication required")
) { token => Action(parse.text) { request =>
  Ok("Got body that was " + request.body.length + " characters long")
}}

And then I submit the following request header:

POST /upload HTTP/1.1
Host: localhost
Authentication-Token: secret-token
Content-Type: text/plain
Content-Length: 12
Expect: 100-continue

Play will immediately respond with:

100 Continue HTTP/1.1

At which point, I can then send my body, and Play will send the response. The whole transaction will look like this:

C: POST /upload HTTP/1.1
C: Host: localhost
C: Authentication-Token: secret-token
C: Content-Type: text/plain
C: Content-Length: 12
C: Expect: 100-continue
C: 
S: HTTP/1.1 100 Continue
S:
C: Hello world!
S: HTTP/1.1 200 OK
S: Content-Type: text/plain;charset=utf-8
S: Content-Length: 37
S:
S: Got body that was 12 characters long

However, if I don't send an authentication token, or if my content type is wrong, this is what will happen:

C: POST /upload HTTP/1.1
C: Host: localhost
C: Content-Type: text/plain
C: Content-Length: 12
C: Expect: 100-continue
C: 
S: HTTP/1.1 403 Forbidden
S: Content-Type: text/plain;charset=utf-8
S: Content-Length: 23
S:
S: Authentication required

And so even though in the request header I said that the content length was 12, I didn't have to upload it, because I sent the expect header, and Play didn't send a 100 continue response back, instead it was able to immediately tell me that the request would fail. Obviously with such a small body, this doesn't make a lot of sense, but with a body gigabytes in length, it means I don't have to spend however many hours uploading it before I finally find out that I wasn't allowed to upload it.

About

Hi! My name is James Roper, and I am a software developer with a particular interest in open source development and trying new things. I program in Scala, Java, Go, PHP, Python and Javascript, and I work for Lightbend as the architect of Kalix. I also have a full life outside the world of IT, enjoy playing a variety of musical instruments and sports, and currently I live in Canberra.