Tuesday, April 15, 2014

NIO 2 in Apache Tomcat 8

There is a new NIO 2 based connector in Apache Tomcat 8 that is nearing reasonably useful status, being now labeled as beta. Besides NIO 2 aligning well with the asynchronous IO from Servlet 3.1, it is not its only benefit.

Speed

First, a quick speed test. Raw speed is measured using a Servlet writing 1KB of data, using ab -k -c 100 (keepalive enabled over 100 concurrent connections) so that it only measures the ability of doing a blocking write and a keepalive between two requests. Obviously a horrible real world benchmark, but the idea is only to see if NIO 2 is fast enough, since it does look kinda slow when you look at its high level API. This could have eliminated NIO 2 as a useful solution since a stable NIO connector already exists in Tomcat, while at the other end of the spectrum APR is available for raw speed. I am happy to report that NIO 2 is significantly faster than NIO for this pure blocking/polling stress test, up to about +50%, and is comparable to APR for that task.

After taking this most critical issue out of the equation, we have a connector that is more elegant than the current connectors, as poller management for NIO and APR, blocking IO for NIO, and native code for APR have been proven to be a seemingly endless source of complications / deadlocks / crashes / platform specific issues.

However, it is not known yet how good the real world scalability and resource consumption is, although some initial weak points can already be identified with JSSE and static file serving (see below). With thread and poller management being nearly completely abstracted away, the JVM has everything in its hands to provide an optimized behavior, eventually.

A simple API

Or is it ? Actually, only blocking IO is very simple with NIO 2. A read or write returns immediately like with NIO, but unlike NIO the operation does not have to be complete, it could still be in progress asynchronously. To represent that, the most basic read/write API uses a Future object that can be polled (bad idea) or blocked upon. So, simple blocking, with per operation timeout, looks great.

"Non blocking" as is called in Servlet 3.1, requires using the more complex API that uses completion handlers to notify that the operation is now done. That also still sounds simple, but the special cases need to be handled while the NIO 2 API does not provide everything to do that easily. A call can complete inline (or not, obviously), synchronization is not intuitive (there's no code block to sync on while an operation is pending, but evidently the state of some important objects like the buffers is undefined; the risk of deadlock is also present), incomplete operations are possible, etc.

The API does allow some more significant IO optimization, with scatter and gather. I tried taking advantage of the latter in Tomcat, with more work on that possible in the future.

How NIO 2 could be better

NIO 2 looks simple, fast and intuitive, but many things in it could still use some improvement.

Sendfile support

NIO transferTo API is not supported with NIO 2 asynchronous channels, and I don't see a good reason for that. As a result, although the NIO 2 connector raw speed is good and it can be fast enough in most cases, it is not the most efficient file server. Not critical, but since it is such a low cost thing to implement, it is unfortunate.

JSSE integration

It is the same as with NIO and uses the SSL engine API, which allows good control and non blocking. But everyone will do mostly the same asynchronous channel wrapper code. This JSSE channel code could have been included with NIO 2.

JSSE (non) speed

JSSE is still as slow as it used to compared with OpenSSL. Immunity to that bleed thing only gets you so far though, JSSE as it is now looks like a waste of server resources. However, this component of the JVM is pluggable, so we'll see if this can be improved in the future.

Better state handling

There is no way to do basic things like query the operation state when using the completion handlers, while it is included when using a Future. The pending flag should be available somewhere and should actually be an integrated semaphore shared with the future (to be able to wait for any pending operation to complete). In the end, although it does look intuitive and nothing is insurmountable, it ends up being more complex that it needs to.

So there is room for some nice additions in NIO 2.next, if it happens !