At my main job, we have a large datastructure that takes considerable CPU time to be built, but remains unchanged thereafter. Its job is to geocode positions to and from a local reference system, which in turn provides us the ability to pin records, for instance, to a place on a Road, and know to which coordinate pair a local reference would correspond.
As it usually goes, a new decision was made to support multiple Roads per User. Now, a download of 800KB of data (stored in an IndexedDB for later sessions) was tolerable; potentially multiple megabytes would be deadly, even if the software could be used before that constant feedback of conversions was given — it just became one of those features Users hold on to.
The first step was finding out what could be used. This is what I evaluated:
At first, considering I was already using Sinatra, I tried sinatra-websocket. For some reason I just couldn’t get the connection to be upgraded to a WebSocket, and decided to move on quickly. faye-websocket I just skipped, to be frank.
The next two suffered from the same problem: after booting Rails and loading the structure, I was left with only enough memory for a couple dozen or so clients on a small Heroku dyno. Also, Rails’ boot time coupled with building the thing occasionally made Heroku think something had gone wrong, and often the process crashed before the service went up.
The only one left, if you’re counting, was webmachine-ruby.
Setting up was relatively easy. To ramp up, I first migrated the original HTTP-based service to its resource structure. It has more of an OO flair than both Rails and Sinatra, with the caveat that it provides a lot less (by design). The dispatcher is easy to understand, and I quite enjoyed toying with the visual debugger.
Moving to a WebSocket, however, changes everything. As far as I can tell (and the documentation specifies) you completely skip over the regular infrastructure by providing a callable to a configuration option, as such:
1 2 3 4 5 6 7 8
That is pretty much what the docs say. Since it only expects the handler to respond to #call, you can write your own ad-hoc dispatcher:
1 2 3 4 5 6
What the docs don’t address are some basics of sockets programming. If you see your handler hang and never respond again, requiring you to restart, don’t fret: you just have to provide a loop to read from the the socket and let Celluloid::IO do its non-blocking magic:
1 2 3 4 5 6 7 8
Don’t worry: your CPU won’t be pegged at 100%, because non-blocking. You’ll be subjected, however, to the same limitations node has regarding CPU usage and its event handlers (i.e. if you are CPU-intensive, you’ll affect throughput).
Luckily, we have threads in Ruby. I decided to take advantage of that by assigning each client to a Celluloid Actor, which allows me to provide some of the CPU-intensive operations without compromising (at least not heavily) other Users. It has been working fine so far.
My solution doesn’t take into account non-WebSocket clients, but it should. webmachine-ruby makes it easy by allowing you to implement streaming APIs without much trouble, and I suppose it’ll only take a bit of JS to fallback from one to the other and provide an abstract connection to consumers.
The documentation also doesn’t go over all the events that can happen on the socket (onerror, onclose, onopen, onmessage). You can see them as methods on the socket, each taking a block, but for my use case I just let the actor crash and be done with it. If I’m missing some cleanup, please let me know.