Troika Tech

Decoding Larger JSON Objects in Elm 0.15

Elm is pretty cool. It’s a functional programming language with a focus on usability, strongly-typed but unceremonious, with nice type inferencing, good documentation and great stewardship from its creator, Evan Czaplicki.

It’s so cool I’ve given an excited talk about it at work after only a couple of weeks of fiddling with it. And whenever I speak about tech, I try to add a demo or two to tie things together and make points clearer. That led to elm-forecast, a tiny app showing how to call APIs, decode JSON and display things on the screen.

What elm-forecast looks like

The problem

Dark Sky’s JSON API offers detailed weather information for most of the world. It has up-to-the minute data for some locations, and powers a lot of nice weather apps, like forecast.io and Weather Timeline. My app was also going to be nice, so I picked it as my data source.

I started by wrapping the current forecast definition as a record:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
type alias Forecast = { time : Int
                      , summary : String
                      , icon : String
                      , precipIntensity : Float
                      , precipProbability : Float
                      , temperature : Float
                      , windSpeed : Float
                      , windBearing : Float
                      , humidity : Float
                      , visibility : Float
                      , cloudCover : Float
                      , pressure : Float
                      , ozone : Float
                      }

Record types marry a lot of the feel of JavaScript objects with static types (the things after the colons).

If you’re familiar with dynamic languages, the next step will seem alien: instead of just calling something like JSON.parse(obj) and referencing its fields, we have to tell Elm how to make a typed Forecast out of the serialized data.

Let’s see what it looks like with a smaller object:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$ elm repl
Elm REPL 0.4.2 (Elm Platform 0.15.1)
  See usage examples at <https://github.com/elm-lang/elm-repl>
  Type :help for help, :exit to exit
> import Json.Decode as Json exposing ((:=))
> type alias Point = { x: Float, y: Float }
> serialized = "{\"x\": -43.123, \"y\": -22.321}"
"{\"x\": -43.123, \"y\": -22.321}" : String
> pointDecoder = Json.object2 \
|   Point \
|   ("x" := Json.float) \
|   ("y" := Json.float)
<function> : Json.Decode.Decoder Repl.Point
> Json.decodeString pointDecoder serialized
Ok { x = -43.123, y = -22.321 } : Result.Result String Repl.Point

The code above defines a type Point and a Json.Decode.Decoder pointDecoder, which takes care of deserializing an object with two fields (object2) and returning a Point. As you can see, no types have been declared, yet Elm has inferred every single one of them.

Json.Decode has functions from object1 to object8, capable of building objects with one up to eight fields. What to do with Forecast, that has 13? “Throw away five things, it’s just an example app you’re building to learn Elm”, thought the lazy author. Luckily, thirst for knowledge (and a little guilt) averted that course, and I relied on what little functional programming I know to almost get there using Json.Decoder.andThen. Since almost is actually not quite, to Google I went. A thread with a recipe from mr. Czaplicki himself offered the following solution:

1
2
3
4
5
import Json.Decode as Json

apply : Json.Decoder (a -> b) -> Json.Decoder a -> Json.Decoder b
apply func value =
    Json.object2 (<|) func value

Let’s see it in action:

1
2
3
4
> newPointDecoder = Json.map Point ("x" := Json.float) `apply` ("y" := Json.float)
<function> : Json.Decode.Decoder Repl.Point
> Json.decodeString newPointDecoder serialized
Ok { x = -43.123, y = -22.321 } : Result.Result String Repl.Point

With apply, you can chain as many decoders as you like and build objectN. But how does it work?

A detour to the world of Partial Application

In Elm, like in Haskell, every function is curried. What it means, in practice, is that every function takes a single argument and returns a value, which can in turn be another function, taking a single argument, returning a value, and so forth. I’ll define a function add that (oh, how impressive) adds two numbers:

1
2
3
4
> add x y = x + y
<function> : number -> number -> number
> add 2 3
5 : number

It looks like a function call with two arguments, like you see in most other languages. But look at the type signature the compiler inferred: add : number -> number -> number. What do the arrows represent? Well, they tell you exactly that the paragraph above tries to explain. Let’s see:

1
2
3
4
> add2 = add 2
<function> : number -> number
> add2 3
5 : number

When defining add2, I’ve partially applied add with 2, getting another function (now from number -> number). Calling that function will then result in a final number, the 5 literal that amazes people all over the world. This very characteristic helps you build apply.

In the example a few paragraphs above, Point is a function with the signature Float -> Float -> Point. That means that if I try to use it with a single decoder, it will move closer to getting an actual Point, but not get there yet:

1
2
> Json.map Point ("x" := Json.float)
<function> : Json.Decode.Decoder (Float -> Repl.Point)

Looking at the type signature, it’s a structure that decodes a Float and returns another structure that can decode a function Float -> Point. If I tried to do the same thing with a type constructor that took more arguments, say Float -> String -> Bool -> String -> Value, the first step would yield a Decoder with type (String -> Bool -> String -> Value) — solved for the first parameter, still waiting for a resolution for the next three.

What apply does then is leverage the fact that you can progressively get to your final value by applying function by function, taking care of spitting out every every step as a Json.Decoder. There’s a name for this pattern of having a function in a box and applying it to values in other boxes: it’s an Applicative functor. Now, if you’ve read a bit about the language, you know that Elm shies away from the burden of a Math-y, Haskell-y lexicon. The great thing is that by hiding the words but showing things in practice, it ends up fostering an intuition in programmers for how the concepts can be useful.

Let’s go back to Json.Decode.object2. It expects (a -> b -> c) -> Decoder a -> Decoder b -> Decoder c — a function from type a to b to c and Decoders for a and b, yielding a Decoder c. In our definition pointDecoder in the beginning of this post, we matched that to a tee, as Point can be seen as a function taking two Floats (a and b) and returning a Point record (c). But a, b or c can also be a function! In fact, that’s exactly what we’ve seen above with Json.Decode.Decoder (Float -> Repl.Point). Thus, when we say:

1
> Json.object2 (<|) func value

and replace func with Json.Decoder.Decode (Float -> Point) and value with ("y" := Json.float), we’ll end up with a Decoder built of applying what’s coming out of value to Float -> Point, arriving at Decoder Point. If we manually try to build the same chain, it looks like this:

1
2
3
4
5
6
7
8
9
10
> import Json.Decode as Json exposing ((:=), andThen)
> type alias Point = { x: Float, y: Float }
> partialDecoder = Json.succeed(Point) `andThen` \
    (\f -> ("x" := Json.float) `andThen` \
    (\x -> Json.succeed <| f x))
<function> : Json.Decode.Decoder (Float -> Repl.Point)
> decoderPoint = partialDecoder `andThen` \
    (\f -> ("y" := Json.float) `andThen` \
    (\y -> Json.succeed <| f y))
<function> : Json.Decode.Decoder Repl.Point

Cool, right? Now that you and I understand the technique, we can go back to the gif above and marvel at how poor my CSS skills are.

No magic

What I find the most refreshing as I dive into functional programming is that there’s (usually) no magic. If you start peeling the layers, there’s just functions brought together to perform amazing things. apply here is exactly that: the power of a few functions allowing you to convert arbitrarily large structures into a nice type Elm can understand. In a world of “factory this” “IoC container that”, you can’t help but smile. And it REALLY REALLY REALLY improves your programming everywhere: I’m a fan of saying my Ruby is much better and more maintainable after I decided to learn the functional ways because it’s true. Hopefully you can find the same joy.

Ruby Is a Friend

As time goes by, Ruby moves closer and closer to the “boring tech” bin: it’s tried, true and trite (at least by Hacker News standards). And to be completely honest, I’ve often been taken by that same sentiment. The awareness of the privilege to be working with something that remains foreign to most of the Brazilian market has been replaced by a feeling of “not cool enough”, an anxiety for something as wonderful to happen again.

Analyzing things more carefully, however, I’ve realized how much it has allowed me to change as a coder. In Ruby, it’s very easy to go from “it’s like this in language x” to “it’s like this in Ruby, too.” The “Belle marquise” quality of expression it allows is a source of many joys, and can be quite invigorating as the relationship between it and the programmer develops.

We often feel pressured, in our field, to jump from thing to thing, from shiny to shiny, from silver bullet to silver bullet. “The best tool for the job cannot possibly be the tool you had five years ago” — chants the crowd as they try a new JavaScript build system — “it just isn’t right.” Reflecting on the role Ruby has had in my life, though, shines light on an aspect we rarely explore: a programming language is a friend. It’s not perfect, it can annoy you, but it’s there, and by merely being there and helping you think it makes you better.

Ruby is a great friend. We’ve been together for almost ten years. I can only hope I’ve been making it better too.

Maybe Haskell

The programming world is one of trends and fashions. One week you’re on the top of the world for using that NoSQL database, and then you’re very wrong the next; one day it’s all about Rails, the next it’s node.js, now it’s Go. Using Hacker News as a compass seemingly means discarding everything you’re doing now to follow the next big thing.

Like fashion, though, sometimes one of those new things is actually well-rounded, makes a mark and becomes permanent. Also like fashion, the new thing might be an old thing that people are rediscovering or just now ready to adopt. Judging by what’s on the specialized news, Functional Programming is the old-new rage that’s changing the world and is here to stay.

It’s no wonder: more enlightened programmers and language designers have sprinkled some of the joys of that paradigm upon our OO tools, making us giggle with happiness when chaining maps and injects and taking blocks to change a method’s behavior, or using anonymous and higher-order functions in tired and uncool languages of yesteryear, feeling more productive all the way. It’s so transformative to think in pipelines and in functional composition that we end up wanting to know how to learn more and feel even better. Functional Programming called, and you answered.

But lo!, what is a catamorphism? What the hell is a Category, and why does it need a theory? Why did someone put a Monad in my burrito? Is Functor just a funny word Erik Meijer says?

Let’s face it: it can be daunting. None of the usual landmarks of what we call programming are there to guide you through learning, and it’s easy to feel inadequate and, dare I say it, intellectually inferior. Fear not: Pat Brisbin knows this, and is here to help.

The Book

“Maybe Haskell”, written by the aforementioned mr. Brisbin, is a book of the short and sweet kind. It quickly acknowledges that it probably will not be the definitive guide on any of the subjects it talks about, and moves right on to the material.

From the gates, the author explains referential transparency and uses equational reasoning to show you how a name for an expression (let’s say add(x, y)) can be replaced safely by the expression itself (x + y). This will become a tool later on to clarify that what seems so elaborate is actually pretty straightforward. It’s very effective, because it unfolds everything that looks so terse and codified into its components, and those components into their components, working as both a calming device (“see how simple it is? It’s just about context”) and an illustration of the power of function composition.

It then gets to its main thread: what is the Maybe type and how is it built? What does it mean to adopt Maybe in a code base, and how do you deal with it without having every function in your system taking and returning other Maybes? Even if you’ve heard of or applied Maybe, it might give you ideas and reveal unknown subtleties — especially if all you’ve learned about it has been self-directed.

From that on you’ll hit three head-scratchers in sequence: Functors, Applicatives and Monads. It begins with showing you ways of not infecting your code with Maybe everywhere and ends with calling functions that take multiple Maybes, dealing with the possibility of one or more of them not being there. The path is of full of little joys and insights to savor, and you’ll get what a Monad does by the end of it (even if the answer to what it is goes through endofunctors and other details).

What I greatly enjoyed was the “Other Types” section. It’s brief, but tackles how you can use the same building blocks to improve designs and make errors and side-effects more explicit. While I knew most of the benefits and the material, I thought about the complete novice and how that section could spark new ideas. I didn’t “ooh” and “aah” because it was new to me: I did because it will hook a lot of casually interested people who perhaps got the book because it came from someone in the Ruby world and aren’t very invested in the ideas of FP yet. It will definitely make Ruby, if nothing else, better.

In the end, even if “Maybe Haskell” just explains enough of what the language can do to support the examples, the quiet imponence and lack of pretense of the language become very evident. As the text progresses, you’ll see the ivory tower where FP wizards live for what it really is: a building that begins on the same ground you and I step on, made out of very solid and simple materials. Luckily we have Pat gently guiding us to that conclusion.

ENSIME and Emacs as a Scala IDE

“Maybe Emacs is not enough.”

That popped up in my mind, and it scared me. I knew Scala was a different beast; I knew there was probably a lot I was missing out on by using my tried-and-true workflows; I knew that IntelliJ was supposed to be amazing. Still, thinking Emacs-the-almighty was not enough frightened me.

When I started on this slow path towards learning FP, I had been using dynamic languages almost exclusively for almost 14 years, with a short stop in C++-land for a couple of them. I was used to a world of mostly ok tools centered on a REPL, and it was fine — programming is more thinking than typing and clicking, that whole spiel. But I had never really done anything in a good type system, and, frankly, it was time I knew how the rest of the world leveraged their tools in order to work more comfortably and effectively.

With that in mind, I evaluated the Typesafe IDE and IntelliJ IDEA 12 and 13, finding a lot of good in both tools (and a few problems, some discussed in my post about the Reactive Programming course). Still, after a few good days with each option, I was tempted to just go back to Emacs and rely on my memory (and Dash) for the API signatures, do all my code refactorings by hand and use the sbt console for quick explorations.

Then I found out I could have the cake and eat it too.

Enter ENSIME

ENSIME (ENhanced Scala Interaction Mode for Emacs) is a project that gives Emacs IDE-like capabilities. It performs type and error-checking as you write code, provides symbol inspection, facilities for browsing your codebase and performing automated refactorings. It accomplishes all that using the Scala Presentation Compiler, a lightweight version of the infrastructure that goes only as far as needed to resolve types, find errors and do semantic highlighting.

Setting it up is super simple. Using MELPA, install the ensime package. Then add the following to your Emacs config:

1
2
(require 'ensime)
(add-hook 'scala-mode-hook 'ensime-scala-mode-hook)

Then add the plugin to your global sbt config (e.g. ~/.sbt/0.13/plugins/plugins.sbt):

1
2
3
resolvers += Resolver.sonatypeRepo("snapshots")

addSbtPlugin("org.ensime" % "ensime-sbt" % "0.1.5-SNAPSHOT")

And then, in your project directory, run sbt gen-ensime (requires sbt >= 0.13.5). It will resolve the dependencies, install the ENSIME plugin and leave everything ready to go.

Now, when you open a buffer, you’re gonna see the following in your mode line:

Use M-x ensime to start a connection. It might take a few seconds for it to do what it must to analyze your project, but you’ll eventually see the mode line change to show it’s ready to work.

Code completion

One of the cool things ENSIME provides is real code completion, based on the type you’re dealing with. Instead of the usual M-/ cycling, you can explore an API by looking at the method signatures and documentation. Here’s the thing in action:

Type inspection

Sometimes Scala’s type inference engine gets confused, giving you something too broad or too narrow for your needs; other times, you just want to know the type of a val. Worry not: ENSIME can tell you what has been inferenced just by putting the cursor over the token you want and pressing C-c C-v i (works a bit like :t in ghci).

You can also show uses of a symbol by pressing C-c C-v r.

Automated Refactorings

ENSIME offers six simple, but extremely useful automated refactorings:

  • Inline Local
  • Extract Local
  • Extract Method
  • Rename
  • Organize Imports
  • Import Type at Point

Of all of these, Import Type at Point is the only one I’d consider flaky. It resolves the type perfectly, but inserts the import statement inline. I don’t know if that’s configurable. Otherwise, it works as many other automated tools: finds each change, shows you the substitution, asks you to ok it.

Navigation

You can use M-. and M-*, normally associated with finding tags, to move inside your project.

You can also jump from implementation to test, and vice versa.

scala and sbt integration

If you press C-c C-v s, an sbt console will be launched. A lot of my usual Ruby workflow of running specs from keybinds and jumping quickly to the REPL can be reproduced with this feature.

For instance, when you want to run all tests, you press C-c C-b T. When you wish only to invoke testQuick, you use C-c C-b t.

There’s keybinds for changing a region or a buffer, too — useful both for playing with code and exercising your Emacs gymnastics.

Finally

ENSIME has been fun to work with. It allows me to focus on code and work comfortably with my (admittedly small) projects. It’s a great showcase of Emacs capabilities, and has led a couple of hardcore vim-using friends to show admiration.

If you’re doing Scala and don’t want to commit to an IDE, but wish to have more of a modern environment, please try ENSIME. I even hear there’s vim and jEdit clients.

Functional Programming in Ruby

On June 21th, 2014, I gave a talk about Functional Programming in Ruby in one of RubyOnRio’s monthly meetings. I decided to do a short overview of some concepts and techniques in the FP world and then go over Gary Bernhardt’sFunctional Core, Imperative Shell” — after all, it would be a hard sell if I couldn’t show a way said techniques make your everyday coding better.

It didn’t get recorded this time, but I thought the slides could be interesting. Here they are.

Property-based Testing in Ruby

For the past year or so I have slowly been dipping my feet into the vast functional programming seas. From taking the awesome Coursera offerings from Typesafe to slowly working through Rúnar Bjarnason’s and Paul Chiusano’s Functional Programming in Scala, my mind has been expanding proportionally to the time I dedicate to learning its ways. It has been incredibly rewarding and humbling.

One such reward has been coming into direct touch with property-based testing. This technique, first developed by QuickCheck in Haskell-land, spins automated testing on its head: instead of codifying what is proper behavior by asserting that the outputs for given inputs match what is expected, the tester establishes logical properties about what should happen and lets the tool generate loads of inputs to check if they hold. If something goes wrong, the tool will then try to find the smallest test input that breaks the property (falsifies it), a process called shrinking; if it can’t find anything, you can sigh with relief and think about what to scrutinize next.

Having a QuickCheck-like tool at your disposal can be incredibly powerful. The more complex the software or the algorithm, the greater the likelihood of your carefully curated unit and integration tests having blind spots. Basho, for instance, have written about the stark realization that their worker pool library was full of subtle bugs by using QuickCheck for Erlang, and you can find other instances of how the technique helped make better software.

I don’t know about you, but when I come in contact with stuff like that I immediately think of how improved parts of my day job would be if I just could apply it. Considering that my daily duties are conducted in Ruby, I felt it was time I explored the subject in that realm.

A contrived setup that hopefully shows how it can work out

Let’s say we’ve decided to implement our own linked list class in Ruby. We would probably start our implementation with something like this:

“List”link
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
require "singleton"

class Nil
  include Singleton

  def empty?; true; end

  def to_s
    "Nil"
  end
end

class Cons
  def initialize(head, tail = Nil.instance)
    @head = head
    @tail = tail
  end

  attr_reader :head, :tail

  def empty?; false; end

  def to_s
    "(#{head} . #{tail.to_s})"
  end
end

Using that very convenient API, we can build lists:

1
2
>> l = Cons.new(1, Cons.new(2, Cons.new(3, Nil.instance)))
>> l.to_s # => "(1 . (2 . (3 . Nil)))"

We know that, in a linked list, adding to the head is O(1), while appending to the end is O(n). So we build algorithms that respect its efficiency guarantees. However, when we, say, map this list into another list, it results in the following situation:

“List”link
1
2
3
4
5
6
7
8
9
10
def do_something_amazing(list, acc = Nil.instance)
  super_value = list.head * 1337
  if list.tail.empty?
    acc
  else
    do_something_amazing(list.tail, List.new(super_value, acc))
  end
end

>> do_something(l).to_s # => "(4011 . (2674 . (1337 . Nil)))"

Processing things from head to tail means the list ends up reversed. It’s common, then, to reverse it back when we’re done processing, to preserve the order an external user would expect. Let’s add a reverse method to a List helper module:

“List”link
1
2
3
4
5
6
7
8
9
module List
  def self.reverse(list, acc = Nil.instance)
    if list.empty?
      acc
    else
      reverse(list.tail, Cons.new(list.head, acc))
    end
  end
end

So when we try to reverse what was created in do_something_amazing, we get what we need:

1
List.reverse(do_something_amazing(l)).to_s # => "(1337 . (2674 . (4011 . Nil)))"

Awesome. I think this is enough for us to start exploring properties. If you’re getting bored, take a sip of coffee and come back when you’re ready. There’s a few cool tricks below the fold.

Testing the old way

Being the good developers we are, we are covering that code with tests:

“List Tests”link
1
2
3
4
5
6
7
8
9
10
class ListTest < MiniTest::Test
  def test_reversing_lists
    assert_equal "(3 . (2 . (1 . Nil)))",
      List.reverse(Cons.new(1, Cons.new(2, Cons.new(3)))).to_s
    assert_equal "(9 . (400 . (321 . (1 . (10 . Nil)))))",
      List.reverse(Cons.new(10, Cons.new(1, Cons.new(321, Cons.new(400, Cons.new(9)))))).to_s
    assert_equal "Nil", List.reverse(Nil.instance).to_s
    assert_equal "(1 . Nil)", List.reverse(Cons.new(1)).to_s
  end
end

We’re pretty confident that’s enough, even if it was kind of boring to do manually. That amount of testing would let us go home and sleep soundly.

Testing the QuickCheck way

First, we’ll need something like QuickCheck in Ruby. The best, most idiomatic, most maintained, least-Monad-y thing I have found is Rantly. It has both primitive value generation built-in and property testing with shrinking. We’ll skip over the basic API and go straight to defining a property to check if my algorithm is really bullet-proof. To aid in the creation of lists from Arrays, we’ll add a new helper:

“List”link
1
2
3
4
5
6
module List
  # ...
  def self.from_values(*values)
    values.reverse.inject(Nil.instance) { |ls, v| Cons.new(v, ls) }
  end
end

To check that it works, let’s change the existing tests and see if they still pass:

“List Tests”link
1
2
3
4
5
6
7
8
9
10
class ListTest < MiniTest::Test
  def test_reversing_lists
    assert_equal "(3 . (2 . (1 . Nil)))",
      List.reverse(List.from_values(1, 2, 3)).to_s
    assert_equal "(9 . (400 . (321 . (1 . (10 . Nil)))))",
      List.reverse(List.from_values(10, 1, 321, 400, 9)).to_s
    assert_equal "Nil", List.reverse(Nil.instance).to_s
    assert_equal "(1 . Nil)", List.reverse(List.from_values(1)).to_s
  end
end
“List Tests”link
1
2
3
4
5
6
7
8
9
Run options: --seed 48889

# Running:

.

Finished in 0.001256s, 796.1783 runs/s, 3184.7134 assertions/s.

1 runs, 4 assertions, 0 failures, 0 errors, 0 skips

Great. Now to the newfangled thing. As I mentioned before, writing a property to check requires us to think differently than we would with regular unit tests. Your formulation should state something logical, something that does not rely on specific inputs. Following that guideline, we can reason about reversing lists in the following manner:

“List Tests”link
1
2
3
4
5
6
7
8
9
10
  # ...
  def test_reversing_by_property
    property {
      length = range(0, 1_000_000)
      List.from_values(array(length) { integer })
    }.check { |list|
      assert_equal list.to_s, List.reverse(List.reverse(list)).to_s
    }
  end
  # ...

The meat is in the check block. Determining that a list has been reversed correctly requires us to check if reversing it again gets us back to the original list. To seed our check, we build a property block that creats an array with a random length between 0 and 1_000_000, itself filled with random integers. Let’s run the tests again:

“List Tests”link
1
2
3
4
5
6
7
8
9
10
11
12
13
$ bundle exec ruby list.rb
Run options: --seed 17130

# Running:

.
..........
success: 100 tests
.

Finished in 121.969127s, 0.0164 runs/s, 0.8527 assertions/s.

2 runs, 104 assertions, 0 failures, 0 errors, 0 skips

It took a while (we wanted to be thorough, with those million-item arrays), but we’re pretty sure it works. I’m a believer and I’m stoked; when I look at you, however, I see a face that says “look, it’s cool and all, but isn’t it kind of worthless? The tests we had were telling us the same thing, and we only needed the power of our minds to generate the correct inputs. Why go through so much trouble?”

Well, what about those times when ours minds fail us?

Catching a bug with Rantly

Let’s say you’re excited about building your own data structures and want to wrap that linked list inside a very inefficient Set. You mutter to yourself that you should make sure items are not inserted twice, which for now seems to be the main difference between Sets and Lists as storage containers.

You build a little more structure into what you already have, adding a prepend method and inlining reverse into a List base class:

“List 2”link
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
class List
  def to_s
    raise "Don't use this directly, fool"
  end

  def empty?; true; end

  def prepend(v)
    Cons.new(v, self)
  end

  def reverse(acc = Nil.instance)
    if empty?
      acc
    else
      tail.reverse(Cons.new(head, acc))
    end
  end

  def self.from_values(*values)
    values.reverse.inject(Nil.instance) { |ls, v| Cons.new(v, ls) }
  end
end

class Nil < List
  include Singleton

  def to_s
    "Nil"
  end
end

class Cons < List
  def initialize(head, tail = Nil.instance)
    @head = head
    @tail = tail
  end

  attr_reader :head, :tail

  def empty?; false; end

  def to_s
    "(#{head} . #{tail.to_s})"
  end
end

To check if an item exists, you add a contains? method:

“List with contains”link
1
2
3
4
5
6
7
8
9
10
11
12
13
class List
  # ..
  def contains?(v); false; end
  # ..
end

class Cons < List
  # ..
  def contains?(v)
    head == v || tail.contains?(v)
  end
  # ..
end

Then you write your immutable Set and matching tests:

“A dumb set implementation”link
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
class DumbSet
  def initialize(storage = Nil.instance)
    @storage = storage
  end

  attr_reader :storage
  private     :storage

  def push(v)
    if !storage.contains?(v)
      DumbSet.new(storage.prepend(v))
    else
      self
    end
  end
  alias_method :<<, :push

  def contains?(v)
    storage.contains?(v)
  end

  def to_a
    values = []
    list   = storage
    until list.empty?
      values << list.head
      list = list.tail
    end
    values
  end
end

class DumbSetTest < Minitest::Test
  def setup
    @s = (((DumbSet.new << 1) << 2) << 3)
  end

  attr_reader :s

  def test_contains
    assert s.contains?(3)
    assert s.contains?(2)
    assert s.contains?(1)
    assert !s.contains?(4)
  end

  def test_uniqueness
    assert_equal [-32, 1, 2, 3], (s << -32 << -32 << -32).to_a.sort
  end
end

And because I spotted you writing new code and yelled “HEY USE RANTLY IT’S SO COOL YIPEE”, you add some property tests:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
class DumbSetTest < Minitest::Test
  # ...
def test_contains_property
    property {
      array(range(0, 100)) { integer }
    }.check { |vs|
      s = vs.inject(DumbSet.new) { |ds, v| ds << v }
      assert vs.all? { |v| s.contains?(v) }
    }
  end

  def test_uniqueness_property
    property {
      array(range(0, 100)) { integer }
    }.check { |vs|
      ns = vs.inject(DumbSet.new) { |ds, v| ds << v }
      rs = vs.inject(ns) { |ds, v| ds << v }
      assert_equal vs.sort, ns.to_a.sort
    }
  end
  # ...
end

It looks good:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ bundle exec ruby set_test.rb
Run options: --seed 15625

# Running:


..........
success: 100 tests
..
..........
success: 100 tests
..

Finished in 0.119717s, 33.4121 runs/s, 1720.7247 assertions/s.

4 runs, 206 assertions, 0 failures, 0 errors, 0 skips

You then implement the removal of items:

“Set with delete”link
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
class DumbSet
  # ...
  def delete(v)
    ls = storage
    ns = DumbSet.new

    while !ls.empty?
      if ls.head != v
        ns = ns << v
      end

      ls = ls.tail
    end

    ns
  end
  # ...
end

class DumbSetTest < Minitest::TestCase
  # ...
  def test_delete
    os = (((DumbSet.new << 1) << 2) << 3)
    ns = os.delete(1337)
    assert_equal [1, 2, 3], ns.to_a.sort
    ns = os.delete(3)
    assert_equal [1, 2], ns.to_a.sort
    ns = ns.delete(2)
    assert_equal [1], ns.to_a.sort
    ns = (ns << 432).delete(1)
    assert_equal [432], ns.to_a.sort
    ns = ns.delete(432)
    assert_equal [], ns.to_a.sort
  end
  # ...
end

Your tests pass, but this time you don’t listen to me about adding another property. You’re just not that convinced they’re worth their salt, and it looks good enough to ship with all the tests you’ve added. The Pokémon Collecting app you work on can benefit from it right now, instead of 20 minutes from now. To production it goes.

Time goes by, and you’ve forgotten about me and our little adventure. Your system is humming along and moved on to maintenance mode. Days have been kind of slow, so you decide to add an optimization you’ve read about in Hacker News, detailing how a node.js program got a 10x speedup. You modify your delete method accordingly:

“Set Tests”link
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
  # ...
  def delete(v)
    ls  = storage
    tmp = DumbSet.new

    while !ls.empty?
      if (ls.head != v) && (ls.head < 1500) # secret performance trick
        tmp = tmp << ls.head
      end

      ls = ls.tail
    end

    tmp
  end
  # ...

CI still reports all green.

A few days later, you receive a report from a User telling she deleted their Pokémon with power level 3, but her Pokémons with levels 4013, 1551 and 20000 disappeared. Your first line of defense — your tests — have not caught any issues. Sweating bullets and drowning in emails from stakeholders and other Pokémon fiends, you’re about to collapse.

And then you remember: what about trying to express a property to see if it holds?

1
2
3
4
5
6
7
8
9
10
11
  # We'll add at most 10 unique items and then delete the first
  # 2. If there's anything wrong, this will blow up.
  def test_delete_property
    property {
      array(10) { range(0, 5000) }.uniq
    }.check { |values|
      os = values.inject(DumbSet.new) { |s, v| s << v }
      ds = values[0..1].inject(os) { |s, v| s.delete(v) }
      assert_equal (values.size - 2), ds.to_a.size
    }
  end

You run it and it explodes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
$ bundle exec ruby set_test.rb
Run options: --seed 46455

# Running:


..........
success: 100 tests
..
failure: 0 tests, on:
[384, 437, 120, 718, 1850, 4579, 3178, 4191, 533, 2669]
F
..........
success: 100 tests
...

Finished in 0.093858s, 63.9264 runs/s, 2248.0769 assertions/s.

  1) Failure:
DumbSetTest#test_delete_property [set_test.rb:69]:
Expected: 8
  Actual: 2

6 runs, 211 assertions, 1 failures, 0 errors, 0 skips

What? How come you’ve only got 2 when you expected 8? Well, there must be something wrong with delete, after all. Let’s take that array and try it on an pry session to see what happens:

1
2
3
4
5
6
[1] pry(main)> values = [384, 437, 120, 718, 1850, 4579, 3178, 4191, 533, 2669]
=> [384, 437, 120, 718, 1850, 4579, 3178, 4191, 533, 2669]
[2] pry(main)> os = values.inject(DumbSet.new) { |s, v| s << v }
=> #<DumbSet...>
[3] pry(main)> values[0..1].inject(os) { |s, v| s.delete(v) }.to_a
=> [718, 533]

Wait a minute! Should delete also remove everything that’s over 1000-ish? Is there anything in the code that stipulates such a thing? Maybe that node.js optimization was not so great after all. Let’s remove it and run the tests:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$ bundle exec ruby set_test.rb
Run options: --seed 2727

# Running:

.
..........
success: 100 tests
..
..........
success: 100 tests
.
..........
success: 100 tests
..

Finished in 0.099329s, 60.4053 runs/s, 3120.9415 assertions/s.

6 runs, 310 assertions, 0 failures, 0 errors, 0 skips

Voilà: properties have saved the day, and you’ve learned not to trust Hacker News bravado ever again.

Is using Rantly the same as using QuickCheck or ScalaCheck?

Sort of. For one, you have to write your own generators every time you want something other than basic types, while both QuickCheck and ScalaCheck can figure out a lot by themselves. This can make expressing what you mean a lot easier, and you don’t spend time debugging your property blocks in search of mistakes. That said, writing a generator for your own types requires only that you instantiate them in the property blocks with other auto-generated values.

Shrinking is not as good in Rantly. It works ok a lot of the time, but it could be improved. On the surface, from skimming the algorithms used in ScalaCheck and Rantly, it doesn’t seem that different, but over that side of the line the patterns in minimization seem easier to spot.

There’s also no mechanism to test stateful code. ScalaCheck has Commands to help in modeling state changes, and I’m aware PropEr and QuickCheck for Erlang also provide something in that direction.

One minor thing is that integration with RSpec and MiniTest could be improved. Its output pollutes the test run, and on large suites it becomes hard to know the relationship between a failed property and a failed test. It should be easy to fix for anyone motivated. On that note, there’s no ready-made extension for MiniTest (although adding one is trivial enough that I’m sending a PR to fix it).

Final considerations

I hope I have proven, even if with a craptastic example, that property-testing can aid you in writing better Ruby code. Our community is great when it comes to using tests both as a design and as a verification tool, and QuickCheck (via Rantly) is a new way of thinking about them. You should keep your TDD/BDD to carve out your objects and responsibilities, but also add property checks where suitable to strengthen your confidence in the system.

(PT-BR) Palestra Sobre Celluloid No RubyOnRio

No dia 15 de março de 2014, aproveitei o encontro do RubyOnRio para falar sobre Celluloid, uma implementação do Actor Model para Ruby. Como programação reativa, concorrência, paralelismo e quetais têm ocupado minha mente nos últimos meses, achei por bem conversar com o pessoal sobre como isso é relevante para o futuro do desenvolvimento de software e para o Ruby em si, constantemente sob ameaça (se Hacker News for parâmetro) de ser soterrado por uma tecnologia mais antenada com os novos tempos.

A ideia da minha apresentação foi dar uma pincelada nos problemas clássicos de threading, explicar en passant as ideias do Actor Model, e por fim encerrar na aplicação disso dentro do Celluloid. Claro que há informações vitais que ficaram de fora, que a superficialidade pode ser criticada, que o palestrante é meio capenga, mas espero que no cômputo geral o resultado tenha agradado.

Por fim apresentei um projetinho que fiz especialmente para o encontro, o Balladina. Foi divertido fazê-lo, e aprendi bastante coisa sobre o Celluloid no processo. Pude também contrastar algumas coisas com a parca experiência que tive com o Akka no curso do Coursera, e a maturidade do Celluloid relativa à do Akka me deixou esperançoso de um futuro bacana no Ruby. Vamos torcer pelo melhor.

WebSockets in Ruby

At my main job, we have a large datastructure that takes considerable CPU time to be built, but remains unchanged thereafter. Its job is to geocode positions to and from a local reference system, which in turn provides us the ability to pin records, for instance, to a place on a Road, and know to which coordinate pair a local reference would correspond.

For the first pass, I built the Ruby library for the geocoding and a simple (Sinatra-based) webservice. This worked fine for a while until the Client required that every mouse move performed a conversion. Said change, prompted me to build the same geocoding infrastructure again in JavaScript, and all were happy for a while.

As it usually goes, a new decision was made to support multiple Roads per User. Now, a download of 800KB of data (stored in an IndexedDB for later sessions) was tolerable; potentially multiple megabytes would be deadly, even if the software could be used before that constant feedback of conversions was given — it just became one of those features Users hold on to.

I knew that we had to go for a solution that kept that intact and made the whole thing manageable. I had dabbled in WebSockets before (with node.js and Socket.IO) and kind of knew the lay of the land. Still, from previous searches, I was also aware there was a dearth of Ruby solutions, and for a moment considered going with my JavaScript port on node. The thought gave me shivers.

The contenders

The first step was finding out what could be used. This is what I evaluated:

The first three are EventMachine-based; tubesock uses rack hijacking; webmachine-ruby provides WebSockets via Reel, a Celluloid::IO-based HTTP server.

At first, considering I was already using Sinatra, I tried sinatra-websocket. For some reason I just couldn’t get the connection to be upgraded to a WebSocket, and decided to move on quickly. faye-websocket I just skipped, to be frank.

The next two suffered from the same problem: after booting Rails and loading the structure, I was left with only enough memory for a couple dozen or so clients on a small Heroku dyno. Also, Rails’ boot time coupled with building the thing occasionally made Heroku think something had gone wrong, and often the process crashed before the service went up.

The only one left, if you’re counting, was webmachine-ruby.

webmachine-ruby

Setting up was relatively easy. To ramp up, I first migrated the original HTTP-based service to its resource structure. It has more of an OO flair than both Rails and Sinatra, with the caveat that it provides a lot less (by design). The dispatcher is easy to understand, and I quite enjoyed toying with the visual debugger.

Moving to a WebSocket, however, changes everything. As far as I can tell (and the documentation specifies) you completely skip over the regular infrastructure by providing a callable to a configuration option, as such:

1
2
3
4
5
6
7
8
App = Webmachine::Application do |app|
  app.configure do |config|
    config.adapter = :Reel
    config.adapter_options[:websocket_handler] = proc do |websocket|
      websocket << "hello, world"
    end
  end
end

That is pretty much what the docs say. Since it only expects the handler to respond to #call, you can write your own ad-hoc dispatcher:

1
2
3
4
5
6
class WebsocketHandler
  def call(websocket)
    message = websocket.read
    # do something with the message, call methods on other objects, log stuff, have your fun
  end
end

What the docs don’t address are some basics of sockets programming. If you see your handler hang and never respond again, requiring you to restart, don’t fret: you just have to provide a loop to read from the the socket and let Celluloid::IO do its non-blocking magic:

1
2
3
4
5
6
7
8
class WebsocketHandler
  def call(websocket)
    loop do
      message = websocket.read
      # do something with the message, call methods on other objects, log stuff, have your fun
    end
  end
end

Don’t worry: your CPU won’t be pegged at 100%, because non-blocking. You’ll be subjected, however, to the same limitations node has regarding CPU usage and its event handlers (i.e. if you are CPU-intensive, you’ll affect throughput).

Luckily, we have threads in Ruby. I decided to take advantage of that by assigning each client to a Celluloid Actor, which allows me to provide some of the CPU-intensive operations without compromising (at least not heavily) other Users. It has been working fine so far.

What’s missing

My solution doesn’t take into account non-WebSocket clients, but it should. webmachine-ruby makes it easy by allowing you to implement streaming APIs without much trouble, and I suppose it’ll only take a bit of JS to fallback from one to the other and provide an abstract connection to consumers.

The documentation also doesn’t go over all the events that can happen on the socket (onerror, onclose, onopen, onmessage). You can see them as methods on the socket, each taking a block, but for my use case I just let the actor crash and be done with it. If I’m missing some cleanup, please let me know.

This architecture also doesn’t provide a ready-baked pub/sub system, with channels and message brokers. If that’s more in the spirit of what you need, check out faye and websocket-rails.

“Principles of Reactive Programming”, a Review

After the positive experience learning some Scala and some functional patterns in Functional Programming Principles in Scala, I was excited to undertake the new course from Martin Odersky and co. The fact that it would pick up from where it left off and build on the Reactive brouhaha was icing in the cake; after all, the techniques and technologies approached in the lectures highlight some of what Scala does best.

Structure

It starts with mr. Odersky’s already familiar style, reviewing some of what was taught in the previous course and expanding on some topics. The dreaded Monads were dealt with, and while I can’t talk about the subject from a theoretical standpoint or offer a perfect explanation, I feel I grasp how they can be used to compose behavior and express functionality better.

The second part was the highlight, for me. Erik Meijer (from LINQ and Reactive Extensions fame on the .NET land) has an incredibly upbeat energy and great sense of humor, which carry you like a breeze through the amazing concepts he expresses. I loved thinking about Event streams and how to compose them using monadic combinators in the form of Observables (alongside Subscriptions, Schedulers, Promises and Futures). I think this affected me the most, for I could clearly see how to mix and match things to add behavior.

The third part was also very good. Roland Kuhn (Akka’s tech lead) has a soft-spoken style that relaxes you while he lays the groundwork over which you’ll learn about Actors, building distributed systems and dealing with failure in that model. He also shows piece by piece what Akka can do (which led me to explore more of Celluloid, a topic for a future post).

Exercises

Everything comes together with each week’s exercises. The first was a quick-and-simple one that explores property testing. I went through it quickly, but I still find it hard to apply that mode of thinking, and didn’t get much out of the technique later on. I know this is a personal limitation, as some people in the course forums mentioned using it to great effect.

After that, you get a wake-up call that this is gonna require some hard work right as you’re asked to build two simulations: one of circuits, with emphasis on building a demultiplexer, and one of an epidemy. The fact that this involves timing and sequencing makes you think about state and purely functional programs. Building something substantial (and fun to watch) gives you an appreciation of their usefulness.

Epidemy simulation

Erik Meijer’s had cool, practical examples: you build a node-style web server (with a Reactor loop) in the first week and a Wikipedia Swing client using reactive streams for GUI events and networking in the second. It makes you want to write everything in this style, because it becomes much easier to think about what happens when with whom.

The third section, on Akka, takes the prize in this area: first you redo the binary search tree exercise from the first course using Actors, which I found a great idea, as you’re confronted just with the new concepts. Then you are challenged into building a (simplified) distributed key-value store, including replication and joining of new nodes. Describing this exercise to colleagues made them want to take the course. I hope they do.

Scala

Being forced to use more Scala made my interest in it grow more and more. Alex Payne has said that it feels a bit like Ruby in its expressiveness, and I got that while building the exercises. I confess it is refreshing and enlightening to use types as a way of reasoning, and not as something that just soothes a compiler with a frail ego.

On the other hand, tooling was problematic. Being an Emacs user, I try to keep in it at all times, and surprisingly (coupled with ENSIME) the experience was not bad. Still I tried to take the opportunity to learn a little bit more about what is offered and how well it works, and the experience was not so great. Importing the provided Eclipse projects sometimes resulted in missing dependencies; importing the sbt project in IntelliJ IDEA resulted in error messages complaining about the TEST context, which some of the students claimed could be solved with nightly builds of its Scala and SBT plugins.

In ENSIME it mostly worked fine. Sometimes I needed to run the tests before generating the project in sbt, or to regenerate the project as I fleshed out the code. This was not that that awful, in the end: coming from a world of no static types, no code completion, poor navigation (seriously, guys, I’ve used dynamic languages for over 15 years, and grep/ack/ag is not a good replacement) and pretty crude automatic refactorings, it felt like I was thrown into a better world. I find myself missing it in my daily Ruby dealings.

ENSIME as I worked through the exercises

I realize the situation with Eclipse and IntelliJ may be specific to the course, as the canonical thing to use was sbt. However, being that sbt is the canonical build tool, period, it made the experience messy, and complaints in the forum suggest there’s still a lot to improve here. We developers are used these sorts of hurdles (and, let’s be frank, to learning workarounds), but this is the type of thing that turns off someone that is not yet invested in the tech.

What I learned about myself

Comparing my approach to my fellow students’, I found I could go through most things relatively easily, based on experience alone, but that eventually I had to exercise muscles in areas I consider mostly weak: self-reliance and grit. It was necessary to overcome my impulse to procrastinate when facing a roadblock, and also to not feel guilty about going to the forums, reading about other people’s struggles or using a test suite that some kind soul provided for everyone else.

To face things that are known but empirically unexplored made me question my ability, and demanded a hard look at my self-image. It took great care and some pain to keep motivated and think of the outcome. The sense of pride from having completed this without delaying a single exercise makes me believe I can accomplish more. Furthermore, getting a near-perfect score for the effort is very sweet.

I hope this helps me become a little more patient. My instinct is to go for quick intellectual gratification, for tight feedback loops, and that is not always possible (or even desirable). Standing on the shoulder of giants requires a climb; I must remember not be discouraged after the first stretch.

Conclusion

I recommend everyone to go through this course. It is challenging, rewarding and (for me) a good source of personal growth. Scala is very interesting, and worthy of your time, and Akka is incredible. I found the different teaching styles refreshing, a great way to keep the learner engaged. All in all, a delightful experience.

“REMOTE”, by 37signals — ★★☆☆☆

As a remote worker myself, I nodded my head frequently at the advantages and challenges presented, so the rating’s not about a fundamental disagreement with the message or the intentions. Like the authors, I know from personal experience that commuting, facing a strict set of working hours, interruptions and living with the expectation of availability from others are some of the greatest dangers to productive work (creative work especially).

I did, however, expect more than short chapters and sparse data points. Maybe it’s the programmer in me misunderstanding the whole “business book” thing, but perhaps the arguments would hold more weight if they were more than anedoctal. They render the whole thing as an account of what works for 37signals — a very small and relatively unknown company in the greater scheme of things. I understand that relying on managers and employers to be swayed by arguments from authority is antithetical to the book (the thesis being that remote work is a rationally better decision overall), but we cannot underestimate how many small-to-medium companies manage by emulation. Maybe stronger data and clearer research could counter gut reactions better.

I think this might be good if you need just a little push to go after this, if you’re a bit on the fence and already considering/thinking this is a trend that should be studied and followed. If you want something to challenge your views with great insights, don’t bother: it’s fluffy and humane and beautifully illustrated, but you’re probably going to leave the book mostly unchallenged.