An exploration of Lambda in Ruby

2012-04-08

Lets start off by taking a look at traditional block passing in ruby.

5.times {|i| puts i}

Looks great right? Go ahead and pass that block off to another method.

Using the block/yield syntax:

def first
  yield "Hello!"
end

def second
  # first yield...?
  first {|str| yield str}
end

second {|str| puts str}

Instead of just passing the original block we are forced to create another block which yields to the original block. Of course, we could make the situation a little better by catching the block by reference.

Using the block/ref syntax:

def second(&block)
  first &block
end

second {|str| puts str}

This is certainly more DRY and visually pleasing over the first example. Do notice however that Ruby is actually converting our block into a lambda (a Proc object technically) for us now. Even worse is that Ruby handles argument and block syntax somewhat antagonistically.

Antagonistic blocks:

# Are we passing a block to our upto method, making a hash,
# or passing something to our object 5?
1.upto 5 {|i| puts i}

# SyntaxError: (irb):23: syntax error, unexpected '{', expecting $end
# 1.upto 5 {|i| puts i}
#           ^
# 	from /Users/lori/.rvm/rubies/ruby-1.9.3-p0/bin/irb:16:in `<main>'

# We could solve this with do / end
1.upto 5 do |i|
  puts i
end

# Resort to using parens
1.upto(5) {|i| puts i}

# Or just pass a lambda by reference
1.upto 5, &->(i) {puts i}

A first class lambda would gain us the ability to pass it around, manipulate it, use it as a function / method, and treat it like any other object. Unfortunately this is not entirely the case with Ruby. We have an odd mixture of blocks, lambdas, by reference conversions, and so forth. Having said that, lets see how close we can get to a first class lambda.

Our previous example using a lambda:

def first(&bl)
  bl.call "Hello!"
end

def second(&bl)
  first &bl
end

second &->(str) {puts str}

And let's have a little more fun with the lambda syntax.

FizzBuzz:

fizzbuzz = ->(i) do
  (i%15).zero? and next "FizzBuzz"
  (i%3).zero?  and next "Fizz"
  (i%5).zero?  and next "Buzz"
  i
end

puts (1..100).map(&fizzbuzz).join("\n")

Fantastic right? Instead of having an awkward mixture of blocks, yields, and arguments we just favor the syntax which gives us the strongest lambda. While it's not quite 'first class' (you can't treat it as a method) at least we are a lot closer then relying on blocks.

Certainly there is a performance impact moving away from blocks right?

Utility methods for the following benchmarks:

def by_yield(i)
  yield i
end

def by_ref(i, &block)
  block.call i
end

By Reference vs Yield:

Benchmark.bm do |b|
  b.report("built-in \w block") do
    1.upto(1_000_000) {|i|}
  end

  b.report("built-in \w lambda") do
    1.upto(1_000_000, &->(i){})
  end

  b.report("yield \w block") do
    1_000_000.times &->(i) do
      by_yield(i) {|x|}
    end
  end

  b.report("ref \w block") do
    1_000_000.times &->(i) do
      by_ref(i) {|x|}
    end
  end

  b.report("1 off yield \w block") do
    by_yield(1) {|i| }
  end

  b.report("1 off ref \w block") do
    by_ref(1) {|i| }
  end
end

# Iterating over a MILLION uses of yield vs by reference.
#                        user       system     total       real
#  built-in \w block     0.050000   0.000000   0.050000 (  0.051122)
#  built-in \w lambda    0.040000   0.000000   0.040000 (  0.043985)
#  yield \w block        0.120000   0.000000   0.120000 (  0.117864)
#  ref \w block          0.840000   0.050000   0.890000 (  0.885240)

# A single use of yield vs reference
#                        user       system     total       real
#  1 off yield \w block  0.000000   0.000000   0.000000 (  0.000003)
#  1 off ref \w block    0.000000   0.000000   0.000000 (  0.000003)

Surprisingly the impact is pretty marginal. Our largest impact actually comes from the by reference syntax for methods. Given that we have already incurred the cost of 'by reference' in our previous method definition, let's look at lambda vs block.

Lambda vs Block:

Benchmark.bm do |b|
  b.report("ref \w block") do
    1_000_000.times &->(i) do
      by_ref(i) {|x|}
    end
  end

  b.report("ref \w lambda") do
    1_000_000.times &->(i) do
      by_ref i, &->(x) {}
    end
  end

  b.report("1 off ref \w lambda") do
    by_ref 1, &->(x) {}
  end

  b.report("1 off ref \w block") do
    by_ref(1) {|i| }
  end
end

# Iterating over a MILLION uses of block vs lambda.
#                       user       system     total       real
#  ref \w block         0.830000   0.050000   0.880000 (  0.872216)
#  ref \w lambda        0.920000   0.040000   0.960000 (  0.957568)

# A single use of block vs lambda.
#                       user       system     total       real
#  1 off ref \w lambda  0.000000   0.000000   0.000000 (  0.000004)
#  1 off ref \w block   0.000000   0.000000   0.000000 (  0.000003)

Still a pretty small difference.

I do wish the whole lambda/block/closure situation in Ruby was better unified. For now, it certainly looks nicer to use blocks when working with a DSL.

Menu DSL:

menu "File" do
  item "New"
  item "Open"
  menu "Open Recent", -> do
    generate_recently_opened_list
  end
  separator
  item "Exit"
end

With a more unified syntax, we wouldn't need a special form for lambdas. Consider for example Smalltalk with it's single 'block' syntax.

example
  "Because Smalltalk is fun!"
  |numbers smaller a|
  numbers := OrderedCollection new.
  numbers add: 1; add: 2; add: 3; add: 4.

  "Gather items that are less then or equal to 2"
  smaller := numbers select: [:i| i <=2 ].

  "save off a block in variable a"
  a := [:i :x sum|
         sum := i + x
         sum
       ].

  "And return it"
  ^a