Log in or Sign Up

Posted by al on Nov. 3, 2011 code erlang javascript python ruby

Lost in scope

Today I bumped into by a behavior of the Ruby language that I didn't expect. Here is a dummy example to illustrate the problem:

def any_day_but_monday?(day)
  days = %w(monday tuesday wednesday thursday friday saturday sunday)
  good_days = days.reject { |day| day == 'monday' }
  good_days.include?(day)
end

puts any_day_but_monday?('monday')

Intuitively, what do you think the output of this code would be? We're building a list of good_days which doesn't contain 'monday' and then we're testing if 'monday' is in that list. It should return false, right?

That's what I was thinking but it turns out that it doesn't work. The variable day used in the block passed to Array#reject happily overwrites the variable day passed to our function. So by the time we're testing if 'monday' is in good_days, it's not 'monday' anymore, it's 'sunday'.

In my (obviously misguided) mind, blocks are closures and therefore have their own context. My thinking was that the block would create a variable day, separate from the variable day used at the function level. I know that closures can access and capture variables from the outer context, but spontaneously I was thinking that the block parameter would create its own new, distinct variable.

This reminds me of a Python issue a friend of mine raised on IRC while he was implementing callbacks for Tkinter. Here is another contrived example to illustrate the problem:

callbacks = []
days = "monday tuesday wednesday thursday friday saturday sunday".split()
for day in days:
    callbacks.append(lambda: "Today is %s." % day)

print callbacks[0]()

We're looping on a list of days and creating a callback that returns a string based on each day. When we call the first callback, this should print Today is monday, right?

Well, it doesn't work either. But maybe you're thinking that a lambda is not a proper function and doesn't make proper closures. Try this then:

callbacks = []
days = "monday tuesday wednesday thursday friday saturday sunday".split()
for day in days:
    def callback():
        return "Today is %s." % day
    callbacks.append(callback)

print callbacks[0]()

Not better unfortunately, all callbacks still return Today is sunday. For a good explanation of this issue see this post which also presents a clever workaround using keyword arguments (more on this later).

Just in case you still had hope in Python, here is an equivalent of the previous Ruby example using list comprehensions:

def any_day_but_monday(day):
    days = "monday tuesday wednesday thursday friday saturday sunday".split()
    good_days = [day for day in days if day != "monday"]
    return day in good_days

As you might have guessed, this won't work as expected (at least as I would expect).

So all this is about scoping and unexpected side effects. This is the kind of things that functional languages are supposed to be good at, isn't it? This makes me wonder if a functional programming language would do any better. The only "real" functional programming language I'm familiar with is Erlang. Let's try to translate our initial Ruby example into Erlang using list comprehension:

-module(good_day).
-export([any_day_but_monday/1]).

any_day_but_monday(Day) ->
    Days = [monday, tuesday, wednesday, thursday, friday, saturday, sunday],
    GoodDays = [Day || Day <- Days, Day =/= monday],
    lists:member(Day, GoodDays).

Let's try it in the Erlang shell. First we need to compile it:

1> c(good_day).
./good_day.erl:6: Warning: variable 'Day' shadowed in generate
{ok,good_day}

The compiler prints a warning. Let's see what the documentation says about it:

./FileName.erl:Line: Warning: variable 'X' shadowed in generate

This diagnostic warns us that the variable X in the pattern is not the same variable as the variable X which occurs in the function head.

Well, that sounds pretty good to me! And if we call our function we do get the expected behavior:

2> good_day:any_day_but_monday(monday).
false

So Erlang does the right thing, but warns us that using the same variable name could be confusing. Very civilized. Using a function instead of list comprehension gives a similar result:

any_day_but_monday(Day) ->
    Days = [monday, tuesday, wednesday, thursday, friday, saturday, sunday],
    GoodDays = lists:filter(fun(Day) -> Day =/= monday end, Days),
    lists:member(Day, GoodDays).

which gives the following compiler warning:

./good_day.erl:6: Warning: variable 'Day' shadowed in 'fun'

We might as well follow Erlang's advice and use different variable names so that experienced Ruby or Python programmers don't get confused ;-)

Actually I'm thinking now, there's another language that I'm familiar with and which is supposed to be somewhat functional: JavaScript. Let's see how it copes with our example:

function any_day_but_monday(day) {
   var days = "monday tuesday wednesday thursday friday saturday sunday".split(" ")
   var good_days = days.filter(function(day) { return day !== "monday" });
   return good_days.indexOf(day) !== -1;
}

alert(any_day_but_monday("monday"));

(I'm using Array.indexOf and Array.filter from ECMAScript 5 so you might need a shim if your browser is very old)

It works without a warning. It looks like we have a winner, who knew it would be JavaScript ;-)

Now let's go back to the callback generation example that was causing so much trouble to Python:

function make_callbacks() {
  var days = "monday tuesday wednesday thursday friday saturday sunday".split(" ")
  return days.map(function(day) {
    return function() {
      return "Today is " + day;
    }
  });
}
alert(make_callbacks()[0]());

It works like a charm: the first callback does return Today is monday.

However if you've been using JavaScript in the 20th century, you might know there's another weird-looking way to iterate, the for loop. Let's see how that works:

function make_callbacks() {
  var days = "monday tuesday wednesday thursday friday saturday sunday".split(" ")
  var callbacks = [];
  for (var i = 0; i < days.length; i++) {
    var day = days[i];
    callbacks.push(function() {
      return "Today is " + day;
    });
  }
  return callbacks;
}
alert(make_callbacks()[0]());

This has exactly the same problem as the Python version, only the value of day created at the last iteration is available to our callbacks. Just like in Python, each callback doesn't get its own distinct variable but instead it refers to the variable day which belongs to the function scope. This is also true in Ruby if we use the for loop:

def callbacks
  result = []
  for day in days
    result << lambda { "Today is #{day}" }
  end
  return result
end

puts callbacks[0].call

Whereas if we use Ruby block-based looping, it works fine and outputs Today is monday as expected:

def callbacks
  days.map { |day|
    lambda { "Today is #{day}" }
  }
end

puts callbacks[0].call

In Python, Ruby and JavaScript, if we create callbacks within a for loop, the variable available to those callbacks will be the one and unique variable which belongs to the function where we've written the for loop and which has been overwritten at each iteration, not the value the variable had when the callback was created. But in Ruby and JavaScript, if we iterate in a functional way, the function (or block) in which we create our callbacks gets passed a different value at each iteration and doesn't overwrite that value while it's executing so each callback will refer to a different value. Using this idea, in Python we can do:

def make_callback(day):
    def callback():
        return "Today is %s." % day
    return callback

callbacks = []
for day in days:
    callbacks.append(make_callback(day))

Or more succinctly, because in Python default values get evaluated when the function is created (which is another source of confusion), we can do this:

callbacks = [lambda day=day: "Today is %s."  % day for day in days]

Before finishing this post, I'd just like to quickly check what's going on with list comprehension in CoffeeScript, a language I've recently started to learn.

any_day_but_monday = (day) ->
  days = "monday tuesday wednesday thursday friday saturday sunday".split " "
  good_days = (day for day in days when day isnt "monday")
  day in good_days
alert any_day_but_monday("monday")

Unfortunately this doesn't work either. Maybe I'm the only one to think variables created by list comprehensions should be independent from the surrounding context? That's how Erlang does it, but it seems to be apologizing for it. What do you think?

Comments on this post:

Please Login (or Sign Up) to leave a comment

View source in reStructuredText