2008-01-16 / 14:01 /

Reading Peter Thatcher’s Monads in Python (with nice syntax!) had two effects: 1) it–along with Oliver Steel’s article and the Wikipedia–helped me understand what monads do and 2) it made me think monads were the bee’s knees for error handling.

But then I read A Little Lesson on Laziness and Unsafety and I remembered Reg‘s post about how tail-recursion could result in debugging difficulties (well, if I remembered correctly since I couldn’t find the link). I also reviewed Peter’s code and it’s, to my un-haskelly brain, excessively magical (though that’s far from criticism: I thought the article was pretty amazing).

So then I started thinking about error handling and wrote some code… but Michael Feathers thinks we don’t talk about error handling enough, so first a few pages of drivel.

Out of bounds

Most modern error handling is done via exceptions. Exceptions are nice for one reason: they’re optional. You can blissfully code down the happy path and rely on the global handler to barf. Coupled with test driven development, it’s an efficient way to write code: break it, fix it (and eventually) upgrade it with error handling, though I use the term upgrade loosely. Joel Spolsky thinks–or at least thought in 2003–that exceptions are glorified gotos; basically exceptions are out-of-band signaling. Out of band signaling worked for the phone companies because they were trying to keep control out of the hands of their users. But since we’re our own users, we’d love to be able to keep signaling in band (so that we can, you know, build things like blue boxes).

But Joel’s preferred C/C++/Java example is… well…

T tmp;
if (ERROR == g(x, tmp))
     errorhandling;
if (ERROR == f(tmp, result))
     errorhandling;

Yikes! Verbose and still out of bounds: return has been subsumed for the status while output variables are used for the actual return value.

Bundling the status with the value helps. Although Erlang’s got exceptions [PDF], error handling was traditionally via tuples (note: IANAErlangExpert):

…the predominant way of signalling the success or failure of a function is to make it return tagged tuples like {ok, Value} in case of success and {error, Reason} otherwise, forcing the caller to check the result and either extract the value or handle the error.

Carlsson, Gustavsson & Nyblom Erlang’s Exception Handling Revisited [PDF]

Now the problem–which is mentioned in the pdf along with a million other smart things–is that the complexity is baked in: the callee has to start off with tagged tuples (or break clients when it changes) and the caller has to always deal with a tuple. Erlang does have pattern matching, which almost makes dealing with tuples a feature, but it’s still an extra step. Maybe that’s why Erlang programming conventions recommend coding the happy path and letting errors fall through to the built-in logging.

Although I crapped all over Joel’s C code sample, I agree with his solution:

I think the reason programmers in C/C++/Java style languages have been attracted to exceptions is simply because the syntax does not have a concise way to call a function that returns multiple values, so it’s hard to write a function that either produces a return value or returns an error. (The only languages I have used extensively that do let you return multiple values nicely are ML and Haskell.)

Joel Spolsky, Exceptions

Instead of two channels, use a single channel with some fancy controls to overload the data. And we shall call these controls Monads.

At least that’s my impression of Peter’s division example (let’s be honest, IANAHaskellExpert either). The Maybe monad allows us to specify that mdiv will return a result or an error. It also takes care of the plumbing: Nothing propagates through further calculations.

And now for something that doesn’t use monads at all

But as I said, monads makes me nervous. Primarily it’s because the code is different (and I am terrified of change), but it also has to do with debugging. Imperative code is easy to debug: add a few prints where it’s crashing. Functional constructs remove that, even in a language as non-functional (har har) as Python.

>>> [1.0/d for d in (1, 2, 0, 3)]
Traceback (most recent call last):
  File "", line 1, in <module>
ZeroDivisionError: float division

Ignoring the contrived example, how do you know what data is causing the failure? We could simply replace the 1.0/d with a function that checks its arguments and reports, but that’s pretty specialized. Exceptions, on the other hand, are general: if we’re looking for failures just look for exceptions.

def log_excp(f):
    def f2(*a, **ka):
        try: return f(*a, **ka)
        except Exception, e:
            print "Caught exception %s running %s(%s, %s)" \
                  % (e, f.__name__, a, ka)
            raise
    return f2
print [log_excp(lambda d: 1.0/d)(d) for d in (1, 2, 0, 3)]

Caught exception float division running <lambda>((0,), {})
Traceback (most recent call last):
  File "<stdin>", line 155, in <module>
  File "<stdin>", line 148, in f2
  File "<stdin>", line 155, in <lambda>
ZeroDivisionError: float division

If we want to make this more monadic, instead of just debugging, we could return a value to indicate an error, like the built-in None:

from traceback import format_exc
def log_excp_cont(f):
    def f2(*a, **ka):
        try: return f(*a, **ka)
        except Exception, e:
            print "Caught exception %s running %s(%s, %s)\n%s" \
                  % (e, f.__name__, a, ka, format_exc())
            return None
    return f2
print [log_excp_cont(lambda d: 1.0/d)(d) for d in (1, 2, 0, 3)]

Caught exception float division running <lambda>((0,), {})
Traceback (most recent call last):
  File "<stdin>", line 160, in f2
  File "<stdin>", line 166, in <lambda>
ZeroDivisionError: float division

[1.0, 0.5, None, 0.33333333333333331]

If we replicate Peter’s mdiv (renamed to sdiv–for safe div–here), we can see the error propagate:

@log_excp_cont
def sdiv(n, d):
    return float(n) / d

def with_sdiv():
    val1 = sdiv(2.0, 2.0)
    val2 = sdiv(3.0, 0.0)
    val3 = sdiv(val1, val2)
    return (val1, val2, val3)
print with_sdiv()

Caught exception float division running sdiv((3.0, 0.0), {})
Traceback (most recent call last):
  File "<stdin>", line 160, in f2
  File "<stdin>", line 171, in sdiv
ZeroDivisionError: float division

Caught exception unsupported operand type(s) for /: 'float' and 'NoneType' running sdiv((1.0, None), {})
Traceback (most recent call last):
  File "<stdin>", line 160, in f2
  File "<stdin>", line 171, in sdiv
TypeError: unsupported operand type(s) for /: 'float' and 'NoneType'

(1.0, None, None)

So we get None. But the exceptions differ between val2 and val3. The first is a ZeroDivisionError, the second a TypeError. It works because we’re looking for any Exception, but it is not perfect. A bigger problem with None is that it’s not always an error value. Most operations on None throw an exception, but bool(None) == False, for instance. We’ll look more at None‘s replacement later, now let’s make this

A big pretentious framework

from traceback import format_exc

class excp_handler(object):
    def __init__(self, *mappings):
        if len(mappings) > 0: self.mappings = mappings
        else: self.mappings = ((Exception, none),)
    def __call__(self, f):
        def f2(*a, **ka):
            try: return f(*a, **ka)
            except Exception, e:
                for etype, ehandler in self.mappings:
                    if isinstance(e, etype):
                        return ehandler(func=f, args=a, kargs=ka, excp=e,
                                        trace=format_exc())
                raise   # If no handler, re-raise exception
        return f2

none = lambda *a, **ka: None

Ok, not that big. excp_handler is a decorator for handling exceptions (it could have been written as a regular function, but I prefer classes for longer decorators). If an exception occurs while running the wrapped function, it searches for a match and runs a corresponding handler. If no mappings are found, the error is re-raised. If no mappings are specified, it defaults to returning None on any exception. none is just a helper that always returns None (since that’s our current error type). In action:

print [excp_handler()(lambda d: 1.0/d)(d) for d in (1, 2, 0, 3)]
[1.0, 0.5, None, 0.33333333333333331]

If we want debugging information, we can add it to our handler, in this case via a wrapper:

def elog(ehandler):
    def eh2(func, args, kargs, excp, trace):
        print "Caught exception %s running %s(%s, %s)\n%s" \
              % (excp, func.__name__, args, kargs, trace)
        return ehandler(func, args, kargs, excp, trace)
    return eh2

print [excp_handler((Exception, elog(none)))(lambda d: 1.0/d)(d) for d in (1, 2, 0, 3)]

Caught exception float division running <lambda>((0,), {})
Traceback (most recent call last):
  File "<stdin>", line 52, in f2
  File "<stdin>", line 129, in <lambda>
ZeroDivisionError: float division

[1.0, 0.5, None, 0.33333333333333331]

The ability to specify an arbitrary handler has some additional advantages. First it can work around the problems of None: if you control the use of the data, you can write a handler to return whatever null object you want. Your FooBar class’ methods can return NullFooBar.

Second, handlers can raise wrapped exceptions. Michael Feathers’ article is about different zones of error-handling: the trusted-zone and the external interface. Sloppiness is tolerable in the trusted zone but the error handling of your external functions is another part of your interface. One common solution is to define your own exception hierarchy and wrap outgoing exceptions:

class CustomException(Exception): pass

def mk_raise_excp(e):
    def raise_excp(*a, **ka): raise e
    return raise_excp

print [excp_handler((Exception, mk_raise_excp(CustomException)))(lambda d: 1.0/d)(d)
       for d in (1, 2, 0, 3)]

Traceback (most recent call last):
  File "<stdin>", line 204, in <module>
  File "<stdin>", line 57, in f2
  File "<stdin>", line 199, in raise_excp
__main__.CustomException

Though we should wrap the original exception for reference, but that’s a solved prolem (and this post is getting too long).

This implementation has limitations, for example there’s no handling of cases that require finally. Right now I’m comfortable only using it for certain cases. In fact, that’s what I do: this grew from code to handle errors in the middle of a map reduce. I’m sure use will flush out more problems.

Monads & debugging

Peter’s original article wasn’t as much about error handling as it was “oh wow, it works!”. But it does make me wonder: how do you handle errors inside Haskell (e.g.)? Specifically:

  • How do you add error handling to happy path code?
  • What do you get in the way of debugging information?

If there are any functional programmers out there, I’d love to hear your thoughts.

For our next trick: fixing that whole None thing

Instead of returning None (or a different object each time), we really want a universal Failure object. It should obviously indicate an error, track its history (the original exception and stack trace) and respond to operations with another Failure I haven’t implemented anything, but I think a custom Exception–maybe FailureException in honor of monads–would work. Subclassing Exception indicates that it’s an error and also allows for processing inside of excp_handler. Tracking history can be done by storing the original exceptions innards (again see Ian Bicking‘s post). return self in FailureException‘s methods will propogate the error. Injecting the return self could be a task for __getattribute__ along with a whitelist of allowed functions.

But I haven’t tried writing any of that, and it’s time for lunch.