Friday, April 15, 2005

Duck Typing dilemmas - A strawman for "pure" Python interfaces

The Duck Typing debate is less about type-checking and more about expectations

Cedric Beust blogs about The Perils of Duck Typing and he's not alone. Many people have been talking about dynamic and static typing for quite a while now and it's very close to being an Emacs vs. vi argument.

I don't think the problem here is really the absence of static typing but more the absence of "good" design or reasonable expectation management. Let me give an example (so you can make fun of me if I'm wrong).

In Python, a number of methods and functions take "file-like objects" as parameters. Now some of these methods actually want all the methods that files implement and others are content with just having read and readline. This interface is obviously a little hard to code to since in order to implement it you need to read the specific requirements of the method/function you're calling.

Now the only thing wrong with this whole setup is that you really have no idea what methods to implement and as Cedric points out, you may omit the implementation of a method that is not called directly from the function you're calling. So here's my strawman proposal to those naysayers who demand more static typing: just implement an empty base class that throws NotImplementedException from your methods. Note that in Python this class wouldn't even need to be bundled with the base distribution since Duck Typing works whether you like it or not.

In the function below it doesn't matter what type the fileobject parameter is. It will work fine with a file from the open or file functions as well as with any other iterable object.


def line_count(fileobject):
count = 0

for i in fileobject:
count += 1

return count


Now let's assume that we want to make it explicit that you need to pass an iterable object. Most pythonistas will tell you to re-write the function like this.


def line_count(iterable_object):
count = 0

for i in iterable_object:
count += 1

return count


Now that's nice and clear. We can tell (as long as we have two brain cells to rub together) from the name of the argument what the type should be. Of course then the whole discussion takes a nasty swerve towards the "Hungarian notation debate" side of things.

Putting the notion of encoding the type in the argument name aside we can still provide a nice way to show what type is expected. All we have to do is write our function like this.



class Iterable(object):
def next(self):
raise NotImplementedException

def __iter__(self):
return self

def line_count(itobj):
"""Pass me an instance of Iterable, please."""

count = 0

for i in itobj:
count += 1

return count


Now, because Python supports multiple inheritance, we can just use the Iterable "interface" as a mixin. It doesn't really matter whether we use it or not because the methods that it defines are clearly described and listed in the class definition. In addition, the function will still accept non-Iterable objects that adhere to the "Iterator types" description in the Python Library Reference. It's just that designing like this makes it simpler on the client to use the library you've developed. Note also that any class that implements the "Iterator types" set of methods could also just mix in the Iterable class for the hell of it.

I know that this seems like a lot of flailing for nothing (since there is still no type-checking until runtime at which point it still happens using Duck Typing) but the point here is that I think the Duck Typing argument is a design argument and not a dynamic vs. static typing argument. You can implement Java-like interfaces in a dynamic language, they just won't be checked at compile-time; but maybe that's OK. The real goal is to: a) make sure that the implementor knows all the methods they need to support and b) make sure that you are not passing the wrong instance by mistake to your function. I think that by adhering to a few simple design guidelines both of those goals are pretty easy to accomplish.

No comments: