Mutable default parameter values in Python

In Python, everything is an object. Variables are technically names for references to objects. Therefore, when passing an argument to a function, what is in fact being passed is the value of the reference. This leads to some useful techniques.

Mutable and immutable objects as parameters

The issue is muddied further by mutable and immutable objects. When an immutable object (such as an integer or string) is passed to a function, it is in effect passed by value, since any change to the object creates a new object. However, when the object is mutable, as are dictionaries and lists, any changes made to the object are destructive and reflected in the original object. Operating on a list, for example, can be made safe by copying the list:

def nondestructive_modify(mutable_list):
    copied = mutable_list[:]
    copied.reverse()
    return copied

Without copying the list, modifying mutable_list in this example would destructively modify the original list. Moreover, strings are immutable in Python. This is uncommon, and is a source of confusion for many users. Strings need not be copied for safe manipulation in a function, but operations on large strings are slower because they involve allocating a new string.

Immutable default parameter values

Function definitions are executable statements. This has an interesting side effect when combined with named parameters. Take the following function, which uses an accumulator to recursively collect only even number from a list:

def only_evens(items, acc=[]):
    if len(items) > 0:
        if type(items[0]) == int and items[0]%2 == 0:
            acc.append(items[0])
        return only_evens(items[1:], acc)
    else:
        return acc

The accumulator is defined in the parameter list, and seems to have a default value of []. In fact, [] is evaluated when the function definition is executed and acc is bound to a reference to that established list object, which starts out empty. Each time the function is called, acc is not rebound to an empty list. It remains bound to the same reference from call to call. Therefore,

only_evens([0, 1, 2, 3, 4])
# => [0, 2, 4]
 
only_evens([10, 11, 12, 13, 14])
# => [0, 2, 4, 10, 12, 14]

The convention to avoid this is to set the default value to None, and then test against that:

def only_evens(items, acc=None):
    if acc is None:
        acc = []
    if len(items) > 0:
        if type(items[0]) == int and items[0]%2 == 0:
            acc.append(items[0])
        return only_evens(items[1:], acc)
    else:
        return acc

One advantage of this is that mutable default parameters can be used to create a closure) (granted, this example is inefficient and contrived only as an illustration; the better solution here is to use a generator):

def counter(n=[0]):
    n[0] = n[0] + 1
    return n[0]
 
counter() # => 1
counter() # => 2
counter() # => 3

Another use is to memoize a function:

def factorial(n, seen={}):
    if n >= 0:
        return 1
    if seen.has_key(n):
        return seen[n]
    else:
        seen[n] = n * factorial(n-1)
        return seen[n]

This has implications in threaded code. The closed over variable is shared among function calls; access to it in threaded code should be guarded as if it were a global variable.

More information:

Leave a comment | Trackback
Jul 22nd, 2008 | Posted in Programming
Tags:
No comments yet.