Closure cells refer to values needed by the function but are taken from the surrounding scope.
When Python compiles a nested function, it notes any variables that it references but are only defined in a parent function (not globals) in the code objects for both the nested function and the parent scope. These are the co_freevars
and co_cellvars
attributes on the __code__
objects of these functions, respectively.
Then, when you actually create the nested function (which happens when the parent function is executed), those references are then used to attach a closure to the nested function.
A function closure holds a tuple of cells, one each for each free variable (named in co_freevars
); cells are special references to local variables of a parent scope, that follow the values those local variables point to. This is best illustrated with an example:
def foo():
def bar():
print(spam)
spam = 'ham'
bar()
spam = 'eggs'
bar()
return bar
b = foo()
b()
In the above example, the function bar
has one closure cell, which points to spam
in the function foo
. The cell follows the value of spam
. More importantly, once foo()
completes and bar
is returned, the cell continues to reference the value (the string eggs
) even though the variable spam
inside foo
no longer exists.
Thus, the above code outputs:
>>> b=foo()
ham
eggs
>>> b()
eggs
and b.__closure__[0].cell_contents
is 'eggs'
.
Note that the closure is dereferenced when bar()
is called; the closure doesn't capture the value here. That makes a difference when you produce nested functions (with lambda
expressions or def
statements) that reference the loop variable:
def foo():
bar = []
for spam in ('ham', 'eggs', 'salad'):
bar.append(lambda: spam)
return bar
for bar in foo():
print bar()
The above will print salad
three times in a row, because all three lambda
functions reference the spam
variable, not the value it was bound to when the function object was created. By the time the for
loop finishes, spam
was bound to 'salad'
, so all three closures will resolve to that value.