The problem that amortization solves is that common operations may trigger occasional slow ones. Therefore if we add up the worst cases, we are effectively looking at how the program would perform if garbage collection is always running and every data structure had to be moved in memory every time. But if we ignore the worst cases, we are effectively ignoring that garbage collection sometimes does run, and large lists sometimes do run out of allocated space and have to be moved to a bigger bucket.
We solve this by gradually writing off occasional big operations over time. We write it off as soon as we realize that it may be needed some day. Which means that the amortized cost is usually bigger than the real cost, because it includes that future work, but occasionally the real cost is way bigger than the amortized. And, on average, they come out to around the same.
The standard example people start with is a list implementation where we allocate 2x the space we currently need, and then reallocate and move it if we use up space. When I run foo.append(...)
in this implementation, usually I just insert. But occasionally I have to copy the whole large list. However if I just copied and the list had n
items, after I append
n
times I will need to copy 2n
items to a bigger space. Therefore my amortized analysis of what it costs to append
includes the cost of an insert and moving 2 items. And over the next n
times I call append
it my estimate exceeds the real cost n-1
times and is less the n
th time, but averages out exactly right.
(Python's real list implementation works like this except that the new list is around 9/8 the size of the old one.)
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…