Hash tables are O(1)
average and amortized case complexity, however it suffers from O(n)
worst case time complexity. [And I think this is where your confusion is]
Hash tables suffer from O(n)
worst time complexity due to two reasons:
- If too many elements were hashed into the same key: looking inside this key may take
O(n)
time.
- Once a hash table has passed its load balance - it has to rehash [create a new bigger table, and re-insert each element to the table].
However, it is said to be O(1)
average and amortized case because:
- It is very rare that many items will be hashed to the same key [if you chose a good hash function and you don't have too big load balance.
- The rehash operation, which is
O(n)
, can at most happen after n/2
ops, which are all assumed O(1)
: Thus when you sum the average time per op, you get : (n*O(1) + O(n)) / n) = O(1)
Note because of the rehashing issue - a realtime applications and applications that need low latency - should not use a hash table as their data structure.
EDIT: Annother issue with hash tables: cache
Another issue where you might see a performance loss in large hash tables is due to cache performance. Hash Tables suffer from bad cache performance, and thus for large collection - the access time might take longer, since you need to reload the relevant part of the table from the memory back into the cache.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…