I was wondering how hashtable find the correct index when it increase it's capacity. For example let's assume I have a hashtable with default capacity 10. Now we have to add (key,value) pair [14,"hello 1"]
The index that we will get for above key '14' using below index mechanism is '4'. So hashtable going to save this (key,value) pair inside the index 4.
int index = key.GetHashCode() % 10
Now we keep on adding items into the hashtable and it reaches to the load factor. So it's time to resize. And let's assume hastable resize to 20.
Now I'm going to search my old key '14' into this hashtable. And as per the index mechanism now I will get the index for this key as 14. So I will start searching into the hashtable from index 14 but ideally it is in index 4.
So my question is how hashtable track the existing key index when it resize? Or does hashtable rehash all existing keys when it resize?
I've looked through the Shared Source CLI implementation for .Net and it looks like the entries are rehashed upon expansion. However, it is not necessary to recompute the HashCode with .GetHashCode().
If you look through the implementation you'll see the expand()
method in which the following steps occur:
- A temporary bucket array is created and sized to the smallest prime greater than double its current size.
- The new array is populated by rehashing from the old bucket array.
.
for (nb = 0; nb < oldhashsize; nb++)
{
bucket oldb = buckets[nb];
if ((oldb.key != null) && (oldb.key != buckets))
{
putEntry(newBuckets, oldb.key, oldb.val, oldb.hash_coll & 0x7FFFFFFF);
}
}
private void putEntry (bucket[] newBuckets, Object key, Object nvalue, int hashcode)
{
BCLDebug.Assert(hashcode >= 0, "hashcode >= 0"); // make sure collision bit (sign bit) wasn't set.
uint seed = (uint) hashcode;
uint incr = (uint)(1 + (((seed >> 5) + 1) % ((uint)newBuckets.Length - 1)));
do
{
int bucketNumber = (int) (seed % (uint)newBuckets.Length);
if ((newBuckets[bucketNumber].key == null) || (newBuckets[bucketNumber].key == buckets))
{
newBuckets[bucketNumber].val = nvalue;
newBuckets[bucketNumber].key = key;
newBuckets[bucketNumber].hash_coll |= hashcode;
return;
}
newBuckets[bucketNumber].hash_coll |= unchecked((int)0x80000000);
seed += incr;
} while (true);
}
}
The new array has been built and will be used in subsequent operations.
Also, from MSDN regarding Hashtable.Add():