Static Hashing

  • Number of primary pages are fixed.
  • Overflow pages are needed.
  • Buckets contain data entries.
  • May result in long overflow chain and harm performance.

Extendible Hashing

Let bucket size be $K$.

Insertion Algorithm

  • Initially, there is an empty directory with global depth $G=0$, and a single empty bucket with local depth $L=0$.
  • Look up the directory to find the bucket where it should go.
    • When a bucket not full, just place it here.
    • When a bucket is full, split it by increasing the local depth and re-hash.
      • If the directory can fit that new bucket (or $G \geq max (L)$), then update directory to point to new bucket.
      • If the directory is full (or $G < max (L)$), then double the directory by incresing global depth and update pointers accordingly.

Linear Hashing

Let initial number of buckets be $N$, and bucket size $K$.

Search Algorithm

  • Hash the key $k$ using level $l$ hashing function $v=h_l(k)$.
    • If $v \geq next$ then $k$ is in bucket $v$.
    • If $v \lt next$, it means the bucket has been splitted in this round, so $k$ should be in $v=h_{l+1}(k)$.

Insertion Algorithm

  • Initially, there are $N$ empty buckets. $next$ is 0, level $l$ is 0.
  • Using search algorithm to find the correct bucket to go
    • If the bucket is not full, just put into it.
    • If the bucket is full, and this is not $next$, then add a overflow page and put the data. Split the $next$ bucket by re-hash using $h_{l+1}$ hashing function. And increase $next$.
    • If the bucket is full, and this is the $next$ bucket, split the $next$ bucket by re-hash all elements, including the one being inserted, using $h_{l+1}$ hashing function. And increase $next$
  • After above computation, if $next$ has excceed the last bucket (the $2^lN-1$-th bucket) in level $l$, namely $next >= 2^lN$, we need to reset $next=0$ and increase level $l$ by 1.

Exercises

You may use this tool: Decimal to Binary Converter

  1. Let bucket size be 4, insert the following sequence using extendible hashing.

    10 32 321 216 153 215 251 145 6 12

  2. Then delete the following sequence.

    321 153 215 251 145 10 6

  3. Repeat the first question using linear hashing, and let inital number of buckets be 2, bucket size be 2.

Solutions

Insertion using EH

dict [2] local depth bucket
00 [2] 32 216 12
01 [2] 321 153 145
10 [2] 10 6
11 [2] 215 251

After deletion.

dict [0] local depth bucket
  [0] 32 216 12

Insertion using LH

Level: 1

next bucket overflow
  32 216  
-> 321 153 145
  10 6  
  215 251  
  12  

Reference