HashTable
is a legacy class introduced in JDK1.0. It is just like HashMap, only difference
is all the insertion, read and updation operations are synchronized. That means
at a time only one thread can able to perform insertion, read, updation
operations on a Hashtable instance.
Before
going to discuss about Hashtable, I want to explain some basics of Hashing, Hash table, collisions etc.,
Hash Table
Hash table is a data structure used to
store collection of items in a way that it is easy to find them in later.
Usually you can store and search data of Hash table in constant (O(1)) time.
For example, consider following hash
table of size 13 (Technically each position of hash table is called slot, so it
has 13 slots). Initially hash table is empty. Let us try to insert some data
into hash table based on some hash function.
What is Hash function?
Hash function is used to map an item to
a slot (position) in Hash table. Hash function takes an item as input and
returns a slot in range between 0 and hash table size (exclusive).
To demonstrate example, I am going to
use following hash function.
h(item)=item%13
Item
|
Hash
Value
|
103
|
12
|
65
|
0
|
35
|
9
|
19
|
6
|
23
|
10
|
76
|
11
|
Based on the Hash value computed, all
the items are placed in respective slots. Element 103 is placed in slot 12,
Element 65 is placed in slot 0 etc.,
There is a problem in above approach.
Suppose you want to insert an element 26 in Hash table, 26 % 13 = 0, 26 has to
be placed in slot 0, but slot 0 is already occupied with element 65. Here the
concept of collision comes into picture. I will explain more about this later
in the post.
Load Factor
Load factor is denoted by λ, λ = (Number
Of Items)/(Hash table size). In our example λ = 6/13 = 0.46153846153846156.
How to search for an element in Hash Table?
We can use the same hash function to
search for an element in Hash table. Suppose you want to search for an element
23.
a. Apply hash function on element 23, to
find the slot. 23 % 13 = 10
b. Go to slot 10 and check whether element
23 exist or not.
What is Perfect Hash function?
Perfect Hash function maps each element
to a unique slot. Ideally there is no systematic way to construct a perfect
hash function. Usually depends on the load factor, hash table is resized to
reduce the collisions.
Collision Resolution
Collision occurs when two items mapped
to same slot. Collision resolution techniques play an important role in hashing.
One way to address the problem of collisions is linear probing.
Linear
Probing
In linear probing technique, we try to
find next open slot in hash table sequentially. For example, hash table is
already occupied with elements 103, 65, 35, 19, 23, 76.
When we try to place element 26 (slot =
26 % 13 = 0) in hash table, collision occurs. We try to find next open slot
from the position 0 and place the element 26 in it. Next open position after
slot 0 is 1, so we place the element 26 in slot 1.
Suppose you want to insert element 48
into hash table, Find the slot for the element 48 (slot = 48 % 13 = 9). Since
slot 9 is already occupied with element 35, we need to find next open slot
sequentially. Slots 10, 11, 12, 0, 1 are
occupied with elements; slot 2 is open for element 48. Place the element 48
into the slot 2. Same linear probing technique is used while searching elements
in Hash table.
How
you search for an element in linear probing?
Suppose you want to search for an
element 35, find the slot value using hash function (slot = 35 % 13 = 9). Check
the slot 9 for the element 35. Slot 9 has the element 35, return TRUE.
Suppose you want to search for an
element 48, calculate the slot value (slot = 48 % 13 = 9). Check the slot 9 for
the value 48, slot 9 is occupied with value 35. We cannot simply return FALSE,
since there is a possibility of collisions, we must do a sequential search,
starting at position 10, looking until either we find the item 48 or we find an
empty slot.
Chaining
Another way of solving the Collisions is
using chaining. Chaining maintains a linked list kind of data structure at each
slot to resolve collisions.
For example, Elements 103, 65, 35, 19,
23, 76 are inserted into hash table of size 13 using the hash function
h(item)=item%13.
Suppose you want insert an element 130
into Hash table, calculate the slot (slot = 130 % 13 = 0). Slot 0 is already occupied with element
65, so create a new node with element 130 attach this node the slot 0. If you
add elements 130, 91, 71, 50, 89, 206 to the hash table, hash table change like
below.
Suppose you want to search for an
element 206 in the hash table, you calculate slot for the element 206 (slot =
206 % 13 = 11), you go to the slot 11 and traverse entire linked list at slot
11, if you found the value 206 in the linked list, you return TRUE, else FALSE.
One problem with this approach is if linked list size grows, time complexity
increases for search and insert operations. To solve this issue Java8 maintains
a binary search tree kind of data structure.
I hope with this you got some idea in
Hashing, load factor, collision resolution techniques, lets start discuss on HashTable
internal working in Java.
HashTable works on the principle of
Hashing. HashTable Use two properties, which impact the performance.
a. Capacity
b. Load Factor
Capacity
Capacity is the number of buckets(slots)
in the hash table.
Load
Factor
The load factor is a measure of how full
the hash table is allowed to get before its capacity is automatically
increased. By default Load Factor for Hashtable is 0.75.
For Example let us assume
Initial capacity is 10 (I.e, Number Of
Buckets (or) slots)
Load Factor is 0.75
Each Bucket has the capacity to store
100 data items.
So when all the items reached 0.75 * 100
* 10 = 750, then the internal data structures rehashed and the Capacity (Number
Of Buckets (or) slots) increases depends on implementation. The same logic
applies in future also.
How insertion
performed on HashTable?
‘public synchronized V put(K key, V value)’ method is used
to insert <key, Value> pair into Hashtable. Hashtable internally stores
mapping in the form of Entry<K,V> (private static class Entry<K,V>
implements Map.Entry<K,V>) object which contains both key and value
object.
When you call put method,
a. Since put is a synchronized method,
thread acquires a lock on this Hashtable instance. Any other thread wants to
perform synchronized operation on this instance must wait until the lock is
released.
b. hashCode() method on object 'key' is
called, hashcode of the key is used to find the bucket (or) slot location for
this <Key, Value> pair entry. Once the bucket is identified, equals()
method is used to check whether the key is already exist in the bucket or not.
If the key is already exist in bucket, then value associated with the key is
replaced, else new Entry object is created and placed in HashTable.
How
to get an element from Hashtable?
‘public synchronized V get(Object key)’
method is used to retrieve the value associated with key.
a. Since get is a synchronized method,
thread acquires a lock on this Hashtable instance. Any other thread wants to
perform synchronized operation on this instance must wait until the lock is
released.
b. 'get' method calculate the hashCode on
the object key, Based on the hasCode value of the key, it identifies the
bucket. This bucket uses Linked List kind of data structure to resolve
collisions. By using equals() method get method checks whether key is in the Hashtable
or not. If key is not in Hashtable, then get method return null, else return
the value associated with this key.
You may like
No comments:
Post a Comment