Monday, September 17, 2018

Consistent Hashing

"Hashing" is a well known mechanism to enable fast searches. A "hash" is obtained by applying "some" (mathematical or a characteristic) function on the input(s). For an example -

Say we want to store a n words in a hash-table, such that retrieval of a word is faster than the normal O(n) search. A simplest hash function in this case would be to simply get the first letter of the word. The hash-set obtained will be [ab...z]. "america", will map to bucket 'a', "hash" will map to the bucket 'h'. To search of for a word, simply get the first letter and directly jump to that bucket; and then perhaps do a linear search.

Another hash function could be len(string).

Obviously, these approaches are sub-optimal, since multiple words will have a common hash and hence need to search linearly once a bucket is known. This can be bettered by using a hash function, such that a unique hash is generated for a word, but the trade-off is that it will lead to a much bigger and dynamically (sized) hash table.

A balance is generally made by adopting the following approach. A mathematical fn is used to calculate the hash first and then a fixed size (say N) hash table is formed. The hash obtained is mod'ed (hash % N) to obtain a number between 0 to N-1. Thus, whatever be the hash, by doing the mod we have ensure that the bucket index to put the input in is always between 0 and N-1. Hence, fixed sized hash table works. To keep a fixed cap on the search operation, we also need to ensure that, the size of each bucket is also fixed to say O(k), k being called as the load factor. If we found that, a bucket has over grown the size k, it suggests that we need to increase the hash table size to a value greater that N (say 2N).

Now that is a heavy operation, in the sense that all the keys need to be remapped by doing the new mod with 2N.

As an example, consider the following words, say 3 buckets, this is how they are assigned.

WORD | HASH | BUCKET
jue        | 32434 |  1
same     | 34244  |  3
jelly      | 7686   |  1
aman     | 35734 | 2
anuj      | 45642 | 1
deepak  | 226    |  3
julie      | 24324 | 2

So, jue, jelly, anuj got assigned to "1"; aman, "julie" to "2" and "same","deepak" to "3"
or this is how the map looks:

 1 -> jue, jelly, anuj
 2 -> aman, Julie
 3->  same, deepak

Say, load factor was 4, hence adding a new word say "foo", fell to bucket 1 and hence reached the load factor threshold and hence, we increased the no. of buckets to say 6(2*3). we will remod each entry now by 6, calculate new bucket position and done! This requires some good cpu cycles and performance overhead, since all the keys need to be remapped to new buckets.

The approach of CONSISTENT HASHING comes to rescue here which is detailed below. Before delving into that, lets also understand that same hashing/consistent hash approach is also used in distributed caches or distributed data stores (redis, kafka, load balancers, sharding system) to balance the load or data to multiple nodes (machines). Since the volume dealt in these system is huge, talking about billions and trillions of records, re-hashing is never a choice, hence CONSISTENT HASHING is used.

The approach ensures that upon resizing the buckets in the cache or with increase or decrease in the no. of nodes in the cluster only k/N keys needs to be remapped as opposed to re-mapping entire k keys, where N is the max of nodes (before and after).

It starts with mapping the hash to a circle instead of a array of fixed size and also mapping the available nodes (servers) to the same circle. E.g. With in a hash boundary say 0 to MAX_INT, 0 being mapped to 0 degrees, and MAX_INT mapped to 360 degrees, any hash in between 0 and MAX_INT will map to some point on the circle based on it degree by this formula:
(hash/MAX_INT)*360.

Similarly, using the same pattern, the nodes are also put on the circle.

A convention can be chosen that, any "input" will be put on the nearest node found by moving in the clock wise (or anti-clockwise based on the agreed convention).

Example - four nodes fall as below on the circle:

A (10 degrees)
B (60 degrees)
C (135 degrees)
D (200 degrees)

Say, input "julie" resulted to 100 degrees, moving in clockwise direction it will map to Node C. Say, "aman" resulted to 50 degrees, so it will map to Node B.

Say, in event of node failure of say node B, only B's entries will be moved to Node C. Rest entries of on A, C and D will remain as is.

Now, you might argue that this approach might be biased to make a particular node loaded with all the data (or requests). This can be fixed by assigning equals weights (or may be different weights if nodes differ in capacity). Say we assign a weight 10 to each node. So, we will distribute A1 to A10 randomly on the circle, B1 to B10 randomly on the circle and so on D1 and D10 randomly on the circle.
Now the angle of the input can fall close to any of An or Bn or Cn or Dn, hence getting mapped to a random node, more the weights better the chances of an inputs arriving to that node.

That in summary what is called consistent hashing, and doesn't suffer from the cluster re-size performance problem. Used very heavily in distributed systems, no-sql data stores like Redis Cache, Cassandra and Kafka, CDNs and load balancers.

For further reading refer -

https://en.wikipedia.org/wiki/Consistent_hashing

No comments: