Posts

Showing posts from January, 2024

HDFS Infographic

Image
Already we know, the data will be getting stored in 3 data nodes (replication) for prevention of failures or corruption. It is useless when you storing the same data in same disk/rack as 3 times. Suppose that whole data replicated single disk/rack is corrupted then it won't be going to work out, It is useless to come and do replication process in HDFS. When we are storing all of these 3 replicated data in different data node but in same rack. It is also not helpful. Suppose if you lose whole rack, then won't be going to help. So we have to store it in different data node with different rack then it will be more helpful to protect the data. Here we are going to discuss about how to store the data in different data node with different rack and how to handle Fault Tolerance. Replica Placement Strategy Name node will follow  Rack Awareness Policy  algorithm for replicating data into 3 different data nodes. This Rack Awareness Policy follows Replica placement strategy. Here the