Apache HBase Architecture
- HBase is one of the column-oriented NoSQL database built on top of the HDFS for storage and YARN for processing. In this chapter we will discuss about HBase Architecture.
Architecture
There is no concept of DB in HBase. Simply they are calling DB as Table. In HBase, tables are split into regions and that are served by the region servers. Major components of HBase are below,
1. Regions (MemStore, .META., -ROOT-)
2. HBase Region Server (Regions, HLog)
3. HMaster Server
4. Zookeeper
HMaster, Region Server, Zookeeper are placed to coordinate and manage Regions and perform various operations inside the Regions.
We will discuss about HBase components one by one and how it helps to store and process the large set of data.
Region
- HBase tables(schema/DB in RDBMS) can be divided into a number of regions.
- All the columns of a column family is stored in single MemStore of region.
- Single region can contains more than one MemStore.
- A Group of regions is served to the clients by a Region Server, A Region Server can serve approximately 1000 regions to the client.
- Regions which we assign to the nodes in the HBase Cluster, is what we call “Region Servers”. Basically, for the purpose of reads and writes these servers serves the data. While talking about numbers, it can serve approximately 1,000 regions per single Region Server
- Many regions are assigned to a Region Server, which is responsible for handling, managing, executing reads and writes operations on that set of regions.
- The recommended maximum region size is 10 - 20 Gb. For HBase clusters running version 0.90. x, the maximum recommended region size is 4 Gb and the default is 256 Mb.
- Typically you want to keep your region count low on HBase for numerous reasons. Usually right around 100 regions per RegionServer has yielded the best results.
- Each region has one MemStore for each column family, which grows to a configurable size, usually between 128 and 256 MB. You can specify this size by using the hbase.hregion.memstore.flush.size property in the hbase-site.xml configuration file.
- The RegionServer dedicates some fraction of total memory to region MemStores based on the value of the hbase.regionserver.global.memstore.size configuration property. If usage exceeds this configurable size, HBase might become unresponsive or compaction storms might occur.
MemStore
- There is one MemStore for each column family.
- It is the write cache. It stores all the incoming data before committing it to the disk or permanent memory (HFile in HLog).
- The data is sorted in lexicographical order before committing it to the disk (HFile in HLog).
- When the MemStore reaches the threshold, it dumps all the data into a HLog as new HFile. This HFile is stored in HDFS.
- HBase contains multiple HFiles for each Column Family.
- HFile count grows based on the MemStore dumps the data.
- MemStore, Master Server also saves what data and which HFile written at last time. when we start to store the data into HFile then MemStore will start from where/which HFile they stopped previously.
- The HFile indexes are loaded in memory whenever an HFile is opened. This helps in finding a record in a single seek.
- A buffer that holds in-memory modifications (till it is flushed to store files)
.META. & -ROOT-
- hbase:meta table contains metadata of all regions of all tables managed by cluster. All HRegion metadata of HBase is stored in the .META table.
- regarding the location of Hbase -ROOT- table and .META table : Hbase -ROOT- table resides in the znode of the zookeeper, So when the client query the -ROOT- the information of the region server hosting the .META is found. From .META. we can find where our data is hosted
- .META. and -ROOT- are tables/regions. The -ROOT- table holds the list of .META. table regions.
- The .META. table holds the list of all user-space regions. Entries in these tables are keyed by region name, where a region name is made of the table name the region belongs to, the region’s start row, its time of creation, and finally, an MD5 hash of all of the former.
- Key => Region key of the format ([table],[region start key],[region id])
- Values =>
- info:regioninfo (serialized HRegionInfo instance for this region)
- info:server (server:port of the RegionServer containing this region)
- info:serverstartcode (start-time of the RegionServer process containing this region)
- As mentioned in https://hbase.apache.org/book.html#arch.catalog.meta The -ROOT- table is removed since HBase 0.96.0 and the location of .META table is currently stored in the Zookeeper and its name become hbase:meta
- Hbase gives a higher priority to make the META get online when a failure occurs
- HMaster Server which acts similarly as a NameNode in HDFS.
- HMaster handles a collection of Region Server which resides on DataNode.
- It coordinates and manages the Region Server (similar as NameNode manages DataNode in HDFS).
- It assigns regions to the Region Servers and re-assigns regions to Region Servers during recovery and load balancing.
- It monitors all the Region Server's instances in the cluster (with the help of Zookeeper) and performs recovery activities whenever any Region Server is down.
- HMaster performs DDL operations (create and delete tables) and It provides an interface for creating, deleting and updating tables.
- HBase is a vast platform as we know it manages these region and region servers with the help of ZooKeeper.
Hierarchical for Metadata
- The META table is a special HBase catalog table.
- It maintains a list of all the Regions Servers in the HBase storage system.
- .META file contain keys and values. Key contains the information about region and its id, value contains the path of the Region Server.
- Meta will have all information about all tables. ROOT will have all information about all meta's/Region's. Zookeeper will have all information about all ROOT's.
- Zookeeper is master of master(HMaster).
- Zookeeper acts like a coordinator in HBase. It helps to monitor the server state.
- Every Region Server along with HMaster Server sends continuous heartbeat at regular interval to Zookeeper and it checks which server is alive.
- Zookeeper also provides server failure notifications so that, recovery measures can be executed.
- Inactive HMaster listens to active HMaster whether its sending heartbeats to the Zookeeper, If the active HMaster fails to send a heartbeat then the session will get deleted and the inactive HMaster becomes active.
- If a Region Server fails to send a heartbeat, the session will get expired then HMaster will perform suitable recovery actions.
- Zookeeper also maintains the .META Server's path, which helps for client in searching for any region.
- The Client want to get path of particular region then they has to check with .META Server through ZooKeeper by raising request.
A Region Server maintains various regions running on the top of HDFS. Components of a Region Server are below,
1. WAL
2. Block cache
3. MemStore
4. HFile
WAL
Write Ahead Log (WAL) is a file attached to every Region Server. The WAL stores the new data that hasn't been committed to the permanent storage. It is used in case of failure to recover the data sets.
Block Cache
Block Cache resides in the top of Region Server. It stores the frequently read data in the memory.
HFile
- It stores the actual data on the disk.
- HFile is the main persistent storage in an HBase architecture.
- MemStore commits the data to HLog as HFile when the size of MemStore exceeds.
- 0 or More Store files – that get created when MemStore fills up. These are called Hfiles.
- These store files are immutable and HBase creates a new file on every MemStore flush i.e. it does not write to an existing HFile.
- Compaction combines all these Store files for a Region into fewer Store files to optimize performance
How request will work in HBase?
- The client approaches ZooKeeper with a read or writes requests. Then client retrieves the location of the META table from the ZooKeeper.
- With using Meta table, client will came to know about region server and region for writing/reading the data. Then client caches this information with the location.
- For future references with using that cache information client will directly approach path, clients does not waste time in retrieving the location of Region Server from META Server, this saves time and makes the search process faster.
- Suppose if region is shifted or moved that time only client will approach Meta table for getting region and region server information.
- After cache these region and region server information, Whenever the client has a write request, then client writes the data to the WAL from that corresponding saved region server address.
- Once data is written to the WAL, then it is copied to the MemStore.
- Once the data is placed in MemStore, then the client receives the acknowledgment.
- When the MemStore reaches the threshold, it dumps or commits the data into a HFile.
- With using that above cache information of region and region server , Whenever the client has a read request they will check the Block cache first whether Block cache used that requested information recently/frequently. If yes then client will read from there.
- If Block cache fails to find, then client will approach MemStore. MemStore will search for the most recently written files which has not been dumped yet in HFile.
- If MemStore fails to find, Client will use block cache to load the data from HFile.
Comments
Post a Comment