Can multiple Kafka consumers read the same message from a partition?
May 26, 2021How to save Spark DataFrame directly to a Hive table?
May 31, 2021NameNode
NameNode is the heart of HDFS. NameNode maintains the metadata of HDFS – files, list of blocks, directories, permissions etc. The metadata is persisted on a file named FSIMAGE. During the start up of NameNode, the FSIMAGE file will be read and loaded into memory.
Any ongoing changes to the files, directories in FSIMAGE will be written to memory and to a temporary log file. NameNode does not save the ongoing changes to FSIMAGE directly and this is because FSIMAGE file could be big for a big HDFS and updating a big file at runtime will be quite expensive and slow.
Secondary NameNode
Secondary NameNode keeps a copy of FSIMAGE. Periodically Secondary NameNode will get the copy of the FSIMAGE file and the temporary log file from the NameNode and apply the log file to the FSIMAGE file. There by bringing the FSIMAGE file current.
This relieves the NameNode from worrying about merging the contents of FSIMAGE with the temporary log file. Secondary NameNode however doesn’t take over the functions of the NameNode if the NameNode encounters an issue. Secondary NameNode can be manually made the primary NameNode but it doesn’t happen automatically.
Secondary NameNode is also an old concept. Newer versions of Hadoop support High Availability capabilities with Quorum Journal Manager (QJM) or NFS (shared storage).