JobTracker and TaskTracker
July 14, 2015Data Locality in Hadoop
July 28, 2015Hadoop Modes
Hadoop cluster is made up of several key process and each process is designed to do a specific task. Here are the key daemons that make up a Hadoop cluster.
1. NameNode
2. DataNode
3. JobTracker
4. TaskTracker
5. ResourceManager (MRv2)
6. ApplicationMaster (MRv2)
7. NodeManager (MRv2)
8. SecondaryNameNode
etc..
You can run a Hadoop in one of the 3 supported modes –
- Local (Standalone) Mode
- Pseudo-Distributed Mode
- Fully-Distributed Mode
Local (Standalone) Mode
By default, Hadoop is configured to run in a single-node, non-distributed mode, as a single Java process. This is useful for debugging. This does not offer you a true distributed environment. The usage of this mode is very limited and it can be only used for experimentation.
Pseudo-Distributed Mode
Just like the Standalone mode, in Pseudo-Distributed Mode Hadoop is run on a single-node in a pseudo(false) distributed mode. The difference is that each Hadoop daemon runs in a separate Java process in Pseudo-Distributed Mode. Where as in Local mode each Hadoop daemon runs as a single Java process. Again the usage of this mode is very limited and it can be only used for experimentation.
Fully-Distributed Mode
In fully-distributed mode, all daemons are executed in a separate nodes forming a multi node cluster. This setup offers true distributed computing capability and offers built-in reliability, scalability and fault tolerance. This is the standard mode of operation in DEV, QA, UAT, PREPROD and Production environments.