I just set up my first Hadoop cluster. My cluster looks like:
Node1 – NameNode + ResourceManager
Node2 – SecondaryNameNode
Node3 – DataNode (+NodeManager)
Node4 – DataNode (+NodeManager)
Node5 – DataNode
Doing java’s jps command on all machines looks good.
My hadoop-hduser-namenode-node1.log (and same on other machines) are looking fine too. I got a single warning (fs.name.dir only got one directory).
I started a mapreduce-example job wordcount (twice) using a 550 MB apache-log. Job ran, output seems fine.
- My Web-UI
http://namenode:8088/ shows 3 Active Nodes – Node State Running but I couldn’t see any changes at all while my jobs were run.
Java Output after starting the job:
> 16/07/05 13:27:31 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
- Also Java output said ‘split into 5 blocks’. So I’d like to see where blocks getting stored to be sure that my cluster is working fine. Can I somehow check where blocks are getting stored and if
it replication works just like it should in theory.
- My Web-UI
http://datanode:50075 only shows: ‘DataNode on Datanode:50075’. I cannot click on overview or get any helpful information here. When I got my first hands on Hadoop I used a single-node-setup. There, this page had tons of information about the node-health/status/jobs
and I was able to browse the HDFS. I’m missing this here, and I don’t get why.