Java – Namenode stuck in SAFEMODE after cluster reboot

Namenode stuck in SAFEMODE after cluster reboot… here is a solution to the problem.

Namenode stuck in SAFEMODE after cluster reboot

I set up a 3-node Hadoop cluster (Apache Hadoop-2.8.0). I have deployed 2 name nodes configured in HA mode using QJM. 2 datanodes are configured on the same machine where Namenode is installed. The third node is for quorum purposes only.

Setup  
Node1 { nn1, dn1, jn1, zkfc1, zkServer1 }  
Node2 -> {nn2, dn2, jn2, zkfc2, zkServer2}  
Node3 -> {jn3,  zkServer3}

I stopped the cluster

for some reason (restarted the server) and because of them I couldn’t start the cluster successfully. After checking the logs, I see that the name nodes are in safe mode and none of them can load blocks into memory. The following is the status of the name node in the name node UI.

Safe mode is ON. The reported blocks 0 needs additional 6132675 blocks
to reach the threshold 0.9990 of total blocks 6138814. The number of
live datanodes 0 has reached the minimum number 0. Safe mode will be
turned off automatically once the thresholds have been reached.
61,56,984 files and directories, 61,38,814 blocks = 1,22,95,798 total
filesystem object(s). Heap Memory used 5.6 GB of 7.12 GB Heap Memory.
Max Heap Memory is 13.33 GB. Non Heap Memory used 45.19 MB of 49.75 MB
Committed Non Heap Memory. Max Non Heap Memory is 130 MB.

There are many JVM Pause messages in the name node logs, so I tried increasing the HADOOP_HEAPSIZE, increasing the heap size in HADOOP_NAMENODE_OPTS, but it didn’t work.

Need help…

Solution

After receiving a response from the hadoop user mailing list, I’ve fixed this issue.
The issue occurs because the data node does not generate a block report. I checked the logs and found that the data node said ipc.maximum.data.length was less than required.

I added the following property in the core-site.xml file to fix the issue that worked for me.

<property>
     <name>ipc.maximum.data.length</name>
     <value>101372499</value>
 </property>

Related Problems and Solutions