This short How-To post is a companion post to How to Install a Distributed Apache Storm Cluster, and shows how to debug an Apache Zookeeper installation. The debugging methods presented below include starting Zookeeper manually, pinging the Zookeeper server, finding the Zookeeper log files and looking inside for hints and manually purging old Zookeeper data.
Manually Run Zookeeper
The simplest way to debug Zookeeper is to simply start it up manually and verify that it is running.
1 2 |
sudo nohup /usr/share/zookeeper/bin/zkServer.sh start-foreground > /dev/null 2>&1 & ps aux | grep zookeeper |
You should see something like:
1 |
zookeep+ 4950 0.0 0.5 4396148 47324 ? Ssl 13:08 0:00 /usr/bin/java -Dzookeeper.log.dir=/var/log/zookeeper -Dzookeeper.root.logger=INFO,ROLLINGFILE -cp /etc/zookeeper/conf:/usr/share/java/jline.jar:/usr/share/java/log4j-1.2.jar:/usr/share/java/xercesImpl.jar:/usr/share/java/xmlParserAPIs.jar:/usr/share/java/netty.jar:/usr/share/java/slf4j-api.jar:/usr/share/java/slf4j-log4j12.jar:/usr/share/java/zookeeper.jar org.apache.zookeeper.server.quorum.QuorumPeerMain /etc/zookeeper/conf/zoo.cfg |
If not, check out the log files for more information…
Where are the Zookeeper log files?
After a few days of running the Storm cluster it stopped working, and I needed to debuig what was going on. The first error I saw was on the Storm UI: “org.apache.thrift7.transport.TTransportException: java.net.ConnectException: Connection refused”. After rechecking all the cfg
files, I checked to see if nimbus1
could ping zkserver1
on port 2181
.
1 |
nc zkserver1 2181 < /dev/null; echo $? |
A 0
came back indicating that it could not reach the zookeeper instance. I checked if Zookeeper was running, it was so I needed to look into the zookeeper logs.
1 2 |
sudo find / -name "zookeeper.log" |
Once the log file(s) is located you can open it with:
1 2 |
sudo nano /path/to/zookeeper.log |
Reading the logs revealed to me: “java.io.IOException: No space left on device zookeeper”, indicating that the auto purging setup above needed to be tweaked.
Purging Zookeeper Data Manually
To purge old Zookeeper data manually we use a tool called ‘PurgeTxnLog’, which is a class in the `zookeeper.jar’. In order to set the classpath jars, you need to find where they are and what they’re named.
1 2 3 4 |
sudo find / -name "zookeeper.jar" cd /usr/share/java ls -la |
Finally, you can build the correct command to run ‘PurgeTxnLog’.
1 2 |
java -cp /usr/share/java/zookeeper.jar:/usr/share/java/slf4j-api.jar:/usr/share/java/slf4j-log4j12.jar:conf org.apache.zookeeper.server.PurgeTxnLog /var/lib/zookeeper/ /var/lib/zookeeper/ -n 3 |
After running it, you will see that it removed files:
1 2 3 4 5 |
Removing file: Jun 18, 2015 9:57:33 AM /var/lib/zookeeper/version-2/log.1b Removing file: Jun 18, 2015 8:32:46 PM /var/lib/zookeeper/version-2/log.42e0 Removing file: Jun 18, 2015 9:57:42 AM /var/lib/zookeeper/version-2/snapshot.42df Removing file: Jun 18, 2015 8:38:33 PM /var/lib/zookeeper/version-2/snapshot.e67b |
To automatically purge the Zookeeper data, follow the instructions posted at How to Install a Distributed Apache Storm Cluster.
Additional Links
- More info on configuration – http://zookeeper.apache.org/doc/r3.3.3/zookeeperAdmin.html#sc_configuration
- Standalone mode – https://zookeeper.apache.org/doc/r3.3.2/zookeeperStarted.html#sc_InstallingSingleMode
- ZooKeeper Cluster (Multi-Server) Setup – http://myjeeva.com/zookeeper-cluster-setup.html
Final Words
There are several reasons why an Apache Zookeeper installation might not work correctly and need debugging. Did this article help you or was there a different issue at hand? If so please leave a message below!
Subscribe To Our Newsletter
Join our low volume mailing list to receive the latest news and updates from our team.