Java – Some problems with Flume

Some problems with Flume… here is a solution to the problem.

Some problems with Flume

I have 2 CDH4 clusters.
One is CentOS 6.4 (real hardware) and the other is Ubuntu 12.04 (Amazon EC2).

All configuration files are made manually (using Cloudera Manager).
I try to start Cloudera-twitter-example When I launch flume on a CentOS cluster, it works fine. But on an Ubuntu cluster, Flume gives this error in the log file:

2013-09-11 15:04:54,491 INFO org.apache.flume.instrumentation.MonitoredCounterGroup: Component type: SINK, name: HDFS started
2013-09-11 15:04:54,527 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: Unable to start EventDrivenSourceRunner: { source:com.cloudera.flume.source.TwitterSource
{name:Twitter,state:IDLE} } - Exception follows.
java.lang.NoSuchMethodError: twitter4j. FilterQuery.setIncludeEntities(Z)Ltwitter4j/FilterQuery;

After Googling I found this solution in comments
Suresh E Gopalan August 20, 2013 at 2:43 AM

So we have another JAR file
search-contrib-0.9.1-cdh4.3.0-SNAPSHOT-jar-with-dependencies.jar with
the same class and conflicting with correct one in FLUME_CLASSPATH
Temporarily rename it to .org extension, so that it will be excluded
from classpath at the startup

After renaming this jar, Flume starts running on the Ubuntu cluster.
On a CentOS cluster, I have the same jar and the same class, but I don’t need to rename it.

Why is this happening and how should I change the Ubuntu cluster to have the same behavior without renaming?

Solution

Rebuild flume-source and do not download pre-built snapshot .jar

Related Problems and Solutions