Python – Added Spark Executors to Zeppelin

Added Spark Executors to Zeppelin… here is a solution to the problem.

Added Spark Executors to Zeppelin

I’m setting up my cluster using Hortnworks (HDP 2.4). I have a 4-node cluster with each node (16Gb-RAM, 8-CPU). To use python (pyspark), I also installed Spark on Zeppelin Notebook.

My question is: I started with a configuration of 3 nodes, later I added another new node (4 total as mentioned earlier), and the number of performers on Spark is still “3” anyway.

I see on the web that the number of performers can be set in SPARK_EXECUTOR_INSTANCES, but this parameter only exists in the Spark-env template of the Spark’s configuration page in the Ambari user interface. It seems that it needs YARN to decide the executor, but in YARN I haven’t found any information about this.

enter image description here

To be clear, how can I use Ambari to increase the number of executors in my Hortonworks Hadoop cluster?

Solution

Pietro, you can make changes on Zeppelin itself.

In the upper right corner, open the menu and enter the “Interpreter” configuration.

There is a section called “Interpreter”. The last bar is called “Spark” and you should be able to find this setting there.

If not, just insert it, edit the subsection.

Hope this helps.

Related Problems and Solutions