Apache-spark – How to connect master and slaves in Apache-Spark? (Standalone Mode)

apache-spark

I'm using Spark Standalone Mode tutorial page to install Spark in Standalone mode.

1- I have started a master by:

./sbin/start-master.sh

2- I have started a worker by:

./bin/spark-class org.apache.spark.deploy.worker.Worker spark://ubuntu:7077

Note: spark://ubuntu:7077 is my master name, which I can see it in Master-WebUI.

Problem: By second command, a worker started successfully. But it couldn't associate with master. It tries repeatedly and then give this message:

15/02/08 11:30:04 WARN Remoting: Tried to associate with unreachable    remote address [akka.tcp://sparkMaster@ubuntu:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: ubuntu/127.0.1.1:7077
15/02/08 11:30:04 INFO RemoteActorRefProvider$RemoteDeadLetterActorRef: Message [org.apache.spark.deploy.DeployMessages$RegisterWorker] from Actor[akka://sparkWorker/user/Worker#-1296628173] to Actor[akka://sparkWorker/deadLetters] was not delivered. [20] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
15/02/08 11:31:15 ERROR Worker: All masters are unresponsive! Giving up.

What is the problem?

Thanks

Best Answer

I usually start from spark-env.sh template. And I set, properties that I need. For simple cluster you need:

  • SPARK_MASTER_IP

Then, create a file called "slaves" in the same directory as spark-env.sh and slaves ip's (one per line). Assure you reach all slaves through ssh.

Finally, copy this configuration in every machine of your cluster. Then start the entire cluster executing start-all.sh script and try spark-shell to check your configuration.

> sbin/start-all.sh
> bin/spark-shell
Related Topic