More than 5 years have passed since last update.

Sparkのkubernetes native supportについて

Last updated at Posted at 2016-12-25

大変遅れてクリスマスに投稿ですが、こちらはOpt Technologies Advent Calendar 201614日目の記事です。

2016/12/25現在、Apache Sparkは以下の3種類のクラスタマネージャをサポートしています。

  • Standalone
  • YARN
  • Mesos


(k8s公式のSpark Exampleは、k8sでStandaloneクラスタを組む例なので、native supportではない)

ここでは、k8sで下記のforkを動かして、kubernetes native supportを体験してみます。(当該issueを解決するPRのプロトタイプにあたるもののようです)


  • jdk 1.7以上
  • mvn
  • kubernetesクラスタの用意(minikubeでも可)



$ git clone git@github.com:foxish/spark.git -b k8s-support --depth 1
$ cd spark
$ ./build/mvn -Pkubernetes -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests package

ビルドが無事終了したら、Spark Jobをsubmitしてみます。

$ ./bin/spark-submit --deploy-mode cluster --class org.apache.spark.examples.SparkPi --master k8s://default --conf spark.executor.instances=5 --conf spark.kubernetes.sparkImage=manyangled/kube-spark:dynamic http://storage.googleapis.com/foxish-spark-distro/original-spark-examples_2.11-2.1.0-SNAPSHOT.jar 10000
2016-12-25 11:47:32 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-12-25 11:47:32 INFO  KubernetesClusterScheduler:54 - Created KubernetesClusterScheduler instance
2016-12-25 11:47:33 WARN  KubernetesClusterScheduler:66 - Instances: 5
2016-12-25 11:47:33 INFO  KubernetesClusterScheduler:54 - Starting spark driver on kubernetes cluster
2016-12-25 11:47:33 INFO  KubernetesClusterScheduler:54 - Using as kubernetes-master: k8s://


$ kubectl get pods                                                                                                                                                                                      k8s-support
NAME                                  READY     STATUS    RESTARTS   AGE
spark-driver-fp7sd                    1/1       Running   0          23s
spark-executor-impz7-1                1/1       Running   0          13s
spark-executor-impz7-2                1/1       Running   0          13s
spark-executor-impz7-3                1/1       Running   0          13s
spark-executor-impz7-4                1/1       Running   0          13s
spark-executor-impz7-5                1/1       Running   0          13s 
$ kubectl logs -f spark-driver-fp7sd
2016-12-25 02:52:29 INFO  TaskSetManager:54 - Finished task 6924.0 in stage 0.0 (TID 6927) in 357 ms on localhost (executor 2) (6925/10000)
2016-12-25 02:52:29 INFO  TaskSetManager:54 - Starting task 6929.0 in stage 0.0 (TID 6932, localhost, executor 5, partition 6929, PROCESS_LOCAL, 5863 bytes)
2016-12-25 02:52:29 INFO  TaskSetManager:54 - Starting task 6930.0 in stage 0.0 (TID 6933, localhost, executor 4, partition 6930, PROCESS_LOCAL, 5863 bytes)
2016-12-25 02:52:29 INFO  TaskSetManager:54 - Finished task 6926.0 in stage 0.0 (TID 6929) in 288 ms on localhost (executor 4) (6926/10000)
2016-12-25 02:52:29 INFO  TaskSetManager:54 - Finished task 6925.0 in stage 0.0 (TID 6928) in 333 ms on localhost (executor 5) (6927/10000)
2016-12-25 02:52:29 INFO  TaskSetManager:54 - Starting task 6931.0 in stage 0.0 (TID 6934, localhost, executor 3, partition 6931, PROCESS_LOCAL, 5863 bytes)
2016-12-25 02:52:29 INFO  TaskSetManager:54 - Finished task 6927.0 in stage 0.0 (TID 6930) in 287 ms on localhost (executor 3) (6928/10000)






