LoginSignup
2
5

More than 5 years have passed since last update.

Mac で Apache Spark & PySparkするまで

Last updated at Posted at 2018-11-12

完全なる備忘メモ

Java 入れる

意外とハマったのでコチラを参考に↓の感じでJava8を入れる.

$ brew tap caskroom/versions
$ brew cask install java8

.bashrc or .zshrc

$ export JAVA_HOME=`/usr/libexec/java_home -v "8"`
$ PATH=${JAVA_HOME}/bin:${PATH}

を追記しておく.

Apache Spark 入れる

コチラを参考に

$ brew install apache-spark

で入れる.

==> Downloading https://www.apache.org/dyn/closer.lua?path=spark/spark-2.3.2/spa
==> Downloading from http://ftp.jaist.ac.jp/pub/apache/spark/spark-2.3.2/spark-2
######################################################################## 100.0%
🍺  /usr/local/Cellar/apache-spark/2.3.2: 1,019 files, 243.9MB, built in 5 minutes 15 seconds

で,おわり.

とりあえずPySparkを試してみる

$ pyspark

を試しに打ってみる.

Python 3.6.0 |Continuum Analytics, Inc.| (default, Dec 23 2016, 13:19:00) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
2018-11-12 23:38:44 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 2.3.2
      /_/

Using Python version 3.6.0 (default, Dec 23 2016 13:19:00)
SparkSession available as 'spark'.
>>>

こんな感じで出てきたらOK

standalone モードでpysparkを使ってみる

pythonのバージョンを3にしておく.

pyenv global 3.6.0

一つ目のterminalでmasterを立ち上げる.

cd /usr/local/Cellar/apache-spark/2.3.2/libexec/bin
spark-class org.apache.spark.deploy.master.Master

二つ目のterminalでworkerを立ち上げる

cd /usr/local/Cellar/apache-spark/2.3.2/libexec/bin
spark-class org.apache.spark.deploy.worker.Worker spark://IP:PORT

IPはmasterを立ち上げたときに出力される

Master:54 - Starting Spark master at spark://IP:PORT

を探して出力されているIPとPORTを入力する.
三つ目のterminalでsparkを実行する.

pyspark --master spark://IP:PORT
2
5
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
5