2
4

More than 5 years have passed since last update.

Mac上にApacheSparkを試しに触ってみるための環境を構築する手順

Last updated at Posted at 2016-07-29

インストール

brewでインストール

brew install apache-spark

パスを通す

  • インストール先を確認
brew info apache-spark

(結果)
apache-spark: stable 1.6.1, HEAD
Engine for large-scale data processing
https://spark.apache.org/
/usr/local/Cellar/apache-spark/1.6.1 (842 files, 312.4M) *
  Built from source on 2016-07-29 at 23:32:40
From: https://github.com/Homebrew/homebrew-core/blob/master/Formula/apache-spark.rb
  • .bash_profileに追記
echo 'export SPARK_HOME=/usr/local/Cellar/apache-spark/1.6.1' >> ~/.bash_profile
echo 'export PATH=${PATH}:${SPARK_HOME}/bin'                  >> ~/.bash_profile
source ~/.bash_profile

確認

which spark-shell

(結果)
/usr/local/bin/spark-shell

起動してみる

spark-shell

  • spark-shell実行
spark-shell

(結果)
spark-shell
/usr/local/Cellar/apache-spark/1.6.1/bin/load-spark-env.sh: line 2: /usr/local/Cellar/apache-spark/1.6.1/libexec/bin/load-spark-env.sh: Permission denied
/usr/local/Cellar/apache-spark/1.6.1/bin/load-spark-env.sh: line 2: exec: /usr/local/Cellar/apache-spark/1.6.1/libexec/bin/load-spark-env.sh: cannot execute: Undefined error: 0
  • 怒られた。。。
  • この記事を参考にコマンド叩いてみる
unset SPARK_HOME && spark-submit
  • spark-shellを再度実行
spark-shell

(結果)
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark's repl log4j profile: org/apache/spark/log4j-defaults-repl.properties
To adjust logging level use sc.setLogLevel("INFO")
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.6.1
      /_/

Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_40)
Type in expressions to have them evaluated.
Type :help for more information.
Spark context available as sc.
16/07/30 00:00:20 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/07/30 00:00:20 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/07/30 00:00:24 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/07/30 00:00:24 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
16/07/30 00:00:26 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/07/30 00:00:26 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
SQL context available as sqlContext.

scala>
  • 確認できたので抜ける
scala> :quit

Stopping spark context.
  • さっき実行した「unset SPARK_HOME && spark-submit」を、/dev/nullリダイレクト付きで~/.bash_profileに追加しておく
echo 'unset SPARK_HOME && spark-submit > /dev/null 2>&1'   >> ~/.bash_profile

pyspark

  • pyspark実行
pyspark

(結果)
Python 2.7.10 (default, Oct 23 2015, 19:19:21)
[GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/07/30 00:02:43 INFO SparkContext: Running Spark version 1.6.1
16/07/30 00:02:44 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/07/30 00:02:44 INFO SecurityManager: Changing view acls to: LowSE01
16/07/30 00:02:44 INFO SecurityManager: Changing modify acls to: LowSE01
16/07/30 00:02:44 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(LowSE01); users with modify permissions: Set(LowSE01)
16/07/30 00:02:44 INFO Utils: Successfully started service 'sparkDriver' on port 65126.
16/07/30 00:02:45 INFO Slf4jLogger: Slf4jLogger started
16/07/30 00:02:45 INFO Remoting: Starting remoting
16/07/30 00:02:45 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@192.168.179.3:65127]
16/07/30 00:02:45 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 65127.
16/07/30 00:02:45 INFO SparkEnv: Registering MapOutputTracker
16/07/30 00:02:45 INFO SparkEnv: Registering BlockManagerMaster
16/07/30 00:02:45 INFO DiskBlockManager: Created local directory at /private/var/folders/pq/m1yrrt652vg03wf5q66xk0m00000gn/T/blockmgr-47e0a926-d889-4514-9be9-c5da7aaaeb63
16/07/30 00:02:45 INFO MemoryStore: MemoryStore started with capacity 511.1 MB
16/07/30 00:02:45 INFO SparkEnv: Registering OutputCommitCoordinator
16/07/30 00:02:45 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/07/30 00:02:45 INFO SparkUI: Started SparkUI at http://192.168.179.3:4040
16/07/30 00:02:46 INFO Executor: Starting executor ID driver on host localhost
16/07/30 00:02:46 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 65128.
16/07/30 00:02:46 INFO NettyBlockTransferService: Server created on 65128
16/07/30 00:02:46 INFO BlockManagerMaster: Trying to register BlockManager
16/07/30 00:02:46 INFO BlockManagerMasterEndpoint: Registering block manager localhost:65128 with 511.1 MB RAM, BlockManagerId(driver, localhost, 65128)
16/07/30 00:02:46 INFO BlockManagerMaster: Registered BlockManager
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 1.6.1
      /_/

Using Python version 2.7.10 (default, Oct 23 2015 19:19:21)
SparkContext available as sc, HiveContext available as sqlContext.
>>>
  • 確認できたので抜ける
>>> quit()

16/07/30 00:03:52 INFO SparkUI: Stopped Spark web UI at http://192.168.179.3:4040
16/07/30 00:03:52 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/07/30 00:03:52 INFO MemoryStore: MemoryStore cleared
16/07/30 00:03:52 INFO BlockManager: BlockManager stopped
16/07/30 00:03:52 INFO BlockManagerMaster: BlockManagerMaster stopped
16/07/30 00:03:52 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/07/30 00:03:52 INFO SparkContext: Successfully stopped SparkContext
16/07/30 00:03:52 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/07/30 00:03:52 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/07/30 00:03:52 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
16/07/30 00:03:52 INFO ShutdownHookManager: Shutdown hook called
16/07/30 00:03:52 INFO ShutdownHookManager: Deleting directory /private/var/folders/pq/m1yrrt652vg03wf5q66xk0m00000gn/T/spark-4f1301d3-b185-4882-b24f-51454a0f575d
16/07/30 00:03:52 INFO ShutdownHookManager: Deleting directory /private/var/folders/pq/m1yrrt652vg03wf5q66xk0m00000gn/T/spark-4f1301d3-b185-4882-b24f-51454a0f575d/pyspark-42c0de51-6381-4400-80f2-392e6aa2f1d0
2
4
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
4