LoginSignup
2
2

More than 5 years have passed since last update.

gce に spark を インストールする、もちろんubuntu環境で。

Posted at

google compute engineにsparkをインストールする

gceにクイックデプロイではなく、新しいインスタンスを作成してsparkをインストールする。

前提

  • ローカル環境はubuntu
  • gcloudはセットアップ済み

手順(インスタンス作成)

1.VMインスタンスの画面で[新しいインスタンス]ボタンを押す
image

2.マシンタイプ等を適当に選択、ブートディスクはもちろんubuntuに。バージョンはお好みで。
image

3.作成したインスタンスのgcloudで接続を選択
image

4.出てきたコマンドラインをローカルのubuntuのtermに貼り付ける
image

これでローカルから新しく作ったgceのubuntuインスタンスに接続できる。
以降の手順等はgcloudで接続したVM側で実行する。

手順(インストール)

1.Java8をインストール
素のubuntu VMはJavaが入ってなかったのでインストールします。

$ sudo add-apt-repository ppa:webupd8team/java
$ sudo apt-get update
$ sudo apt-get install oracle-java8-installer

途中で[ENTER]とか[Y]とか[OK]とか[yes]とか選んでインストールする。
インストールが終わったらとりあえず確認してみる。

junk@instance-2:~$ java -version
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
junk@instance-2:~$ 

OKっぽい。

2.scalaをダウンロード

$ cd ~
$ mkdir dl
$ cd dl
$ wget http://www.scala-lang.org/files/archive/scala-2.11.7.tgz
--2015-07-06 16:04:20--  http://www.scala-lang.org/files/archive/scala-2.11.7.tgz
Resolving www.scala-lang.org (www.scala-lang.org)... 128.178.154.159
Connecting to www.scala-lang.org (www.scala-lang.org)|128.178.154.159|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 28460530 (27M) [application/x-gzip]
Saving to: ‘scala-2.11.7.tgz’

scala-2.11.7.tgz                100%[======================================================>]  27.14M  5.57MB/s   in 8.3s   

2015-07-06 16:04:29 (3.27 MB/s) - ‘scala-2.11.7.tgz’ saved [28460530/28460530]

3.解凍する

tar -xzvf scala-2.11.7.tgz

4.解凍したscalaをコピーしてリンクも作ってあげる。

$ cd /usr/local/
$ sudo cp -r ~/dl/scala-2.11.7 .
$ sudo ln -sv scala-2.11.7/ scala
‘scala’ -> ‘scala-2.11.7/’

5.sparkをダウンロード

$ cd ~/dl
$ wget http://archive.apache.org/dist/spark/spark-1.4.0/spark-1.4.0-bin-hadoop2.6.tgz
--2015-07-06 16:11:16--  http://archive.apache.org/dist/spark/spark-1.4.0/spark-1.4.0-bin-hadoop2.6.tgz
Resolving archive.apache.org (archive.apache.org)... 192.87.106.229, 140.211.11.131, 2001:610:1:80bc:192:87:106:229
Connecting to archive.apache.org (archive.apache.org)|192.87.106.229|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 250194134 (239M) [application/x-tar]
Saving to: ‘spark-1.4.0-bin-hadoop2.6.tgz’

spark-1.4.0-bin-hadoop2.6.tgz   100%[======================================================>] 238.60M  6.62MB/s   in 45s    

2015-07-06 16:12:02 (5.32 MB/s) - ‘spark-1.4.0-bin-hadoop2.6.tgz’ saved [250194134/250194134]

6.解凍する

$ tar -xzvf spark-1.4.0-bin-hadoop2.6.tgz

7.解凍したsparkをコピーしてリンクも(ry

$ cd /usr/local/
$ sudo cp -r ~/dl/spark-1.4.0-bin-hadoop2.6 .
$ sudo ln -sv spark-1.4.0-bin-hadoop2.6/ spark
‘spark’ -> ‘spark-1.4.0-bin-hadoop2.6/’

8.パスを設定する

$ vi ~/.bashrc  

.bashrcの最後に以下を追加

export SCALA_HOME=/usr/local/scala
export SPARK_HOME=/usr/local/spark
export PATH=$SCALA_HOME/bin:$PATH

読み込み直し

$ source ~/.bashrc

9.起動

$ cd $SPARK_HOME
$ ./bin/spark-shell
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/07/06 16:24:33 INFO SecurityManager: Changing view acls to: junk
15/07/06 16:24:33 INFO SecurityManager: Changing modify acls to: junk
15/07/06 16:24:33 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(junk); users with modify permissions: Set(junk)
15/07/06 16:24:33 INFO HttpServer: Starting HTTP Server
15/07/06 16:24:33 INFO Utils: Successfully started service 'HTTP class server' on port 45846.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.4.0
      /_/

Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_45)
Type in expressions to have them evaluated.
Type :help for more information.
15/07/06 16:24:38 INFO SparkContext: Running Spark version 1.4.0
15/07/06 16:24:38 INFO SecurityManager: Changing view acls to: junk
15/07/06 16:24:38 INFO SecurityManager: Changing modify acls to: junk
15/07/06 16:24:38 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(junk); users with modify permissions: Set(junk)
15/07/06 16:24:39 INFO Slf4jLogger: Slf4jLogger started
15/07/06 16:24:39 INFO Remoting: Starting remoting
Mon Jul 06 16:24:42 UTC 2015 Thread[main,5,main] java.io.FileNotFoundException: derby.log (Permission denied)
15/07/06 16:24:43 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
----------------------------------------------------------------
Loaded from file:/usr/local/spark-1.4.0-bin-hadoop2.6/lib/spark-assembly-1.4.0-hadoop2.6.0.jar
java.vendor=Oracle Corporation
java.runtime.version=1.8.0_45-b14
user.dir=/usr/local/spark-1.4.0-bin-hadoop2.6
os.name=Linux
os.arch=amd64
os.version=3.19.0-21-generic
derby.system.home=null
Database Class Loader started - derby.database.classpath=''
15/07/06 16:24:45 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
15/07/06 16:24:45 INFO MetaStoreDirectSql: MySQL check failed, assuming we are not on mysql: Lexical error at line 1, column 5.  Encountered: "@" (64), after : "".
15/07/06 16:24:46 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
15/07/06 16:24:46 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
15/07/06 16:24:47 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
15/07/06 16:24:47 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
15/07/06 16:24:47 INFO ObjectStore: Initialized ObjectStore
15/07/06 16:24:48 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 0.13.1aa
15/07/06 16:24:48 INFO HiveMetaStore: Added admin role in metastore
15/07/06 16:24:48 INFO HiveMetaStore: Added public role in metastore
15/07/06 16:24:48 INFO HiveMetaStore: No user is added in admin role, since config is empty
15/07/06 16:24:48 INFO SessionState: No Tez session required at this point. hive.execution.engine=mr.
15/07/06 16:24:48 INFO SparkILoop: Created sql context (with Hive support)..
SQL context available as sqlContext.

scala> 

最後のログは長いので途中略しました

ここまで10分くらいで出来るっぽいです。

2
2
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
2