More than 5 years have passed since last update.

JupyterHub + virtualenvでマルチユーザー対応のJupyter Notebookサーバーを構築

Posted at 2018-05-04

今回、複数名で使うディープラーニング向けのサーバーを作りました。
シングルユーザーのJupyter Notebookを複数ユーザーでシェアすることも考えましたが、自分のアカウントでvirtualenvを使って構築したpythonの実行環境を使えないことが問題になります。
そこで、マルチユーザーが謳われているJupyterHubを導入しました。
結果として、Linuxのユーザーアカウントでログインできて、ユーザー毎に使う環境を分けることができるようになりました。

JupyterHubをインストール・設定する記事はQiitaを始めとして多くあります。
それらを参考にしてサーバーを構築する中で、JupyterHubのインストールからサービスとして起動させるようにする設定や各ユーザーのpython環境を使えるようにする設定までを一貫して説明しているものは無いと感じました。
先人たちの記事とは重複する部分が多いですが、自分の作業メモの兼ねて、あえて残すことにします。

まだ分かっていないことも多いため、不備についてはご指摘いただけますと助かります。
また、こちらの方が良いという事につきましては、是非お教えいただけますと幸いです。

（注意）実際に構築したサーバにはtensorflow-gpuをインストールしていますが、今回の趣旨から外れるので、一切触れません。ご了承下さい。

サーバーの構成

項目	内容
ハード	GALLERIA ZZ （OS無しモデル）
OS	Ubuntu 16.04.4 LTS
CPU	Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
Memory	64GB (PC4-21300/16GB × 4)
GPU	NVIDIA GeForce GTX 1080Ti (OC)

※新たに購入したマシンをセットアップしています。

作業の流れ

JupyterHubのインストール
virtualenvをインストールしてアクティベート
JupyterHubをインストール
sudospawnerをインストール
JupyterHubの設定
ユーザー側での設定作業
virtualenvのインストール
~/notebooksディレクトリを作成
~/.jupyter/jupyter_notebook_config.pyを作成

JupyterHubのインストール

今回はvirtualenvを使ってインストール作業を進めます。¹ ²
以下、rootユーザーで実行します。³

sudo su -

1. virtualenvをインストールする

virtualenvのインストールはTensorFlowのインストールドキュメントを参考に進めます。
今回は{target_diretory}に/usr/local/tensorflowを設定しています。

root@tensorflow02:~# apt-get install python3-pip python3-dev python-virtualenv
root@tensorflow02:~# virtualenv --system-site-packages -p python3 {target_directory}

virtualenvを設定できたら次のコマンドでアクティベートします。
正常にインストールできていれば(tensorflow) root@tensorflow02:~#というプロンプトが帰ってきます。

root@tensorflow02:~# source {target_directory}/bin/activate
(tensorflow) root@tensorflow02:~#

2. JupyterHubをインストールする

Installation of Jupyterhub on remote serverを参考に進めます。
以下のコマンドを叩けばJupyterHubのインストールは完了です。

root@tensorflow02:~# sudo apt-get install npm nodejs-legacy
root@tensorflow02:~# npm install -g configurable-http-proxy
root@tensorflow02:~# pip3 install jupyterhub
root@tensorflow02:~# pip3 install --upgrade notebook

3. sudospawnerをインストールする

次のコマンドを実行します。

root@tensorflow02:~# pip install sudospawner

4. JupyterHubの設定

JupyterHubをサービスとして起動するための設定をRun jupyterhub as a system serviceを参考にして行います。

まずは/etc/init.d/jupyterhubを作成します。
https://gist.github.com/lambdalisue/f01c5a65e81100356379をコピーしてきて、次4箇所を変更します。

実行するシェルを/bin/shから/bin/bashに変更
virtualenvをアクティベートするコマンドsource /{virtualenv_directory}/bin/activateを追加
PATHに/{virtualenv_directory}/binを追加
DAEMONを/{virtualenv_directory}/bin/jupyterhubに変更

修正後のスクリプトは次のようになります。

jupyterhub

# !/bin/bash
### BEGIN INIT INFO
# Provides:          jupyterhub
# Required-Start:    $remote_fs $syslog
# Required-Stop:     $remote_fs $syslog
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Start jupyterhub
# Description:       This file should be used to construct scripts to be
#                    placed in /etc/init.d.
### END INIT INFO

# Author: Katsuya Ishiyama <ishiyama.katsuya@gmail.com>
#
# Please remove the "Author" lines above and replace them
# with your own name if you copy and modify this script.

# Do NOT "set -e"

# Only for using virtualenv.
source /{virtualenv_directory}/bin/activate

# PATH should only include /usr/* if it runs after the mountnfs.sh script
PATH=/sbin:/usr/sbin:/bin:/usr/bin:/usr/local/bin:/{virtualenv_directory}/bin
DESC="Multi-user server for Jupyter notebooks"
NAME=jupyterhub
DAEMON=/{virtualenv_directory}/bin/jupyterhub
DAEMON_ARGS="--config=/etc/jupyterhub/jupyterhub_config.py"
PIDFILE=/var/run/$NAME.pid
SCRIPTNAME=/etc/init.d/$NAME

# Exit if the package is not installed
[ -x "$DAEMON" ] || exit 0

# Read configuration variable file if it is present
[ -r /etc/default/$NAME ] && . /etc/default/$NAME

# Load the VERBOSE setting and other rcS variables
. /lib/init/vars.sh

# Define LSB log_* functions.
# Depend on lsb-base (>= 3.2-14) to ensure that this file is present
# and status_of_proc is working.
. /lib/lsb/init-functions

#
# Function that starts the daemon/service
#
do_start()
{
    # Return
    #   0 if daemon has been started
    #   1 if daemon was already running
    #   2 if daemon could not be started
    start-stop-daemon --start --quiet --pidfile $PIDFILE --exec $DAEMON --test > /dev/null \
        || return 1
    start-stop-daemon --start --background --make-pidfile --quiet --pidfile $PIDFILE --exec $DAEMON -- \
        $DAEMON_ARGS \
        || return 2
    # Add code here, if necessary, that waits for the process to be ready
    # to handle requests from services started subsequently which depend
    # on this one.  As a last resort, sleep for some time.
}

#
# Function that stops the daemon/service
#
do_stop()
{
    # Return
    #   0 if daemon has been stopped
    #   1 if daemon was already stopped
    #   2 if daemon could not be stopped
    #   other if a failure occurred
    start-stop-daemon --stop --quiet --retry=TERM/30/KILL/5 --pidfile $PIDFILE --name $NAME
    RETVAL="$?"
    [ "$RETVAL" = 2 ] && return 2
    # Wait for children to finish too if this is a daemon that forks
    # and if the daemon is only ever run from this initscript.
    # If the above conditions are not satisfied then add some other code
    # that waits for the process to drop all resources that could be
    # needed by services started subsequently.  A last resort is to
    # sleep for some time.
    start-stop-daemon --stop --quiet --oknodo --retry=0/30/KILL/5 --exec $DAEMON
    [ "$?" = 2 ] && return 2
    # Many daemons don't delete their pidfiles when they exit.
    rm -f $PIDFILE
    return "$RETVAL"
}

#
# Function that sends a SIGHUP to the daemon/service
#
do_reload() {
    #
    # If the daemon can reload its configuration without
    # restarting (for example, when it is sent a SIGHUP),
    # then implement that here.
    #
    start-stop-daemon --stop --signal 1 --quiet --pidfile $PIDFILE --name $NAME
    return 0
}

case "$1" in
  start)
    [ "$VERBOSE" != no ] && log_daemon_msg "Starting $DESC" "$NAME"
    do_start
    case "$?" in
        0|1) [ "$VERBOSE" != no ] && log_end_msg 0 ;;
        2) [ "$VERBOSE" != no ] && log_end_msg 1 ;;
    esac
    ;;
  stop)
    [ "$VERBOSE" != no ] && log_daemon_msg "Stopping $DESC" "$NAME"
    do_stop
    case "$?" in
        0|1) [ "$VERBOSE" != no ] && log_end_msg 0 ;;
        2) [ "$VERBOSE" != no ] && log_end_msg 1 ;;
    esac
    ;;
  status)
    status_of_proc "$DAEMON" "$NAME" && exit 0 || exit $?
    ;;
  #reload|force-reload)
    #
    # If do_reload() is not implemented then leave this commented out
    # and leave 'force-reload' as an alias for 'restart'.
    #
    #log_daemon_msg "Reloading $DESC" "$NAME"
    #do_reload
    #log_end_msg $?
    #;;
  restart|force-reload)
    #
    # If the "reload" option is implemented then remove the
    # 'force-reload' alias
    #
    log_daemon_msg "Restarting $DESC" "$NAME"
    do_stop
    case "$?" in
      0|1)
        do_start
        case "$?" in
            0) log_end_msg 0 ;;
            1) log_end_msg 1 ;; # Old process is still running
            *) log_end_msg 1 ;; # Failed to start
        esac
        ;;
      *)
        # Failed to stop
        log_end_msg 1
        ;;
    esac
    ;;
  *)
    #echo "Usage: $SCRIPTNAME {start|stop|restart|reload|force-reload}" >&2
    echo "Usage: $SCRIPTNAME {start|stop|status|restart|force-reload}" >&2
    exit 3
    ;;
esac

:

/etc/init.d/jupyterhubを作成したら以下のコマンドを実行します。

root@tensorflow02:~# chmod +x /etc/init.d/jupyterhub
root@tensorflow02:~# jupyterhub --generate-config -f /etc/jupyterhub/jupyterhub_config.py
root@tensorflow02:~# service jupyterhub start
root@tensorflow02:~# service jupyterhub stop
root@tensorflow02:~# update-rc.d jupyterhub defaults

すると/etc/init.dの下にJupyterHubの設定ファイルjupyterhub_config.pyができています。
今回は以下のように編集します。⁴

jupyterhub_config.py

# c.Spawner.notebook_dir = ''
c.Spawner.notebook_dir = '~/notebooks'

# c.Authenticator.admin_users = set()
c.Authenticator.admin_users = {'user01', 'user02', 'user03'}

# c.Authenticator.whitelist = set()
c.Authenticator.whitelist = {'user01', 'user02', 'user03'}

これでsudo service jupyterhub start でJupyterHubを起動できるようになります。

ユーザー側での設定作業

これ以降はJupyterHubを利用するユーザーが設定する作業です。
この作業をすることで、ユーザーが個々の開発環境を作り、Jupyter Notebookで使えるようになります。

1. virtualenvのインストール

$ virtualenv --system-site-packages -p python3 ~/{target_directory}

2. ~/notebooksを作成⁴

$ mkdir ~/notebooks

3. ~/.jupyter/jupyter_notebook_config.pyを作成

ここは@kozo2さんのjupyterhub でユーザーローカルの Python 環境を利用するにはを参考にしました。
~/.jupyter/jupyter_notebook_config.pyを次の内容で作成します。

jupyter_notebook_config.py

# -*- coding: utf-8 -*-

import os

os.environ['PYTHONPATH'] = '/home/{ユーザー名}/{virtualenv_directory}/bin/:/home/{ユーザー名}/{virtualenv_directory}/lib/{pythonのバージョン}/site-packages'

PYTHONPATHにはJupyter Notebookで使いたいpythonが置かれているディレクトリとJupyter Notebookで使いたいパッケージが置かれているディレクトリを指定します。
どちらも設定しないと使いたい環境が使えませんので、ご注意下さい。

以上で設定は完了です。

まとめ

virtualenvとJupyterHubを使ってマルチユーザーのJupyter Notebookを構築しました。
インストールと設定作業は複雑ですが、Linuxのユーザーアカウントで構築したpython環境が使えることがメリットだと感じています。
もちろん、JupyterHubにログインしたユーザー毎にプロセスを生成するので、複数ユーザーで同時に開発・実行することも可能です。

参考URL

Installing TensorFlow on Ubuntu (Installing with Virtualenv)
https://www.tensorflow.org/install/install_linux#installing_with_virtualenv
Installation of Jupyterhub on remote server
https://github.com/jupyterhub/jupyterhub/wiki/Installation-of-Jupyterhub-on-remote-server
Run jupyterhub as a system service
https://github.com/jupyterhub/jupyterhub/wiki/Run-jupyterhub-as-a-system-service
JupyterHubの構築
https://qiita.com/cvusk/items/afa46c35d8d5f0d930ed
Jupyter(hub)のNotebookをユーザー自身で環境変数を変更して使ってもらうための設定
https://qiita.com/kozo2/items/1a22ca4b9da6e0364c4c

脚注

Anacondaを使える場合は、@cvuskさんのJupyterHubの構築を参考にインストールする方が簡単だと思います。 ↩
生のpythonにインストールする方が後の設定が簡単です。
しかし、pythonはシステム管理に使われていることがあるので（特にCentOSのyum）、python環境を弄ることに抵抗があるサーバー管理者もいると思います。
virtualenvはそんな場合でも有効な方法だと考えています。 ↩
始めは「一般ユーザーにインストールしたい」と思ってインストール作業を進めていましたが、sudo service jupyterhub startしたときに他のユーザーがJupyterにログインできないという問題でハマり、rootでインストールし直したという経緯があります。
sudospawner.pyでログインユーザーのJupyter Notebookインスタンスを作成しますが、権限の問題で生成することができなかったようです。
systemdspawnerを使えば解消できそうですが検証できていません。 ↩
今回の設定ではユーザー毎に~/notebooksを作成しないとHTTPエラー500でログインに失敗します。ご注意下さい。 ↩ ↩²

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up