WDLの実行環境をOS Xに構築する
- Docker のインストール
- 今回、cromwellを利用して行う解析にはDockerを利用しているのでDockerを予めインストールしておく
- Mac OS Sierra 10.12 or aboveが必要
- JDK 11.0.2のインストール
- cromwell の実行にはjavaが必要なので、JDKをインストールする
- cromwell のインストール
- jar ファイルなのでダウンロードするだけ
- Google Cloud SDKのインストール
Docker のインストール
Dockerを下記のリンクからインストール
https://www.docker.com/products/docker-desktop
ダウンロード時にアカウントの作成が必要
インストールの詳細については上記サイトを参照の事
インストールされたバージョンを確認する
$ docker --version
Docker version 18.09.1, build 4c52b90
Docker の Preference -> Advanced 設定でDockerが利用可能な資源
CPUs, Memory, Swap 等を適切な値に設定する事
例:
CPUs: 3
Memory: 20.5 GiB
Swap: 1.0 GiB
JDK 11.0.2のインストール
OracleのサイトからOpen JDKをダウンロードする
Downloadsフォルダーに展開後のファイル格納されるので、ダブルクリックしてtarを展開
その後、Terminalを利用して該当のjdkをMacのシステムの適切な場所に導入する
$ sudo mv ~/Downloads/jdk-11.0.2.jdk /Library/Java/JavaVirtualMachines/
インストールされたバージョンを確認する
$ java --version
openjdk version "11.0.2" 2019-01-15
OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
cromwellのインストール
最新版を取得するため、ブラウザーで下記のリンクを開く
https://github.com/broadinstitute/cromwell/releases/latest
cromwell-36.jar
(168MB)をクリックしてダウンロード
cromwellのためのディレクトリを作成
ここではcromwellとしているが場所はどこでも良い
$ cd
$ mkdir cromwell
$ cd cromwell
$ cp ~/Downloads/cromwell-36.jar .
$ ln -s cromwell-36.jar cromwell.jar
動作の確認
$ cat > myWorkflow.wdl
workflow myWorkflow {
call myTask
}
task myTask {
command {
echo "hello world"
}
output {
String out = read_string(stdout())
}
}
Ctrl+D
でファイルに書き込む
cromwellの実行
まずは、先程のサンプルを実行する
コマンドは次の通り
java -jar cromwell.jar run myWorkflow.wdl
$ java -jar cromwell.jar run myWorkflow.wdl
[2019-01-23 10:01:17,99] [info] Running with database db.url = jdbc:hsqldb:mem:48166e1b-70d3-4eac-bc64-fccd7cd0bdee;shutdown=false;hsqldb.tx=mvcc
[2019-01-23 10:01:23,82] [info] Running migration RenameWorkflowOptionsInMetadata with a read batch size of 100000 and a write batch size of 100000
[2019-01-23 10:01:23,83] [info] [RenameWorkflowOptionsInMetadata] 100%
[2019-01-23 10:01:23,92] [info] Running with database db.url = jdbc:hsqldb:mem:80d12ef4-e7a9-4b80-8526-2f644d1ff595;shutdown=false;hsqldb.tx=mvcc
[2019-01-23 10:01:24,24] [info] Slf4jLogger started
[2019-01-23 10:01:24,49] [info] Workflow heartbeat configuration:
{
"cromwellId" : "cromid-face270",
"heartbeatInterval" : "2 minutes",
"ttl" : "10 minutes",
"writeBatchSize" : 10000,
"writeThreshold" : 10000
}
[2019-01-23 10:01:24,51] [info] Metadata summary refreshing every 2 seconds.
[2019-01-23 10:01:24,54] [info] KvWriteActor configured to flush with batch size 200 and process rate 5 seconds.
[2019-01-23 10:01:24,54] [info] WriteMetadataActor configured to flush with batch size 200 and process rate 5 seconds.
[2019-01-23 10:01:24,63] [info] CallCacheWriteActor configured to flush with batch size 100 and process rate 3 seconds.
[2019-01-23 10:01:24,93] [info] JobExecutionTokenDispenser - Distribution rate: 50 per 1 seconds.
[2019-01-23 10:01:24,94] [info] SingleWorkflowRunnerActor: Version 36
[2019-01-23 10:01:24,94] [info] SingleWorkflowRunnerActor: Submitting workflow
[2019-01-23 10:01:24,97] [info] Unspecified type (Unspecified version) workflow 5b65edf5-3e6f-4f5c-995a-2d80429249cc submitted
[2019-01-23 10:01:24,98] [info] SingleWorkflowRunnerActor: Workflow submitted 5b65edf5-3e6f-4f5c-995a-2d80429249cc
[2019-01-23 10:01:24,99] [info] 1 new workflows fetched
[2019-01-23 10:01:24,99] [info] WorkflowManagerActor Starting workflow 5b65edf5-3e6f-4f5c-995a-2d80429249cc
[2019-01-23 10:01:24,99] [warn] SingleWorkflowRunnerActor: received unexpected message: Done in state RunningSwraData
[2019-01-23 10:01:25,01] [info] WorkflowManagerActor Successfully started WorkflowActor-5b65edf5-3e6f-4f5c-995a-2d80429249cc
[2019-01-23 10:01:25,01] [info] Retrieved 1 workflows from the WorkflowStoreActor
[2019-01-23 10:01:25,01] [info] WorkflowStoreHeartbeatWriteActor configured to flush with batch size 10000 and process rate 2 minutes.
[2019-01-23 10:01:25,05] [info] MaterializeWorkflowDescriptorActor [5b65edf5]: Parsing workflow as WDL draft-2
[2019-01-23 10:01:25,46] [info] MaterializeWorkflowDescriptorActor [5b65edf5]: Call-to-Backend assignments: myWorkflow.myTask -> Local
[2019-01-23 10:01:26,71] [info] WorkflowExecutionActor-5b65edf5-3e6f-4f5c-995a-2d80429249cc [5b65edf5]: Starting myWorkflow.myTask
[2019-01-23 10:01:27,20] [info] BackgroundConfigAsyncJobExecutionActor [5b65edf5myWorkflow.myTask:NA:1]: echo "hello world"
[2019-01-23 10:01:27,24] [info] BackgroundConfigAsyncJobExecutionActor [5b65edf5myWorkflow.myTask:NA:1]: executing: /bin/bash /Users/makoto/Documents/cromwell/cromwell-executions/myWorkflow/5b65edf5-3e6f-4f5c-995a-2d80429249cc/call-myTask/execution/script
[2019-01-23 10:01:29,57] [info] BackgroundConfigAsyncJobExecutionActor [5b65edf5myWorkflow.myTask:NA:1]: job id: 1254
[2019-01-23 10:01:29,59] [info] BackgroundConfigAsyncJobExecutionActor [5b65edf5myWorkflow.myTask:NA:1]: Status change from - to Done
[2019-01-23 10:01:30,80] [info] WorkflowExecutionActor-5b65edf5-3e6f-4f5c-995a-2d80429249cc [5b65edf5]: Workflow myWorkflow complete. Final Outputs:
{
"myWorkflow.myTask.out": "hello world"
}
[2019-01-23 10:01:30,82] [info] WorkflowManagerActor WorkflowActor-5b65edf5-3e6f-4f5c-995a-2d80429249cc is in a terminal state: WorkflowSucceededState
[2019-01-23 10:01:38,87] [info] SingleWorkflowRunnerActor workflow finished with status 'Succeeded'.
{
"outputs": {
"myWorkflow.myTask.out": "hello world"
},
"id": "5b65edf5-3e6f-4f5c-995a-2d80429249cc"
}
[2019-01-23 10:01:39,59] [info] Workflow polling stopped
[2019-01-23 10:01:39,60] [info] Shutting down WorkflowStoreActor - Timeout = 5 seconds
[2019-01-23 10:01:39,60] [info] Shutting down WorkflowLogCopyRouter - Timeout = 5 seconds
[2019-01-23 10:01:39,60] [info] Shutting down JobExecutionTokenDispenser - Timeout = 5 seconds
[2019-01-23 10:01:39,60] [info] Aborting all running workflows.
[2019-01-23 10:01:39,60] [info] JobExecutionTokenDispenser stopped
[2019-01-23 10:01:39,60] [info] WorkflowStoreActor stopped
[2019-01-23 10:01:39,61] [info] WorkflowLogCopyRouter stopped
[2019-01-23 10:01:39,61] [info] Shutting down WorkflowManagerActor - Timeout = 3600 seconds
[2019-01-23 10:01:39,61] [info] WorkflowManagerActor All workflows finished
[2019-01-23 10:01:39,61] [info] WorkflowManagerActor stopped
[2019-01-23 10:01:39,61] [info] Connection pools shut down
[2019-01-23 10:01:39,61] [info] Shutting down SubWorkflowStoreActor - Timeout = 1800 seconds
[2019-01-23 10:01:39,61] [info] Shutting down JobStoreActor - Timeout = 1800 seconds
[2019-01-23 10:01:39,61] [info] Shutting down CallCacheWriteActor - Timeout = 1800 seconds
[2019-01-23 10:01:39,61] [info] Shutting down ServiceRegistryActor - Timeout = 1800 seconds
[2019-01-23 10:01:39,61] [info] SubWorkflowStoreActor stopped
[2019-01-23 10:01:39,61] [info] CallCacheWriteActor Shutting down: 0 queued messages to process
[2019-01-23 10:01:39,61] [info] Shutting down DockerHashActor - Timeout = 1800 seconds
[2019-01-23 10:01:39,62] [info] Shutting down IoProxy - Timeout = 1800 seconds
[2019-01-23 10:01:39,62] [info] CallCacheWriteActor stopped
[2019-01-23 10:01:39,62] [info] WriteMetadataActor Shutting down: 0 queued messages to process
[2019-01-23 10:01:39,62] [info] KvWriteActor Shutting down: 0 queued messages to process
[2019-01-23 10:01:39,62] [info] JobStoreActor stopped
[2019-01-23 10:01:39,62] [info] DockerHashActor stopped
[2019-01-23 10:01:39,62] [info] ServiceRegistryActor stopped
[2019-01-23 10:01:39,62] [info] IoProxy stopped
[2019-01-23 10:01:39,63] [info] Database closed
[2019-01-23 10:01:39,63] [info] Stream materializer shut down
[2019-01-23 10:01:39,63] [info] WDL HTTP import resolver closed
最終出力は
[2019-01-23 10:01:38,87] [info] SingleWorkflowRunnerActor workflow finished with status 'Succeeded'.
の部分に出力されたJSONの出力を確認する
{
"outputs": {
"myWorkflow.myTask.out": "hello world"
},
"id": "5b65edf5-3e6f-4f5c-995a-2d80429249cc"
}
上記の情報を元に、下記のフォルダーに出力結果が格納される事がわかる
cromwell-executions/myWorkflow/5b65edf5-3e6f-4f5c-995a-2d80429249cc
内容を確認してみる
$ cat cromwell-executions/myWorkflow/5b65edf5-3e6f-4f5c-995a-2d80429249cc/call-myTask/execution/stdout
hello world
Google Cloud SDKのインストール
詳細な手順は次の通り
https://cloud.google.com/sdk/docs/quickstart-macos
上記のページから
google-cloud-sdk-230.0.0-darwin-x86_64.tar.gz
をダウンロードする
ダブルクリックしてファイルを展開
install.sh
でGoogle Cloud SDKをインストールする
途中で質問事項が数点表示されるがそのままリターンキーを入力しておけば良い
$ mv ~/Downloads/google-cloud-sdk .
$ pushd google-cloud-sdk
$ ./install.sh
Welcome to the Google Cloud SDK!
To help improve the quality of this product, we collect anonymized usage data
and anonymized stacktraces when crashes are encountered; additional information
is available at <https://cloud.google.com/sdk/usage-statistics>. You may choose
to opt out of this collection now (by choosing 'N' at the below prompt), or at
any time in the future by running the following command:
gcloud config set disable_usage_reporting true
Do you want to help improve the Google Cloud SDK (Y/n)?
Your current Cloud SDK version is: 230.0.0
The latest available version is: 230.0.0
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Components │
├───────────────┬──────────────────────────────────────────────────────┬──────────────────────────┬───────────┤
│ Status │ Name │ ID │ Size │
├───────────────┼──────────────────────────────────────────────────────┼──────────────────────────┼───────────┤
│ Not Installed │ App Engine Go Extensions │ app-engine-go │ 56.4 MiB │
│ Not Installed │ Cloud Bigtable Command Line Tool │ cbt │ 6.3 MiB │
│ Not Installed │ Cloud Bigtable Emulator │ bigtable │ 5.6 MiB │
│ Not Installed │ Cloud Datalab Command Line Tool │ datalab │ < 1 MiB │
│ Not Installed │ Cloud Datastore Emulator │ cloud-datastore-emulator │ 18.3 MiB │
│ Not Installed │ Cloud Datastore Emulator (Legacy) │ gcd-emulator │ 38.1 MiB │
│ Not Installed │ Cloud Firestore Emulator │ cloud-firestore-emulator │ 32.2 MiB │
│ Not Installed │ Cloud Pub/Sub Emulator │ pubsub-emulator │ 33.4 MiB │
│ Not Installed │ Cloud SQL Proxy │ cloud_sql_proxy │ 3.7 MiB │
│ Not Installed │ Emulator Reverse Proxy │ emulator-reverse-proxy │ 14.5 MiB │
│ Not Installed │ Google Cloud Build Local Builder │ cloud-build-local │ 5.9 MiB │
│ Not Installed │ Google Container Registry's Docker credential helper │ docker-credential-gcr │ 1.8 MiB │
│ Not Installed │ gcloud Alpha Commands │ alpha │ < 1 MiB │
│ Not Installed │ gcloud Beta Commands │ beta │ < 1 MiB │
│ Not Installed │ gcloud app Java Extensions │ app-engine-java │ 107.5 MiB │
│ Not Installed │ gcloud app PHP Extensions │ app-engine-php │ 21.9 MiB │
│ Not Installed │ gcloud app Python Extensions │ app-engine-python │ 6.2 MiB │
│ Not Installed │ gcloud app Python Extensions (Extra Libraries) │ app-engine-python-extras │ 28.5 MiB │
│ Not Installed │ kubectl │ kubectl │ < 1 MiB │
│ Installed │ BigQuery Command Line Tool │ bq │ < 1 MiB │
│ Installed │ Cloud SDK Core Libraries │ core │ 9.3 MiB │
│ Installed │ Cloud Storage Command Line Tool │ gsutil │ 3.6 MiB │
└───────────────┴──────────────────────────────────────────────────────┴──────────────────────────┴───────────┘
To install or remove components at your current SDK version [230.0.0], run:
$ gcloud components install COMPONENT_ID
$ gcloud components remove COMPONENT_ID
To update your SDK installation to the latest version [230.0.0], run:
$ gcloud components update
Modify profile to update your $PATH and enable shell command
completion?
Do you want to continue (Y/n)? Y
The Google Cloud SDK installer will now prompt you to update an rc
file to bring the Google Cloud CLIs into your environment.
Enter a path to an rc file to update, or leave blank to use
[/Users/makoto/.bash_profile]:
[/Users/makoto/.bash_profile] has been updated.
==> Start a new shell for the changes to take effect.
For more information on how to get started, please visit:
https://cloud.google.com/sdk/docs/quickstarts
変更事項を有効にするために、改めてターミナル開くよう指示されるのでそれに従う
SDKの初期設定
これ以降は、Googleのアカウントが必要となる
作業中にブラウザーからGoogleアカウントにログインする必要がある
(gsutilを利用するだけなら、アカウント無しでも行けるように思うが...)
ログインすると、Google Cloud SDKの利用には
- 初回利用時Free Trialを選択すると1年間で$300利用可能なオプションが選択できる
- いずれにしても、このアカウントをセットアップする際にはクレジットカードの情報を入力する必要がある
今回はここまで