More than 3 years have passed since last update.

AWS Amazon Linux2 インスタンスに BWA をインストールする

Last updated at 2022-01-01Posted at 2022-01-01

初めに

Amazon Linux2 インスタンスに bwa をインストールして使ってみたのでメモとして残しておきます。ユーザーは ec2-user を使用しています。

インストール

以下のページからインストールします。

こちらを適当なディレクトリ（/home/ec2-user など）に置きます。

$ ls
bwa-0.7.17.tar.bz2

解凍します。

$ tar jxvf bwa-0.7.17.tar.bz2
$ cd bwa-0.7.17/

必要なツールをインストールします。

$ sudo yum install gcc
$ sudo yum install zlib-devel

上記のパッケージをインストールしないと、以下のように失敗します。

$ make
gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS  utils.c -o utils.o
utils.c:33:10: fatal error: zlib.h: No such file or directory
 #include <zlib.h>
          ^~~~~~~~
compilation terminated.
make: *** [utils.o] Error 1

ビルドします。

$ make

bwa コマンドをコマンドラインで実行できるようにします。

$ cp bwa *.pl /usr/bin

以下のように usage が表示されれば OK です。

$ bwa

Program: bwa (alignment via Burrows-Wheeler transformation)
Version: 0.7.17-r1188
Contact: Heng Li <lh3@sanger.ac.uk>

Usage:   bwa <command> [options]

Command: index         index sequences in the FASTA format
         mem           BWA-MEM algorithm
         fastmap       identify super-maximal exact matches
         pemerge       merge overlapping paired ends (EXPERIMENTAL)
         aln           gapped/ungapped alignment
         samse         generate alignment (single ended)
         sampe         generate alignment (paired ended)
         bwasw         BWA-SW for long queries

         shm           manage indices in shared memory
         fa2pac        convert FASTA to PAC format
         pac2bwt       generate BWT from PAC
         pac2bwtgen    alternative algorithm for generating BWT
         bwtupdate     update .bwt to the new format
         bwt2sa        generate SA from BWT and Occ

Note: To use BWA, you need to first index the genome with `bwa index'.
      There are three alignment algorithms in BWA: `mem', `bwasw', and
      `aln/samse/sampe'. If you are not sure which to use, try `bwa mem'
      first. Please `man ./bwa.1' for the manual.

リファレンスファイルをダウンロードする

以下のページからリファレンスファイル（基になる塩基配列を記述したファイル）をダウンロードします。

Zea_mays.Zm-B73-REFERENCE-NAM-5.0.dna.chromosome.1.fa.gz

インデックスを作成する

以下のコマンドを実行し、インデックスを作成します。

$ bwa index Zm-B73-REFERENCE-NAM-5.0 Zea_mays.Zm-B73-REFERENCE-NAM-5.0.dna.chromosome.1.fa

インデックス作成後、以下のファイルが作成されます。

Zea_mays.Zm-B73-REFERENCE-NAM-5.0.dna.chromosome.1.fa.amb
Zea_mays.Zm-B73-REFERENCE-NAM-5.0.dna.chromosome.1.fa.ann
Zea_mays.Zm-B73-REFERENCE-NAM-5.0.dna.chromosome.1.fa.bwt
Zea_mays.Zm-B73-REFERENCE-NAM-5.0.dna.chromosome.1.fa.pac
Zea_mays.Zm-B73-REFERENCE-NAM-5.0.dna.chromosome.1.fa.sa

FASTQ ファイルを作成する

FASTQ ファイルを作成するために、ART をインストールします。

上記のリンクから以下のバイナリファイルをダウンロードします。

このファイルを適当なディレクトリ（home/ec2-user など）に置いたら解凍します。

$ tar -xzvf artbinmountrainier2016.06.05linux64.tgz

/usr/bin にコピーします。

$ sudo cp art_illumina *.pl /usr/bin

以下のコマンドを実行し、FASTQ ファイルを作成します。

$ art_illumina -ss HS25 -sam -i Zea_mays.Zm-B73-REFERENCE-NAM-5.0.dna.chromosome.1.fa -p -l 150 -f 20 -m 200 -s 10 -o paired_dat

上記のコマンドの実行に時間がかかるので、途中で停止し、1000 行だけ抜き出しました。

$ head -1000 paired_dat1.fq > paired_1.fq
$ head -1000 paired_dat2.fq > paired_2.fq

マッピングを行う

以下のコマンドを実行し、マッピングを行います。

$ bwa mem  Zea_mays.Zm-B73-REFERENCE-NAM-5.0.dna.chromosome.1.fa paired_1.fq paired_2.fq > test.sam

参考記事

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up