0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 1 year has passed since last update.

[Biopython]細菌ゲノムの簡単なstatistics (ID、ゲノムサイズ、遺伝子数など) を出力する

Posted at

Assembly statsといえばQUAST がポピュラーだが、今回ほしいのはQUASTが出してくれる内容とは微妙に違う。
細菌ゲノムアセンブリの比較ゲノム解析に使うため、NCBIからダウンロードしたgenbank形式ファイルを入力として

  • 種名
  • 株名
  • Accession no.
  • 全長
  • コンティグ数
  • タンパク質をコードする遺伝子の数
  • tRNA遺伝子の数
  • rRNA遺伝子の数
  • CRISPRの数

を羅列したテーブルを出力したい。

genome_stats.py

Requirements

Usage

./genome_stats.py -h
usage: genome_stats.py [-h] -i [INPUT ...] [--output FILE]

Produce a simple TSV summary file for microbial genome assemblies

optional arguments:
  -h, --help            show this help message and exit
  -i [INPUT ...], --input [INPUT ...]
                        Genbank/DDBJ flatfile (required)
  --output FILE, -o FILE, --out FILE, --output FILE
                        output TSV file
./genome_stats.py -i GCF_000006925.2_ASM692v2_genomic.gbff GCF_000005845.2_ASM584v2_genomic.gbff -o results.tbl
results.tbl
Species	Shigella flexneri 2a str. 301	Escherichia coli str. K-12 substr. MG1655
strain	301	K-12
Accession	GCF_000006925.2	NC_000913.3
Length	4828820	4641652
num_contigs	2	1
GC%	50.65	50.79
CDS	4313	4315
rRNA	22	22
tRNA	97(22)	86(21)
CRISPR	0	0
0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?