0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

ヒト変異のHGVS表記の変換(vep api)

Last updated at Posted at 2019-03-01

変異のHGVS表記の相互変換

これまでに、ヒトゲノムの変異のHGVSの変換を行うために、変異のHGVS表記の相互変換(Ensembl variant_recoder)変異のHGVS表記の変換(Python hgvs)等を試してみた

しかし、お手軽さの点ではvep apiを利用するのにかなわないようだ

vep restful api のvariant recoderを利用する

変異のHGVS表記の相互変換(Ensembl variant_recoder)のオプション指定と統一するため、下記のようなコードをpythonで作成

Pythonのrequests モジュールを予めインストールしておいてください

variant_recoder_rest.py
import argparse
import json
import requests

server_grch37 = "https://grch37.rest.ensembl.org"
server_grch38 = "https://rest.ensembl.org"
variant_recoder_human = "/variant_recoder/human"


def variant_recoder(input_data, grch37, pretty):
    server = server_grch38 if not grch37 else server_grch37
    api_url = server + variant_recoder_human + '/' + input_data
    result = requests.get(api_url, headers={"Content-Type": "application/json"})
    if result.status_code != requests.codes.ok:
        header = result.headers
        ratelimit_remaining = int(header['X-RateLimit-Remaining'])
        if ratelimit_remaining == 0:
            print("You are not allowed to submit any request.")
            raise ValueError(
                "You've reached the rate limit of api. Please wait {} seconds.".format(header['X-RateLimit-Reset']))
        raise ValueError(result.text)

    if pretty:
        print(json.dumps(result.json(), indent=4))
    else:
        print(json.dumps(result.json()))


def parse_args():
    parser = argparse.ArgumentParser(description='vep recoder with ensembl vep API.')
    parser.add_argument('--input_data', nargs=1, required=True,
                        help='hgvs string to convert')
    parser.add_argument('--grch37', action='store_true',
                        help='Set genome version to GRCh37. (Default version is GRCh38)')
    parser.add_argument('--pretty', action='store_true', help="Prettify JSON result")
    return parser.parse_args()


if __name__ == "__main__":
    args = parse_args()
    input_data = args.input_data[0]
    grch37 = args.grch37
    pretty = args.pretty
    variant_recoder(input_data=input_data, grch37=grch37, pretty=pretty)

Bug Fix(2019/12/18)

前回同様指定可能なオプションは下記の通り

オプション 説明 コメント
--input_data 変換元のhgvsで表記された文字列 ダブルクオーテーション又はシングルクオーテーションで区切る
--grch37 ヒトゲノムのバージョン指定 無指定であればGRCh38(hg38)が利用される
--pretty JSON表記を見やすく整形表示する

GRCh37で取得

変異のhgvs表記'NM_000603.4:c.894T>G'から変換結果を取得してみる

$ python variant_recoder_rest.py \
    --input_data 'NM_000603.4:c.894T>G' \
    --grch37 \
    --pretty

結果は下記の通り

[
    {
        "hgvsp": [
            "ENSP00000297494.3:p.Asp298Glu",
            "ENSP00000417143.1:p.Asp92Glu",
            "ENSP00000420551.1:p.Asp298Glu",
            "ENSP00000420215.1:p.Asp298Glu",
            "NP_000594.2:p.Asp298Glu",
            "NP_001153581.1:p.Asp298Glu",
            "NP_001153582.1:p.Asp298Glu",
            "NP_001153583.1:p.Asp298Glu"
        ],
        "id": [
            "rs1799983",
            "CM981388"
        ],
        "hgvsg": [
            "NC_000007.13:g.150696111T>G"
        ],
        "hgvsc": [
            "ENST00000297494.3:c.894T>G",
            "ENST00000461406.1:c.276T>G",
            "ENST00000467517.1:c.894T>G",
            "ENST00000484524.1:c.894T>G",
            "NM_000603.4:c.894T>G",
            "NM_001160109.1:c.894T>G",
            "NM_001160110.1:c.894T>G",
            "NM_001160111.1:c.894T>G"
        ],
        "input": "NM_000603.4:c.894T>G"
    }
]

GRCh38で取得

--grch37 オプションを省略した場合にはGRCh38上での変換結果が出力される

$ python variant_recoder_rest.py \
    --input_data 'NM_000603.4:c.894T>G' \
    --pretty
[
    {
        "hgvsc": [
            "ENST00000297494.7:c.894T>G", 
            "ENST00000461406.5:c.276T>G", 
            "ENST00000467517.1:c.894T>G", 
            "ENST00000484524.5:c.894T>G", 
            "NM_000603.4:c.894T>G", 
            "NM_001160109.1:c.894T>G", 
            "NM_001160110.1:c.894T>G", 
            "NM_001160111.1:c.894T>G"
        ], 
        "input": "NM_000603.4:c.894T>G", 
        "hgvsp": [
            "ENSP00000297494.3:p.Asp298Glu", 
            "ENSP00000417143.1:p.Asp92Glu", 
            "ENSP00000420551.1:p.Asp298Glu", 
            "ENSP00000420215.1:p.Asp298Glu", 
            "cds46555.1:p.Asp298Glu", 
            "cds46558.1:p.Asp298Glu", 
            "cds46557.1:p.Asp298Glu", 
            "cds46556.1:p.Asp298Glu"
        ], 
        "hgvsg": [
            "NC_000007.14:g.150999023T>G"
        ], 
        "id": [
            "rs1799983", 
            "CM981388"
        ]
    }
]

hgvsg の結果をみると 

  • GRCh37
    • "NC_000007.13:g.150696111T>G"
  • GRCh38
    • "NC_000007.14:g.150999023T>G"

となっていて、指定したゲノム上で変換が行われている事が確認できる

単位時間あたりのリクエストの制限

EnsemblのAPIの利用には単位時間あたりの利用制限が設けられている
通常、上限が3600秒間に55000回となっているので、リミットに達する事は無いと思えるけれど、もしリミットに達してしまった場合には何秒間待てば再開できるかという情報を表示してエラーとなる

今回はここまで:smile:

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?