0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 1 year has passed since last update.

OrthoFinder2で検出したsingle-copy orthologsのFASTAエントリを一括改名する

Posted at

動機

系統樹を描くのに使うsingle-copy orthologsをOrthoFinder2で選抜した。
このときSingle_Copy_Orthologue_Sequencesフォルダに OG0000001.faのような名前のFASTA形式ファイルが保存される。

FASTAファイル内の各配列のタイトルはタンパク質そのもののIDとなっている。

>BDT62344.1
MENNSHDNVINTPFIDDDNVKNDIFINTSDDDDNNNSDNDSKGNSNNNSSNSSSSSSSSN
SSTTSEDIDNIDLDDYKSVLLLCKEEKIYNNQKNERVENKQFYPNKKKRKLNNIESPQPQ
PSLLSRLPTLSSSSSSSSSSSPSSPPPPSSPTLSVPPPSDDQPPLLITNCKEVVDIIYRH
ERQVMSDSELYLGYASTKMPKEFVECLLMFRDEMTLILQEYARLRRSQQLLGNFNTNSIN
YADEMVKKMIDIILRIDKNSMSVDKYRDVVREALYIYHLVVSKMTGPKHLKRLRTPELHF
DFCMLIALLSHNVENVKNISTKYRVSSIIQFVSALDCQWYLSVVPSILSVFNRSNSICHA
LSFSYMRHAQVNITLCLVFALTETNTTNLVLGVILYLFPESLESIKTNELIKDESLIIPL
CRKIKEHLRSQWIDRMDITNAAYLLLGSCTDGLDTIKVFKKHNYDKKIVAVSIMIQQKLK
QLNQSICFH
>BDT62474.1
MFDHFDPEHFLIVYSPIAFLSIITRNFALIREMSLQDDLYLSTSSDEIEDDNESEEEEDD
DDDYDDNGDSVNDKDDDDDVEFDPDKTNSTVLLTHVPRKGTYVSRIAHDESDSRGCSKSV
ISDTDSGSQTSSVSQSNEHRWRMSLRHQRKRFRHNNFDHHVRTTSPQMVVLSQPMNVLGA
LANEERTAMEVAEITLTQGSLAMSEPDREALLTFYREIKTIISAYLSLVHIQRTLNNYNT
NSINYPEGMVKKILSIIRQIPRHEMSMEKYNVVCRDALFLYYTIITRMTGPKHSKRLRTP
YWQFYFCGVLAMLVNDVPVASDLTVSGKETSLVQFASAVGNPAYQTAVHDISSVYNSSYS
VYKALGLSRSQLTDANMVLAILSARNTHLSDRKPRTMAQSALLYRNPDLIDRMRASGLVQ
DESSLGSTSRAVAAQLRVAGVSNQTLDDASHFLHGSYNQEGVTLRCFGSGQRDLKTVAAS
VLVTEDLRRRIRTTW
>BDV49871.1
MKYIFSDDDDDDNSTISNSNSSSNKSSDNSDIEDDEDDDNDEDLDNYQSILLMSKNDTNH
IIEKKFEHDNSIHSNNIDNKNIPFCSSFDMKETLEIKHDNTSKKNITYEYDSSNNRKDTE
EGQEDKRGPLLITNPKEVVDEILREEKKTSSETIIGYALSKYSKESVESLSMFKEEIILI
LQMYAKMKCRQQLLGNFNTNSTNYSEEMVKKMIEIISRIDKNSMALDKYRDVVREALYIY
HLVVSKMTGPKHLKRLRTPDLHFDFCVIVALLAHNIEVNKSNFSKYRATSLIQFVSALDC
QWYLSVVPNILSVFNSSSSICQIIGFSALKHAKINIILCLLFNIFEKKPNNLVLATILYL
YPESLNVIKQNGLIRDEASVIALSKKIKIHLDDYWIYQSDIHCAANLLIGQWTEGLNIIK
LFKKQHYDKKVIAISLLVQQRLKQLNQTVEYY

系統解析する前に、配列名をもとの生物種名 (=input FASTAファイル名のbasename部分) に変更する必要が生じた。

>MelaMJNV
MENNSHDNVINTPFIDDDNVKNDIFINTSDDDDNNNSDNDSKGNSNNNSSNSSSSSSSSN
SSTTSEDIDNIDLDDYKSVLLLCKEEKIYNNQKNERVENKQFYPNKKKRKLNNIESPQPQ
PSLLSRLPTLSSSSSSSSSSSPSSPPPPSSPTLSVPPPSDDQPPLLITNCKEVVDIIYRH
ERQVMSDSELYLGYASTKMPKEFVECLLMFRDEMTLILQEYARLRRSQQLLGNFNTNSIN
YADEMVKKMIDIILRIDKNSMSVDKYRDVVREALYIYHLVVSKMTGPKHLKRLRTPELHF
DFCMLIALLSHNVENVKNISTKYRVSSIIQFVSALDCQWYLSVVPSILSVFNRSNSICHA
LSFSYMRHAQVNITLCLVFALTETNTTNLVLGVILYLFPESLESIKTNELIKDESLIIPL
CRKIKEHLRSQWIDRMDITNAAYLLLGSCTDGLDTIKVFKKHNYDKKIVAVSIMIQQKLK
QLNQSICFH
>MelaPMNV
MFDHFDPEHFLIVYSPIAFLSIITRNFALIREMSLQDDLYLSTSSDEIEDDNESEEEEDD
DDDYDDNGDSVNDKDDDDDVEFDPDKTNSTVLLTHVPRKGTYVSRIAHDESDSRGCSKSV
ISDTDSGSQTSSVSQSNEHRWRMSLRHQRKRFRHNNFDHHVRTTSPQMVVLSQPMNVLGA
LANEERTAMEVAEITLTQGSLAMSEPDREALLTFYREIKTIISAYLSLVHIQRTLNNYNT
NSINYPEGMVKKILSIIRQIPRHEMSMEKYNVVCRDALFLYYTIITRMTGPKHSKRLRTP
YWQFYFCGVLAMLVNDVPVASDLTVSGKETSLVQFASAVGNPAYQTAVHDISSVYNSSYS
VYKALGLSRSQLTDANMVLAILSARNTHLSDRKPRTMAQSALLYRNPDLIDRMRASGLVQ
DESSLGSTSRAVAAQLRVAGVSNQTLDDASHFLHGSYNQEGVTLRCFGSGQRDLKTVAAS
VLVTEDLRRRIRTTW
>MellatMJNV
MKYIFSDDDDDDNSTISNSNSSSNKSSDNSDIEDDEDDDNDEDLDNYQSILLMSKNDTNH
IIEKKFEHDNSIHSNNIDNKNIPFCSSFDMKETLEIKHDNTSKKNITYEYDSSNNRKDTE
EGQEDKRGPLLITNPKEVVDEILREEKKTSSETIIGYALSKYSKESVESLSMFKEEIILI
LQMYAKMKCRQQLLGNFNTNSTNYSEEMVKKMIEIISRIDKNSMALDKYRDVVREALYIY
HLVVSKMTGPKHLKRLRTPDLHFDFCVIVALLAHNIEVNKSNFSKYRATSLIQFVSALDC
QWYLSVVPNILSVFNSSSSICQIIGFSALKHAKINIILCLLFNIFEKKPNNLVLATILYL
YPESLNVIKQNGLIRDEASVIALSKKIKIHLDDYWIYQSDIHCAANLLIGQWTEGLNIIK
LFKKQHYDKKVIAISLLVQQRLKQLNQTVEYY

rename_single_copy_orthologs.py

Requirements

Usage

Outputディレクトリは事前に作成する必要がある。

 ./rename_single_copy_orthologs.py
usage: rename_single_copy_orthologs.py [-h] --input DIR --output DIR -t ORTHOGROUPS -s SINGLE_COPY_ORTHOLOGUES

Rename single-copy orthologue sequences identified by OrthoFinder2

optional arguments:
  -h, --help            show this help message and exit
  --input DIR, -i DIR, --in DIR
                        Single_Copy_Orthologue_Sequences directory
  --output DIR, -o DIR, --out DIR
                        output directory
  -t ORTHOGROUPS, --orthogroups ORTHOGROUPS
                        Orthogroups.tsv
  -s SINGLE_COPY_ORTHOLOGUES, --single_copy_orthologues SINGLE_COPY_ORTHOLOGUES
                        Orthogroups_SingleCopyOrthologues.txt

テーブルを複数指定する必要があるなど若干ぎこちない点があるが、問題なく作動するためよしとする。

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?