bioinformatics

UCSCとEnsemblとNCBIで染色体名(Chromosome name)を比較する(GRCh38/hg38)

いっつもウガウガするので何とかしようと思い立った。いちどTableとしてもっておかないと無理。

Table

CM000663.2  1   chr1    1   NC_000001.11    248956422
CM000664.2  2   chr2    2   NC_000002.12    242193529
CM000665.2  3   chr3    3   NC_000003.12    198295559
CM000666.2  4   chr4    4   NC_000004.12    190214555
CM000667.2  5   chr5    5   NC_000005.10    181538259
CM000668.2  6   chr6    6   NC_000006.12    170805979
CM000669.2  7   chr7    7   NC_000007.14    159345973
CM000670.2  8   chr8    8   NC_000008.11    145138636
CM000671.2  9   chr9    9   NC_000009.12    138394717
CM000672.2  10  chr10   10  NC_000010.11    133797422
CM000673.2  11  chr11   11  NC_000011.10    135086622
CM000674.2  12  chr12   12  NC_000012.12    133275309
CM000675.2  13  chr13   13  NC_000013.11    114364328
CM000676.2  14  chr14   14  NC_000014.9 107043718
CM000677.2  15  chr15   15  NC_000015.10    101991189
CM000678.2  16  chr16   16  NC_000016.10    90338345
CM000679.2  17  chr17   17  NC_000017.11    83257441
CM000680.2  18  chr18   18  NC_000018.10    80373285
CM000681.2  19  chr19   19  NC_000019.10    58617616
CM000682.2  20  chr20   20  NC_000020.11    64444167
CM000683.2  21  chr21   21  NC_000021.9 46709983
CM000684.2  22  chr22   22  NC_000022.11    50818468
CM000685.2  X   chrX    X   NC_000023.11    156040895
CM000686.2  Y   chrY    Y   NC_000024.10    57227415
GL000008.2  GL000008.2  chr4_GL000008v2_random  HSCHR4_RANDOM_CTG4  NT_113793.3 209709
GL000009.2  GL000009.2  chr14_GL000009v2_random HSCHR14_CTG1_UNLOCALIZED    NT_113796.3 201709
GL000194.1  GL000194.1  chr14_GL000194v1_random HSCHR14_CTG4_UNLOCALIZED    NT_113888.1 191469
GL000195.1  GL000195.1  chrUn_GL000195v1    HSCHRUN_RANDOM_CTG1 NT_113901.1 182896
GL000205.2  GL000205.2  chr17_GL000205v2_random HSCHR17_RANDOM_CTG3 NT_113930.2 185591
GL000208.1  CHR_HSCHR5_RANDOM_CTG1  chr5_GL000208v1_random  HSCHR5_RANDOM_CTG1  NT_113948.1 92689
GL000209.2  CHR_HSCHR19KIR_RP5_B_HAP_CTG3_1 chr19_GL000209v2_alt    HSCHR19KIR_RP5_B_HAP_CTG3_1 NT_113949.2 177381
GL000213.1  GL000213.1  chrUn_GL000213v1    HSCHRUN_RANDOM_CTG2 NT_167208.1 164239
GL000214.1  CHR_HSCHRUN_RANDOM_CTG4 chrUn_GL000214v1    HSCHRUN_RANDOM_CTG4 NT_167209.1 137718
GL000216.2  GL000216.2  chrUn_GL000216v2    HSCHRUN_RANDOM_CTG6 NT_167211.2 176608
GL000218.1  GL000218.1  chrUn_GL000218v1    HSCHRUN_RANDOM_CTG9 NT_113889.1 161147
GL000219.1  GL000219.1  chrUn_GL000219v1    HSCHRUN_RANDOM_CTG10    NT_167213.1 179198
GL000220.1  GL000220.1  chrUn_GL000220v1    HSCHRUN_RANDOM_CTG11    NT_167214.1 161802
GL000221.1  CHR_HSCHR3UN_CTG2   chr3_GL000221v1_random  HSCHR3UN_CTG2   NT_167215.1 155397
GL000224.1  GL000224.1  chrUn_GL000224v1    HSCHRUN_RANDOM_CTG16    NT_167218.1 179693
GL000225.1  GL000225.1  chr14_GL000225v1_random HSCHR14_CTG2_UNLOCALIZED    NT_167219.1 211173
GL000226.1  CHR_HSCHRUN_RANDOM_CTG19    chrUn_GL000226v1    HSCHRUN_RANDOM_CTG19    NT_167220.1 15008
GL000250.2  CHR_HSCHR6_MHC_APD_CTG1 chr6_GL000250v2_alt HSCHR6_MHC_APD_CTG1 NT_167244.2 4672374
GL000251.2  CHR_HSCHR6_MHC_COX_CTG1 chr6_GL000251v2_alt HSCHR6_MHC_COX_CTG1 NT_113891.3 4795265
GL000252.2  CHR_HSCHR6_MHC_DBB_CTG1 chr6_GL000252v2_alt HSCHR6_MHC_DBB_CTG1 NT_167245.2 4604811
GL000253.2  CHR_HSCHR6_MHC_MANN_CTG1    chr6_GL000253v2_alt HSCHR6_MHC_MANN_CTG1    NT_167246.2 4677643
GL000254.2  CHR_HSCHR6_MHC_MCF_CTG1 chr6_GL000254v2_alt HSCHR6_MHC_MCF_CTG1 NT_167247.2 4827813
GL000255.2  CHR_HSCHR6_MHC_QBL_CTG1 chr6_GL000255v2_alt HSCHR6_MHC_QBL_CTG1 NT_167248.2 4606388
GL000256.2  CHR_HSCHR6_MHC_SSTO_CTG1    chr6_GL000256v2_alt HSCHR6_MHC_SSTO_CTG1    NT_167249.2 4929269
GL000257.2  CHR_HSCHR4_1_CTG9   chr4_GL000257v2_alt HSCHR4_1_CTG9   NT_167250.2 586476
GL000258.2  CHR_HSCHR17_1_CTG5  chr17_GL000258v2_alt    HSCHR17_1_CTG5  NT_167251.2 1821992
GL339449.2  CHR_HSCHR5_2_CTG1_1 chr5_GL339449v2_alt HSCHR5_2_CTG1_1 NW_003315917.2  1612928
GL383518.1  CHR_HSCHR1_1_CTG31  chr1_GL383518v1_alt HSCHR1_1_CTG31  NW_003315905.1  182439
GL383519.1  CHR_HSCHR1_2_CTG31  chr1_GL383519v1_alt HSCHR1_2_CTG31  NW_003315906.1  110268
GL383520.2  CHR_HSCHR1_3_CTG31  chr1_GL383520v2_alt HSCHR1_3_CTG31  NW_003315907.2  366580
GL383521.1  CHR_HSCHR2_1_CTG5   chr2_GL383521v1_alt HSCHR2_1_CTG5   NW_003315908.1  143390
GL383522.1  CHR_HSCHR2_1_CTG7_2 chr2_GL383522v1_alt HSCHR2_1_CTG7_2 NW_003315909.1  123821
GL383526.1  CHR_HSCHR3_1_CTG2_1 chr3_GL383526v1_alt HSCHR3_1_CTG2_1 NW_003315913.1  180671
GL383527.1  CHR_HSCHR4_1_CTG12  chr4_GL383527v1_alt HSCHR4_1_CTG12  NW_003315914.1  164536
GL383528.1  CHR_HSCHR4_1_CTG6   chr4_GL383528v1_alt HSCHR4_1_CTG6   NW_003315915.1  376187
GL383530.1  CHR_HSCHR5_3_CTG1_1 chr5_GL383530v1_alt HSCHR5_3_CTG1_1 NW_003315918.1  101241
GL383531.1  CHR_HSCHR5_1_CTG5   chr5_GL383531v1_alt HSCHR5_1_CTG5   NW_003315919.1  173459
GL383532.1  CHR_HSCHR5_1_CTG1   chr5_GL383532v1_alt HSCHR5_1_CTG1   NW_003315920.1  82728
GL383533.1  CHR_HSCHR6_1_CTG2   chr6_GL383533v1_alt HSCHR6_1_CTG2   NW_003315921.1  124736
GL383534.2  CHR_HSCHR7_1_CTG6   chr7_GL383534v2_alt HSCHR7_1_CTG6   NW_003315922.2  119183
GL383539.1  CHR_HSCHR9_1_CTG1   chr9_GL383539v1_alt HSCHR9_1_CTG1   NW_003315928.1  162988
GL383540.1  CHR_HSCHR9_1_CTG2   chr9_GL383540v1_alt HSCHR9_1_CTG2   NW_003315929.1  71551
GL383541.1  CHR_HSCHR9_1_CTG3   chr9_GL383541v1_alt HSCHR9_1_CTG3   NW_003315930.1  171286
GL383542.1  CHR_HSCHR9_1_CTG4   chr9_GL383542v1_alt HSCHR9_1_CTG4   NW_003315931.1  60032
GL383545.1  CHR_HSCHR10_1_CTG1  chr10_GL383545v1_alt    HSCHR10_1_CTG1  NW_003315934.1  179254
GL383546.1  CHR_HSCHR10_1_CTG2  chr10_GL383546v1_alt    HSCHR10_1_CTG2  NW_003315935.1  309802
GL383547.1  CHR_HSCHR11_1_CTG1_1    chr11_GL383547v1_alt    HSCHR11_1_CTG1_1    NW_003315936.1  154407
GL383549.1  CHR_HSCHR12_1_CTG2  chr12_GL383549v1_alt    HSCHR12_1_CTG2  NW_003315938.1  120804
GL383550.2  CHR_HSCHR12_1_CTG2_1    chr12_GL383550v2_alt    HSCHR12_1_CTG2_1    NW_003315939.2  169178
GL383551.1  CHR_HSCHR12_4_CTG2_1    chr12_GL383551v1_alt    HSCHR12_4_CTG2_1    NW_003315940.1  184319
GL383552.1  CHR_HSCHR12_2_CTG2_1    chr12_GL383552v1_alt    HSCHR12_2_CTG2_1    NW_003315941.1  138655
GL383553.2  CHR_HSCHR12_3_CTG2_1    chr12_GL383553v2_alt    HSCHR12_3_CTG2_1    NW_003315942.2  152874
GL383554.1  CHR_HSCHR15_1_CTG8  chr15_GL383554v1_alt    HSCHR15_1_CTG8  NW_003315943.1  296527
GL383555.2  CHR_HSCHR15_2_CTG8  chr15_GL383555v2_alt    HSCHR15_2_CTG8  NW_003315944.2  388773
GL383556.1  CHR_HSCHR16_1_CTG3_1    chr16_GL383556v1_alt    HSCHR16_1_CTG3_1    NW_003315945.1  192462
GL383557.1  CHR_HSCHR16_2_CTG3_1    chr16_GL383557v1_alt    HSCHR16_2_CTG3_1    NW_003315946.1  89672
GL383563.3  CHR_HSCHR17_1_CTG1  chr17_GL383563v3_alt    HSCHR17_1_CTG1  NW_003315952.3  375691
GL383564.2  CHR_HSCHR17_1_CTG4  chr17_GL383564v2_alt    HSCHR17_1_CTG4  NW_003315953.2  133151
GL383565.1  CHR_HSCHR17_2_CTG4  chr17_GL383565v1_alt    HSCHR17_2_CTG4  NW_003315954.1  223995
GL383566.1  CHR_HSCHR17_3_CTG4  chr17_GL383566v1_alt    HSCHR17_3_CTG4  NW_003315955.1  90219
GL383567.1  CHR_HSCHR18_1_CTG1_1    chr18_GL383567v1_alt    HSCHR18_1_CTG1_1    NW_003315956.1  289831
GL383568.1  CHR_HSCHR18_1_CTG2  chr18_GL383568v1_alt    HSCHR18_1_CTG2  NW_003315957.1  104552
GL383569.1  CHR_HSCHR18_1_CTG2_1    chr18_GL383569v1_alt    HSCHR18_1_CTG2_1    NW_003315958.1  167950
GL383570.1  CHR_HSCHR18_2_CTG1_1    chr18_GL383570v1_alt    HSCHR18_2_CTG1_1    NW_003315959.1  164789
GL383571.1  CHR_HSCHR18_2_CTG2  chr18_GL383571v1_alt    HSCHR18_2_CTG2  NW_003315960.1  198278
GL383572.1  CHR_HSCHR18_2_CTG2_1    chr18_GL383572v1_alt    HSCHR18_2_CTG2_1    NW_003315961.1  159547
GL383573.1  CHR_HSCHR19_1_CTG2  chr19_GL383573v1_alt    HSCHR19_1_CTG2  NW_003315962.1  385657
GL383574.1  CHR_HSCHR19_1_CTG3_1    chr19_GL383574v1_alt    HSCHR19_1_CTG3_1    NW_003315963.1  155864
GL383575.2  CHR_HSCHR19_2_CTG2  chr19_GL383575v2_alt    HSCHR19_2_CTG2  NW_003315964.2  170222
GL383576.1  CHR_HSCHR19_3_CTG2  chr19_GL383576v1_alt    HSCHR19_3_CTG2  NW_003315965.1  188024
GL383577.2  CHR_HSCHR20_1_CTG1  chr20_GL383577v2_alt    HSCHR20_1_CTG1  NW_003315966.2  128386
GL383578.2  CHR_HSCHR21_1_CTG1_1    chr21_GL383578v2_alt    HSCHR21_1_CTG1_1    NW_003315967.2  63917
GL383579.2  CHR_HSCHR21_2_CTG1_1    chr21_GL383579v2_alt    HSCHR21_2_CTG1_1    NW_003315968.2  201197
GL383580.2  CHR_HSCHR21_3_CTG1_1    chr21_GL383580v2_alt    HSCHR21_3_CTG1_1    NW_003315969.2  74653
GL383581.2  CHR_HSCHR21_4_CTG1_1    chr21_GL383581v2_alt    HSCHR21_4_CTG1_1    NW_003315970.2  116689
GL383582.2  CHR_HSCHR22_1_CTG1  chr22_GL383582v2_alt    HSCHR22_1_CTG1  NW_003315971.2  162811
GL383583.2  CHR_HSCHR22_1_CTG2  chr22_GL383583v2_alt    HSCHR22_1_CTG2  NW_003315972.2  96924
GL582966.2  CHR_HSCHR2_2_CTG7_2 chr2_GL582966v2_alt HSCHR2_2_CTG7_2 NW_003571033.2  96131
GL877875.1  CHR_HSCHR12_1_CTG1  chr12_GL877875v1_alt    HSCHR12_1_CTG1  NW_003571049.1  167313
GL877876.1  CHR_HSCHR12_2_CTG2  chr12_GL877876v1_alt    HSCHR12_2_CTG2  NW_003571050.1  408271
GL949742.1  CHR_HSCHR5_2_CTG1   chr5_GL949742v1_alt HSCHR5_2_CTG1   NW_003571036.1  226852
GL949746.1  CHR_HSCHR19LRC_COX1_CTG3_1  chr19_GL949746v1_alt    HSCHR19LRC_COX1_CTG3_1  NW_003571054.1  987716
GL949747.2  CHR_HSCHR19LRC_COX2_CTG3_1  chr19_GL949747v2_alt    HSCHR19LRC_COX2_CTG3_1  NW_003571055.2  729520
GL949748.2  CHR_HSCHR19LRC_LRC_I_CTG3_1 chr19_GL949748v2_alt    HSCHR19LRC_LRC_I_CTG3_1 NW_003571056.2  1064304
GL949749.2  CHR_HSCHR19LRC_LRC_J_CTG3_1 chr19_GL949749v2_alt    HSCHR19LRC_LRC_J_CTG3_1 NW_003571057.2  1091841
GL949750.2  CHR_HSCHR19LRC_LRC_S_CTG3_1 chr19_GL949750v2_alt    HSCHR19LRC_LRC_S_CTG3_1 NW_003571058.2  1066390
GL949751.2  CHR_HSCHR19LRC_LRC_T_CTG3_1 chr19_GL949751v2_alt    HSCHR19LRC_LRC_T_CTG3_1 NW_003571059.2  1002683
GL949752.1  CHR_HSCHR19LRC_PGF1_CTG3_1  chr19_GL949752v1_alt    HSCHR19LRC_PGF1_CTG3_1  NW_003571060.1  987100
GL949753.2  CHR_HSCHR19LRC_PGF2_CTG3_1  chr19_GL949753v2_alt    HSCHR19LRC_PGF2_CTG3_1  NW_003571061.2  796479
J01415.2    MT  chrM    MT  NC_012920.1 16569
JH159136.1  HG142_HG150_NOVEL_TEST  chr11_JH159136v1_alt    HG142_HG150_NOVEL_TEST  NW_003871073.1  200998
JH159137.1  HG151_NOVEL_TEST    chr11_JH159137v1_alt    HG151_NOVEL_TEST    NW_003871074.1  191409
JH159146.1  CHR_HSCHR17_4_CTG4  chr17_JH159146v1_alt    HSCHR17_4_CTG4  NW_003871091.1  278131
JH159147.1  CHR_HSCHR17_5_CTG4  chr17_JH159147v1_alt    HSCHR17_5_CTG4  NW_003871092.1  70345
JH159148.1  CHR_HSCHR17_6_CTG4  chr17_JH159148v1_alt    HSCHR17_6_CTG4  NW_003871093.1  88070
JH636055.2  CHR_HSCHR3_1_CTG1   chr3_JH636055v2_alt HSCHR3_1_CTG1   NW_003871060.2  173151
KB021644.2  CHR_HSCHR6_1_CTG3   chr6_KB021644v2_alt HSCHR6_1_CTG3   NW_004166862.2  185823
KB663609.1  CHR_HSCHR22_2_CTG1  chr22_KB663609v1_alt    HSCHR22_2_CTG1  NW_004504305.1  74013
KI270302.1  CHR_HSCHRUN_RANDOM_100  chrUn_KI270302v1    HSCHRUN_RANDOM_100  NT_187396.1 2274
KI270303.1  CHR_HSCHRUN_RANDOM_102  chrUn_KI270303v1    HSCHRUN_RANDOM_102  NT_187398.1 1942
KI270304.1  CHR_HSCHRUN_RANDOM_101  chrUn_KI270304v1    HSCHRUN_RANDOM_101  NT_187397.1 2165
KI270305.1  CHR_HSCHRUN_RANDOM_103  chrUn_KI270305v1    HSCHRUN_RANDOM_103  NT_187399.1 1472
KI270310.1  CHR_HSCHRUN_RANDOM_106  chrUn_KI270310v1    HSCHRUN_RANDOM_106  NT_187402.1 1201
KI270311.1  CHR_HSCHRUN_RANDOM_110  chrUn_KI270311v1    HSCHRUN_RANDOM_110  NT_187406.1 12399
KI270312.1  CHR_HSCHRUN_RANDOM_109  chrUn_KI270312v1    HSCHRUN_RANDOM_109  NT_187405.1 998
KI270315.1  CHR_HSCHRUN_RANDOM_108  chrUn_KI270315v1    HSCHRUN_RANDOM_108  NT_187404.1 2276
KI270316.1  CHR_HSCHRUN_RANDOM_107  chrUn_KI270316v1    HSCHRUN_RANDOM_107  NT_187403.1 1444
KI270317.1  CHR_HSCHRUN_RANDOM_111  chrUn_KI270317v1    HSCHRUN_RANDOM_111  NT_187407.1 37690
KI270320.1  CHR_HSCHRUN_RANDOM_105  chrUn_KI270320v1    HSCHRUN_RANDOM_105  NT_187401.1 4416
KI270322.1  CHR_HSCHRUN_RANDOM_104  chrUn_KI270322v1    HSCHRUN_RANDOM_104  NT_187400.1 21476
KI270329.1  CHR_HSCHRUN_RANDOM_167  chrUn_KI270329v1    HSCHRUN_RANDOM_167  NT_187459.1 1040
KI270330.1  CHR_HSCHRUN_RANDOM_166  chrUn_KI270330v1    HSCHRUN_RANDOM_166  NT_187458.1 1652
KI270333.1  CHR_HSCHRUN_RANDOM_169  chrUn_KI270333v1    HSCHRUN_RANDOM_169  NT_187461.1 2699
KI270334.1  CHR_HSCHRUN_RANDOM_168  chrUn_KI270334v1    HSCHRUN_RANDOM_168  NT_187460.1 1368
KI270335.1  CHR_HSCHRUN_RANDOM_170  chrUn_KI270335v1    HSCHRUN_RANDOM_170  NT_187462.1 1048
KI270336.1  CHR_HSCHRUN_RANDOM_173  chrUn_KI270336v1    HSCHRUN_RANDOM_173  NT_187465.1 1026
KI270337.1  CHR_HSCHRUN_RANDOM_174  chrUn_KI270337v1    HSCHRUN_RANDOM_174  NT_187466.1 1121
KI270338.1  CHR_HSCHRUN_RANDOM_171  chrUn_KI270338v1    HSCHRUN_RANDOM_171  NT_187463.1 1428
KI270340.1  CHR_HSCHRUN_RANDOM_172  chrUn_KI270340v1    HSCHRUN_RANDOM_172  NT_187464.1 1428
KI270362.1  CHR_HSCHRUN_RANDOM_177  chrUn_KI270362v1    HSCHRUN_RANDOM_177  NT_187469.1 3530
KI270363.1  CHR_HSCHRUN_RANDOM_175  chrUn_KI270363v1    HSCHRUN_RANDOM_175  NT_187467.1 1803
KI270364.1  CHR_HSCHRUN_RANDOM_176  chrUn_KI270364v1    HSCHRUN_RANDOM_176  NT_187468.1 2855
KI270366.1  CHR_HSCHRUN_RANDOM_178  chrUn_KI270366v1    HSCHRUN_RANDOM_178  NT_187470.1 8320
KI270371.1  CHR_HSCHRUN_RANDOM_202  chrUn_KI270371v1    HSCHRUN_RANDOM_202  NT_187494.1 2805
KI270372.1  CHR_HSCHRUN_RANDOM_199  chrUn_KI270372v1    HSCHRUN_RANDOM_199  NT_187491.1 1650
KI270373.1  CHR_HSCHRUN_RANDOM_200  chrUn_KI270373v1    HSCHRUN_RANDOM_200  NT_187492.1 1451
KI270374.1  CHR_HSCHRUN_RANDOM_198  chrUn_KI270374v1    HSCHRUN_RANDOM_198  NT_187490.1 2656
KI270375.1  CHR_HSCHRUN_RANDOM_201  chrUn_KI270375v1    HSCHRUN_RANDOM_201  NT_187493.1 2378
KI270376.1  CHR_HSCHRUN_RANDOM_197  chrUn_KI270376v1    HSCHRUN_RANDOM_197  NT_187489.1 1136
KI270378.1  CHR_HSCHRUN_RANDOM_179  chrUn_KI270378v1    HSCHRUN_RANDOM_179  NT_187471.1 1048
KI270379.1  CHR_HSCHRUN_RANDOM_180  chrUn_KI270379v1    HSCHRUN_RANDOM_180  NT_187472.1 1045
KI270381.1  CHR_HSCHRUN_RANDOM_194  chrUn_KI270381v1    HSCHRUN_RANDOM_194  NT_187486.1 1930
KI270382.1  CHR_HSCHRUN_RANDOM_196  chrUn_KI270382v1    HSCHRUN_RANDOM_196  NT_187488.1 4215
KI270383.1  CHR_HSCHRUN_RANDOM_190  chrUn_KI270383v1    HSCHRUN_RANDOM_190  NT_187482.1 1750
KI270384.1  CHR_HSCHRUN_RANDOM_192  chrUn_KI270384v1    HSCHRUN_RANDOM_192  NT_187484.1 1658
KI270385.1  CHR_HSCHRUN_RANDOM_195  chrUn_KI270385v1    HSCHRUN_RANDOM_195  NT_187487.1 990
KI270386.1  CHR_HSCHRUN_RANDOM_188  chrUn_KI270386v1    HSCHRUN_RANDOM_188  NT_187480.1 1788
KI270387.1  CHR_HSCHRUN_RANDOM_183  chrUn_KI270387v1    HSCHRUN_RANDOM_183  NT_187475.1 1537
KI270388.1  CHR_HSCHRUN_RANDOM_186  chrUn_KI270388v1    HSCHRUN_RANDOM_186  NT_187478.1 1216
KI270389.1  CHR_HSCHRUN_RANDOM_181  chrUn_KI270389v1    HSCHRUN_RANDOM_181  NT_187473.1 1298
KI270390.1  CHR_HSCHRUN_RANDOM_182  chrUn_KI270390v1    HSCHRUN_RANDOM_182  NT_187474.1 2387
KI270391.1  CHR_HSCHRUN_RANDOM_189  chrUn_KI270391v1    HSCHRUN_RANDOM_189  NT_187481.1 1484
KI270392.1  CHR_HSCHRUN_RANDOM_193  chrUn_KI270392v1    HSCHRUN_RANDOM_193  NT_187485.1 971
KI270393.1  CHR_HSCHRUN_RANDOM_191  chrUn_KI270393v1    HSCHRUN_RANDOM_191  NT_187483.1 1308
KI270394.1  CHR_HSCHRUN_RANDOM_187  chrUn_KI270394v1    HSCHRUN_RANDOM_187  NT_187479.1 970
KI270395.1  CHR_HSCHRUN_RANDOM_184  chrUn_KI270395v1    HSCHRUN_RANDOM_184  NT_187476.1 1143
KI270396.1  CHR_HSCHRUN_RANDOM_185  chrUn_KI270396v1    HSCHRUN_RANDOM_185  NT_187477.1 1880
KI270411.1  CHR_HSCHRUN_RANDOM_113  chrUn_KI270411v1    HSCHRUN_RANDOM_113  NT_187409.1 2646
KI270412.1  CHR_HSCHRUN_RANDOM_112  chrUn_KI270412v1    HSCHRUN_RANDOM_112  NT_187408.1 1179
KI270414.1  CHR_HSCHRUN_RANDOM_114  chrUn_KI270414v1    HSCHRUN_RANDOM_114  NT_187410.1 2489
KI270417.1  CHR_HSCHRUN_RANDOM_119  chrUn_KI270417v1    HSCHRUN_RANDOM_119  NT_187415.1 2043
KI270418.1  CHR_HSCHRUN_RANDOM_116  chrUn_KI270418v1    HSCHRUN_RANDOM_116  NT_187412.1 2145
KI270419.1  CHR_HSCHRUN_RANDOM_115  chrUn_KI270419v1    HSCHRUN_RANDOM_115  NT_187411.1 1029
KI270420.1  CHR_HSCHRUN_RANDOM_117  chrUn_KI270420v1    HSCHRUN_RANDOM_117  NT_187413.1 2321
KI270422.1  CHR_HSCHRUN_RANDOM_120  chrUn_KI270422v1    HSCHRUN_RANDOM_120  NT_187416.1 1445
KI270423.1  CHR_HSCHRUN_RANDOM_121  chrUn_KI270423v1    HSCHRUN_RANDOM_121  NT_187417.1 981
KI270424.1  CHR_HSCHRUN_RANDOM_118  chrUn_KI270424v1    HSCHRUN_RANDOM_118  NT_187414.1 2140
KI270425.1  CHR_HSCHRUN_RANDOM_122  chrUn_KI270425v1    HSCHRUN_RANDOM_122  NT_187418.1 1884
KI270429.1  CHR_HSCHRUN_RANDOM_123  chrUn_KI270429v1    HSCHRUN_RANDOM_123  NT_187419.1 1361
KI270435.1  CHR_HSCHRUN_RANDOM_128  chrUn_KI270435v1    HSCHRUN_RANDOM_128  NT_187424.1 92983
KI270438.1  CHR_HSCHRUN_RANDOM_129  chrUn_KI270438v1    HSCHRUN_RANDOM_129  NT_187425.1 112505
KI270442.1  KI270442.1  chrUn_KI270442v1    HSCHRUN_RANDOM_124  NT_187420.1 392061
KI270448.1  CHR_HSCHRUN_RANDOM_203  chrUn_KI270448v1    HSCHRUN_RANDOM_203  NT_187495.1 7992
KI270465.1  CHR_HSCHRUN_RANDOM_126  chrUn_KI270465v1    HSCHRUN_RANDOM_126  NT_187422.1 1774
KI270466.1  CHR_HSCHRUN_RANDOM_125  chrUn_KI270466v1    HSCHRUN_RANDOM_125  NT_187421.1 1233
KI270467.1  CHR_HSCHRUN_RANDOM_127  chrUn_KI270467v1    HSCHRUN_RANDOM_127  NT_187423.1 3920
KI270468.1  CHR_HSCHRUN_RANDOM_130  chrUn_KI270468v1    HSCHRUN_RANDOM_130  NT_187426.1 4055
KI270507.1  CHR_HSCHRUN_RANDOM_141  chrUn_KI270507v1    HSCHRUN_RANDOM_141  NT_187437.1 5353
KI270508.1  CHR_HSCHRUN_RANDOM_134  chrUn_KI270508v1    HSCHRUN_RANDOM_134  NT_187430.1 1951
KI270509.1  CHR_HSCHRUN_RANDOM_132  chrUn_KI270509v1    HSCHRUN_RANDOM_132  NT_187428.1 2318
KI270510.1  CHR_HSCHRUN_RANDOM_131  chrUn_KI270510v1    HSCHRUN_RANDOM_131  NT_187427.1 2415
KI270511.1  CHR_HSCHRUN_RANDOM_139  chrUn_KI270511v1    HSCHRUN_RANDOM_139  NT_187435.1 8127
KI270512.1  CHR_HSCHRUN_RANDOM_136  chrUn_KI270512v1    HSCHRUN_RANDOM_136  NT_187432.1 22689
KI270515.1  CHR_HSCHRUN_RANDOM_140  chrUn_KI270515v1    HSCHRUN_RANDOM_140  NT_187436.1 6361
KI270516.1  CHR_HSCHRUN_RANDOM_135  chrUn_KI270516v1    HSCHRUN_RANDOM_135  NT_187431.1 1300
KI270517.1  CHR_HSCHRUN_RANDOM_142  chrUn_KI270517v1    HSCHRUN_RANDOM_142  NT_187438.1 3253
KI270518.1  CHR_HSCHRUN_RANDOM_133  chrUn_KI270518v1    HSCHRUN_RANDOM_133  NT_187429.1 2186
KI270519.1  CHR_HSCHRUN_RANDOM_137  chrUn_KI270519v1    HSCHRUN_RANDOM_137  NT_187433.1 138126
KI270521.1  CHR_HSCHRUN_RANDOM_204  chrUn_KI270521v1    HSCHRUN_RANDOM_204  NT_187496.1 7642
KI270522.1  CHR_HSCHRUN_RANDOM_138  chrUn_KI270522v1    HSCHRUN_RANDOM_138  NT_187434.1 5674
KI270528.1  CHR_HSCHRUN_RANDOM_144  chrUn_KI270528v1    HSCHRUN_RANDOM_144  NT_187440.1 2983
KI270529.1  CHR_HSCHRUN_RANDOM_143  chrUn_KI270529v1    HSCHRUN_RANDOM_143  NT_187439.1 1899
KI270530.1  CHR_HSCHRUN_RANDOM_145  chrUn_KI270530v1    HSCHRUN_RANDOM_145  NT_187441.1 2168
KI270538.1  CHR_HSCHRUN_RANDOM_147  chrUn_KI270538v1    HSCHRUN_RANDOM_147  NT_187443.1 91309
KI270539.1  CHR_HSCHRUN_RANDOM_146  chrUn_KI270539v1    HSCHRUN_RANDOM_146  NT_187442.1 993
KI270544.1  CHR_HSCHRUN_RANDOM_148  chrUn_KI270544v1    HSCHRUN_RANDOM_148  NT_187444.1 1202
KI270548.1  CHR_HSCHRUN_RANDOM_149  chrUn_KI270548v1    HSCHRUN_RANDOM_149  NT_187445.1 1599
KI270579.1  CHR_HSCHRUN_RANDOM_158  chrUn_KI270579v1    HSCHRUN_RANDOM_158  NT_187450.1 31033
KI270580.1  CHR_HSCHRUN_RANDOM_156  chrUn_KI270580v1    HSCHRUN_RANDOM_156  NT_187448.1 1553
KI270581.1  CHR_HSCHRUN_RANDOM_157  chrUn_KI270581v1    HSCHRUN_RANDOM_157  NT_187449.1 7046
KI270582.1  CHR_HSCHRUN_RANDOM_162  chrUn_KI270582v1    HSCHRUN_RANDOM_162  NT_187454.1 6504
KI270583.1  CHR_HSCHRUN_RANDOM_154  chrUn_KI270583v1    HSCHRUN_RANDOM_154  NT_187446.1 1400
KI270584.1  CHR_HSCHRUN_RANDOM_161  chrUn_KI270584v1    HSCHRUN_RANDOM_161  NT_187453.1 4513
KI270587.1  CHR_HSCHRUN_RANDOM_155  chrUn_KI270587v1    HSCHRUN_RANDOM_155  NT_187447.1 2969
KI270588.1  CHR_HSCHRUN_RANDOM_163  chrUn_KI270588v1    HSCHRUN_RANDOM_163  NT_187455.1 6158
KI270589.1  CHR_HSCHRUN_RANDOM_159  chrUn_KI270589v1    HSCHRUN_RANDOM_159  NT_187451.1 44474
KI270590.1  CHR_HSCHRUN_RANDOM_160  chrUn_KI270590v1    HSCHRUN_RANDOM_160  NT_187452.1 4685
KI270591.1  CHR_HSCHRUN_RANDOM_165  chrUn_KI270591v1    HSCHRUN_RANDOM_165  NT_187457.1 5796
KI270593.1  CHR_HSCHRUN_RANDOM_164  chrUn_KI270593v1    HSCHRUN_RANDOM_164  NT_187456.1 3041
KI270706.1  KI270706.1  chr1_KI270706v1_random  HSCHR1_CTG1_UNLOCALIZED NT_187361.1 175055
KI270707.1  KI270707.1  chr1_KI270707v1_random  HSCHR1_CTG2_UNLOCALIZED NT_187362.1 32032
KI270708.1  KI270708.1  chr1_KI270708v1_random  HSCHR1_CTG3_UNLOCALIZED NT_187363.1 127682
KI270709.1  CHR_HSCHR1_CTG4_UNLOCALIZED chr1_KI270709v1_random  HSCHR1_CTG4_UNLOCALIZED NT_187364.1 66860
KI270710.1  CHR_HSCHR1_CTG5_UNLOCALIZED chr1_KI270710v1_random  HSCHR1_CTG5_UNLOCALIZED NT_187365.1 40176
KI270711.1  KI270711.1  chr1_KI270711v1_random  HSCHR1_CTG6_UNLOCALIZED NT_187366.1 42210
KI270712.1  CHR_HSCHR1_CTG7_UNLOCALIZED chr1_KI270712v1_random  HSCHR1_CTG7_UNLOCALIZED NT_187367.1 176043
KI270713.1  KI270713.1  chr1_KI270713v1_random  HSCHR1_CTG8_UNLOCALIZED NT_187368.1 40745
KI270714.1  KI270714.1  chr1_KI270714v1_random  HSCHR1_CTG9_UNLOCALIZED NT_187369.1 41717
KI270715.1  CHR_HSCHR2_RANDOM_CTG1  chr2_KI270715v1_random  HSCHR2_RANDOM_CTG1  NT_187370.1 161471
KI270716.1  CHR_HSCHR2_RANDOM_CTG2  chr2_KI270716v1_random  HSCHR2_RANDOM_CTG2  NT_187371.1 153799
KI270717.1  CHR_HSCHR9_UNLOCALIZED_CTG1 chr9_KI270717v1_random  HSCHR9_UNLOCALIZED_CTG1 NT_187372.1 40062
KI270718.1  CHR_HSCHR9_UNLOCALIZED_CTG2 chr9_KI270718v1_random  HSCHR9_UNLOCALIZED_CTG2 NT_187373.1 38054
KI270719.1  CHR_HSCHR9_UNLOCALIZED_CTG3 chr9_KI270719v1_random  HSCHR9_UNLOCALIZED_CTG3 NT_187374.1 176845
KI270720.1  CHR_HSCHR9_UNLOCALIZED_CTG4 chr9_KI270720v1_random  HSCHR9_UNLOCALIZED_CTG4 NT_187375.1 39050
KI270721.1  KI270721.1  chr11_KI270721v1_random HSCHR11_CTG1_UNLOCALIZED    NT_187376.1 100316
KI270722.1  KI270722.1  chr14_KI270722v1_random HSCHR14_CTG3_UNLOCALIZED    NT_187377.1 194050
KI270723.1  KI270723.1  chr14_KI270723v1_random HSCHR14_CTG5_UNLOCALIZED    NT_187378.1 38115
KI270724.1  KI270724.1  chr14_KI270724v1_random HSCHR14_CTG6_UNLOCALIZED    NT_187379.1 39555
KI270725.1  CHR_HSCHR14_CTG7_UNLOCALIZED    chr14_KI270725v1_random HSCHR14_CTG7_UNLOCALIZED    NT_187380.1 172810
KI270726.1  KI270726.1  chr14_KI270726v1_random HSCHR14_CTG8_UNLOCALIZED    NT_187381.1 43739
KI270727.1  KI270727.1  chr15_KI270727v1_random HSCHR15_RANDOM_CTG1 NT_187382.1 448248
KI270728.1  KI270728.1  chr16_KI270728v1_random HSCHR16_RANDOM_CTG1 NT_187383.1 1872759
KI270729.1  CHR_HSCHR17_RANDOM_CTG4 chr17_KI270729v1_random HSCHR17_RANDOM_CTG4 NT_187384.1 280839
KI270730.1  CHR_HSCHR17_RANDOM_CTG5 chr17_KI270730v1_random HSCHR17_RANDOM_CTG5 NT_187385.1 112551
KI270731.1  KI270731.1  chr22_KI270731v1_random HSCHR22_UNLOCALIZED_CTG1    NT_187386.1 150754
KI270732.1  CHR_HSCHR22_UNLOCALIZED_CTG2    chr22_KI270732v1_random HSCHR22_UNLOCALIZED_CTG2    NT_187387.1 41543
KI270733.1  KI270733.1  chr22_KI270733v1_random HSCHR22_UNLOCALIZED_CTG3    NT_187388.1 179772
KI270734.1  KI270734.1  chr22_KI270734v1_random HSCHR22_UNLOCALIZED_CTG4    NT_187389.1 165050
KI270735.1  CHR_HSCHR22_UNLOCALIZED_CTG5    chr22_KI270735v1_random HSCHR22_UNLOCALIZED_CTG5    NT_187390.1 42811
KI270736.1  CHR_HSCHR22_UNLOCALIZED_CTG6    chr22_KI270736v1_random HSCHR22_UNLOCALIZED_CTG6    NT_187391.1 181920
KI270737.1  CHR_HSCHR22_UNLOCALIZED_CTG7    chr22_KI270737v1_random HSCHR22_UNLOCALIZED_CTG7    NT_187392.1 103838
KI270738.1  CHR_HSCHR22_UNLOCALIZED_CTG8    chr22_KI270738v1_random HSCHR22_UNLOCALIZED_CTG8    NT_187393.1 99375
KI270739.1  CHR_HSCHR22_UNLOCALIZED_CTG9    chr22_KI270739v1_random HSCHR22_UNLOCALIZED_CTG9    NT_187394.1 73985
KI270740.1  CHR_HSCHRY_RANDOM_CTG1  chrY_KI270740v1_random  HSCHRY_RANDOM_CTG1  NT_187395.1 37240
KI270741.1  KI270741.1  chrUn_KI270741v1    HSCHRUN_RANDOM_CTG17    NT_187497.1 157432
KI270742.1  CHR_HSCHRUN_RANDOM_CTG42    chrUn_KI270742v1    HSCHRUN_RANDOM_CTG42    NT_187513.1 186739
KI270743.1  KI270743.1  chrUn_KI270743v1    HSCHRUN_RANDOM_CTG20    NT_187498.1 210658
KI270744.1  KI270744.1  chrUn_KI270744v1    HSCHRUN_RANDOM_CTG21    NT_187499.1 168472
KI270745.1  CHR_HSCHRUN_RANDOM_CTG22    chrUn_KI270745v1    HSCHRUN_RANDOM_CTG22    NT_187500.1 41891
KI270746.1  CHR_HSCHRUN_RANDOM_CTG23    chrUn_KI270746v1    HSCHRUN_RANDOM_CTG23    NT_187501.1 66486
KI270747.1  CHR_HSCHRUN_RANDOM_CTG24    chrUn_KI270747v1    HSCHRUN_RANDOM_CTG24    NT_187502.1 198735
KI270748.1  CHR_HSCHRUN_RANDOM_CTG25    chrUn_KI270748v1    HSCHRUN_RANDOM_CTG25    NT_187503.1 93321
KI270749.1  CHR_HSCHRUN_RANDOM_CTG26    chrUn_KI270749v1    HSCHRUN_RANDOM_CTG26    NT_187504.1 158759
KI270750.1  KI270750.1  chrUn_KI270750v1    HSCHRUN_RANDOM_CTG27    NT_187505.1 148850
KI270751.1  CHR_HSCHRUN_RANDOM_CTG28    chrUn_KI270751v1    HSCHRUN_RANDOM_CTG28    NT_187506.1 150742
KI270752.1  KI270752.1  chrUn_KI270752v1    HSCHRUN_RANDOM_CTG29    NT_187507.1 27745
KI270753.1  CHR_HSCHRUN_RANDOM_CTG30    chrUn_KI270753v1    HSCHRUN_RANDOM_CTG30    NT_187508.1 62944
KI270754.1  CHR_HSCHRUN_RANDOM_CTG33    chrUn_KI270754v1    HSCHRUN_RANDOM_CTG33    NT_187509.1 40191
KI270755.1  CHR_HSCHRUN_RANDOM_CTG34    chrUn_KI270755v1    HSCHRUN_RANDOM_CTG34    NT_187510.1 36723
KI270756.1  CHR_HSCHRUN_RANDOM_CTG35    chrUn_KI270756v1    HSCHRUN_RANDOM_CTG35    NT_187511.1 79590
KI270757.1  CHR_HSCHRUN_RANDOM_CTG36    chrUn_KI270757v1    HSCHRUN_RANDOM_CTG36    NT_187512.1 71251
KI270758.1  CHR_HSCHR6_8_CTG1   chr6_KI270758v1_alt HSCHR6_8_CTG1   NT_187692.1 76752
KI270759.1  CHR_HSCHR1_1_CTG32_1    chr1_KI270759v1_alt HSCHR1_1_CTG32_1    NT_187516.1 425601
KI270760.1  CHR_HSCHR1_1_CTG11  chr1_KI270760v1_alt HSCHR1_1_CTG11  NT_187514.1 109528
KI270761.1  CHR_HSCHR1_2_CTG32_1    chr1_KI270761v1_alt HSCHR1_2_CTG32_1    NT_187518.1 165834
KI270762.1  CHR_HSCHR1_1_CTG3   chr1_KI270762v1_alt HSCHR1_1_CTG3   NT_187515.1 354444
KI270763.1  CHR_HSCHR1_3_CTG32_1    chr1_KI270763v1_alt HSCHR1_3_CTG32_1    NT_187519.1 911658
KI270764.1  CHR_HSCHR1_4_CTG32_1    chr1_KI270764v1_alt HSCHR1_4_CTG32_1    NT_187521.1 50258
KI270765.1  CHR_HSCHR1_4_CTG31  chr1_KI270765v1_alt HSCHR1_4_CTG31  NT_187520.1 185285
KI270766.1  CHR_HSCHR1_2_CTG3   chr1_KI270766v1_alt HSCHR1_2_CTG3   NT_187517.1 256271
KI270767.1  CHR_HSCHR2_1_CTG15  chr2_KI270767v1_alt HSCHR2_1_CTG15  NT_187523.1 161578
KI270768.1  CHR_HSCHR2_3_CTG7_2 chr2_KI270768v1_alt HSCHR2_3_CTG7_2 NT_187528.1 110099
KI270769.1  CHR_HSCHR2_1_CTG1   chr2_KI270769v1_alt HSCHR2_1_CTG1   NT_187522.1 120616
KI270770.1  CHR_HSCHR2_2_CTG1   chr2_KI270770v1_alt HSCHR2_2_CTG1   NT_187525.1 136240
KI270771.1  CHR_HSCHR2_4_CTG7_2 chr2_KI270771v1_alt HSCHR2_4_CTG7_2 NT_187530.1 110395
KI270772.1  CHR_HSCHR2_1_CTG7   chr2_KI270772v1_alt HSCHR2_1_CTG7   NT_187524.1 133041
KI270773.1  CHR_HSCHR2_3_CTG1   chr2_KI270773v1_alt HSCHR2_3_CTG1   NT_187526.1 70887
KI270774.1  CHR_HSCHR2_4_CTG1   chr2_KI270774v1_alt HSCHR2_4_CTG1   NT_187529.1 223625
KI270775.1  CHR_HSCHR2_5_CTG7_2 chr2_KI270775v1_alt HSCHR2_5_CTG7_2 NT_187531.1 138019
KI270776.1  CHR_HSCHR2_3_CTG15  chr2_KI270776v1_alt HSCHR2_3_CTG15  NT_187527.1 174166
KI270777.1  CHR_HSCHR3_2_CTG2_1 chr3_KI270777v1_alt HSCHR3_2_CTG2_1 NT_187533.1 173649
KI270778.1  CHR_HSCHR3_3_CTG2_1 chr3_KI270778v1_alt HSCHR3_3_CTG2_1 NT_187536.1 248252
KI270779.1  CHR_HSCHR3_1_CTG3   chr3_KI270779v1_alt HSCHR3_1_CTG3   NT_187532.1 205312
KI270780.1  CHR_HSCHR3_4_CTG2_1 chr3_KI270780v1_alt HSCHR3_4_CTG2_1 NT_187537.1 224108
KI270781.1  CHR_HSCHR3_5_CTG2_1 chr3_KI270781v1_alt HSCHR3_5_CTG2_1 NT_187538.1 113034
KI270782.1  CHR_HSCHR3_2_CTG3   chr3_KI270782v1_alt HSCHR3_2_CTG3   NT_187534.1 162429
KI270783.1  CHR_HSCHR3_3_CTG1   chr3_KI270783v1_alt HSCHR3_3_CTG1   NT_187535.1 109187
KI270784.1  CHR_HSCHR3_9_CTG3   chr3_KI270784v1_alt HSCHR3_9_CTG3   NT_187539.1 184404
KI270785.1  CHR_HSCHR4_2_CTG12  chr4_KI270785v1_alt HSCHR4_2_CTG12  NT_187542.1 119912
KI270786.1  CHR_HSCHR4_3_CTG12  chr4_KI270786v1_alt HSCHR4_3_CTG12  NT_187543.1 244096
KI270787.1  CHR_HSCHR4_1_CTG8_1 chr4_KI270787v1_alt HSCHR4_1_CTG8_1 NT_187541.1 111943
KI270788.1  CHR_HSCHR4_4_CTG12  chr4_KI270788v1_alt HSCHR4_4_CTG12  NT_187544.1 158965
KI270789.1  CHR_HSCHR4_5_CTG12  chr4_KI270789v1_alt HSCHR4_5_CTG12  NT_187545.1 205944
KI270790.1  CHR_HSCHR4_1_CTG4   chr4_KI270790v1_alt HSCHR4_1_CTG4   NT_187540.1 220246
KI270791.1  CHR_HSCHR5_3_CTG1   chr5_KI270791v1_alt HSCHR5_3_CTG1   NT_187547.1 195710
KI270792.1  CHR_HSCHR5_4_CTG1   chr5_KI270792v1_alt HSCHR5_4_CTG1   NT_187548.1 179043
KI270793.1  CHR_HSCHR5_5_CTG1   chr5_KI270793v1_alt HSCHR5_5_CTG1   NT_187550.1 126136
KI270794.1  CHR_HSCHR5_6_CTG1   chr5_KI270794v1_alt HSCHR5_6_CTG1   NT_187551.1 164558
KI270795.1  CHR_HSCHR5_2_CTG5   chr5_KI270795v1_alt HSCHR5_2_CTG5   NT_187546.1 131892
KI270796.1  CHR_HSCHR5_4_CTG1_1 chr5_KI270796v1_alt HSCHR5_4_CTG1_1 NT_187549.1 172708
KI270797.1  CHR_HSCHR6_1_CTG4   chr6_KI270797v1_alt HSCHR6_1_CTG4   NT_187552.1 197536
KI270798.1  CHR_HSCHR6_1_CTG5   chr6_KI270798v1_alt HSCHR6_1_CTG5   NT_187553.1 271782
KI270799.1  CHR_HSCHR6_1_CTG6   chr6_KI270799v1_alt HSCHR6_1_CTG6   NT_187554.1 152148
KI270800.1  CHR_HSCHR6_1_CTG7   chr6_KI270800v1_alt HSCHR6_1_CTG7   NT_187555.1 175808
KI270801.1  CHR_HSCHR6_1_CTG8   chr6_KI270801v1_alt HSCHR6_1_CTG8   NT_187556.1 870480
KI270802.1  CHR_HSCHR6_1_CTG9   chr6_KI270802v1_alt HSCHR6_1_CTG9   NT_187557.1 75005
KI270803.1  CHR_HSCHR7_2_CTG6   chr7_KI270803v1_alt HSCHR7_2_CTG6   NT_187562.1 1111570
KI270804.1  CHR_HSCHR7_1_CTG1   chr7_KI270804v1_alt HSCHR7_1_CTG1   NT_187558.1 157952
KI270805.1  CHR_HSCHR7_1_CTG7   chr7_KI270805v1_alt HSCHR7_1_CTG7   NT_187560.1 209988
KI270806.1  CHR_HSCHR7_1_CTG4_4 chr7_KI270806v1_alt HSCHR7_1_CTG4_4 NT_187559.1 158166
KI270807.1  CHR_HSCHR7_2_CTG7   chr7_KI270807v1_alt HSCHR7_2_CTG7   NT_187563.1 126434
KI270808.1  CHR_HSCHR7_3_CTG6   chr7_KI270808v1_alt HSCHR7_3_CTG6   NT_187564.1 271455
KI270809.1  CHR_HSCHR7_2_CTG4_4 chr7_KI270809v1_alt HSCHR7_2_CTG4_4 NT_187561.1 209586
KI270810.1  CHR_HSCHR8_1_CTG7   chr8_KI270810v1_alt HSCHR8_1_CTG7   NT_187567.1 374415
KI270811.1  CHR_HSCHR8_1_CTG1   chr8_KI270811v1_alt HSCHR8_1_CTG1   NT_187565.1 292436
KI270812.1  CHR_HSCHR8_2_CTG1   chr8_KI270812v1_alt HSCHR8_2_CTG1   NT_187568.1 282736
KI270813.1  CHR_HSCHR8_3_CTG1   chr8_KI270813v1_alt HSCHR8_3_CTG1   NT_187570.1 300230
KI270814.1  CHR_HSCHR8_1_CTG6   chr8_KI270814v1_alt HSCHR8_1_CTG6   NT_187566.1 141812
KI270815.1  CHR_HSCHR8_2_CTG7   chr8_KI270815v1_alt HSCHR8_2_CTG7   NT_187569.1 132244
KI270816.1  CHR_HSCHR8_3_CTG7   chr8_KI270816v1_alt HSCHR8_3_CTG7   NT_187571.1 305841
KI270817.1  CHR_HSCHR8_4_CTG7   chr8_KI270817v1_alt HSCHR8_4_CTG7   NT_187573.1 158983
KI270818.1  CHR_HSCHR8_4_CTG1   chr8_KI270818v1_alt HSCHR8_4_CTG1   NT_187572.1 145606
KI270819.1  CHR_HSCHR8_5_CTG7   chr8_KI270819v1_alt HSCHR8_5_CTG7   NT_187574.1 133535
KI270820.1  CHR_HSCHR8_6_CTG7   chr8_KI270820v1_alt HSCHR8_6_CTG7   NT_187575.1 36640
KI270821.1  CHR_HSCHR8_8_CTG1   chr8_KI270821v1_alt HSCHR8_8_CTG1   NT_187576.1 985506
KI270822.1  CHR_HSCHR8_9_CTG1   chr8_KI270822v1_alt HSCHR8_9_CTG1   NT_187577.1 624492
KI270823.1  CHR_HSCHR9_1_CTG5   chr9_KI270823v1_alt HSCHR9_1_CTG5   NT_187578.1 439082
KI270824.1  CHR_HSCHR10_1_CTG3  chr10_KI270824v1_alt    HSCHR10_1_CTG3  NT_187579.1 181496
KI270825.1  CHR_HSCHR10_1_CTG4  chr10_KI270825v1_alt    HSCHR10_1_CTG4  NT_187580.1 188315
KI270826.1  CHR_HSCHR11_1_CTG2  chr11_KI270826v1_alt    HSCHR11_1_CTG2  NT_187581.1 186169
KI270827.1  CHR_HSCHR11_1_CTG3  chr11_KI270827v1_alt    HSCHR11_1_CTG3  NT_187582.1 67707
KI270829.1  CHR_HSCHR11_1_CTG5  chr11_KI270829v1_alt    HSCHR11_1_CTG5  NT_187583.1 204059
KI270830.1  CHR_HSCHR11_1_CTG6  chr11_KI270830v1_alt    HSCHR11_1_CTG6  NT_187584.1 177092
KI270831.1  CHR_HSCHR11_1_CTG7  chr11_KI270831v1_alt    HSCHR11_1_CTG7  NT_187585.1 296895
KI270832.1  CHR_HSCHR11_1_CTG8  chr11_KI270832v1_alt    HSCHR11_1_CTG8  NT_187586.1 210133
KI270833.1  CHR_HSCHR12_5_CTG2_1    chr12_KI270833v1_alt    HSCHR12_5_CTG2_1    NT_187589.1 76061
KI270834.1  CHR_HSCHR12_6_CTG2_1    chr12_KI270834v1_alt    HSCHR12_6_CTG2_1    NT_187590.1 119498
KI270835.1  CHR_HSCHR12_4_CTG2  chr12_KI270835v1_alt    HSCHR12_4_CTG2  NT_187587.1 238139
KI270836.1  CHR_HSCHR12_7_CTG2_1    chr12_KI270836v1_alt    HSCHR12_7_CTG2_1    NT_187591.1 56134
KI270837.1  CHR_HSCHR12_5_CTG2  chr12_KI270837v1_alt    HSCHR12_5_CTG2  NT_187588.1 40090
KI270838.1  CHR_HSCHR13_1_CTG1  chr13_KI270838v1_alt    HSCHR13_1_CTG1  NT_187592.1 306913
KI270839.1  CHR_HSCHR13_1_CTG2  chr13_KI270839v1_alt    HSCHR13_1_CTG2  NT_187593.1 180306
KI270840.1  CHR_HSCHR13_1_CTG3  chr13_KI270840v1_alt    HSCHR13_1_CTG3  NT_187594.1 191684
KI270841.1  CHR_HSCHR13_1_CTG4  chr13_KI270841v1_alt    HSCHR13_1_CTG4  NT_187595.1 169134
KI270842.1  CHR_HSCHR13_1_CTG5  chr13_KI270842v1_alt    HSCHR13_1_CTG5  NT_187596.1 37287
KI270843.1  CHR_HSCHR13_1_CTG6  chr13_KI270843v1_alt    HSCHR13_1_CTG6  NT_187597.1 103832
KI270844.1  CHR_HSCHR14_1_CTG1  chr14_KI270844v1_alt    HSCHR14_1_CTG1  NT_187598.1 322166
KI270845.1  CHR_HSCHR14_2_CTG1  chr14_KI270845v1_alt    HSCHR14_2_CTG1  NT_187599.1 180703
KI270846.1  CHR_HSCHR14_3_CTG1  chr14_KI270846v1_alt    HSCHR14_3_CTG1  NT_187600.1 1351393
KI270847.1  CHR_HSCHR14_7_CTG1  chr14_KI270847v1_alt    HSCHR14_7_CTG1  NT_187601.1 1511111
KI270848.1  CHR_HSCHR15_1_CTG3  chr15_KI270848v1_alt    HSCHR15_1_CTG3  NT_187603.1 327382
KI270849.1  CHR_HSCHR15_3_CTG8  chr15_KI270849v1_alt    HSCHR15_3_CTG8  NT_187605.1 244917
KI270850.1  CHR_HSCHR15_5_CTG8  chr15_KI270850v1_alt    HSCHR15_5_CTG8  NT_187606.1 430880
KI270851.1  CHR_HSCHR15_3_CTG3  chr15_KI270851v1_alt    HSCHR15_3_CTG3  NT_187604.1 263054
KI270852.1  CHR_HSCHR15_1_CTG1  chr15_KI270852v1_alt    HSCHR15_1_CTG1  NT_187602.1 478999
KI270853.1  CHR_HSCHR16_1_CTG1  chr16_KI270853v1_alt    HSCHR16_1_CTG1  NT_187607.1 2659700
KI270854.1  CHR_HSCHR16_CTG2    chr16_KI270854v1_alt    HSCHR16_CTG2    NT_187610.1 134193
KI270855.1  CHR_HSCHR16_3_CTG1  chr16_KI270855v1_alt    HSCHR16_3_CTG1  NT_187608.1 232857
KI270856.1  CHR_HSCHR16_4_CTG1  chr16_KI270856v1_alt    HSCHR16_4_CTG1  NT_187609.1 63982
KI270857.1  CHR_HSCHR17_7_CTG4  chr17_KI270857v1_alt    HSCHR17_7_CTG4  NT_187614.1 2877074
KI270858.1  CHR_HSCHR17_8_CTG4  chr17_KI270858v1_alt    HSCHR17_8_CTG4  NT_187615.1 235827
KI270859.1  CHR_HSCHR17_9_CTG4  chr17_KI270859v1_alt    HSCHR17_9_CTG4  NT_187616.1 108763
KI270860.1  CHR_HSCHR17_1_CTG9  chr17_KI270860v1_alt    HSCHR17_1_CTG9  NT_187612.1 178921
KI270861.1  CHR_HSCHR17_1_CTG2  chr17_KI270861v1_alt    HSCHR17_1_CTG2  NT_187611.1 196688
KI270862.1  CHR_HSCHR17_2_CTG2  chr17_KI270862v1_alt    HSCHR17_2_CTG2  NT_187613.1 391357
KI270863.1  CHR_HSCHR18_3_CTG2_1    chr18_KI270863v1_alt    HSCHR18_3_CTG2_1    NT_187617.1 167999
KI270864.1  CHR_HSCHR18_4_CTG1_1    chr18_KI270864v1_alt    HSCHR18_4_CTG1_1    NT_187618.1 111737
KI270865.1  CHR_HSCHR19_4_CTG2  chr19_KI270865v1_alt    HSCHR19_4_CTG2  NT_187621.1 52969
KI270866.1  CHR_HSCHR19_2_CTG3_1    chr19_KI270866v1_alt    HSCHR19_2_CTG3_1    NT_187619.1 43156
KI270867.1  CHR_HSCHR19_3_CTG3_1    chr19_KI270867v1_alt    HSCHR19_3_CTG3_1    NT_187620.1 233762
KI270868.1  CHR_HSCHR19_5_CTG2  chr19_KI270868v1_alt    HSCHR19_5_CTG2  NT_187622.1 61734
KI270869.1  CHR_HSCHR20_1_CTG2  chr20_KI270869v1_alt    HSCHR20_1_CTG2  NT_187623.1 118774
KI270870.1  CHR_HSCHR20_1_CTG3  chr20_KI270870v1_alt    HSCHR20_1_CTG3  NT_187624.1 183433
KI270871.1  CHR_HSCHR20_1_CTG4  chr20_KI270871v1_alt    HSCHR20_1_CTG4  NT_187625.1 58661
KI270872.1  CHR_HSCHR21_5_CTG2  chr21_KI270872v1_alt    HSCHR21_5_CTG2  NT_187626.1 82692
KI270873.1  CHR_HSCHR21_6_CTG1_1    chr21_KI270873v1_alt    HSCHR21_6_CTG1_1    NT_187627.1 143900
KI270874.1  CHR_HSCHR21_8_CTG1_1    chr21_KI270874v1_alt    HSCHR21_8_CTG1_1    NT_187628.1 166743
KI270875.1  CHR_HSCHR22_1_CTG3  chr22_KI270875v1_alt    HSCHR22_1_CTG3  NT_187629.1 259914
KI270876.1  CHR_HSCHR22_1_CTG4  chr22_KI270876v1_alt    HSCHR22_1_CTG4  NT_187630.1 263666
KI270877.1  CHR_HSCHR22_1_CTG5  chr22_KI270877v1_alt    HSCHR22_1_CTG5  NT_187631.1 101331
KI270878.1  CHR_HSCHR22_1_CTG6  chr22_KI270878v1_alt    HSCHR22_1_CTG6  NT_187632.1 186262
KI270879.1  CHR_HSCHR22_1_CTG7  chr22_KI270879v1_alt    HSCHR22_1_CTG7  NT_187633.1 304135
KI270880.1  CHR_HSCHRX_1_CTG3   chrX_KI270880v1_alt HSCHRX_1_CTG3   NT_187634.1 284869
KI270881.1  CHR_HSCHRX_2_CTG12  chrX_KI270881v1_alt HSCHRX_2_CTG12  NT_187635.1 144206
KI270882.1  CHR_HSCHR19KIR_FH15_B_HAP_CTG3_1    chr19_KI270882v1_alt    HSCHR19KIR_FH15_B_HAP_CTG3_1    NT_187636.1 248807
KI270883.1  CHR_HSCHR19KIR_G085_A_HAP_CTG3_1    chr19_KI270883v1_alt    HSCHR19KIR_G085_A_HAP_CTG3_1    NT_187637.1 170399
KI270884.1  CHR_HSCHR19KIR_G085_BA1_HAP_CTG3_1  chr19_KI270884v1_alt    HSCHR19KIR_G085_BA1_HAP_CTG3_1  NT_187638.1 157053
KI270885.1  CHR_HSCHR19KIR_G248_A_HAP_CTG3_1    chr19_KI270885v1_alt    HSCHR19KIR_G248_A_HAP_CTG3_1    NT_187639.1 171027
KI270886.1  CHR_HSCHR19KIR_G248_BA2_HAP_CTG3_1  chr19_KI270886v1_alt    HSCHR19KIR_G248_BA2_HAP_CTG3_1  NT_187640.1 204239
KI270887.1  CHR_HSCHR19KIR_GRC212_AB_HAP_CTG3_1 chr19_KI270887v1_alt    HSCHR19KIR_GRC212_AB_HAP_CTG3_1 NT_187641.1 209512
KI270888.1  CHR_HSCHR19KIR_GRC212_BA1_HAP_CTG3_1    chr19_KI270888v1_alt    HSCHR19KIR_GRC212_BA1_HAP_CTG3_1    NT_187642.1 155532
KI270889.1  CHR_HSCHR19KIR_LUCE_A_HAP_CTG3_1    chr19_KI270889v1_alt    HSCHR19KIR_LUCE_A_HAP_CTG3_1    NT_187643.1 170698
KI270890.1  CHR_HSCHR19KIR_LUCE_BDEL_HAP_CTG3_1 chr19_KI270890v1_alt    HSCHR19KIR_LUCE_BDEL_HAP_CTG3_1 NT_187644.1 184499
KI270891.1  CHR_HSCHR19KIR_RSH_A_HAP_CTG3_1 chr19_KI270891v1_alt    HSCHR19KIR_RSH_A_HAP_CTG3_1 NT_187645.1 170680
KI270892.1  CHR_HSCHR1_ALT2_1_CTG32_1   chr1_KI270892v1_alt HSCHR1_ALT2_1_CTG32_1   NT_187646.1 162212
KI270893.1  CHR_HSCHR2_2_CTG15  chr2_KI270893v1_alt HSCHR2_2_CTG15  NT_187647.1 161218
KI270894.1  CHR_HSCHR2_2_CTG7   chr2_KI270894v1_alt HSCHR2_2_CTG7   NT_187648.1 214158
KI270895.1  CHR_HSCHR3_3_CTG3   chr3_KI270895v1_alt HSCHR3_3_CTG3   NT_187649.1 162896
KI270896.1  CHR_HSCHR4_6_CTG12  chr4_KI270896v1_alt HSCHR4_6_CTG12  NT_187650.1 378547
KI270897.1  CHR_HSCHR5_1_CTG1_1 chr5_KI270897v1_alt HSCHR5_1_CTG1_1 NT_187651.1 1144418
KI270898.1  CHR_HSCHR5_3_CTG5   chr5_KI270898v1_alt HSCHR5_3_CTG5   NT_187652.1 130957
KI270899.1  CHR_HSCHR7_2_CTG1   chr7_KI270899v1_alt HSCHR7_2_CTG1   NT_187653.1 190869
KI270900.1  CHR_HSCHR8_5_CTG1   chr8_KI270900v1_alt HSCHR8_5_CTG1   NT_187654.1 318687
KI270901.1  CHR_HSCHR8_6_CTG1   chr8_KI270901v1_alt HSCHR8_6_CTG1   NT_187655.1 136959
KI270902.1  CHR_HSCHR11_2_CTG1  chr11_KI270902v1_alt    HSCHR11_2_CTG1  NT_187656.1 106711
KI270903.1  CHR_HSCHR11_2_CTG1_1    chr11_KI270903v1_alt    HSCHR11_2_CTG1_1    NT_187657.1 214625
KI270904.1  CHR_HSCHR12_3_CTG2  chr12_KI270904v1_alt    HSCHR12_3_CTG2  NT_187658.1 572349
KI270905.1  CHR_HSCHR15_4_CTG8  chr15_KI270905v1_alt    HSCHR15_4_CTG8  NT_187660.1 5161414
KI270906.1  CHR_HSCHR15_2_CTG3  chr15_KI270906v1_alt    HSCHR15_2_CTG3  NT_187659.1 196384
KI270907.1  CHR_HSCHR17_2_CTG1  chr17_KI270907v1_alt    HSCHR17_2_CTG1  NT_187662.1 137721
KI270908.1  CHR_HSCHR17_2_CTG5  chr17_KI270908v1_alt    HSCHR17_2_CTG5  NT_187663.1 1423190
KI270909.1  CHR_HSCHR17_10_CTG4 chr17_KI270909v1_alt    HSCHR17_10_CTG4 NT_187661.1 325800
KI270910.1  CHR_HSCHR17_3_CTG2  chr17_KI270910v1_alt    HSCHR17_3_CTG2  NT_187664.1 157099
KI270911.1  CHR_HSCHR18_ALT2_CTG2_1 chr18_KI270911v1_alt    HSCHR18_ALT2_CTG2_1 NT_187666.1 157710
KI270912.1  CHR_HSCHR18_ALT21_CTG2_1    chr18_KI270912v1_alt    HSCHR18_ALT21_CTG2_1    NT_187665.1 174061
KI270913.1  CHR_HSCHRX_2_CTG3   chrX_KI270913v1_alt HSCHRX_2_CTG3   NT_187667.1 274009
KI270914.1  CHR_HSCHR19KIR_RSH_BA2_HAP_CTG3_1   chr19_KI270914v1_alt    HSCHR19KIR_RSH_BA2_HAP_CTG3_1   NT_187668.1 205194
KI270915.1  CHR_HSCHR19KIR_T7526_A_HAP_CTG3_1   chr19_KI270915v1_alt    HSCHR19KIR_T7526_A_HAP_CTG3_1   NT_187669.1 170665
KI270916.1  CHR_HSCHR19KIR_T7526_BDEL_HAP_CTG3_1    chr19_KI270916v1_alt    HSCHR19KIR_T7526_BDEL_HAP_CTG3_1    NT_187670.1 184516
KI270917.1  CHR_HSCHR19KIR_ABC08_A1_HAP_CTG3_1  chr19_KI270917v1_alt    HSCHR19KIR_ABC08_A1_HAP_CTG3_1  NT_187671.1 190932
KI270918.1  CHR_HSCHR19KIR_ABC08_AB_HAP_C_P_CTG3_1  chr19_KI270918v1_alt    HSCHR19KIR_ABC08_AB_HAP_C_P_CTG3_1  NT_187672.1 123111
KI270919.1  CHR_HSCHR19KIR_ABC08_AB_HAP_T_P_CTG3_1  chr19_KI270919v1_alt    HSCHR19KIR_ABC08_AB_HAP_T_P_CTG3_1  NT_187673.1 170701
KI270920.1  CHR_HSCHR19KIR_FH05_A_HAP_CTG3_1    chr19_KI270920v1_alt    HSCHR19KIR_FH05_A_HAP_CTG3_1    NT_187674.1 198005
KI270921.1  CHR_HSCHR19KIR_FH05_B_HAP_CTG3_1    chr19_KI270921v1_alt    HSCHR19KIR_FH05_B_HAP_CTG3_1    NT_187675.1 282224
KI270922.1  CHR_HSCHR19KIR_FH06_A_HAP_CTG3_1    chr19_KI270922v1_alt    HSCHR19KIR_FH06_A_HAP_CTG3_1    NT_187676.1 187935
KI270923.1  CHR_HSCHR19KIR_FH06_BA1_HAP_CTG3_1  chr19_KI270923v1_alt    HSCHR19KIR_FH06_BA1_HAP_CTG3_1  NT_187677.1 189352
KI270924.1  CHR_HSCHR3_4_CTG3   chr3_KI270924v1_alt HSCHR3_4_CTG3   NT_187678.1 166540
KI270925.1  CHR_HSCHR4_7_CTG12  chr4_KI270925v1_alt HSCHR4_7_CTG12  NT_187679.1 555799
KI270926.1  CHR_HSCHR8_7_CTG1   chr8_KI270926v1_alt HSCHR8_7_CTG1   NT_187680.1 229282
KI270927.1  CHR_HSCHR11_3_CTG1  chr11_KI270927v1_alt    HSCHR11_3_CTG1  NT_187681.1 218612
KI270928.1  CHR_HSCHR22_3_CTG1  chr22_KI270928v1_alt    HSCHR22_3_CTG1  NT_187682.1 176103
KI270929.1  CHR_HSCHR19KIR_FH08_A_HAP_CTG3_1    chr19_KI270929v1_alt    HSCHR19KIR_FH08_A_HAP_CTG3_1    NT_187683.1 186203
KI270930.1  CHR_HSCHR19KIR_FH08_BAX_HAP_CTG3_1  chr19_KI270930v1_alt    HSCHR19KIR_FH08_BAX_HAP_CTG3_1  NT_187684.1 200773
KI270931.1  CHR_HSCHR19KIR_FH13_A_HAP_CTG3_1    chr19_KI270931v1_alt    HSCHR19KIR_FH13_A_HAP_CTG3_1    NT_187685.1 170148
KI270932.1  CHR_HSCHR19KIR_FH13_BA2_HAP_CTG3_1  chr19_KI270932v1_alt    HSCHR19KIR_FH13_BA2_HAP_CTG3_1  NT_187686.1 215732
KI270933.1  CHR_HSCHR19KIR_FH15_A_HAP_CTG3_1    chr19_KI270933v1_alt    HSCHR19KIR_FH15_A_HAP_CTG3_1    NT_187687.1 170537
KI270934.1  CHR_HSCHR3_5_CTG3   chr3_KI270934v1_alt HSCHR3_5_CTG3   NT_187688.1 163458
KI270935.1  CHR_HSCHR3_6_CTG3   chr3_KI270935v1_alt HSCHR3_6_CTG3   NT_187689.1 197351
KI270936.1  CHR_HSCHR3_7_CTG3   chr3_KI270936v1_alt HSCHR3_7_CTG3   NT_187690.1 164170
KI270937.1  CHR_HSCHR3_8_CTG3   chr3_KI270937v1_alt HSCHR3_8_CTG3   NT_187691.1 165607
KI270938.1  CHR_HSCHR19_4_CTG3_1    chr19_KI270938v1_alt    HSCHR19_4_CTG3_1    NT_187693.1 1066800

準備

180502時点で、UCSCのGRCh38/hg38のゲノムのversionはGCA_000001405.15 (Dec. 2013)なので、これとEnsemblのbiomartを併せる必要がある。きっとなんか良いツールがある気もするけど、データの中身を見ながら泥臭くやる。

UCSCのページをみると、

UCSC Genome Browser assembly ID: hg38
Sequencing/Assembly provider ID: Genome Reference Consortium Human GRCh38 (GCA_000001405.15)
Assembly date: Dec. 2013
Accession ID: GCA_000001405.15

になっていて、アセンブリのAccession IDがGCA_000001405.15であることがわかる。

NCBIでこれを探す。
https://www.ncbi.nlm.nih.gov/assembly/GCA_000001405.15/

該当するのはGCF_000001405.26。ページにもassembly_report.txtにも、

# GenBank assembly accession: GCA_000001405.15
# RefSeq assembly accession: GCF_000001405.26
# RefSeq assembly and GenBank assemblies identical: yes

になっている。

Ensemblのarchiveはこんな感じになっている。(180502時点のLatestはrelease 92なのでGRCh38.p12)

Ensembl GRCh37: Full Feb 2014 archive with BLAST, VEP and BioMart
Ensembl 91: Dec 2017 (GRCh38.p10)
Ensembl 90: Aug 2017 (GRCh38.p10) - patched/updated gene set Jun 2017
Ensembl 89: May 2017 (GRCh38.p10) - patched/updated gene set Jan 2017
Ensembl 88: Mar 2017 (GRCh38.p10)
Ensembl 87: Dec 2016 (GRCh38.p7)
Ensembl 86: Oct 2016 (GRCh38.p7)
Ensembl 85: Jul 2016 (GRCh38.p7) - patched/updated gene set Jun 2016
Ensembl 84: Mar 2016 (GRCh38.p5)
Ensembl 83: Dec 2015 (GRCh38.p5) - patched/updated gene set Oct 2015
Ensembl 82: Sep 2015 (GRCh38.p3)
Ensembl 81: Jul 2015 (GRCh38.p3) - patched/updated gene set Jun 2015
Ensembl 80: May 2015 (GRCh38.p2) - patched/updated gene set Jan 2015
Ensembl 79: Mar 2015 (GRCh38.p2)
Ensembl 78: Dec 2014 (GRCh38)
Ensembl 77: Oct 2014 (GRCh38) - patched/updated gene set Aug 2014
Ensembl 76: Aug 2014 (GRCh38) - gene set updated Jul 2014
Ensembl 75: Feb 2014 (GRCh37.p13)
Ensembl 74: Dec 2013 (GRCh37.p13) - patched/updated gene set Sep 2013
Ensembl 67: May 2012 (GRCh37.p7) - patched/updated gene set Feb 2012
Ensembl 54: May 2009 (NCBI 36) - patched/updated gene set Oct 2008

EnsemblのGRCh38/hg38の最終版はEnsembl release 78になる。
それ以降のreleasaeはpatchがつくのでめんどくさい。

ためしにEnsemblのarchiveのbiomartからデータ落とすと、HSCHR10_1_CTG1がCHR_HSCHR10_1_CTG1みたいに「HSCHR」で始まるものにはすべて「CHR_」がついてるし、「LRG_...」とかなに?ってなるし、CHR_HG142_HG150_NOVEL_TESTとかCHR_HG151_NOVEL_TESTとかまでいる。

ただEnsemblのGRCh38/hg38だと、染色体名がちょっとNCBIと違う感じになってたりする。しかEnsembl(というかEBIのENA)のGCA_000001405.15_sequence_report.txtでは同じになってるのに、biomartで落としてくると変わってる。。。

Ensembl78 GRCh38/hg38 UCSC GRCh38/hg38 NCBI (Name, GenBank, RefSeq)
1 chr1 1, CM000663.2, NC_000001.11
MT chrM MT, J01415.2, NC_012920.1
CHR_HSCHR1_CTG1_UNLOCALIZED chr1_KI270706v1_random HSCHR1_CTG1_UNLOCALIZED, KI270706.1, T_187361.1
CHR_HSCHR2_RANDOM_CTG1 chr2_KI270715v1_random HSCHR2_RANDOM_CTG1, KI270715.1, NT_187370.1
GL000008.2 chr4_GL000008v2_random HSCHR4_RANDOM_CTG4, GL000008.2, NT_113793.3
GL000195.1 chrUn_GL000195v1 HSCHRUN_RANDOM_CTG1, GL000195.1, NT_113901.1
KI270734.1 chr22_KI270734v1_random HSCHR22_UNLOCALIZED_CTG4, KI270734.1, NT_187389.1
KI270442.1 chrUn_KI270442v1 HSCHRUN_RANDOM_124, KI270442.1, NT_187420.1

作業

ひとまず、NCBIで該当するGCF_000001405.26のassembly_reportから情報を流用する。
これは実際はUCSCのassembly_report.txtと同一なのと、このなかにSequence-NameGenBank-AccnRefSeq-AccnSequence-LengthUCSC-style-nameと一緒にそろっている。「HSCHR→CHRHSCHR」だけ気をつければよいのでは。

EnsemblのGCA_000001405.15_sequence_report.txtと、NCBIのGCF_000001405.26_GRCh38_assembly_report.txtをみると、GenBank accnはどちらもそろっている。あとlengthもある。UCSCもhg38.chrom.sizes.txtがあるので、落としもらしの確認はできる。

shellであつかうときは、NCBIのGCF_000001405.26_GRCh38_assembly_report.txtが改行アレなので、nkf --unixとかしておく。

GenBank accnでjoinしてlengthと名前を確認、みたいな感じか。

 $ less NCBI_GRCh38_GCA_000001405.15/GCF_000001405.26_GRCh38_assembly_report.txt | nkf --unix | grep -v "^\#" | perl -ne 'chomp;@l=split(/\t/);print join("\t",@l[4,9,0,2,6,8])."\n";' | sort -u > temp_ncbi.txt

CM000663.2  chr1    1   1   NC_000001.11    248956422
CM000664.2  chr2    2   2   NC_000002.12    242193529
CM000665.2  chr3    3   3   NC_000003.12    198295559
CM000666.2  chr4    4   4   NC_000004.12    190214555
CM000667.2  chr5    5   5   NC_000005.10    181538259
...
KI270935.1  chr3_KI270935v1_alt HSCHR3_6_CTG3   3   NT_187689.1 197351
KI270936.1  chr3_KI270936v1_alt HSCHR3_7_CTG3   3   NT_187690.1 164170
KI270937.1  chr3_KI270937v1_alt HSCHR3_8_CTG3   3   NT_187691.1 165607
KI270938.1  chr19_KI270938v1_alt    HSCHR19_4_CTG3_1    19  NT_187693.1 1066800

 $ less Ensembl_release78_GRCh38_GCA_000001405.15/GCA_000001405.15_sequence_report.txt | nkf --unix | grep -v "^accession" | perl -ne 'chomp;@l=split(/\t/);$l[1]=~s/HSCHR/CHR_HSCHR/;print join("\t",@l[0,1,4,2])."\n";' | sort -u > temp_ensembl.txt

CM000663.2  1   1   248956422
CM000664.2  2   2   242193529
CM000665.2  3   3   198295559
CM000666.2  4   4   190214555
...
KI270935.1  CHR_HSCHR3_6_CTG3   3   197351
KI270936.1  CHR_HSCHR3_7_CTG3   3   164170
KI270937.1  CHR_HSCHR3_8_CTG3   3   165607
KI270938.1  CHR_HSCHR19_4_CTG3_1    19  1066800

 $ less UCSC_hg38_GCA_000001405.15/hg38.chrom.sizes.txt | sort -u > temp_ucsc.txt

chr1    248956422
chr10   133797422
chr10_GL383545v1_alt    179254
chr10_GL383546v1_alt    309802
...
chrX_KI270881v1_alt 144206
chrX_KI270913v1_alt 274009
chrY    57227415
chrY_KI270740v1_random  37240

 $ wc temp_*
     455    1820   18907 temp_ensembl.txt
     455    2730   31541 temp_ncbi.txt
     455     910   11672 temp_ucsc.txt
    1365    5460   60408 total

 $ join -a 1 -t "   " temp_ensembl.txt temp_ncbi.txt | perl -ne 'chomp;@l=split(/\t/);print join("\t",($l[4],@l))."\n";' | sort -u > temp_ens_ncbi.txt

 $ wc temp_*
     455    4550   54096 temp_ens_ncbi.txt
     455    1820   18907 temp_ensembl.txt
     455    2730   31541 temp_ncbi.txt
     455     910   11672 temp_ucsc.txt
    1820   10010  114504 total

 $ join -a 1 -t "   " temp_ens_ncbi.txt temp_ucsc.txt | cut -f 2- | sort -u > TABLE_hg38_GRCh38_chr_ens78_ncbi_ucsc.txt

 $ wc TABLE_hg38_GRCh38_chr_ens78_ncbi_ucsc.txt
     455    4550   48466 TABLE_hg38_GRCh38_chr_ens78_ncbi_ucsc.tx

いちおうchoromosome lengthが同じか確認。めんどくさいので横着する。

 $ less TABLE_hg38_GRCh38_chr_ens78_ncbi_ucsc.txt | cut -f4,9,10
248956422   248956422   248956422
242193529   242193529   242193529
198295559   198295559   198295559
...
164170  164170  164170
165607  165607  165607
1066800 1066800 1066800

 $ diff --report <(less TABLE_hg38_GRCh38_chr_ens78_ncbi_ucsc.txt | cut -f4) <(less TABLE_hg38_GRCh38_chr_ens78_ncbi_ucsc.txt | cut -f9)
Files /dev/fd/11 and /dev/fd/12 are identical

 $ diff --report <(less TABLE_hg38_GRCh38_chr_ens78_ncbi_ucsc.txt | cut -f4) <(less TABLE_hg38_GRCh38_chr_ens78_ncbi_ucsc.txt | cut -f10)
Files /dev/fd/11 and /dev/fd/12 are identical

ただbiomartのdataにあてて確認してみると出来ないのがけっこうある。biomartで適当にダウンロードして、あわせてみる。

 $ gzcat martquery_0501xxxxxx_xxx.txt.gz | grep -v "^Chromosome" | cut -f1 | sort -u > list_ens78_biomart_chr.txt

 $ less list_ens78_biomart_chr.txt | wc
      802     802    9134

 $ less list_ens78_biomart_chr.txt | grep -v LRG | grep -v _NOVEL_TEST | wc
     268     268    4931

そもそもbiomartのentryが実際はすくない。

 $ join -a 1 <(less list_ens78_biomart_chr.txt | sort -u) <(less TABLE_hg38_GRCh38_chr_ens78_ncbi_ucsc.txt | cut -f2 | sort -u | perl -pe 's/$/\tOK/;') | grep -v OK$ | grep -v LRG | grep -v _NOVEL_TEST > list_ens78_biomart_chr_unmatched.txt

 $ wc list_ens78_biomart_chr_unmatched.txt
      34      34     374 list_ens78_biomart_chr_unmatched.txt

GL000008.2
GL000009.2
GL000194.1
...
KI270744.1
KI270750.1
KI270752.1

これらはGenbank accnのままなのか。別途なんとかしないとだめか。あとLRGとかもちゃんとしなきゃだめか。

 $ less TABLE_hg38_GRCh38_chr_ens78_ncbi_ucsc.txt | grep -v -f list_ens78_biomart_chr_unmatched.txt > temp_TABLE_matched.txt

 $ less TABLE_hg38_GRCh38_chr_ens78_ncbi_ucsc.txt | grep -f list_ens78_biomart_chr_unmatched.txt > temp_TABLE_unmatched.txt

 $ wc temp_TABLE_matched.txt temp_TABLE_unmatched.txt 
     421    4210   44417 temp_TABLE_matched.txt
      34     340    4049 temp_TABLE_unmatched.txt
     455    4550   48466 total 

 $ cat temp_TABLE_matched.txt <(cat temp_TABLE_unmatched.txt | perl -ne 'chomp;@l=split(/\t/);print join("\t",@l[0,0,2..$#l])."\n";') | sort -u > TABLE_hg38_GRCh38_chr_ens78_ncbi_ucsc_modified.txt

 $ wc TABLE_hg38_GRCh38_chr_ens78_ncbi_ucsc_modified.txt
     455    4550   47939 TABLE_hg38_GRCh38_chr_ens78_ncbi_ucsc_modified.txt

 $ join <(less list_ens78_biomart_chr.txt | sort -u) <(less TABLE_hg38_GRCh38_chr_ens78_ncbi_ucsc_modified.txt | cut -f2 | sort -u | perl -pe 's/$/\tOK/;' | sort -u) | grep OK | wc
     268     536    5735

これでもうライフゼロ。いちおうコンバートできるだけのTABLEにはなったのではないか。。。
上述のTableを出力する。

less TABLE_hg38_GRCh38_chr_ens78_ncbi_ucsc_modified.txt | cut -f1,2,5,6,8,9

あとがき

あとはこのTABLEをつかって、以下みたいなhashの対応表つくって入れ替えちゃえばよい。

 $ cut -f2,5 TABLE_hg38_GRCh38_chr_ens78_ncbi_ucsc_modified.txt | perl -pe 's/^/\"/;s/\t/\"\ \=\>\ \"/;s/$/\"\,/;'
"1" => "chr1",
"2" => "chr2",
"3" => "chr3",
...
"CHR_HSCHR3_7_CTG3" => "chr3_KI270936v1_alt",
"CHR_HSCHR3_8_CTG3" => "chr3_KI270937v1_alt",
"CHR_HSCHR19_4_CTG3_1" => "chr19_KI270938v1_alt",

ダウンロードしたbiomartのデータなんかは、LRGとかそういうのは該当するhashがなければすべてNotAvailableみたいな名前に変換して、grep -vとかで一気にomitしちゃえばいいのでは、という目論見。

LRGとかは今後の課題。こことかこことかのデータつかってコンバートするようにするか。そもそもLiftOverとかそういう機能ないのかな。。。