DNA LLM and genome for survey 2200 papers.

Posted at 2025-08-09

This list is the revision of DNA LLM for survey 2000 papers.
https://qiita.com/kaizen_nagoya/items/d528200aa52766a51b30

I add 200 references on some articles.
I add sequence number and all reference numbes.
I will describe the same document for remove from the list.

No.	S	R	R.R	title and URL	new or qiita
1		0	0	Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model　https://arxiv.org/abs/2505.23579	https://qiita.com/kaizen_nagoya/items/0718b214043a614deee0
2		1	0	] J. Amberger, C. A. Bocchini, A. F. Scott, and A. Hamosh. Mckusick’s online mendelian inheritance in man (omim®). Nucleic Acids Research, 37:D793, 2008. ISSN 03051048. doi: 10.1093/NAR/GKN665. URL https://pmc.ncbi.nlm.nih.gov/articles/PMC2686440/.	https://qiita.com/kaizen_nagoya/items/c599d867201d1ffb1f4d
3		1	1	1.McKusick VA. On the X Chromosome of Man. Quart. Rev. Biol. 1962;37:69–175. doi: 10.1086/403631.
4		1	2	2.McKusick VA. Mendelian Inheritance in Man, A Catolog of Autosomal Dominant, Autosomal Recessive, and X-linked Phenotypes. 1st edn. Baltimore, MD: Johns Hopkins University Press; 1966.
5		1	3	3.McKusick VA. Mendelian Inheritance in Man, A Catolog of Human Genes and Genetic Disorders. 12th edn. Baltimore, MD: Johns Hopkins University Press; 1998.
6		1	4	4.Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA. Online Mendelian Inheritance in Man (OMIM®), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 2005;33:514–517. doi: 10.1093/nar/gki033.
7		2	0	] Anthropic. Claude 3.7 sonnet, February 2025. URL https://www.anthropic.com/news/claude-3-7-sonnet. Accessed: 2025-05-13.	https://qiita.com/kaizen_nagoya/items/4364d9c475114353cf2a
8		3	0	] G. Benegas, C. Ye, C. Albors, J. C. Li, and Y. S. Song. Genomic language models: Opportunities and challenges. ArXiv, page arXiv:2407.11435v2, 9 2024. ISSN 2331-8422. URL https://pmc.ncbi.nlm.nih.gov/articles/PMC11275703/http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC11275703.	https://pubmed.ncbi.nlm.nih.gov/39753409/ https://qiita.com/kaizen_nagoya/items/f797330e64e0c7d05f39
9		3	1	1. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. Attention is all you need. In: Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R., eds. Advances in Neural Information Processing Systems vol. 30. Curran Associates, Inc. (2017):.
10		3	2	2. Gulati, A., Qin, J., Chiu, C.-C., Parmar, N., Zhang, Y., Yu, J., Han, W., Wang, S., Zhang, Z., Wu, Y. et al. (2020). Conformer: Convolution-augmented transformer for speech recognition. arXiv preprint arXiv:2005.08100. https://arxiv.org/abs/2005.08100.
11		3	3	3. Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F. L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S. et al. (2023). GPT-4 technical report. arXiv preprint arXiv:2303.08774. https://arxiv.org/abs/2303.08774.
12		3	4	4. Bateman, A., Martin, M.-J., Orchard, S., Magrane, M., Ahmad, S., Alpi, E., Bowler-Barnett, E. H., Britto, R., Cukura, A., Denny, P. et al. (2023). UniProt: the universal protein knowledgebase in 2023. Nucleic Acids Research 51, D523–D531.
13		3	5	5. Lin, Z., Akin, H., Rao, R., Hie, B., Zhu, Z., Lu, W., Smetanin, N., Verkuil, R., Kabeli, O., Shmueli, Y. et al. (2023). Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130.
14		3	6	6. Meier, J., Rao, R., Verkuil, R., Liu, J., Sercu, T., and Rives, A. Language models enable zero-shot prediction of the effects of mutations on protein function. In:Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W., eds. Advances in Neural Information Processing Systems vol. 34. Curran Associates, Inc. 2021):( 29287–29303). https://proceedings.neurips.cc/paper_files/paper/2021/file/f51338d736f95dd42427296047067694-Paper.pdf.
15		3	7	7. Truong Jr, T., and Bepler, T. PoET: A generative model of protein families as sequences-of-sequences. In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., and Levine, S., eds. Advances in Neural Information Processing Systems vol. 36. Curran Associates, Inc. (2023):( 77379–77415). https://proceedings.neurips.cc/paper_files/paper/2023/file/f4366126eba252699b280e8f93c0ab2f-Paper-Conference.pdf.
16		3	8	8. Bepler, T., and Berger, B. (2021). Learning the protein language: Evolution, structure, and function. Cell Systems 12, 654–669.
17		3	9	9. Ruffolo, J. A., and Madani, A. (2024). Designing proteins with language models. Nature Biotechnology 42, 200–202.
18		3	10	10. Riesselman, A. J., Ingraham, J. B., and Marks, D. S. (2018). Deep generative models of genetic variation capture the effects of mutations. Nature Methods 15, 816–822.
19		3	11	11. Frazer, J., Notin, P., Dias, M., Gomez, A., Min, J. K., Brock, K., Gal, Y., and Marks, D. S. (2021). Disease variant prediction with deep generative models of evolutionary data. Nature 599, 91–95.
20		3	12	12. Brandes, N., Goldman, G., Wang, C. H., Ye, C. J., and Ntranos, V. (2023). Genome-wide prediction of disease variant effects with a deep protein language model. Nature Genetics. https://doi.org/10.1038/s41588-023-01465-0. doi:10.1038/s41588-023-01465-0.
21		3	13	13. Benegas, G., Batra, S. S., and Song, Y. S. (2023). DNA language models are powerful predictors of genome-wide variant effects. Proceedings of the National Academy of Sciences 120,e2311219120.
22		3	14	14. Mendoza-Revilla, J., Trop, E., Gonzalez, L., Roller, M., Dalla-Torre, H., de Almeida, B. P., Richard, G., Caton, J., Lopez Carranza, N., Skwark, M., Laterre, A., Beguir, K., Pierrot, T., and Lopez, M. (2024). A foundational large language model for edible plant genomes. Communications Biology 7, 835. https://doi.org/10.1038/s42003-024-06465-2. doi:10.1038/s42003-024-06465-2.
23		3	15	15. Zhai, J., Gokaslan, A., Schiff, Y., Berthel, A., Liu, Z.-Y., Miller, Z. R., Scheben, A., Stitzer, M. C., Romay, C., Buckler, E. S., and Kuleshov, V. (2024). Cross-species plant genomes modeling at single nucleotide resolution using a pre-trained DNA language model. bioRxiv preprint. https://www.biorxiv.org/content/early/2024/06/05/2024.06.04.596709. doi:10.1101/2024.06.04.596709.
24		3	16	16. Dalla-Torre, H., Gonzalez, L., Mendoza Revilla, J., Lopez Carranza, N., Henryk Grywaczewski, A., Oteri, F., Dallago, C., Trop, E., Sirelkhatim, H., Richard, G. et al. (2023). The Nucleotide Transformer: Building and Evaluating Robust Foundation Models for Human Genomics. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2023.01.11.523679v3.
25		3	17	17. Benegas, G., Albors, C., Aw, A. J., Ye, C., and Song, Y. S. (2023). GPN-MSA: an alignment- based DNA language model for genome-wide variant effect prediction. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2023.10.10.561776v2.
26		3	18	18. Hsu, C., Nisonoff, H., Fannjiang, C., and Listgarten, J. (2022). Learning protein fitness models from evolutionary and assay-labeled data. Nature Biotechnology 40, 1114–1122.
27		3	19	19. Tomaz da Silva, P., Karollus, A., Hingerl, J., Galindez, G., Wagner, N., Hernandez-Alias, X., Incarnato, D., and Gagneur, J. (2024). Nucleotide dependency analysis of DNA language models reveals genomic functional elements. bioRxiv preprint ( 2024–07). https://www.biorxiv.org/content/10.1101/2024.07.27.605418v1.
28		3	20	20. Siepel, A., Bejerano, G., Pedersen, J. S., Hinrichs, A. S., Hou, M., Rosenbloom, K., Clawson, H., Spieth, J., Hillier, L. W., Richards, S. et al. (2005). Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Research 15, 1034–1050.
29		3	21	21. Pollard, K. S., Hubisz, M. J., Rosenbloom, K. R., and Siepel, A. (2010). Detection of nonneutral substitution rates on mammalian phylogenies. Genome Research 20, 110–121.
30		3	22	22. Avsec, Z., Agarwal, V., Visentin, D., Ledsam, J. R., Grabska-Barwinska, A., Taylor, K. R., Assael, Y., Jumper, J., Kohli, P., and Kelley, D. R. (2021). Effective gene expression prediction from sequence by integrating long-range interactions. Nature Methods 18, 1196–1203.
31		3	23	23. Jaganathan, K., Panagiotopoulou, S. K., McRae, J. F., Darbandi, S. F., Knowles, D., Li, Y. I., Kosmicki, J. A., Arbelaez, J., Cui, W., Schwartz, G. B. et al. (2019). Predicting splicing from primary sequence with deep learning. Cell 176, 535–548.
32		3	24	24. Schiff, Y., Kao, C.-H., Gokaslan, A., Dao, T., Gu, A., and Kuleshov, V. (2024). Caduceus: Bi-directional equivariant long-range DNA sequence modeling. arXiv preprint arXiv:2403.03234.https://arxiv.org/abs/2403.03234.
33		3	25	25. Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., and Amodei, D. Language models are few-shot learners. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H., eds. Advances in Neural Information Processing Systems vol. 33. Curran Associates, Inc. (2020):( 1877–1901). https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.
34		3	26	26. Madani, A., Krause, B., Greene, E. R., Subramanian, S., Mohr, B. P., Holton, J. M., Olmos, J. L., Xiong, C., Sun, Z. Z., Socher, R. et al. (2023). Large language models generate functional protein sequences across diverse families. Nature Biotechnology 41, 1099–1106.
35		3	27	27. Ingraham, J., Garg, V., Barzilay, R., and Jaakkola, T. Generative models for graph-based protein design. In: Wallach, H., Larochelle, H., Beygelzimer, A., d'Alch´e-Buc, F., Fox, E., and Garnett, R., eds. Advances in Neural Information Processing Systems vol. 32. Curran Associates, Inc. (2019):https://proceedings.neurips.cc/paper_files/paper/2019/file/f3a4ff4839c56a5f460c88cce3666a2b-Paper.pdf.
36		3	28	28. Hsu, C., Verkuil, R., Liu, J., Lin, Z., Hie, B., Sercu, T., Lerer, A., and Rives, A. Learning inverse folding from millions of predicted structures. In: International Conference on Machine Learning. PMLR (2022):( 8946–8970).
37		3	29	29. Shin, J.-E., Riesselman, A. J., Kollasch, A. W., McMahon, C., Simon, E., Sander, C., Manglik, A., Kruse, A. C., and Marks, D. S. (2021). Protein design and variant prediction using autoregressive generative models. Nature Communications 12, 2403.
38		3	30	30. Lal, A., Garfield, D., Biancalani, T., and Eraslan, G. regLM: Designing realistic regulatory DNA with autoregressive language models. In: International Conference on Research in Computational Molecular Biology. Springer (2024):( 332–335).
39		3	31	31. Nguyen, E., Poli, M., Durrant, M. G., Thomas, A. W., Kang, B., Sullivan, J., Ng, M. Y., Lewis, A., Patel, A., Lou, A. et al. (2024). Sequence modeling and design from molecular to genome scale with Evo. bioRxiv preprint ( 2024–02). https://www.biorxiv.org/content/10.1101/2024.02.27.582234v2.
40		3	32	32. Wang, Y., Wang, H., Wei, L., Li, S., Liu, L., and Wang, X. (2020). Synthetic promoter design in Escherichia coli based on a deep generative network. Nucleic Acids Research 48,6403–6412.
41		3	33	33. Jores, T., Tonnies, J., Wrightsman, T., Buckler, E. S., Cuperus, J. T., Fields, S., and Queitsch, C. (2021). Synthetic promoter designs enabled by a comprehensive analysis of plant core promoters. Nature Plants 7, 842–855.
42		3	34	34. de Almeida, B. P., Reiter, F., Pagani, M., and Stark, A. (2022). DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers. Nature Genetics 54, 613–624.
43		3	35	35. Nguyen, E., Poli, M., Faizi, M., Thomas, A., Wornow, M., Birch-Sykes, C., Massaroli, S., Patel, A., Rabideau, C., Bengio, Y., Ermon, S., R´e, C., and Baccus, S. HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution. In: Oh, A., Naumann,
44		3	36	T., Globerson, A., Saenko, K., Hardt, M., and Levine, S., eds. Advances in Neural Information Processing Systems vol. 36. Curran Associates, Inc. (2023):( 43177–43201).
45		3	37	36. Shao, B. (2023). A long-context language model for deciphering and generating bacteriophage genomes. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2023.12.18.572218v3.
46		3	38	37. Ratcliff, J. D. (2024). Transformer model generated bacteriophage genomes are compositionally distinct from natural sequences. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2024.03.19.585716v1.
47		3	39	38. Alipanahi, B., Delong, A., Weirauch, M. T., and Frey, B. J. (2015). Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning. Nature Biotechnology 33, 831–838.
48		3	40	39. Zhou, J., and Troyanskaya, O. G. (2015). Predicting effects of noncoding variants with deep learning–based sequence model. Nature Methods 12, 931–934.
49		3	41	40. Kelley, D. R., Snoek, J., and Rinn, J. L. (2016). Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Research 26, 990–999.
50		3	42	41. Kelley, D. R., Reshef, Y. A., Bileschi, M., Belanger, D., McLean, C. Y., and Snoek, J. (2018). Sequential regulatory activity prediction across chromosomes with convolutional neural networks. Genome Research 28, 739–750.
51		3	43	42. Zeng, T., and Li, Y. I. (2022). Predicting RNA splicing from DNA sequence using Pangolin. Genome Biology 23, 103. https://genomebiology.biomedcentral.com/articles/10.1186/s13059-022-02664-4. doi:10.1186/s13059-022-02664-4.
52		3	44	43. Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Burstein, J., Doran, C., and Solorio, T., eds. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics (2019):(4171–4186). https://aclanthology.org/N19-1423. doi:10.18653/v1/N19-1423.
53		3	45	44. Bommasani, R., Hudson, D. A. et al. (2021). On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258. https://arxiv.org/abs/2108.07258.
54		3	46	45. West-Roberts, J., Kravitz, J., Jha, N., Cornman, A., and Hwang, Y. (2024). Diverse genomic embedding benchmark for functional evaluation across the tree of life. bioRxiv ( 2024–07). https://www.biorxiv.org/content/10.1101/2024.07.10.602933v1.
55		3	47	46. de Almeida, B. P., Dalla-Torre, H., Richard, G., Blum, C., Hexemer, L., G´elard, M., Mendoza-Revilla, J., Pandey, P., Laurent, S., Lopez, M. et al. (2024). SegmentNT: annotating the genome at single-nucleotide resolution with DNA foundation models. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2024.03.14.584712v2.
56		3	48	47. Zhou, Z., Wu, W., Ho, H., Wang, J., Shi, L., Davuluri, R. V., Wang, Z., and Liu, H. (2024). DNABERT-S: Learning species-aware dna embedding with genome foundation models. arXiv preprint. https://arxiv.org/abs/2402.08777.
57		3	49	48. Zhou, Z., Ji, Y., Li, W., Dutta, P., Davuluri, R., and Liu, H. (2023). DNABERT-2: Efficient foundation model and benchmark for multi-species genome. arXiv preprint arXiv:2306.15006. https://arxiv.org/abs/2306.15006.
58		3	50	49. Garau-Luis, J. J., Bordes, P., Gonzalez, L., Roller, M., de Almeida, B. P., Hexemer, L., Blum, C., Laurent, S., Grzegorzewski, J., Lang, M. et al. (2024). Multi-modal transfer learning between biological foundation models. arXiv preprint arXiv:2406.14150.
59		3	51	50. Marin, F. I., Teufel, F., Horlacher, M., Madsen, D., Pultz, D., Winther, O., and Boomsma, W. BEND: Benchmarking DNA Language Models on Biologically Meaningful Tasks. In: International Conference on Learning Representations (2024):.
60		3	52	51. Tang, Z., and Koo, P. K. (2024). Evaluating the representational power of pre-trained DNA language models for regulatory genomics. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2024.02.29.582810v1.
61		3	53	52. Li, F.-Z., Amini, A. P., Yue, Y., Yang, K. K., and Lu, A. X. (2024). Feature reuse and scaling: Understanding transfer learning with protein language models. bioRxiv preprint ( 202402).
62		3	54	53. Zaheer, M., Guruganesh, G., Dubey, K. A., Ainslie, J., Alberti, C., Ontanon, S., Pham, P., Ravula, A., Wang, Q., Yang, L., and Ahmed, A. Big Bird: Transformers for Longer Sequences. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H., eds. Advances in Neural Information Processing Systems vol. 33. Curran Associates, Inc. (2020):(17283–17297).
63		3	55	54. Ji, Y., Zhou, Z., Liu, H., and Davuluri, R. V. (2021). DNABERT: pre-trained bidirectional encoder representations from Transformers model for DNA-language in genome. Bioinformatics 37, 2112–2120.
64		3	56	55. Mo, S., Fu, X., Hong, C., Chen, Y., Zheng, Y., Tang, X., Lan, Y., Shen, Z., and Xing, E. Multi-modal Self-supervised Pre-training for Large-scale Genome Data. In: NeurIPS 2021 AI for Science Workshop (2021):.
65		3	57	56. Trotter, M. V., Nguyen, C. Q., Young, S., Woodruff, R. T., and Branson, K. M. (2021). Epigenomic language models powered by Cerebras. arXiv preprint arXiv:2112.07571. https://arxiv.org/abs/2112.07571.
66		3	58	57. Zhang, Y., An, L., Yue, F., and Hardison, R. C. (2016). Jointly characterizing epigenetic dynamics across multiple human cell types. Nucleic Acids Research 44, 6721–6731.
67		3	59	58. Hoarfrost, A., Aptekmann, A., Farfa˜nuk, G., and Bromberg, Y. (2022). Deep learning of a bacterial and archaeal universal language of life enables transfer learning and illuminates microbial dark matter. Nature Communications 13, 2606.
68		3	60	59. Yang, M., Huang, L., Huang, H., Tang, H., Zhang, N., Yang, H., Wu, J., and Mu, F. (2022). Integrating convolution and self-attention improves language model of human genome for interpreting non-coding regions at base-resolution. Nucleic Acids Research 50, e81–e81.
69		3	61	60. Gwak, H.-J., and Rho, M. (2022). ViBE: a hierarchical BERT model to identify eukaryotic viruses using metagenome sequencing data. Briefings in Bioinformatics 23. doi:10.1093/bib/bbac204. Bbac204.
70		3	62	61. Levy, B., Xu, Z., Zhao, L., Kremling, K., Altman, R., Wong, P., and Tanner, C. FloraBERT: cross-species transfer learning withattention-based neural networks for gene expression prediction (2022). https://doi.org/10.21203/rs.3.rs-1927200/v1. doi:10.21203/rs.3.rs-1927200/v1.
71		3	63	62. Bai, Z., Zhang, Y.-z., Miyano, S., Yamaguchi, R., Fujimoto, K., Uematsu, S., and Imoto, S. (2022). Identification of bacteriophage genome sequences with representation learning. Bioinformatics. Btac509.
72		3	64	63. Zvyagin, M., Brace, A., Hippe, K., Deng, Y., Zhang, B., Bohorquez, C. O., Clyde, A.,Kale, B., Perez-Rivera, D., Ma, H. et al. (2023). GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics. The International Journal of High Performance Computing Applications 37, 683–705.
73		3	65	64. Chen, K., Zhou, Y., Ding, M., Wang, Y., Ren, Z., and Yang, Y. (2024). Self-supervised learning on millions of primary RNA sequences from 72 vertebrates improves sequence-based RNA splicing prediction. Briefings in Bioinformatics 25, bbae163.
74		3	66	65. Karollus, A., Hingerl, J., Gankin, D., Grosshauser, M., Klemon, K., and Gagneur, J. (2024). Species-aware DNA language models capture regulatory elements and their evolution. Genome Biology 25, 83.
75		3	67	66. Fishman, V., Kuratov, Y., Petrov, M., Shmelev, A., Shepelin, D., Chekanov, N., Kardymon, O., and Burtsev, M. (2023). GENA-LM: A Family of Open-Source Foundational Models for Long DNA Sequences. bioRxiv preprint. https://www.biorxiv.org/content/early/2023/06/13/2023.06.12.544594. doi:10.1101/2023.06.12.544594.
76		3	68	67. Sanabria, M., Hirsch, J., and Poetsch, A. R. (2023). The human genome’s vocabulary as proposed by the DNA language model GROVER. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2023.07.19.549677v2.
77		3	69	68. Zhang, D., Zhang, W., He, B., Zhang, J., Qin, C., and Yao, J. (2023). DNAGPT: A generalized pretrained tool for multiple DNA sequence analysis tasks. bioRxiv preprint. https://arxiv.org/abs/2307.05628.
78		3	70	69. Chu, Y., Yu, D., Li, Y., Huang, K., Shen, Y., Cong, L., Zhang, J., and Wang, M. (2024). A 5’ UTR language model for decoding untranslated regions of mRNA and function predictions. Nature Machine Intelligence 6, 449–460.
79		3	71	70. Lorenz, R., Bernhart, S. H., H¨oner zu Siederdissen, C., Tafer, H., Flamm, C., Stadler, P. F., and Hofacker, I. L. (2011). ViennaRNA Package 2.0. Algorithms for Molecular Biology 6, 1–14.
80		3	72	71. Robson, E. S., and Ioannidis, N. M. (2023). GUANinE v1. 0: Benchmark Datasets for Genomic AI Sequence-to-Function Models. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2023.10.12.562113v3.
81		3	73	72. Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., and Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research 21, 1–67. http://jmlr.org/papers/v21/20-074.html.
82		3	74	73. Kudo, T. Subword regularization: Improving neural network translation models with multiple subword candidates. In: Gurevych, I., and Miyao, Y., eds. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Melbourne, Australia: Association for Computational Linguistics (2018):( 66–75). https://aclanthology.org/P18-1007. doi:10.18653/v1/P18-1007.
83		3	75	74. Richard, G., de Almeida, B. P., Dalla-Torre, H., Blum, C., Hexemer, L., Pandey, P., Laurent, S., Lopez, M. P., Laterre, A., Lang, M. et al. (2024). ChatNT: A Multimodal Conversational Agent for DNA, RNA and Protein Tasks. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2024.04.30.591835v1.
84		3	76	75. Chiang, W.-L., Li, Z., Lin, Z., Sheng, Y., Wu, Z., Zhang, H., Zheng, L., Zhuang, S., Zhuang, Y., Gonzalez, J. E., Stoica, I., and Xing, E. P. Vicuna: An Open-Source Chat- bot Impressing GPT-4 with 90%* ChatGPT Quality (2023). https://lmsys.org/blog/2023-03-30-vicuna/.
85		3	77	76. He, Y., Fang, P., Shan, Y., Pan, Y., Wei, Y., Chen, Y., Chen, Y., Liu, Y., Zeng, Z., Zhou, Z. et al. (2024). LucaOne: Generalized Biological Foundation Model with Unified Nucleic Acid and Protein Language. bioRxiv preprint ( 2024–05). https://www.biorxiv.org/content/10.1101/2024.05.10.592927v1.
86		3	78	77. Zhu, X., Qin, C., Wang, F., Yang, F., He, B., Zhao, Y., and Yao, J. (2024). CD-GPT: A Biological Foundation Model Bridging the Gap between Molecular Sequences Through Central Dogma. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2024.06.24.600337v1.
87		3	79	78. Cornman, A., West-Roberts, J., Camargo, A. P., Roux, S., Beracochea, M., Mirdita, M., Ovchinnikov, S., and Hwang, Y. (2024). The OMG dataset: An Open MetaGenomic corpus for mixed-modality genomic language modeling. bioRxiv preprint ( 2024–08). https://www.biorxiv.org/content/10.1101/2024.08.14.607850v1.
88		3	80	79. Markowitz, V. M., Chen, I.-M. A., Palaniappan, K., Chu, K., Szeto, E., Grechkin, Y., Ratner, A., Jacob, B., Huang, J., Williams, P. et al. (2012). IMG: the integrated microbial genomes database and comparative analysis system. Nucleic Acids Research 40, D115–D122.
89		3	81	80. Richardson, L., Allen, B., Baldi, G., Beracochea, M., Bileschi, M. L., Burdett, T., Burgin, J., Caballero-P´erez, J., Cochrane, G., Colwell, L. J. et al. (2023). MGnify: the microbiome sequence data analysis resource in 2023. Nucleic Acids Research 51, D753–D759.
90		3	82	81. Gao, L., Biderman, S., Black, S., Golding, L., Hoppe, T., Foster, C., Phang, J., He, H., Thite, A., Nabeshima, N., Presser, S., and Leahy, C. (2020). The Pile: An 800GB Dataset of Diverse Text for Language Modeling. arXiv preprint arXiv:2101.00027. https://arxiv.org/abs/2101.00027.
91		3	83	82. Longpre, S., Biderman, S., Albalak, A., Schoelkopf, H., McDuff, D., Kapoor, S., Klyman, K., Lo, K., Ilharco, G., San, N. et al. (2024). The responsible foundation model development cheatsheet: A review of tools & resources. arXiv preprint arXiv:2406.16746. https://arxiv.org/abs/2406.16746.
92		3	84	83. Sullivan, P. F., Meadows, J. R., Gazal, S., Phan, B. N., Li, X., Genereux, D. P., Dong, M. X., Bianchi, M., Andrews, G., Sakthikumar, S. et al. (2023). Leveraging base-pair mammalian constraint to understand genetic variation and human disease. Science 380, eabn2937.
93		3	85	84. Lee, K., Ippolito, D., Nystrom, A., Zhang, C., Eck, D., Callison-Burch, C., and Carlini, N. Deduplicating training data makes language models better. In: Muresan, S., Nakov, P., and Villavicencio, A., eds. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Dublin, Ireland: Association for Compu- tational Linguistics (2022):( 8424–8445). https://aclanthology.org/2022.acl-long.577. doi:10.18653/v1/2022.acl-long.577.
94		3	86	85. Schoenfelder, S., and Fraser, P. (2019). Long-range enhancer–promoter contacts in gene expression control. Nature Reviews Genetics 20, 437–455.
95		3	87	86. Karnuta, J. M., and Scacheri, P. C. (2018). Enhancers: bridging the gap between gene control and human disease. Human Molecular Genetics 27, R219–R227.
96		3	88	87. King, J. L., and Jukes, T. H. (1969). Non-darwinian evolution. Science 164, 788–798. doi:10.1126/science.164.3881.788.
97		3	89	88. Tay, Y., Dehghani, M., Gupta, J. P., Aribandi, V., Bahri, D., Qin, Z., and Metzler, D. Are pretrained convolutions better than pretrained transformers? In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (2021):( 4349–4359). https://aclanthology.org/2021.acl-long.335/.
98		3	90	89. Yang, K. K., Fusi, N., and Lu, A. X. (2024). Convolutions are competitive with transformers for protein sequence pretraining. Cell Systems 15, 286–294.
99		3	91	90. Linder, J., Srivastava, D., Yuan, H., Agarwal, V., and Kelley, D. R. (2023). Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2023.08.30.555582v1.
100		3	92	91. Su, J., Ahmed, M., Lu, Y., Pan, S., Bo, W., and Liu, Y. (2024). Roformer: Enhanced transformer with rotary position embedding. Neurocomputing 568, 127063.
101		3	93	92. Dai, Z., Yang, Z., Yang, Y., Carbonell, J. G., Le, Q. V., and Salakhutdinov, R. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context. In: Korhonen, A., Traum, D. R., and M`arquez, L., eds. Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers. Association for Computational Linguistics (2019):( 2978–2988). https://doi.org/10.18653/v1/p19-1285. doi:10.18653/V1/P19-1285.
102		3	94	93. Yu, L., Simig, D., Flaherty, C., Aghajanyan, A., Zettlemoyer, L., and Lewis, M. MEGABYTE: Predicting million-byte sequences with multiscale transformers. In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., and Levine, S., eds. Advances in Neural Information Processing Systems vol. 36. Curran Associates, Inc. (2023):( 78808–78823). https://proceedings.neurips.cc/paper_files/paper/2023/file/f8f78f8043f35890181a824e53a57134-Paper-Conference.pdf.
103		3	95	94. Gu, A., Goel, K., and Re, C. Efficiently modeling long sequences with structured state spaces. In: International Conference on Learning Representations (2022):.
104		3	96	95. Poli, M., Massaroli, S., Nguyen, E., Fu, D. Y., Dao, T., Baccus, S., Bengio, Y., Ermon, S., and R´e, C. Hyena Hierarchy: Towards larger convolutional language models. In: International Conference on Machine Learning. PMLR (2023):( 28043–28078).
105		3	97	96. Gu, A., and Dao, T. (2023). Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752. https://arxiv.org/abs/2312.00752.
106		3	98	97. Rives, A., Meier, J., Sercu, T., Goyal, S., Lin, Z., Liu, J., Guo, D., Ott, M., Zitnick, C. L., Ma, J., and Ferguson, A. L. (2021). Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proceedings of the National Academy of Sciences 118, e2016239118.
107		3	99	98. Cheng, X., Chen, B., Li, P., Gong, J., Tang, J., and Song, L. (2024). Training compute-optimal protein language models. bioRxiv preprint. https://www.biorxiv.org/content/ 10.1101/2024.06.06.597716v1.
108		3	100	99. Samuel, D. (2024). BERTs are Generative In-Context Learners. arXiv:2406.04823. https://arxiv.org/abs/2406.04823. arXiv preprint
109		3	101	100. Hayes, T., Rao, R., Akin, H., Sofroniew, N. J., Oktay, D., Lin, Z., Verkuil, R., Tran, V. Q., Deaton, J., Wiggert, M. et al. (2024). Simulating 500 million years of evolution with a language model. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2024.07.01.600583v1.
110		3	102	101. Sennrich, R., Haddow, B., and Birch, A. Neural machine translation of rare words with subword units. In: Erk, K., and Smith, N. A., eds. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Berlin, Germany: Association for Computational Linguistics (2016):( 1715–1725). https://aclanthology.org/P16-1162. doi:10.18653/v1/P16-1162.
111		3	103	102. Blanchette, M., Kent, W. J., Riemer, C., Elnitski, L., Smit, A. F., Roskin, K. M., Baertsch, R., Rosenbloom, K., Clawson, H., Green, E. D. et al. (2004). Aligning multiple genomic sequences with the threaded blockset aligner. Genome Research 14, 708–715.
112		3	104	103. Armstrong, J., Hickey, G., Diekhans, M., Fiddes, I. T., Novak, A. M., Deran, A., Fang, Q., Xie, D., Feng, S., Stiller, J. et al. (2020). Progressive Cactus is a multiple-genome aligner for the thousand-genome era. Nature 587, 246–251.
113		3	105	104. Song, B., Buckler, E. S., and Stitzer, M. C. (2024). New whole-genome alignment tools are needed for tapping into plant diversity. Trends in Plant Science 29, 355–369.
114		3	106	105. Phan, M. H., Zehnder, T. M., Puntieri, F., Lo, B.-W., Lenhard, B., Mueller, F., Vingron, M., and Ibrahim, D. M. (2024). Conservation of regulatory elements with highly diverged sequences across large evolutionary distances. bioRxiv preprint ( 2024–05). https://www.biorxiv.org/content/10.1101/2024.05.13.590087v1.
115		3	107	106. Linardatos, P., Papastefanopoulos, V., and Kotsiantis, S. (2020). Explainable AI: A review of machine learning interpretability methods. Entropy 23, 18.
116		3	108	107. Zhang, Y., Tiˇno, P., Leonardis, A., and Tang, K. (2021). A survey on neural network inter-pretability. IEEE Transactions on Emerging Topics in Computational Intelligence 5, 726–742.
117		3	109	108. Talukder, A., Barham, C., Li, X., and Hu, H. (2021). Interpretation of deep learning in genomics and epigenomics. Briefings in Bioinformatics 22, bbaa177.
118		3	110	109. Shrikumar, A., Tian, K., Avsec, Z., Shcherbina, A., Banerjee, A., Sharmin, M., Nair, S., and Kundaje, A. (2018). Technical note on transcription factor motif discovery from importance scores (TF-MoDISco) version 0.5. 6.5. arXiv preprint arXiv:1811.00416. https://arxiv.org/abs/1811.00416.
119		3	111	110. Fowler, D. M., Adams, D. J., Gloyn, A. L., Hahn, W. C., Marks, D. S., Muffley, L. A., Neal, J. T., Roth, F. P., Rubin, A. F., Starita, L. M., and Hurles, M. E. (2023). An Atlas of Variant Effects to understand the genome at nucleotide resolution. Genome Biology 24, 147.
120		3	112	111. Kircher, M., Xiong, C., Martin, B., Schubach, M., Inoue, F., Bell, R. J. A., Costello, J. F., Shendure, J., and Ahituv, N. (2019). Saturation mutagenesis of twenty disease-associated regulatory elements at single base-pair resolution. Nature Communications 10. doi:10.1038/s41467-019-11526-w.
121		3	113	112. Findlay, G. M., Daza, R. M., Martin, B., Zhang, M. D., Leith, A. P., Gasperini, M., Janizek, J. D., Huang, X., Starita, L. M., and Shendure, J. (2018). Accurate classification of BRCA1 variants with saturation genome editing. Nature 562, 217–222.
122		3	114	113. Notin, P., Kollasch, A. W., Ritter, D., Niekerk, L. V., Paul, S., Spinner, H., Rollins, N. J., Shaw, A., Orenbuch, R., Weitzman, R., Frazer, J., Dias, M., Franceschi, D., Gal, Y., and Marks, D. S. ProteinGym: Large-Scale Benchmarks for Protein Fitness Prediction and Design. In: Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2023):https://openreview.net/forum?id=URoZHqAohf.
123		3	115	114. Landrum, M. J., Lee, J. M., Benson, M., Brown, G. R., Chao, C., Chitipiralla, S., Gu, B., Hart, J., Hoffman, D., Jang, W. et al. (2016). ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Research 44, D862–D868.
124		3	116	115. Stenson, P. D., Mort, M., Ball, E. V., Evans, K., Hayden, M., Heywood, S., Hussain, M., Phillips, A. D., and Cooper, D. N. (2017). The Human Gene Mutation Database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies. Human Genetics 136, 665–677. doi:10.1007/s00439-017-1779-6.
125		3	117	116. Amberger, J. S., Bocchini, C. A., Schiettecatte, F., Scott, A. F., and Hamosh, A. (2015). OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders. Nucleic Acids Research 43, D789–D798.
126		3	118	117. Pritchard, J. K., and Cox, N. (2002). The allelic architecture of human disease genes: common disease–common variant...or not? Human Molecular Genetics 11, 2417–2423. doi:10.1093/hmg/11.20.2417.
127		3	119	118. Karczewski, K. J., Francioli, L. C., Tiao, G., Cummings, B. B., Alf¨oldi, J., Wang, Q., Collins, R. L., Laricchia, K. M., Ganna, A., Birnbaum, D. P., Gauthier, L. D., Brand, H., Solomonson, M., Watts, N. A., Rhodes, D., Singer-Berk, M., England, E. M., Seaby, E. G., Kosmicki, J. A., Walters, R. K., Tashman, K., Farjoun, Y., Banks, E., Poterba, T., Consortium, G. A. D., and MacArthur, D. G. (2020). The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443. doi:10.1038/s41586-020-2308-7.
128		3	120	119. Vapnik, V. N. The Nature of Statistical Learning Theory. New York: Springer (1999).
129		3	121	120. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., and Fei-Fei, L. (2015). ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 211–252. doi:10.1007/s11263-015-0816-y.
130		3	122	121. Moult, J., Fidelis, K., Kryshtafovych, A., Schwede, T., and Tramontano, A. (2018). Critical assessment of methods of protein structure prediction (CASP)—round XII. Proteins: Structure, Function, and Bioinformatics 86, 7–15.
131		3	123	122. Johnson, A. D., Handsaker, R. E., Pulit, S. L., Nizzari, M., O’Donnell, C. J., and de Bakker, P. I. (2017). CAGI: The Critical Assessment of Genome Interpretation. Genome Biology 18, 1–5.
132		3	124	123. Grimm, D. G., Azencott, C.-A., Aicheler, F., Gieraths, U., MacArthur, D. G., Samocha, K. E., Cooper, D. N., Stenson, P. D., Daly, M. J., Smoller, J. W. et al. (2015). The evaluation of tools used to predict the impact of missense variants is hindered by two types of circularity. Human Mutation 36, 513–523.
133		3	125	124. Hartl, D. L., Clark, A. G., and Clark, A. G. Principles of population genetics vol. 116. Sinauer associates Sunderland, MA (1997).
134		3	126	125. Livesey, B. J., Badonyi, M., Dias, M., Frazer, J., Kumar, S., Lindorff-Larsen, K., McCandlish, D. M., Orenbuch, R., Shearer, C. A., Muffley, L. et al. (2024). Guidelines for releasing a variant effect predictor. arXiv preprint. https://arxiv.org/abs/2404.10807.
135		3	127	126. Gupta, A., Lal, A., Gunsalus, L. M., Biancalani, T., and Eraslan, G. (2023). Polygraph: A software framework for the systematic assessment of synthetic regulatory DNA elements. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2023.11.27.568764v2.
136		3	128	127. Consortium, E. P. et al. (2012). An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74.
137		3	129	128. Kundaje, A., Meuleman, W., Ernst, J. et al. (2015). Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330.
138		3	130	129. Greˇsov´a, K., Martinek, V., Cech´ak, D., Simeˇcek, P., and Alexiou, P. (2023). Genomic bench-marks: a collection of datasets for genomic sequence classification. BMC Genomic Data 24, Article number: 25.
139		3	131	130. Helfrich, G. (2024). The harms of terminology: why we should reject so-called “frontier AI”. AI and Ethics ( 1–7).
140		4	0	] G. Benegas, C. Albors, A. J. Aw, C. Ye, and Y. S. Song. A dna language model based on multispecies alignment predicts the effects of genome-wide variants. Nature Biotechnology, pages 1–6, 1 2025. ISSN 15461696. doi: 10.1038/S41587-024-02511-W;SUBJMETA=114, 1305,208,631;KWRD=GENETICS,MACHINE+LEARNING. URL https://www.nature.com/articles/s41587-024-02511-w. https://www.researchgate.net/publication/387673319_A_DNA_language_model_based_on_multispecies_alignment_predicts_the_effects_of_genome-wide_variants	https://qiita.com/kaizen_nagoya/items/6e8858c2395dcc98804a
141		4	1	1. Goldfeder, R. L., Wall, D. P., Khoury, M. J., Ioannidis, J. P. & Ashley, E. A. Human genome sequencing at the population scale: a primer on high-throughput DNA sequencing and analysis. Am. J. Epidemiol. 186, 1000–1009 (2017).
142		4	2	2. Marwaha, S., Knowles, J. W. & Ashley, E. A. A guide for the diagnosis of rare and undiagnosed disease: beyond the exome. Genome Med. 14, 23 (2022).
143		4	3	3. Lee, S., Abecasis, G. R., Boehnke, M. & Lin, X. Rare-variant association analysis: study designs and statistical tests. Am. J. Hum. Genet. 95, 5–23 (2014).
144		4	4	4. Trajanoska, K. et al. From target discovery to clinical drug development with human genetics. Nature 620, 737–745 (2023).
145		4	5	5. Meier, J. et al. Language models enable zero-shot prediction of the effects of mutations on protein function. In Proc. Advances in Neural Information Processing Systems 34 (eds Ranzato, M. et al.) 29287–29303 (Curran Associates, Inc., 2021).
146		4	6	6. Brandes, N., Goldman, G., Wang, C. H., Ye, C. J. & Ntranos, V. Genome-wide prediction of disease variant effects with a deep protein language model. Nat. Genet. 55, 1512–1522 (2023).
147		4	7	7. Jagota, M. et al. Cross-protein transfer learning substantially improves disease variant prediction. Genome Biol. 24, 182 (2023).
148		4	8	8. Benegas, G., Batra, S. S. & Song, Y. S. DNA language models are powerful predictors of genome-wide variant effects. Proc. Natl Acad. Sci. USA 120, e2311219120 (2023).
149		4	9	9. Dalla-Torre, H. et al. Nucleotide Transformer: building and evaluating robust foundation models for human genomics. Nat. Methods https://doi.org/10.1038/s41592-024-02523-z (2024).
150		4	10	10. Vaswani, A. et al. Attention is all you need. In Proc. Advances in Neural Information Processing Systems 30 (eds Guyon, S. et al.) 6000–6010 (Curran Associates, Inc., 2017).
151		4	11	11. Armstrong, J., Fiddes, I. T., Diekhans, M. & Paten, B. Wholegenome alignment and comparative annotation. Annu. Rev. Anim. Biosci. 7, 41–64 (2019).
152		4	12	12. Nguyen, E. et al. HyenaDNA: long-range genomic sequence modeling at single nucleotide resolution. In Proc. 37th International Conference on Neural Information Processing Systems (eds Oh, A. et al.) 43177–43201 (Curran Associates, Inc.,2023).
153		4	13	13. Rentzsch, P., Schubach, M., Shendure, J. & Kircher, M. CADD-Splice-improving genome-wide variant effect prediction using deep learning-derived splice scores. Genome Med. 13, 31 (2021).
154		4	14	14. Pollard, K. S., Hubisz, M. J., Rosenbloom, K. R. & Siepel, A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 20, 110–121 (2010).
155		4	15	15. Sullivan, P. F. et al. Leveraging base-pair mammalian constraint to understand genetic variation and human disease. Science 380, eabn2937 (2023).
156		4	16	16. Rives, A. et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc. Natl Acad. Sci. USA 118, e2016239118 (2021).
157		4	17	17. Avsec, Ž. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021).
158		4	18	18. Jaganathan, K. et al. Predicting splicing from primary sequence with deep learning. Cell 176, 535–548.e24 (2019).
159		4	19	19. Rao, R. M. et al. MSA Transformer. In Proceedings of the 38th International Conference on Machine Learning (eds Meila, M. & Zhang, T.) (PMLR, 2021).
160		4	20	20. Paten, B. et al. Genome-wide nucleotide-level mammalian ancestor reconstruction. Genome Res. 18, 1829–1843 (2008).
161		4	21	21. Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).
162		4	22	22. Landrum, M. J. et al. ClinVar: improvements to accessing data. Nucleic Acids Res. 48, D835–D844 (2020).
163		4	23	23. Chen, S. et al. A genomic mutational constraint map using variation in 76,156 human genomes. Nature 625, 92–100 (2024).
164		4	24	24. Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 47, D886–D894 (2019).
165		4	25	25. Tate, J. G. et al. COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res. 47, D941–D947 (2019).
166		4	26	26. Notin, P. et al. ProteinGym: large-scale benchmarks for protein fitness prediction and design. In Proceedings of the Advances in Neural Information Processing Systems 37 (eds Oh, A. et al.) (NeurIPS, 2023).
167		4	27	27. Smedley, D. et al. A whole-genome analysis framework for effective identification of pathogenic regulatory variants in mendelian disease. Am. J. Hum. Genet. 99, 595–606 (2016).
168		4	28	28. Albuisson, J. et al. Identification of two novel mutations in Shh long-range regulator associated with familial pre-axial polydactyly. Clin. Genet. 79, 371–377 (2011).
169		4	29	29. Kvon, E. Z. et al. Comprehensive in vivo interrogation reveals phenotypic impact of human enhancer variants. Cell 180, 1262–1271.e15 (2020).
170		4	30	30. Arbini, A. A., Pollak, E. S., Bayleran, J. K., High, K. A. & Bauer, K. A. Severe factor VII deficiency due to a mutation disrupting a hepatocyte nuclear factor 4 binding site in the factor VII promoter. Blood 89, 176–182 (1997).
171		4	31	31. Gao, H. et al. The landscape of tolerated genetic variation in humans and primates. Science 380, eabn8153 (2023).
172		4	32	32. The Dependency Map Consortium. DepMap 23Q4 public. figshare https://doi.org/10.25452/figshare.plus.24667905.v2 (2023).
173		4	33	33. Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443(2020).
174		4	34	34. Agarwal, I., Fuller, Z. L., Myers, S. R. & Przeworski, M. Relating pathogenic loss-of-function mutations in humans to their evolutionary fitness costs. eLife 12, e83172 (2023).
175		4	35	35. Zeng, T., Spence, J. P., Mostafavi, H. & Pritchard, J. K.Bayesian estimation of gene constraint from an evolutionary model with gene features. Nat. Genet. 56, 1632–1643 (2024).
176		4	36	36. Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–424 (2015).
177		4	37	37. Schneider, T. D. & Stephens, R. M. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 18, 6097–6100 (1990).
178		4	38	38. Kent, W. J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).
179		4	39	39. Nair, S. et al. The dynseq browser track shows context-specific features at nucleotide resolution. Nat. Genet. 54, 1581–1583 (2022).
180		4	40	40. Fishman, V. et al. GENA-LM: a family of open-source foundational models for long DNA sequences. Preprint at bioRxiv https://doi.org/10.1101/2023.06.12.544594 (2023).
181		4	41	41. Borgeaud, S. et al. Improving language models by retrieving from trillions of tokens. In Proc. 39th International Conference on Machine Learning (eds Chaudhuri, K. et al.) 2206–2240 (PMLR, 2022).
182		4	42	42. Weiner, D. J. et al. Polygenic architecture of rare coding variation across 394,783 exomes. Nature 614, 492–499 (2023).
183		4	43	43. Weissbrod, O. et al. Functionally informed fine-mapping and polygenic localization of complex trait heritability. Nat. Genet. 52, 1355–1363 (2020).
184		4	44	44. Márquez-Luna, C. et al. Incorporating functional priors improves polygenic prediction accuracy in UK Biobank and 23andMe data sets. Nat. Commun. 12, 6052 (2021).
185		4	45	45. Aw, A. J., McRae, J., Rahmani, E. & Song, Y. S. Highly parameterized polygenic scores tend to overfit to population stratification via random effects. Preprint at bioRxiv https://doi.org/10.1101/2024.01.27.577589 (2024).
186		5	0	] G. Brixi, M. G. Durrant, J. Ku, M. Poli, G. Brockman, D. Chang, G. A. Gonzalez, S. H. King, D. B. Li, A. T. Merchant, M. Naghipourfar, E. Nguyen, C. Ricci-Tam, D. W. Romero, G. Sun, A. Taghibakshi, A. Vorontsov, B. Yang, M. Deng, L. Gorton, N. Nguyen, N. K. Wang, E. Adams, S. A. Baccus, S. Dillmann, S. Ermon, D. Guo, R. Ilango, K. Janik, A. X. Lu, R. Mehta, M. R.Mofrad, M. Y. Ng, J. Pannu, C. Ré, J. C. Schmok, J. S. John, J. Sullivan, K. Zhu, G. Zynda, D. Balsam, P. Collison, A. B. Costa, T. Hernandez-Boussard, E. Ho, M.-Y. Liu, T. McGrath, K. Powell, D. P. Burke, H. Goodarzi, P. D. Hsu, and B. L. Hie. Genome modeling and design across all domains of life with evo 2. bioRxiv, 2025. doi: 10.1101/2025.02.18.638918. URL https://www.biorxiv.org/content/early/2025/02/21/2025.02.18.638918.	https://qiita.com/kaizen_nagoya/items/eecda74f758008633ee2
187		5	1	J. Abramson, J. Adler, J. Dunger, R. Evans, T. Green, A. Pritzel, O. Ronneberger, L. Willmore, A. J. Ballard, J. Bambrick, et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature,2024.
188		5	2	C. D. Allis and T. Jenuwein. The molecular hallmarks of epigenetic control. Nature Reviews Genetics, 17(8):487–500, 2016.
189		5	3	E. Almazrouei, H. Alobeidli, A. Alshamsi, A. Cappelli, R. Cojocaru, M. Debbah, É. Goffinet, D. Hesslow, J. Launay, Q. Malartic, et al. The falcon series of open language models. arXiv preprint arXiv:2311.16867, 2023. A. Andonian, Q. Anthony, S. Biderman, S. Black, P. Gali, L. Gao, E. Hallahan, J. Levy-Kramer, C. Leahy,
190		5	4	L.Nestler,K.Parker,M.Pieler,J.Phang,S.Purohit,H.Schoelkopf,D.Stander,T.Songz,C.Tigges,B.Thérien, P. Wang, and S. Weinbach. GPT-NeoX: Large Scale Autoregressive Language Modeling in PyTorch, 9 2023. URL https://www.github.com/eleutherai/gpt-neox.
191		5	5	Ž. Avsec, V. Agarwal, D. Visentin, J. R. Ledsam, A. Grabska-Barwinska, K. R. Taylor, Y. Assael, J. Jumper, P. Kohli, and D. R. Kelley. Effective gene expression prediction from sequence by integrating long-range interactions. Nature Methods, 18(10):1196–1203, 2021.
192		5	6	G. Benegas, S. S. Batra, and Y. S. Song. DNA language models are powerful predictors of genome-wide variant effects. Proceedings of the National Academy of Sciences, 120(44):e2311219120, 2023. doi: 10.1073/pnas.2311219120. URL https://www.pnas.org/doi/abs/10.1073/pnas.2311219120.
193		5	7	G. Benegas, C. Albors, A. J. Aw, C. Ye, and Y. S. Song. A dna language model based on multispecies alignment predicts the effects of genome-wide variants. Nature Biotechnology, pages 1–6, 2025.
194		5	8	S. Biderman, H. Schoelkopf, Q. G. Anthony, H. Bradley, K. O’Brien, E. Hallahan, M. A. Khan, S. Purohit, U. S. Prashanth, E. Raff, et al. Pythia: A suite for analyzing large language models across training and scaling. In International Conference on Machine Learning, pages 2397–2430. PMLR, 2023.
195		5	9	D. Bloomfield, J. Pannu, A. W. Zhu, M. Y. Ng, A. Lewis, E. Bendavid, S. M. Asch, T. Hernandez-Boussard, A. Cicero, and T.Inglesby. AIandbiosecurity: Theneedforgovernance. Science, 385(6711):831–833, 2024. doi:10.1126/science.adq1977. URL https://www.science.org/doi/abs/10.1126/science.adq1977.
196		5	10	N. Brandes, G. Goldman, C. H. Wang, C. J. Ye, and V. Ntranos. Genome-wide prediction of disease variant effects with a deep protein language model. Nature Genetics, 55(9):1512–1522, Sept. 2023. ISSN 1546-1718. doi: 10.1038/s41588-023-01465-0. URL https://doi.org/10.1038/s41588-023-01465-0.
197		5	11	S. Brenner, A. Stretton, and S. Kaplan. Genetic code: the ‘nonsense’ triplets for chain termination and their suppression. Nature, 206(4988):994–998, 1965.
198		5	12	T. Bricken, A. Templeton, J. Batson, B. Chen, A. Jermyn, T. Conerly, N. Turner, C. Anil, C. Denison, A. Askell, R. Lasenby, Y. Wu, S. Kravec, N. Schiefer, T. Maxwell, N. Joseph, Z. Hatfield-Dodds, A. Tamkin, K. Nguyen, B. McLean, J. E. Burke, T. Hume, S. Carter, T. Henighan, and C. Olah. Towards monosemanticity: Decomposing language models with dictionary learning. Transformer Circuits Thread, 2023. https://transformer-circuits.pub/2023/monosemantic-features/index.html.
199		5	13	J. E. Brownell, J. Zhou, T. Ranalli, R. Kobayashi, D. G. Edmondson, S. Y. Roth, and C. D. Allis. Tetrahymena histone acetyltransferase A: a homolog to yeast Gcn5p linking histone acetylation to gene activation. Cell, 84(6):843–851, 1996.
200		5	14	B. Bussmann, P. Leask, and N. Nanda. BatchTopK sparse autoencoders. arXiv, 2412.06410, 2024. A. P. Camargo, S. Roux, F. Schulz, M. Babinski, Y. Xu, B. Hu, P. S. G. Chain, S. Nayfach, and N. C. Kyrpides. Identification of mobile genetic elements with genomad. Nat. Biotechnol., 42(8):1303–1312, Aug. 2024.
201		5		P. Camargo, S. Roux, F. Schulz, M. Babinski, Y. Xu, B. Hu, P. S. G. Chain, S. Nayfach, and N. C. Kyrpides. Identification of mobile genetic elements with genomad. Nat. Biotechnol., 42(8):1303–1312, Aug.
202		5	15	B. Chen, X. Cheng, P. Li, Y.-a. Geng, J. Gong, S. Li, Z. Bei, X. Tan, B. Wang, X. Zeng, et al. xTri-moPGLM:unified100b-scalepre-trainedtransformerfordecipheringthelanguageofprotein. arXiv preprint arXiv:2401.06199, 2024.
203		5	16	I.-M. A. Chen, K. Chu, K. Palaniappan, A. Ratner, J. Huang, M. Huntemann, P. Hajek, S. Ritter, N. Varghese, R. Seshadri, S. Roux, T. Woyke, E. A. Eloe-Fadrosh, N. N. Ivanova, and N. C. Kyrpides. The IMG/M data management and analysis system v.6.0: new tools and advanced capabilities. Nucleic Acids Res., 49(D1):D751–D763, Jan. 2021.
204		5	17	J.Chen,Z.Hu,S.Sun,Q.Tan,Y.Wang,Q.Yu,L.Zong,L.Hong,J.Xiao,T.Shen,I.King,andY.Li. Interpretable RNA foundation model from unannotated data for highly accurate RNA structure and function predictions 2022.
205		5	18	S. Chen, S. Wong, L. Chen, and Y. Tian. Extending context window of large language models via positional interpolation. URL https://arxiv. org/abs/2306.15595, 2023.
206		5	19	J. Cheng, G. Novati, J. Pan, C. Bycroft, A. Žemgulyt˙ e, T. Applebaum, A. Pritzel, L. H. Wong, M. Zielinski, T. Sargeant, et al. Accurate proteome-wide missense variant effect prediction with alphamissense. Science, 381(6664):eadg7492, 2023.
207		5	20	L. T. Chow, R. E. Gelinas, T. R. Broker, and R. J. Roberts. An amazing sequence arrangement at the 5 ends of adenovirus 2 messenger RNA. Cell, 12(1):1–8, 1977.
208		5	21	G. Csárdi, T. Nepusz, K. Müller, S. Horvát, V. Traag, F. Zanini, and D. Noom. igraph for R: R interface of the igraph library for graph theory and network analysis, Dec. 2024.
209		5	22	H. Cunningham, A. Ewart, L. Riggs, R. Huben, and L. Sharkey. Sparse autoencoders find highly interpretable features in language models. arXiv, 2309.08600, 2023.
210		5	23	H.Dalla-Torre, L.Gonzalez, J.Mendoza-Revilla, N.LopezCarranza, A.H.Grzywaczewski, F.Oteri, C.Dallago, E. Trop, B. P. de Almeida, H. Sirelkhatim, et al. Nucleotide Transformer: Building and evaluating robust foundation models for human genomics. Nature Methods, pages 1–11, 2024.
211		5	24	C. Darwin. On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life. John Murray, London, 1859.
212		5	25	T. Dobzhansky. Genetics and the Origin of Species. Columbia University Press, 1951.
213		5	26	A. Dubey, A. Jauhri, A. Pandey, A. Kadian, A. Al-Dahle, A. Letman, A. Mathur, A. Schelten, A. Yang, A. Fan, et al. The Llama 3 herd of models. arXiv preprint arXiv:2407.21783, 2024.
214		5	27	M. G. Durrant, N. T. Perry, J. J. Pai, A. R. Jangid, J. S. Athukoralage, M. Hiraizumi, J. P. McSpedon, A. Pawluk, H. Nishimasu, S. Konermann, and P. D. Hsu. Bridge RNAs direct programmable recombination of target and donor DNA. Nature, 630(8018):984–993, June 2024.
215		5	28	A. A. Egorov and G. C. Atkinson. Lovis4u: Locus visualisation tool for comparative genomics. bioRxiv, 2024. doi: 10.1101/2024.09.11.612399. URL https://www.biorxiv.org/content/early/2024/09/14/2024.09.11.612399.
216		5	29	G. M. Findlay, R. M. Daza, B. Martin, M. D. Zhang, A. P. Leith, M. Gasperini, J. D. Janizek, X. Huang, L. M. Starita, and J. Shendure. Accurate classification of BRCA1 variants with saturation genome editing. Nature, 562:217–222, 10 2018. ISSN 0028-0836. doi: 10.1038/s41586-018-0461-z.
217		5	30	L. Gao, T. D. la Tour, H. Tillman, G. Goh, R. Troll, A. Radford, I. Sutskever, J. Leike, and J. Wu. Scaling and evaluating sparse autoencoders. arXiv, 2024a. URL https://arxiv.org/abs/2406.04093.
218		5	31	T.Gao, A.Wettig, H.Yen, andD.Chen. Howtotrainlong-contextlanguagemodels(effectively). arXiv preprint arXiv:2410.02660, 2024b.
219		5	32	D.G.Gibson, G.A.Benders, C.Andrews-Pfannkoch, E.A.Denisova, H.Baden-Tillson, J.Zaveri, T.B.Stockwell, A. Brownley, D. W. Thomas, M. A. Algire, C. Merryman, L. Young, V. N. Noskov, J. I. Glass, J. C. Venter, C. A. Hutchison, and H. O. Smith. Complete chemical synthesis, assembly, and cloning of a Mycoplasma genitalium genome. Science, 319(5867):1215–1220, 2008.
220		5	33	S. Gupta, J. A. Stamatoyannopoulos, T. L. Bailey, and W. S. Noble. Quantifying similarity between motifs. Genome Biology, 8(2):R24, Feb. 2007. ISSN 1474-760X. doi: 10.1186/gb-2007-8-2-r24. URL https://doi.org/10.1186/gb-2007-8-2-r24.
221		5	34	P. W. Harrison, M. R. Amode, O. Austine-Orimoloye, A. G. Azov, M. Barba, I. Barnes, A. Becker, R. Bennett, A. Berry, J. Bhai, S. K. Bhurji, S. Boddu, P. R. Branco Lins, L. Brooks, S. B. Ramaraju, L. I. Campbell, M. C. Martinez, M. Charkhchi, K. Chougule, A. Cockburn, C. Davidson, N. H. De Silva, K. Dodiya, S. Donaldson, B. El Houdaigui, T. E. Naboulsi, R. Fatima, C. G. Giron, T. Genez, D. Grigoriadis, G. S. Ghattaoraya, J. G. Martinez, T. A. Gurbich, M. Hardy, Z. Hollis, T. Hourlier, T. Hunt, M. Kay, V. Kaykala, T. Le, D. Lemos, D. Lodha, D. Marques-Coelho, G. Maslen, G. A. Merino, L. P. Mirabueno, A. Mushtaq, S. N. Hossain, D. N. Ogeh, M. P. Sakthivel, A. Parker, M. Perry, I. Piližota, D. Poppleton, I. Prosovetskaia, S. Raj, J. G. Pérez-Silva, A. I. A. Salam, S. Saraf, N. Saraiva-Agostinho, D. Sheppard, S. Sinha, B. Sipos, V. Sitnik, W. Stark, E. Steed, M.-M.Suner,L.Surapaneni,K.Sutinen,F.F.Tricomi,D.Urbina-Gómez, A.Veidenberg, T.A.Walsh, D.Ware, E. Wass, N. L. Willhoft, J. Allen, J. Alvarez-Jarreta, M. Chakiachvili, B. Flint, S. Giorgetti, L. Haggerty, G. R. Ilsley, J.Keatley, J.E.Loveland, B.Moore, J.M.Mudge, G.Naamati, J.Tate, S.J.Trevanion, A.Winterbottom, A. Frankish, S. E. Hunt, F. Cunningham, S. Dyer, R. D. Finn, F. J. Martin, and A. D. Yates. Ensembl 2024. Nucleic Acids Res., 52(D1):D891–D899, Jan. 2024.
222		5	35	S. Hartmann, D. Lu, J. Phillips, and T. J. Vision. Phytome: a platform for plant comparative genomics. Nucleic Acids Res., 34(Database issue):D724–30, Jan. 2006.
223		5	36	T. Hayes, R. Rao, H. Akin, N. J. Sofroniew, D. Oktay, Z. Lin, R. Verkuil, V. Q. Tran, J. Deaton, M. Wiggert, et al. Simulating 500 million years of evolution with a language model. Science, eads0018, 2025.
224		5	37	S. B. Hedges. The origin and evolution of model organisms. Nat. Rev. Genet., 3(11):838–849, Nov. 2002.
225		5	38	J. C. Hingerl, A. Karollus, and J. Gagneur. Flashzoi: An enhanced Borzoi model for accelerated genomic analysis. bioRxiv, pages 2024–12, 2024.
226		5	39	H. Huang, C. Hu, J. Na, S. N. Hart, R. D. Gnanaolivu, M. Abozaid, T. Rao, Y. A. Tecleab, T. Pesaran, et al. Functional evaluation and clinical classification of BRCA2 variants. Nature, pages 1–10, 2025.
227		5	40	L. A. Hug, B. J. Baker, K. Anantharaman, C. T. Brown, A. J. Probst, C. J. Castelle, C. N. Butterfield, A. W. Hernsdorf, Y. Amano, K. Ise, Y. Suzuki, N. Dudek, D. A. Relman, K. M. Finstad, R. Amundson, B. C. Thomas, and J. F. Banfield. A new view of the tree of life. Nat. Microbiol., 1(5):16048, Apr. 2016.
228		5	41	D. Hyatt, G.-L. Chen, P. F. Locascio, M. L. Land, F. W. Larimer, and L. J. Hauser. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics, 11:119, Mar. 2010.
229		5	42	F. Jacob and J. Monod. Genetic regulatory mechanisms in the synthesis of proteins. Journal of Molecular Biology, 3(3):318–356, 1961.
230		5	43	Y. Ji, Z. Zhou, H. Liu, and R. V. Davuluri. DNABERT: pre-trained bidirectional encoder representations from transformers model for DNA-language in genome. Bioinformatics, 37(15):2112–2120, 022021. ISSN1367-4803. doi: 10.1093/bioinformatics/btab083. URL https://doi.org/10.1093/bioinformatics/btab083.
231		5	44	W. Kabsch and C. Sander. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers: Original Research on Biomolecules, 22(12):2577–2637, 1983.
232		5	45	J. Kaplan, S. McCandlish, T. Henighan, T. B. Brown, B. Chess, R. Child, S. Gray, A. Radford, J. Wu, and D.Amodei. Scalinglawsforneurallanguagemodels,2020. URLhttps://arxiv.org/abs/2001.08361.
233		5	46	J. R. Karr, J. C. Sanghvi, D. N. Macklin, M. V. Gutschow, J. M. Jacobs, B. Bolival, N. Assad-Garcia, J. I. Glass, and M. W. Covert. A whole-cell computational model predicts phenotype from genotype. Cell, 150(2):389–401, 2012.
234		5	47	S. Knudsen. Promoter2.0: for the recognition of PolII promoter sequences. Bioinformatics (Oxford, England), 15(5):356–361, 1999.
235		5	48	M. Kozak. The scanning model for translation: an update. The Journal of cell biology, 108(2):229–241, 1989.
236		5	49	J. Ku, E. Nguyen, D. Romero, G. Brixi, B. Yang, A. Vorontsov, A. Taghibakhshi, A. Lu, D. Burke, G. Brockman, S. Massaroli, C. Re, P. Hsu, B. Hie, S. Ermon, and M. Poli. Systems and algorithms for convolutional multi-hybrid language models at scale. 2025.
237		5	50	P. Kunzmann, T. D. Müller, M. Greil, J. H. Krumbach, J. M. Anter, D. Bauer, F. Islam, and K. Hamacher. Biotite: new tools for a versatile python bioinformatics library. BMC Bioinformatics, 24(1):236, June 2023.
238		5	51	M. J. Landrum, J. M. Lee, G. R. Riley, W. Jang, W. S. Rubinstein, D. M. Church, and D. R. Maglott. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Research, 42 (D1):D980–D985, 11 2013. ISSN 0305-1048. doi: 10.1093/nar/gkt1113. URL https://doi.org/10.1093/nar/gkt1113.
239		5	52	S. Li, S. Moayedpour, R. Li, M. Bailey, S. Riahi, M. Miladi, J. Miner, D. Zheng, J. Wang, A. Balsubramani, K.Tran, M.Zacharia, M.Wu, X.Gu, R.Clinton, C.Asquith, J.Skalesk, L.Boeglin, S.Chivukula, A.Dias, F.U. Montoya, V. Agarwal, Z. Bar-Joseph, and S. Jager. CodonBERT: Large language models for mRNA design and optimization. bioRxiv, 2023. doi: 10.1101/2023.09.09.556981. URL https://www.biorxiv.org/content/early/2023/09/12/2023.09.09.556981.
240		5	53	W.-W. Liang, S. Müller, S. K. Hart, H.-H. Wessels, A. Méndez-Mancilla, A. Sookdeo, O. Choi, C. M. Caragine, A. Corman, L. Lu, O. Kolumba, B. Williams, and N. E. Sanjana. Transcriptome-scale RNA-targeting CRISPR screens reveal essential lncRNAs in human cells. Cell, 187(26):7637–7654.e29, Dec. 2024.
241		5	54	Z. Lin, H. Akin, R. Rao, B. Hie, Z. Zhu, W. Lu, N. Smetanin, R. Verkuil, O. Kabeli, Y. Shmueli, A. dos Santos Costa, M. Fazel-Zarandi, T. Sercu, S. Candido, and A. Rives. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science, 379:1123–1130, 3 2023. ISSN 0036-8075. doi:10.1126/science.ade2574.
242		5	55	J.Linder, D.Srivastava, H.Yuan, V.Agarwal, andD.R.Kelley. PredictingRNA-seqcoveragefromdnasequence as a unifying model of gene regulation. Nature Genetics, pages 1–13, 2025.
243		5	56	A. Liu, B. Feng, B. Xue, B. Wang, B. Wu, C. Lu, C. Zhao, C. Deng, C. Zhang, C. Ruan, et al. DeepSeek-v3 technical report. arXiv preprint arXiv:2412.19437, 2024.
244		5	57	B. J. Livesey and J. A. Marsh. Updated benchmarking of variant effect predictors using deep mutational scanning. Molecular Systems Biology, 19, 8 2023. ISSN 1744-4292. doi: 10.15252/msb.202211474.
245		5	58	A. Lozhkov, R. Li, L. B. Allal, F. Cassano, J. Lamy-Poirier, N. Tazi, A. Tang, D. Pykhtar, J. Liu, Y. Wei, et al. Starcoder 2 and the stack v2: The next generation. arXiv preprint arXiv:2402.19173, 2024.
246		5	59	A. Makhzani and B. Frey. k-sparse autoencoders. arXiv, 2014. URL https://arxiv.org/abs/1312.5663.
247		5	60	G. Marçais and C. Kingsford. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics, 27(6):764–770, 2011.
248		5	61	K. Marcker and F. Sanger. N-formyl-methionyl-s-rna. Journal of Molecular Biology, 8(6):835–IN8,1964. ISSN 0022-2836. doi: https://doi.org/10.1016/S0022-2836(64)80164-9. URL https://www.sciencedirect.com/science/article/pii/S0022283664801649.
249		5	62	L. McInnes, J. Healy, and J. Melville. UMAP: Uniform manifold approximation and projection for dimension reduction. arXiv, 1802.03426, 2018.
250		5	63	J. Meier, R. Rao, R. Verkuil, J. Liu, T. Sercu, and A. Rives. Language models enable zero-shot prediction of the effects of mutations on protein function. bioRxiv, 2021. doi: 10.1101/2021.07.09.450648. URL https://www.biorxiv.org/content/10.1101/2021.07.09.450648v1.
251		5	64	G. Mendel. Versuche über Pflanzen-Hybriden. Verhandlungen des naturforschenden Vereines in Brünn, 4:3–47, 1866.
252		5	65	E. C. Meng, T. D. Goddard, E. F. Pettersen, G. S. Couch, Z. J. Pearson, J. H. Morris, and T. E. Ferrin. Ucsf chimerax: Tools for structure building and analysis. Protein Science, 32(11):e4792, 2023. doi: https://doi.org/10.1002/pro.4792. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/pro.4792.
253		5	66	G. Meng, Y. Li, C. Yang, and S. Liu. MitoZ: a toolkit for animal mitochondrial genome assembly, annotation and visualization. Nucleic Acids Research, 47(11):e63–e63, 2019.
254		5	67	A. T. Merchant, S. H. King, E. Nguyen, and B. L. Hie. Semantic mining of functional de novo genes from a genomic language model. bioRxiv, 2024. doi: 10.1101/2024.12.17.628962. URL https://www.biorxiv.org/content/early/2024/12/18/2024.12.17.628962.
255		5	68	F. Meyer, D. Paarmann, M. D’Souza, R. Olson, E. M. Glass, M. Kubal, T. Paczian, A. Rodriguez, R. Stevens, A. Wilke, J. Wilkening, and R. A. Edwards. The metagenomics RAST server - a public resource for the automatic phylogenetic and function alanalysis of meta genomes. BMC Bioinformatics, 9(1):386, Sept.2008.
256		5	69	J. Mistry, S. Chuguransky, L. Williams, M. Qureshi, G. A. Salazar, E. L. Sonnhammer, S. C. Tosatto, L. Paladin, S. Raj, L. J. Richardson, et al. Pfam: The protein families database in 2021. Nucleic acids research, 49(D1): D412–D419, 2021
257		5	70	A. L. Mitchell, A. Almeida, M. Beracochea, M. Boland, J. Burgin, G. Cochrane, M. R. Crusoe, V. Kale, S. C. Potter, L. J. Richardson, E. Sakharova, M. Scheremetjew, A. Korobeynikov, A. Shlemov, O. Kunyavskaya,A. Lapidus, and R. D. Finn. MGnify: the microbiome analysis resource in 2020. Nucleic Acids Res., 48(D1):D570–D578, Jan. 2020.
258		5	71	M.Naghipourfar, S.Chen, M.Howard, C.Macdonald, A.Saberi, T.Hagen, M.Mofrad, W.Coyote-Maestas, and H. Goodarzi. A suite of foundation models captures the contextual interplay between codons. bioRxiv, Oct. 2024. doi: 10.1101/2024.10.10.617568. URL http://dx.doi.org/10.1101/2024.10.10.617568.
259		5	72	E.Nguyen, M.Poli, M.G.Durrant, B.Kang, D.Katrekar, D.B.Li, L.J.Bartie, A.W.Thomas, S.H.King, G.Brixi, J. Sullivan, M. Y. Ng, A. Lewis, A. Lou, S. Ermon, S. A. Baccus, T. Hernandez-Boussard, C. Ré, P. D. Hsu, and B. L. Hie. Sequence modeling and design from molecular to genome scale with evo. Science, 386(6723): eado9336, Nov. 2024a.
260		5	73	E.Nguyen,M.Poli,M.Faizi,A.Thomas,M.Wornow,C.Birch-Sykes,S.Massaroli,A.Patel,C.Rabideau,Y.Bengio, et al. HyenaDNA: Long-range genomic sequence modeling at single nucleotide resolution. Advances in neural information processing systems, 36, 2024b.
261		5	74	E. Nijkamp, J. A. Ruffolo, E. N. Weinstein, N. Naik, and A. Madani. ProGen2: exploring the boundaries of protein language models. Cell Systems, 14(11):968–978, 2023.
262		5	75	M. W. Nirenberg and J. H. Matthaei. The dependence of cell-free protein synthesis in E. coli upon naturally occurring or synthetic polyribonucleotides. Proceedings of the National Academy of Sciences, 47(10):1588–1602, 1961.
263		5	76	P. Notin, A. W. Kollasch, D. Ritter, L. van Niekerk, S. Paul, H. Spinner, N. Rollins, A. Shaw, R. Weitzman, J. Frazer, M. Dias, D. Franceschi, R. Orenbuch, Y. Gal, and D. S. Marks. ProteinGym: Large-scale benchmarks for protein design and fitness prediction. bioRxiv, page 2023.12.07.570727, 1 2023. doi:10.1101/2023.12.07.570727. URL http://biorxiv.org/content/early/2023/12/08/2023.12.07.570727.abstract.
264		5	77	N. A. O’Leary, M. W. Wright, J. R. Brister, S. Ciufo, D. Haddad, R. McVeigh, B. Rajput, B. Robbertse, B. Smith-White, D. Ako-Adjei, et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Research, 44(D1):D733–D745, 2016.
265		5	78	B. D. Ondov, T. J. Treangen, P. Melsted, A. B. Mallonee, N. H. Bergman, S. Koren, and A. M. Phillippy. Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol., 17(1):132, June 2016.
266		5	81	R. Overbeek, M. Fonstein, M. D’souza, G. D. Pusch, and N. Maltsev. The use of gene clusters to infer functional coupling. Proceedings of the National Academy of Sciences, 96(6):2896–2901, 1999.
267		5	82	D.H.Parks, M.Chuvochina, C.Rinke, A.J.Mussig, P.-A.Chaumeil, andP.Hugenholtz. GTDB:anongoingcen-sus of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. Nucleic Acids Res., 50(D1):D785–D794, Jan. 2022.
268		5	83	A. Patel, A. Singhal, A. Wang, A. Pampari, M. Kasowski, and A. Kundaje. DART-Eval: A comprehensive DNA language model evaluation benchmark on regulatory DNA. Advances in Neural Information Processing Systems, 37:62024–62061, 2024.
269		5	84	A. K. Pathak, N. Bora, M. Badonyi, B. J. Livesey, S. Consortium, J. Ngeow, and J. A. Marsh. Pervasive ancestry bias in variant effect predictors. bioRxiv, pages 2024–05, 2024.
270		5	85	F.Pedregosa, G.Varoquaux, A.Gramfort, V.Michel, B.Thirion, O.Grisel, M.Blondel, P.Prettenhofer, R.Weiss, V. Dubourg, et al. Scikit-learn: Machine learning in python. The Journal of Machine Learning Research, 12:2825–2830, 2011.
271		5	86	R. J. Penić, T. Vlašić, R. G. Huber, Y. Wan, and M. Šikić. RiNALMo: General-purpose RNA language models can generalize well on structure prediction tasks. arXiv, 2403.00043, 2024.
272		5	87	D. Piya, N. Nolan, M. L. Moore, L. A. R. Hernandez, B. F. Cress, R. Young, A. P. Arkin, and V. K. Mutalik. Systematic and scalable genome-wide essentiality mapping to identify nonessential genes in phages. PLOS Biology, 21:e3002416, 12 2023. ISSN 1545-7885. doi: 10.1371/journal.pbio.3002416.
273		5	88	M. Poli, S. Massaroli, E. Nguyen, D. Y. Fu, T. Dao, S. Baccus, Y. Bengio, S. Ermon, and C. Ré. Hyena hierarchy: Towards larger convolutional language models. In International Conference on Machine Learning, pages 28043–28078. PMLR, 2023.
274		5	89	M. Poli, A. W. Thomas, E. Nguyen, P. Ponnusamy, B. Deiseroth, K. Kersting, T. Suzuki, B. Hie, S. Ermon, C. Ré, et al. Mechanistic design and scaling of hybrid architectures. arXiv preprint arXiv:2403.17844, 2024.
275		5	90	K. S. Pollard, M. J. Hubisz, K. R. Rosenbloom, and A. Siepel. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Research, 20(1):110–21, 2010. doi: 10.1101/gr.097857.109.
276		5	91	E. Proux-Wéra, D. Armisén, K. P. Byrne, and K. H. Wolfe. A pipeline for automated annotation of yeast genome sequences by a conserved-synteny approach. BMC Bioinformatics, 13:1–12, 2012.
277		5	92	A. R. Quinlan and I. M. Hall. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics, 26(6):841–842, Mar. 2010.
278		5	93	A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, et al. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
279		5	94	S. Rajbhandari, J. Rasley, O. Ruwase, and Y. He. ZeRO: Memory optimizations toward training trillion parameter models. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1–16. IEEE, 2020.
280		5	95	Responsible AI x Biodesign. Community values, guiding principles, and commitments for the responsible development of AI for protein design, March 8 2024. URL https://responsiblebiodesign.ai/#values-and-principles.
281		5	96	A. Rives, J. Meier, T. Sercu, S. Goyal, Z. Lin, J. Liu, D. Guo, M. Ott, C. L. Zitnick, J. Ma, and R. Fergus. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proceedings of the National Academy of Sciences, 118:e2016239118, 2021. ISSN 0027-8424. doi:10.1073/pnas.2016239118.
282		5	97	J.T.Robinson,H.Thorvaldsdottir,D.Turner,andJ.P.Mesirov. igv.js: anembeddablejavascriptimplementation of the integrative genomics viewer (igv). Bioinformatics, 39(1):btac830, 12 2022. ISSN 1367-4811. doi:10.1093/bioinformatics/btac830. URL https://doi.org/10.1093/bioinformatics/btac830.
283		5	98	M. Sandoval-Velasco, O. Dudchenko, J. A. Rodríguez, C. P. Estrada, M. Dehasque, C. Fontsere, S. S. Mak, R. Khan, V. G. Contessoto, A. B. O. Junior, et al. Three-dimensional genome architecture persists in a 52,000-year-old woolly mammoth skin sample. Cell, 187(14):3541–3562, 2024.
284		5	99	I.Sarropoulos, R.Marin, M.Cardoso-Moreira, andH.Kaessmann. Developmental dynamics of lncRNAs across mammalian organs and species. Nature, 571(7766):510–514, July 2019.
285		5	100	V. A. Schneider, T. Graves-Lindsay, K. Howe, N. Bouk, H.-C. Chen, P. A. Kitts, T. D. Murphy, K. D. Pruitt, F. Thibaud-Nissen, D. Albracht, R. S. Fulton, M. Kremitzki, V. Magrini, C. Markovic, S. McGrath, K. M. Steinberg, K. Auger, W. Chow, J. Collins, G. Harden, T. Hubbard, S. Pelan, J. T. Simpson, G. Threadgold, J. Torrance, J. M. Wood, L. Clarke, S. Koren, M. Boitano, P. Peluso, H. Li, C.-S. Chin, A. M. Phillippy, R. Durbin, R. K. Wilson, P. Flicek, E. E. Eichler, and D. M. Church. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Research, 27(5):849–864, May 2017. ISSN 1088-9051, 1549-5469. doi: 10.1101/gr.213611.116. URL https://genome.cshlp.org/content/27/5/849. Company: Cold Spring Harbor Laboratory Press Distributor: Cold Spring Harbor Laboratory Press Institution: Cold Spring Harbor Laboratory Press Label: Cold Spring Harbor Laboratory Press Publisher: Cold Spring Harbor Lab.
286		5	101	M. Schubach, T. Maass, L. Nazaretyan, S. Röner, and M. Kircher. CADD v1.7: using protein language models, regulatoryc nn sando the rnucleotide-level score stoimprovegenome-wide variant predictions. Nucleic Acids Research, 52(D1):D1143–D1154, 01 2024. ISSN 0305-1048. doi: 10.1093/nar/gkad989. URL https://doi.org/10.1093/nar/gkad989.
287		5	102	O. Schwengers, L. Jelonek, M. A. Dieckmann, S. Beyvers, J. Blom, and A. Goesmann. Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Microb. Genom., 7 (11), Nov. 2021.
288		5	103	T. Shen, Z. Hu, S. Sun, D. Liu, F. Wong, J. Wang, J. Chen, Y. Wang, L. Hong, J. Xiao, et al. Accurate RNA 3d structure prediction using a language model-based deep learning approach. Nature Methods, pages 1–12, 2024.
289		5	104	J. Shine and L. Dalgarno. The 3-terminal sequence of escherichia coli 16s ribosomal rna: complementarity to nonsense triplets and ribosome binding sites. Proceedings of the National Academy of Sciences, 71(4): 1342–1346, 1974.
290		5	105	J. Söding, A. Biegert, and A. N. Lupas. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Research, 33(suppl_2):W244–W248, 2005.
291		5	106	M.Steinegger and J.Söding. MMseqs2 enables sensitive protein sequences earching for the analys is ofmassive data sets. Nat. Biotechnol., 35(11):1026–1028, Nov. 2017.
292		5	107	J. Su, M. Ahmed, Y. Lu, S. Pan, W. Bo, and Y. Liu. Roformer: Enhanced transformer with rotary position embedding. Neurocomputing, 568:127063, 2024.
293		5	108	P. J. Sullivan, J. M. Quinn, W. Wu, M. Pinese, and M. J. Cowley. SpliceVarDB: A comprehensive database of experimentally validated human splicing variants. The American Journal of Human Genetics, 111(10): 2164–2175, Oct. 2024. ISSN 0002-9297. doi: 10.1016/j.ajhg.2024.08.002. URL https://doi.org/10.1016/j.ajhg.2024.08.002. Publisher: Elsevier.
294		5	109	S. Sunagawa, L. P. Coelho, S. Chaffron, J. R. Kultima, K. Labadie, G. Salazar, B. Djahanschiri, G. Zeller, D. R. Mende, A. Alberti, F. M. Cornejo-Castillo, P. I. Costea, C. Cruaud, F. d’Ovidio, S. Engelen, I. Ferrera, J. M. Gasol, L. Guidi, F. Hildebrand, F. Kokoszka, C. Lepoivre, G. Lima-Mendez, J. Poulain, B. T. Poulos, M. RoyoLlonch, H. Sarmento, S. Vieira-Silva, C. Dimier, M. Picheral, S. Searson, S. Kandels-Lewis, Tara Oceans coordinators, C. Bowler, C. de Vargas, G. Gorsky, N. Grimsley, P. Hingamp, D. Iudicone, O. Jaillon, F. Not, H. Ogata, S. Pesant, S. Speich, L. Stemmann, M. B. Sullivan, J. Weissenbach, P. Wincker, E. Karsenti, J. Raes, S. G. Acinas, and P. Bork. Structure and function of the global ocean microbiome. Science, 348(6237):1261359, May 2015.
295		5	110	E. Szathmáry and J. M. Smith. The major evolutionary transitions. Nature, 374(6519):227–232, 1995.
296		5	111	A. Tareen and J. B. Kinney. Logomaker: beautiful sequence logos in python. Bioinformatics, 36(7):2272–2274, 12 2019. ISSN 1367-4803. doi: 10.1093/bioinformatics/btz921. URL https://doi.org/10.1093/bioinformatics/btz921.
297		5	112	Team OLMo, P. Walsh, L. Soldaini, D. Groeneveld, K. Lo, S. Arora, A. Bhagia, Y. Gu, S. Huang, M. Jordan, et al. 2 OLMo 2 furious. arXiv preprint arXiv:2501.00656, 2024.
298		5	113	A. Templeton, T. Conerly, J. Marcus, J. Lindsey, T. Bricken, B. Chen, A. Pearce, C. Citro, E. Ameisen, A. Jones, H.Cunningham, N.L.Turner, C.McDougall, M.MacDiarmid, C.D.Freeman, T.R.Sumers, E.Rees, J.Batson, A. Jermyn, S. Carter, C. Olah, and T. Henighan. Scaling monosemanticity: Extracting interpretable features from claude 3 sonnet. Transformer Circuits Thread, 2024. URL https://transformer-circuits.pub/2024/scaling-monosemanticity/index.html.
299		5	114	A. W. Thomas, R. Parnichkun, A. Amini, S. Massaroli, and M. Poli. STAR: Synthesis of tailored architectures. arXiv, 2411.17800, 2024.
300		5	115	K. Tunyasuvunakool, J. Adler, Z. Wu, T. Green, M. Zielinski, A. Žídek, A. Bridgland, A. Cowie, C. Meyer, A. Laydon, et al. Highly accurate protein structure prediction for the human proteome. Nature, 596(7873): 590–596, 2021.
301		5	116	A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. Gomez, L. Kaiser, and I. Polosukhin. Attention is all you need. Advances in Neural Information Processing Systems, 2017.
302		5	117	I. E. Vorontsov, I. A. Eliseeva, A. Zinkevich, M. Nikonov, S. Abramov, A. Boytsov, V. Kamenets, A. Kasianova, S. Kolmykov, I. S. Yevshin, A. Favorov, Y. A. Medvedeva, A. Jolma, F. Kolpakov, V. J. Makeev, and I. V. Kulakovskiy. HOCOMOCO in 2024: a rebuild of the curated collection of binding models for human and mouse transcription factors. Nucleic Acids Research, 52(D1):D154–D163, November 2023. ISSN 0305-1048. doi: 10.1093/nar/gkad1077. URL https://doi.org/10.1093/nar/gkad1077.
303		5	118	N. Wang, J. Bian, Y. Li, X. Li, S. Mumtaz, L. Kong, and H. Xiong. Multi-purpose RNA language modelling with motif-aware pretraining and type-guided fine-tuning. Nature Machine Intelligence, pages 1–10, 2024.
304		5	119	J. Watson and F. Crick. Molecular structure of nucleic acids: A structure for deoxyribose nucleic acid. Nature, 171:737–738, 4 1953. ISSN 0028-0836. doi: 10.1038/171737a0.
305		5	120	W.Xiong, J.Liu, I.Molybog, H.Zhang, P.Bhargava, R.Hou, L.Martin, R.Rungta, K.A.Sankararaman, B.Oguz, et al. Effective long-context scaling of foundation models. arXiv preprint arXiv:2309.16039, 2023.
306		5	121	K. K. Yang, N. Fusi, and A. X. Lu. Convolutions are competitive with transformers for protein sequence pretraining. Cell Systems, 15(3):286–294, 2024.
307		5	122	N. D. Youngblut, J. de la Cuesta-Zuluaga, G. H. Reischer, S. Dauser, N. Schuster, C. Walzer, G. Stalder, A. H. Farnleitner, and R. E. Ley. Large-scale metagenome assembly reveals novel animal-associated microbial genomes, biosynthetic gene clusters, and other genetic diversity, 2020.
308		5	123	R. Zhang. DEG: a database of essential genes. Nucleic Acids Research, 32:271D–272, 1 2004. ISSN 1362-4962. doi: 10.1093/nar/gkh024.
309		5	124	Z. Zhang, H. K. Wayment-Steele, G. Brixi, H. Wang, D. Kern, and S. Ovchinnikov. Protein language models learn evolutionary statistics of interacting sequence motifs. Proceedings of the National Academy of Sciences, 121(45):e2406285121, 2024.
310		5	125	M. Zvyagin, A. Brace, K. Hippe, Y. Deng, B. Zhang, C. O. Bohorquez, A. Clyde, B. Kale, D. Perez-Rivera, H. Ma, C. M. Mann, M. Irvin, D. G. Ozgulbas, N. Vassilieva, J. G. Pauloski, L. Ward, V. Hayot-Sasson, M. Emani, S. Foreman, Z. Xie, D. Lin, M. Shukla, W. Nie, J. Romero, C. Dallago, A. Vahdat, C. Xiao, T. Gibbs, I. Foster, J. J. Davis, M. E. Papka, T. Brettin, R. Stevens, A. Anandkumar, V. Vishwanath, and A. Ramanathan. GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics. The International Journal of High Performance Computing Applications, 37:683–705, 11 2023. ISSN 1094-3420. doi: 10.1177/10943420231201154.
311				] W. Brown. Granular format rewards for eliciting mathematical reasoning capabilities in small language models. https://gist.github.com/willccbb/4676755236bb08cab5f4e54a0475d6fb. GitHub Gist.	https://qiita.com/kaizen_nagoya/items/98ef27a02a07c4c03d6e
312		7	0	] S. Chen, L. C. Francioli, J. K. Goodrich, R. L. Collins, M. Kanai, Q. Wang, J. Alföldi, N. A. Watts, C. Vittal, L. D. Gauthier, T. Poterba, M. W. Wilson, Y. Tarasova, W. Phu, R. Grant, M. T. Yohannes, Z. Koenig, Y. Farjoun, E. Banks, S. Donnelly, S. Gabriel, N. Gupta, S. Ferriera, C. Tolonen, S. Novod, L. Bergelson, D. Roazen, V. Ruano-Rubio, M. Covarrubias, C. Llanwarne, N. Petrillo, G. Wade, T. Jeandet, R. Munshi, K. Tibbetts, M. Abreu, C. A. A. Salinas, T. Ahmad, C. M. Albert, D. Ardissino, I. M. Armean, E. G. Atkinson, G. Atzmon, J. Barnard, S. M. Baxter, L. Beaugerie, E. J. Benjamin, D. Benjamin, M. Boehnke, L. L. Bonnycastle, E. P. Bottinger, D. W. Bowden, M. J. Bown, H. Brand, S. Brant, T. Brookings, S. Bryant, S. E. Calvo, H. Campos, J. C. Chambers, J. C. Chan, K. R. Chao, S. Chapman, D. I. Chasman, R. Chisholm, J. Cho, R. Chowdhury, M. K. Chung, W. K. Chung, K. Cibulskis, B. Cohen, K. M. Connolly, A. Correa, B. B. Cummings, D. Dabelea, J. Danesh, D. Darbar, P. Darnowsky, J. Denny, R. Duggirala, J. Dupuis, P. T. Ellinor, R. Elosua, J. Emery, E. England, J. Erdmann, T. Esko, E. Evangelista, D. Fatkin, J. Florez, A. Franke, J. Fu, M. Färkkilä, K. Garimella, J. Gentry, G. Getz, D. C. Glahn, B. Glaser, S. J. Glatt, D. Goldstein, C. Gonzalez, L. Groop, S. Gudmundsson, A. Haessly, C. Haiman, I. Hall, C. L. Hanis, M. Harms, M. Hiltunen, M. M. Holi, C. M. Hultman, C. Jalas, M. Kallela, D. Kaplan, J. Kaprio, S. Kathiresan, E. E. Kenny, B. J. Kim, Y. J. Kim, D. King, G. Kirov, J. Kooner, S. Koskinen, H. M. Krumholz, S. Kugathasan, S. H. Kwak, M. Laakso, N. Lake, T. Langsford, K. M. Laricchia, T. Lehtimäki, M. Lek, E. Lipscomb, R. J. Loos, W. Lu, S. A. Lubitz, T. T. Luna, R. C. Ma, G. M. Marcus, J. Marrugat, K. M. Mattila, S. McCarroll, M. I. McCarthy, J. L. McCauley, D. McGovern, R. McPherson, J. B. Meigs, O. Melander, A. Metspalu, D. Meyers, E. V. Minikel, B. D. Mitchell, V. K. Mootha, A. Naheed, S. Nazarian, P. M. Nilsson, M. C. O’Donovan, Y. Okada, D. Ongur, L. Orozco, M. J. Owen, C. Palmer, N. D. Palmer, A. Palotie, K. S. Park, C. Pato, A. E. Pulver, D. Rader, N. Rahman, A. Reiner, A. M. Remes, D. Rhodes, S. Rich, J. D. Rioux, S. Ripatti, D. M. Roden, J. I. Rotter, N. Sahakian, D. Saleheen, V. Salomaa, A. Saltzman, N. J. Samani, K. E. Samocha, A. Sanchis-Juan, J. Scharf, M. Schleicher, H. Schunkert, S. Schönherr, E. G. Seaby, S. H. Shah, M. Shand, T. Sharpe, M. B. Shoemaker, T. Shyong, E. K. Silverman, M. Singer-Berk, P. Sklar, J. T. Smith, J. G. Smith, H. Soininen, H. Sokol, R. G. Son, J. Soto, T. Spector, C. Stevens, N. O. Stitziel, P. F. Sullivan, J. Suvisaari, E. S. Tai, K. D. Taylor, Y. Y. Teo, M. Tsuang, T. Tuomi, D. Turner, T. Tusie-Luna, E. Vartiainen, M. Vawter, L. Wang, A. Wang, J. S. Ware, H. Watkins, R. K. Weersma, B. Weisburd, M. Wessman, N. Whiffin, J. G. Wilson, R. J. Xavier, A. O’Donnell-Luria, M. Solomonson, C. Seed, A. R. Martin, M. E. Talkowski, H. L. Rehm, M. J. Daly, G. Tiao, B. M. Neale, D. G. MacArthur, and K. J. Karczewski. A genomic mutational constraint map using variation in 76,156 human genomes. Nature 2023 625:7993, 625:92–100, 12 2023. ISSN 1476-4687. doi: 10.1038/s41586-023-06045-0. URL https://www.nature.com/articles/s41586-023-06045-0.	https://qiita.com/kaizen_nagoya/items/e799ad85ee98bb2a8cf6
313		7	1	Short, P. J. et al. De novo mutations in regulatory elements in neurodevelopmental disorders. Nature 555, 611–616 (2018).
314		7	2	Satterstrom, F. K. et al. Large-scale exome sequencing study implicates both developmental and functional changes in the neurobiology of autism. Cell 180, 568–584.e523 (2020).
315		7	3	Singh, T. et al. The contribution of rare variants to risk of schizophrenia in individuals with and without intellectual disability. Nat. Genet. 49, 1167–1173 (2017).
316		7	4	Ganna, A. et al. Quantifying the impact of rare and ultra-rare coding variation across the phenotypic spectrum. Am. J. Hum. Genet. 102, 1204–1211 (2018).
317		7	5	Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
318		7	6	Petrovski, S., Wang, Q., Heinzen, E. L., Allen, A. S. & Goldstein, D. B. Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet. 9, e1003709 (2013).
319		7	7	Samocha, K. E. et al. A framework for the interpretation of de novo mutation in human disease. Nat. Genet. 46, 944–950 (2014).
320		7	8	Hindorff, L. A. et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA 106, 9362–9367 (2009).
321		7	9	Lanyi, J. K. Photochromism of halorhodopsin. cis/trans isomerization of the retinal around the 13–14 double bond. J. Biol. Chem. 261, 14025–14030 (1986).
322		7	10	Mathelier, A., Shi, W. & Wasserman, W. W. Identification of altered cis-regulatory elements in human disease. Trends Genet. 31, 67–76 (2015).
323		7	11	Spielmann, M. & Mundlos, S. Looking beyond the genes: the role of non-coding variants in human disease. Hum. Mol. Genet. 25, R157–R165 (2016).
324		7	12	Zhang, F. & Lupski, J. R. Non-coding genetic variants in human disease. Hum. Mol. Genet. 24, R102–R110 (2015).
325		7	13	Seplyarskiy, V. B. & Sunyaev, S. The origin of human mutation in light of genomic data. Nat. Rev. Genet. 22, 672–686 (2021).
326		7	14	Seplyarskiy, V. B. et al. Population sequencing data reveal a compendium of mutational processes in the human germ line. Science 373, 1030–1035 (2021).
327		7	15	Gussow, A. B. et al. Orion: Detecting regions of the human non-coding genome that are intolerant to variation using population genetics. PLoS ONE 12, e0181604 (2017).
328		7	16	di Iulio, J. et al. The human noncoding genome defined by genetic diversity. Nat. Genet. 50, 333–337 (2018).
329		7	17	Halldorsson, B. V. et al. The sequences of 150,119 genomes in the UK Biobank. Nature 607, 732–740 (2022).
330		7	18	Ritchie, G. et al. Functional annotation of noncoding sequence variants. Nat. Methods 11, 294–296 (2014).
331		7	19	Vitsios, D., Dhindsa, R. S., Middleton, L., Gussow, A. B. & Petrovski, S. Prioritizing non-coding regions based on human genomic constraint and sequence context with deep learning. Nat. Commun. 12, 1504 (2021).
332		7	20	Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).
333		7	21	Pollard, K. S., Hubisz, M. J., Rosenbloom, K. R. & Siepel, A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 20, 110–121 (2010).
334		7	22	Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
335		7	23	Halldorsson, B. V. et al. Characterizing mutagenic effects of recombination through a sequence-level genetic map. Science 363, eaau1043 (2019).
336		7	24	An, J. Y. et al. Genome-wide de novo risk score implicates promoter variation in autism spectrum disorder. Science 362, eaat6576 (2018).
337		7	25	Collins, R. L. et al. A structural variation reference for medical and population genetics. Nature 581, 444–451 (2020).
338		7	26	The ENCODE Project Consortium. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 583, 699–710 (2020).
339		7	27	Andersson, R. et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014).
340		7	28	Jiang, Y. et al. SEdb: a comprehensive human super-enhancer database. Nucleic Acids Res. 47, D235–D243 (2019).
341		7	29	Pott, S. & Lieb, J. D. What are super-enhancers? Nat. Genet. 47, 8–12 (2015).
342		7	30	Bartel, D. P. Metazoan microRNAs. Cell 173, 20–51 (2018).
343		7	31	Welter, D. et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 42, D1001–D1006 (2014).
344		7	32	Kanai, M. et al. Insights from complex trait fine-mapping across diverse populations. Preprint at medRxiv https://doi.org/10.1101/2021.09.03.21262975 (2021).
345		7	33	Jung, R. G. et al. Association between plasminogen activator inhibitor-1 and cardiovascular events: a systematic review and meta-analysis. Thromb. J. 16, 12 (2018).
346		7	34	Song, C., Burgess, S., Eicher, J. D., O’Donnell, C. J. & Johnson, A. D. Causal effect of plasminogen activator inhibitor type 1 on coronary heart disease. J. Am. Heart Assoc. 6, e004918 (2017).
347		7	35	Schaefer, A. S. et al. Genetic evidence for PLASMINOGEN as a shared genetic risk factor of coronary artery disease and periodontitis. Circ. Cardiovasc. Genet. 8, 159–167 (2015).
348		7	36	Li, Y. Y. Plasminogen activator inhibitor-1 4G/5G gene polymorphism and coronary artery disease in the Chinese Han population: a meta-analysis. PLoS ONE 7, e33511 (2012).
349		7	37	Drinane, M. C., Sherman, J. A., Hall, A. E., Simons, M. & Mulligan-Kehoe, M. J. Plasminogen and plasmin activity in patients with coronary artery disease. J. Thromb. Haemost. 4, 1288–1295 (2006).
350		7	38	Lowe, G. D. et al. Tissue plasminogen activator antigen and coronary heart disease. Prospective study and meta-analysis. Eur. Heart J. 25, 252–259 (2004).
351		7	39	Wang, Q. S. et al. Leveraging supervised learning for functionally informed fine-mapping of cis-eQTLs identifies an additional 20,913 putative causal eQTLs. Nat. Commun. 12, 3394 (2021).
352		7	40	Landrum, M. J. et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 46, D1062–D1067 (2018).
353		7	41	Stenson, P. D. et al. Human Gene Mutation Database (HGMD): 2003 update. Hum. Mutat. 21, 577–581 (2003).
354		7	42	Davydov, E. V. et al. Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Comput. Biol. 6, e1001025 (2010).
355		7	43	Greenway, S. C. et al. De novo copy number variants identify new genes and loci in isolated sporadic tetralogy of Fallot. Nat. Genet. 41, 931–935 (2009).
356		7	44	Mefford, H. C. et al. Recurrent reciprocal genomic rearrangements of 17q12 are associated with renal disease, diabetes, and epilepsy. Am. J. Hum. Genet. 81, 1057–1069 (2007).
357		7	45	Sebat, J. et al. Strong association of de novo copy number mutations with autism. Science 316, 445–449 (2007).
358		7	46	Stefansson, H. et al. Large recurrent microdeletions associated with schizophrenia. Nature 455, 232–236 (2008).
359		7	47	Walsh, T. et al. Rare structural variants disrupt multiple genes in neurodevelopmental pathways in schizophrenia. Science 320, 539–543 (2008).
360		7	48	Wright, C. F. et al. Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data. Lancet 385, 1305–1314 (2015).
361		7	49	Spielmann, M., Lupianez, D. G. & Mundlos, S. Structural variation in the 3D genome. Nat. Rev. Genet. 19, 453–467 (2018).
362		7	50	Spielmann, M. & Mundlos, S. Structural variations, the regulatory landscape of the genome and their alteration in human disease. Bioessays 35, 533–543 (2013).
363		7	51	Coe, B. P. et al. Refining analyses of copy number variation identifies specific genes associated with developmental delay. Nat. Genet. 46, 1063–1071 (2014).
364		7	52	Cooper, G. M. et al. A copy number variation morbidity map of developmental delay. Nat. Genet. 43, 838–846 (2011).
365		7	53	Klopocki, E. et al. Copy-number variations involving the IHH locus are associated with syndactyly and craniosynostosis. Am. J. Hum. Genet. 88, 70–75 (2011).
366		7	54	Barroso, E. et al. Identification of the fourth duplication of upstream IHH regulatory elements, in a family with craniosynostosis Philadelphia type, helps to define the phenotypic characterization of these regulatory elements. Am. J. Med. Genet. A 167A, 902–906 (2015).
367		7	55	Will, A. J. et al. Composition and dosage of a multipartite enhancer cluster control developmental expression of Ihh (Indian hedgehog). Nat. Genet. 49, 1539–1545 (2017).
368		7	56	Roadmap Epigenomics Consortium. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
369		7	57	Rehm, H. L. et al. ClinGen—the Clinical Genome Resource. N. Engl. J. Med. 372, 2235–2242 (2015).
370		7	58	Blake, J. A. et al. The Mouse Genome Database (MGD): premier model organism resource for mammalian genomics and genetics. Nucleic Acids Res. 39, D842–D848 (2011).
371		7	59	McKusick, V. A. Mendelian Inheritance in Man and its online version, OMIM. Am. J. Hum. Genet. 80, 588–604 (2007).
372		7	60	Consortium, G. T. The Genotype–Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).
373		7	61	Xu, H. et al. Elevated ASCL2 expression in breast cancer is associated with the poor prognosis of patients. Am. J. Cancer Res. 7, 955–961 (2017).
374		7	62	Jubb, A. M. et al. Achaete-scute like 2 (ascl2) is a target of Wnt signalling and is upregulated in intestinal neoplasia. Oncogene 25, 3445–3457 (2006).
375		7	63	Tian, Y. et al. MicroRNA-200 (miR-200) cluster regulation by achaete scute-like 2 (Ascl2): impact on the epithelial-mesenchymal transition in colon cancer cells. J. Biol. Chem. 289, 36101–36115 (2014).
376		7	64	Guo, M. H. et al. Inferring compound heterozygosity from large-scale exome sequencing data. Nat. Genet. https://doi.org/10.1038/s41588-023-01608-3 (2023).
377		7	65	Zhu, P. et al. Single-cell DNA methylome sequencing of human preimplantation embryos. Nat. Genet. 50, 12–19 (2018).
378		7	66	Tang, W. W. et al. A unique gene regulatory network resets the human germline epigenome for development. Cell 161, 1453–1467 (2015).
379		7	67	Ross, D. A., Lim, J., Lin, R.-S. & Yang, M.-H. Incremental learning for robust visual tracking. Int. J. Comput. Vision 77, 125–141 (2008).
380		7	68	Karolchik, D. et al. The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 32, D493–D496 (2004).
381		7	69	Li, H. Toward better understanding of artifacts in variant calling from high-coverage samples. Bioinformatics 30, 2843–2851 (2014).
382		7	70	Davis, C. A. et al. The Encyclopedia of DNA elements (ENCODE): data portal update. Nucleic Acids Res. 46, D794–D801 (2018).
383		7	71	Goldmann, J. M. et al. Germline de novo mutation clusters arise during oocyte aging in genomic regions with high double-strand-break incidence. Nat. Genet. 50, 487–492 (2018).
384		7	72	Zhao, H. et al. CrossMap: a versatile tool for coordinate conversion between genome assemblies. Bioinformatics 30, 1006–1007 (2014).
385		7	73	Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
386		7	74	Kent, W. J., Zweig, A. S., Barber, G., Hinrichs, A. S. & Karolchik, D. BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics 26, 2204–2207 (2010).
387		7	75	Koenig, Z. et al. A harmonized public resource of deeply sequenced diverse human genomes. Preprint at bioRxiv https://doi.org/10.1101/2023.01.23.525248 (2023).
388		7	76	Harrow, J. et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760–1774 (2012).
389		7	77	Hon, C. C. et al. An atlas of human long non-coding RNAs with accurate 5′ ends. Nature 543, 199–204 (2017).
390		7	78	Wang, G., Sarkar, A., Carbonetto, P. & Stephens, M. A simple new approach to variable selection in regression, with application to genetic fine-mapping. J. R. Stat. Soc. B 82, 1273–1300 (2020).
391		7	79	Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590, 290–299 (2021).
392		7	80	Budescu, D. V. Dominance analysis: a new approach to the problem of relative importance of predictors in multiple regression. Psych. Bull. 114, 542 (1993).
393		7	81	Azen, R. & Budescu, D. V. The dominance analysis approach for comparing predictors in multiple regression. Psych. Methods 8, 129 (2003).
394		7	82	Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011).
395		7	83	Liu, Y., Sarkar, A., Kheradpour, P., Ernst, J. & Kellis, M. Evidence of reduced recombination rate in human regulatory domains. Genome Biol. 18, 193 (2017).
396		7	84	Robin, X. et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12, 1–8 (2011).
397		7	85	Bergstrom, A. et al. Insights into human genetic variation and population history from 929 diverse genomes. Science 367, eaay5012 (2020).
398		7	86	The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).
399		8	0	] M. E. Consens, B. Li, A. R. Poetsch, and S. Gilbert. Genomic language models could transform medicine but not yet. npj Digital Medicine, 8:1–4, 12 2025. ISSN 23986352. doi: 10.1038/S41746-025-01603-4;SUBJMETA=1538,692,700;KWRD=HEALTH+CARE, HEALTH+POLICY. URL https://www.nature.com/articles/s41746-025-01603-4.	https://qiita.com/kaizen_nagoya/items/f797330e64e0c7d05f39
400		8	1	Brixi, G. et al. Genome modeling and design across all domains of life with Evo 2. 2025.02.18.638918. Preprint at https://doi.org/10.1101/2025.02.18.638918 (2025).
401		8	2	Callaway, E. Biggest-ever AI biology model writes DNA on demand. Nature 638, 868–869 (2025).
402		8	3	Park, E. G. et al. Genomic analyses of non-coding RNAs overlapping transposable elements and its implication to human diseases. Int. J. Mol. Sci. 23, 8950 (2022).
403		8	4	Alonso, M.E., Pernaute, B., Crespo, M., Gómez-Skarmeta, J.L. & Manzanares, M. Understanding the regulatory genome. Int. J. Develop. Biol. https://ijdb.ehu.eus/article/072428ma (2009).
404		8	5	Consens, M. E. et al. Transformers and genome language models. Nat. Machine Intell. https://www.nature.com/articles/s42256-025-01007-9 (2025).
405		8	6	Dalla-Torre, H. et al. Nucleotide transformer: building and evaluating robust foundation models for human genomics. Nat. Methods 22, 287–297 (2025).
406		8	7	Nguyen, E. et al. Sequence modeling and design from molecular to genome scale with Evo. Science 386, eado9336 (2024).
407		8	8	Comfort, N. Genetics: we are the 98%. Nature 520, 615–616 (2015).
408		8	9	Benegas, G., Albors, C., Aw, A.J. et al. A DNA language model based on multispecies alignment predicts the effects of genome-wide variants. Nat Biotechnol. https://doi.org/10.1038/s41587-024-02511-w (2025).
409		8	10	Health (US), N. I. of & Study, B. S. C. Understanding Human Genetic Variation. in NIH Curriculum Supplement Series [Internet] (National Institutes of Health, 2007).
410		8	11	Gresova, K., Martinek, V., Cechak, D., Simecek, P. & Alexiou, P. Genomic benchmarks: a collection of datasets for genomic sequence classification. BMC Genomic Data 24, 25 (2023).
411		8	12	Marin, F. I. et al. BEND: Benchmarking DNA language models on biologically meaningful tasks. Preprint at https://doi.org/10.48550/arXiv.2311.12570 (2024).
412		8	13	Ji, Y., Zhou, Z., Liu, H. & Davuluri, R. V. DNABERT: pre-trained bidirectional encoder representations from transformers model for DNA-language in genome. Bioinformatics 37, 2112–2120 (2021).
413		8	14	Sanabria, M., Hirsch, J. & Poetsch, A. R. Distinguishing word identity and sequence context in DNA language models. BMC Bioinformatics 25, 301 (2024).
414		8	15	Sanabria, M., Hirsch, J., Joubert, P. M. & Poetsch, A. R. DNA language model GROVER learns sequence context in the human genome. Nat. Mach. Intell. 6, 911–923 (2024).
415		8	16	Bloomfield, D. et al. AI and biosecurity: the need for governance. Science 385, 831–833 (2024).
416		8	17	Riedemann, L., Labonne, M. & Gilbert, S. The path forward for large language models in medicine is open. Npj Digit. Med. 7, 1–5 (2024).
417		8	18	Poetsch, A. R. KI-Modell analysiert und generiert DNA-Strukturen (SMC, 2024).
418		8	19	Derraz, B. et al. New regulatory thinking is needed for AI-based personalised drug and cell therapies in precision oncology. Npj Precis. Oncol. 8, 1–11 (2024).
419		8	20	Harishbhai Tilala, M. et al. Ethical considerations in the use of artificial intelligence and machine learning in health care: a comprehensive review. Cureus 16, e62443 (2024).
420		8	21	Zhou, Z. et al. DNABERT-2: Efficient Foundation Model and Benchmark For Multi-Species Genome. (2023).
421		8	22	Nguyen, E. et al. HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution. (2023).
422		9	0	] H. Dalla-Torre, L. Gonzalez, J. Mendoza-Revilla, N. L. Carranza, A. H. Grzywaczewski, F. Oteri, C. Dallago, E. Trop, B. P. de Almeida, H. Sirelkhatim, G. Richard, M. Skwark, K. Beguir, M. Lopez, and T. Pierrot. Nucleotide transformer: building and evaluating robust foundation models for human genomics. Nature Methods, 22:287–297, 2 2024. ISSN 15487105. doi: 10.1038/S41592-024-02523-Z;SUBJMETA=114,1305,1647,208,212,631,794; KWRD=GENOMICS,MACHINE+LEARNING,SOFTWARE. URL https://www.nature.com/articles/s41592-024-02523-z.	https://qiita.com/kaizen_nagoya/items/1c147c2b095364f04ef7
423		9	1	Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. Bert: pre-training of deep bidirectional transformers for language understanding. Preprint at https://arxiv.org/abs/1810.04805 (2018).
424		9	2	Brown, T. et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 33, 1877–1901 (2020).
425		9	3	Jumper, J. et al. Highly accurate protein structure prediction with alphafold. Nature 596, 583–589 (2021).
426		9	4	Rives, A. et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc. Natl Acad. Sci. USA 118, e2016239118 (2021).
427		9	5	Elnaggar, A. et al. ProtTrans: towards cracking the language of life’s code through self-supervised deep learning and high performance computing. IEEE Trans. Pattern Anal. Mach. Intell. 44, 7112–7127 (2022).
428		9	6	Littmann, M., Heinzinger, M., Dallago, C., Olenyi, T. & Rost, B. Embeddings from deep learning transfer go annotations beyond homology. Sci. Rep. 11, 1–14 (2021).
429		9	7	Marquet, C. et al. Embeddings from protein language models predict conservation and variant effects. Hum. Genet. 141, 1629–1647 (2022).
430		9	8	Littmann, M., Heinzinger, M., Dallago, C., Weissenow, K. & Rost, B. Protein embeddings and deep learning predict binding residues for various ligand classes. Sci. Rep. 11, 23916 (2021).
431		9	9	Avsec, Ž. et al. Base-resolution models of transcription-factor binding reveal soft motif syntax. Nat. Genet. 53, 354–366 (2021).
432		9	10	Zhou, J. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning–based sequence model. Nat. Methods 12, 931–934 (2015).
433		9	11	Mateo, L. J., Sinnott-Armstrong, N. & Boettiger, A. N. Tracing dna paths and rna profiles in cultured cells and tissues with orca. Nat. Protoc. 16, 1647–1713 (2021).
434		9	12	de Almeida, B. P., Reiter, F., Pagani, M. & Stark, A. DeepSTARR predicts enhancer activity from dna sequence and enables the de novo design of synthetic enhancers. Nat. Genet. 54, 613–624 (2022).
435		9	13	Eraslan, G., Avsec, Ž., Gagneur, J. & Theis, F. J. Deep learning: new computational modelling techniques for genomics. Nat. Rev. Genet. 20, 389–403 (2019).
436		9	14	Zhou, J. et al. Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk. Nat. Genet. 50, 1171–1179 (2018).
437		9	15	Kelley, D. R. Cross-species regulatory sequence activity prediction. PLOS Comput. Biol. 16, e1008050 (2020).
438		9	16	Kelley, D. R. et al. Sequential regulatory activity prediction across chromosomes with convolutional neural networks. Genome Res. 28, 739–750 (2018).
439		9	17	Agarwal, V. & Shendure, J. Predicting mRNA abundance directly from genomic sequence using deep convolutional neural networks. Cell Rep. 31, 107663 (2020).
440		9	18	Chen, K. M., Wong, A. K., Troyanskaya, O. G. & Zhou, J. A sequence-based global map of regulatory activity for deciphering human genetics. Nat. Genet. 54, 940–949 (2022).
441		9	19	Avsec, Ž. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021).
442		9	20	Ji, Y., Zhou, Z., Liu, H. & Davuluri, R. V. Dnabert: pre-trained bidirectional encoder representations from transformers model for DNA-language in genome. Bioinformatics 37, 2112–2120 (2021).
443		9	21	Zvyagin, M. T. et al. Genslms: genome-scale language models reveal SARS-CoV-2 evolutionary dynamics. Preprint at bioRxiv https://doi.org/10.1101/2022.10.10.511571 (2022).
444		9	22	Outeiral, C. & Deane, C. M. Codon language embeddings provide strong signals for protein engineering. Nat. Mach. Intell. 6, 170–179 (2024).
445		9	23	Zhou, Z. et al. Dnabert-2: efficient foundation model and benchmark for multi-species genome. in Proceedings of the Twelfth International Conference on Learning Representations https://openreview.net/pdf?id=oMLQB4EZE1 (ICLR, 2024).
446		9	24	Fishman, V. et al. Gena-lm: A family of open-source foundational models for long dna sequences. Preprint at bioRxiv https://doi.org/10.1101/2023.06.12.544594 (2023).
447		9	25	Nguyen, E. et al. Hyenadna: Long-range genomic sequence modeling at single nucleotide resolution. in 37th Conference on Neural Information Processing Systems https://openreview.net/pdf?id=ubzNoJjOKj (NeurIPS, 2023).
448		9	26	Mendoza-Revilla, J. et al. A foundational large language model for edible plant genomes. Commun. Biol. 7, 835 (2024).
449		9	27	Rae, J. W. et al. Scaling language models: methods, analysis & insights from training gopher. Preprint at https://arxiv.org/abs/2112.11446 (2021).
450		9	28	Consortium, G. P. et al. A global reference for human genetic variation. Nature 526, 68 (2015).
451		9	29	Harrow, J. et al. GENCODE: the reference human genome annotation for the encode project. Genome Res. 22, 1760–1774 (2012).
452		9	30	Meylan, P., Dreos, R., Ambrosini, G., Groux, R. & Bucher, P. Epd in 2020: enhanced data visualization and extension to ncRNA promoters. Nucleic Acids Res. 48, D65–D69 (2020).
453		9	31	ENCODE. An integrated encyclopedia of dna elements in the human genome. Nature 489, 57–74 (2012).
454		9	32	The ENCODE Project Consortium. Expanded encyclopaedias of dna elements in the human and mouse genomes. Nature 583, 699–710 (2020).
455		9	33	Meuleman, W. et al. Index and biological spectrum of human DNase I hypersensitive sites. Nature 584, 244–251 (2020).
456		9	34	Li, F.-Z., Amini, A. P., Yang, K. K. & Lu, A. X. Pretrained protein language model transfer learning: is the final layer representation what we want? in Machine Learning for Structural Biology Workshop (NeurIPS, 2022).
457		9	35	Liu, H. et al. Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning. in 36th Conference on Neural Information Processing Systems (NeurIPS, 2022).
458		9	36	Jaganathan, K. et al. Predicting splicing from primary sequence with deep learning. Cell 176, 535–548 (2019).
459		9	37	Benegas, G., Batra, S. S. & Song, Y. S. DNA language models are powerful zero-shot predictors of non-coding variant effects. Proc. Natl Acad. Sci. USA 120, e2311219120 (2023).
460		9	38	Vig, J. et al. BERTology meets biology: interpreting attention in protein language models. in Proceedings of the International Conference on Learning Representations 2021 https://openreview.net/pdf?id=YWtLZvLmud7 (ICLR, 2021).
461		9	39	Braun, S. et al. Decoding a cancer-relevant splicing decision in the ron proto-oncogene using high-throughput mutagenesis. Nat. Commun. 9, 3315 (2018).
462		9	40	McLaren, W. et al. The Ensembl variant effect predictor. Genome Biol. 17, 1–14 (2016).
463		9	41	Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).
464		9	42	Consortium, G. The gtex consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).
465		9	43	Võsa, U. et al. Large-scale cis-and trans-eqtl analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. Nat. Genetics 53, 1300–1310 (2021).
466		9	44	Chowdhery, A. et al. PaLM: scaling language modeling with pathways. J. Mach. Learn. Technol. 24, 11324–11436 (2021).
467		9	45	Hoffmann, J. et al. Training compute-optimal large language models. in 36th Conference on Neural Information Processing Systems https://proceedings.neurips.cc/paper_files/paper/2022/file/c1e2faff6f588870935f114ebe04a3e5-Paper-Conference.pdf (NeurIPS, 2022).
468		9	46	Rogers, A., Kovaleva, O. & Rumshisky, A. A primer in BERTology: what we know about how bert works. Trans. Assoc. Comput. Linguist. 8, 842–866 (2020).
469		9	47	Stärk, H., Dallago, C., Heinzinger, M. & Rost, B. Light attention predicts protein location from the language of life. Bioinform. Adv. 1, vbab035 (2021).
470		9	48	Zou, J. et al. A primer on deep learning in genomics. Nat. Genetics 51, 12–18 (2019).
471		9	49	Wang, A. et al. Superglue: a stickier benchmark for general-purpose language understanding systems. in 33rd Conference on Neural Information Processing Systems https://papers.nips.cc/paper_files/paper/2019/file/4496bf24afe7fab6f046bf4923da8de6-Paper.pdf (NeurIPS, 2019).
472		9	50	Hendrycks, D. & Gimpel, K. Gaussian error linear units (gelus). Preprint at https://arxiv.org/abs/1606.08415 (2016).
473		9	51	Su, J. et al. Roformer: enhanced transformer with rotary position embedding. Neurocomputing 568, 127063 (2024).
474		9	52	Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980v5 (2015).
475		9	53	Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
476		9	54	Bergstra, J., Bardenet, R., Bengio, Y. & Kégl, B. Algorithms for hyper-parameter optimization. in Advances in Neural Information Processing Systems 24 https://papers.nips.cc/paper_files/paper/2011/file/86e8f7ab32cfd12577bc2619bc635690-Paper.pdf (NeurIPS, 2011).
477		9	55	Byrska-Bishop, M. et al. High-coverage whole-genome sequencing of the expanded 1000 genomes project cohort including 602 trios. Cell 185, 3426–3440 (2022).
478		9	56	Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with hisat2 and hisat-genotype. Nat. Biotechnol.37, 907–905 (2019).
479		9	57	Leslie, R., O’Donnell, C. J. & Johnson, A. D. GRASP: analysis of genotype–phenotype results from 1390 genome-wide association studies and corresponding open access database. Bioinformatics 30, i185–i194 (2014).
480		9	58	Landrum, M. J. et al. Clinvar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 46, D1062–D1067 (2018).
481		9	59	Stenson, P. D. et al. The Human Gene Mutation Database (HGMD®): optimizing its use in a clinical diagnostic or research setting. Hum. Genet. 139, 1197–1207 (2020).
482		10	0	H. Dalla-Torre, L. Gonzalez, J. Mendoza-Revilla, N. Lopez Carranza, A. H. Grzywaczewski, F. Oteri, C. Dallago, E. Trop, B. P. De Almeida, H. Sirelkhatim, G. Richard, M. Skwark, K. Beguir, M. Lopez, and T. Pierrot. Nucleotide transformer: building and evaluating robust foundation models for human genomics. Nature Methods, 22(2):287–297, Feb. 2025. ISSN 1548-7091, 1548-7105. doi: 10.1038/s41592-024-02523-z. URL https://www.nature.com/articles/s41592-024-02523-z.	https://qiita.com/kaizen_nagoya/items/fe607a3aaf7ffb309d33
483		10	1	Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. Bert: pre-training of deep bidirectional transformers for language understanding. Preprint at https://arxiv.org/abs/1810.04805 (2018).
484		10	2	Brown, T. et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 33, 1877–1901 (2020).
485		10	3	Jumper, J. et al. Highly accurate protein structure prediction with alphafold. Nature 596, 583–589 (2021).
486		10	4	Rives, A. et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc. Natl Acad. Sci. USA 118, e2016239118 (2021).
487		10	5	Elnaggar, A. et al. ProtTrans: towards cracking the language of life’s code through self-supervised deep learning and high performance computing. IEEE Trans. Pattern Anal. Mach. Intell. 44, 7112–7127 (2022).
488		10	6	Littmann, M., Heinzinger, M., Dallago, C., Olenyi, T. & Rost, B. Embeddings from deep learning transfer go annotations beyond homology. Sci. Rep. 11, 1–14 (2021).
489		10	7	Marquet, C. et al. Embeddings from protein language models predict conservation and variant effects. Hum. Genet. 141, 1629–1647 (2022).
490		10	8	Littmann, M., Heinzinger, M., Dallago, C., Weissenow, K. & Rost, B. Protein embeddings and deep learning predict binding residues for various ligand classes. Sci. Rep. 11, 23916 (2021).
491		10	9	Avsec, Ž. et al. Base-resolution models of transcription-factor binding reveal soft motif syntax. Nat. Genet. 53, 354–366 (2021).
492		10	10	Zhou, J. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning–based sequence model. Nat. Methods 12, 931–934 (2015).
493		10	11	Mateo, L. J., Sinnott-Armstrong, N. & Boettiger, A. N. Tracing dna paths and rna profiles in cultured cells and tissues with orca. Nat. Protoc. 16, 1647–1713 (2021).
494		10	12	de Almeida, B. P., Reiter, F., Pagani, M. & Stark, A. DeepSTARR predicts enhancer activity from dna sequence and enables the de novo design of synthetic enhancers. Nat. Genet. 54, 613–624 (2022).
495		10	13	Eraslan, G., Avsec, Ž., Gagneur, J. & Theis, F. J. Deep learning: new computational modelling techniques for genomics. Nat. Rev. Genet. 20, 389–403 (2019).
496		10	14	Zhou, J. et al. Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk. Nat. Genet. 50, 1171–1179 (2018).
497		10	15	Kelley, D. R. Cross-species regulatory sequence activity prediction. PLOS Comput. Biol. 16, e1008050 (2020).
498		10	16	Kelley, D. R. et al. Sequential regulatory activity prediction across chromosomes with convolutional neural networks. Genome Res. 28, 739–750 (2018).
499		10	17	Agarwal, V. & Shendure, J. Predicting mRNA abundance directly from genomic sequence using deep convolutional neural networks. Cell Rep. 31, 107663 (2020).
500		10	18	Chen, K. M., Wong, A. K., Troyanskaya, O. G. & Zhou, J. A sequence-based global map of regulatory activity for deciphering human genetics. Nat. Genet. 54, 940–949 (2022).
501		10	19	Avsec, Ž. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021).
502		10	20	Ji, Y., Zhou, Z., Liu, H. & Davuluri, R. V. Dnabert: pre-trained bidirectional encoder representations from transformers model for DNA-language in genome. Bioinformatics 37, 2112–2120 (2021).
503		10	21	Zvyagin, M. T. et al. Genslms: genome-scale language models reveal SARS-CoV-2 evolutionary dynamics. Preprint at bioRxiv https://doi.org/10.1101/2022.10.10.511571 (2022).
504		10	22	Outeiral, C. & Deane, C. M. Codon language embeddings provide strong signals for protein engineering. Nat. Mach. Intell. 6, 170–179 (2024).
505		10	23	Zhou, Z. et al. Dnabert-2: efficient foundation model and benchmark for multi-species genome. in Proceedings of the Twelfth International Conference on Learning Representations https://openreview.net/pdf?id=oMLQB4EZE1 (ICLR, 2024).
506		10	24	Fishman, V. et al. Gena-lm: A family of open-source foundational models for long dna sequences. Preprint at bioRxiv https://doi.org/10.1101/2023.06.12.544594 (2023).
507		10	25	Nguyen, E. et al. Hyenadna: Long-range genomic sequence modeling at single nucleotide resolution. in 37th Conference on Neural Information Processing Systems https://openreview.net/pdf?id=ubzNoJjOKj (NeurIPS, 2023).
508		10	26	Mendoza-Revilla, J. et al. A foundational large language model for edible plant genomes. Commun. Biol. 7, 835 (2024).
509		10	27	Rae, J. W. et al. Scaling language models: methods, analysis & insights from training gopher. Preprint at https://arxiv.org/abs/2112.11446 (2021).
510		10	28	Consortium, G. P. et al. A global reference for human genetic variation. Nature 526, 68 (2015).
511		10	29	Harrow, J. et al. GENCODE: the reference human genome annotation for the encode project. Genome Res. 22, 1760–1774 (2012).
512		10	30	Meylan, P., Dreos, R., Ambrosini, G., Groux, R. & Bucher, P. Epd in 2020: enhanced data visualization and extension to ncRNA promoters. Nucleic Acids Res. 48, D65–D69 (2020).
513		10	31	ENCODE. An integrated encyclopedia of dna elements in the human genome. Nature 489, 57–74 (2012).
514		10	32	The ENCODE Project Consortium. Expanded encyclopaedias of dna elements in the human and mouse genomes. Nature 583, 699–710 (2020).
515		10	33	Meuleman, W. et al. Index and biological spectrum of human DNase I hypersensitive sites. Nature 584, 244–251 (2020).
516		10	34	Li, F.-Z., Amini, A. P., Yang, K. K. & Lu, A. X. Pretrained protein language model transfer learning: is the final layer representation what we want? in Machine Learning for Structural Biology Workshop (NeurIPS, 2022).
517		10	35	Liu, H. et al. Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning. in 36th Conference on Neural Information Processing Systems (NeurIPS, 2022).
518		10	36	Jaganathan, K. et al. Predicting splicing from primary sequence with deep learning. Cell 176, 535–548 (2019).
519		10	37	Benegas, G., Batra, S. S. & Song, Y. S. DNA language models are powerful zero-shot predictors of non-coding variant effects. Proc. Natl Acad. Sci. USA 120, e2311219120 (2023).
520		10	38	Vig, J. et al. BERTology meets biology: interpreting attention in protein language models. in Proceedings of the International Conference on Learning Representations 2021 https://openreview.net/pdf?id=YWtLZvLmud7 (ICLR, 2021).
521		10	39	Braun, S. et al. Decoding a cancer-relevant splicing decision in the ron proto-oncogene using high-throughput mutagenesis. Nat. Commun. 9, 3315 (2018).
522		10	40	McLaren, W. et al. The Ensembl variant effect predictor. Genome Biol. 17, 1–14 (2016).
523		10	41	Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).
524		10	42	Consortium, G. The gtex consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).
525		10	43	Võsa, U. et al. Large-scale cis-and trans-eqtl analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. Nat. Genetics 53, 1300–1310 (2021).
526		10	44	Chowdhery, A. et al. PaLM: scaling language modeling with pathways. J. Mach. Learn. Technol. 24, 11324–11436 (2021).
527		10	45	Hoffmann, J. et al. Training compute-optimal large language models. in 36th Conference on Neural Information Processing Systems https://proceedings.neurips.cc/paper_files/paper/2022/file/c1e2faff6f588870935f114ebe04a3e5-Paper-Conference.pdf (NeurIPS, 2022).
528		10	46	Rogers, A., Kovaleva, O. & Rumshisky, A. A primer in BERTology: what we know about how bert works. Trans. Assoc. Comput. Linguist. 8, 842–866 (2020).
529		10	47	Stärk, H., Dallago, C., Heinzinger, M. & Rost, B. Light attention predicts protein location from the language of life. Bioinform. Adv. 1, vbab035 (2021).
530		10	48	Zou, J. et al. A primer on deep learning in genomics. Nat. Genetics 51, 12–18 (2019).
531		10	49	Wang, A. et al. Superglue: a stickier benchmark for general-purpose language understanding systems. in 33rd Conference on Neural Information Processing Systems https://papers.nips.cc/paper_files/paper/2019/file/4496bf24afe7fab6f046bf4923da8de6-Paper.pdf (NeurIPS, 2019).
532		10	50	Hendrycks, D. & Gimpel, K. Gaussian error linear units (gelus). Preprint at https://arxiv.org/abs/1606.08415 (2016).
533		10	51	Su, J. et al. Roformer: enhanced transformer with rotary position embedding. Neurocomputing 568, 127063 (2024).
534		10	52	Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980v5 (2015).
535		10	53	Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
536		10	54	Bergstra, J., Bardenet, R., Bengio, Y. & Kégl, B. Algorithms for hyper-parameter optimization. in Advances in Neural Information Processing Systems 24 https://papers.nips.cc/paper_files/paper/2011/file/86e8f7ab32cfd12577bc2619bc635690-Paper.pdf (NeurIPS, 2011).
537		10	55	Byrska-Bishop, M. et al. High-coverage whole-genome sequencing of the expanded 1000 genomes project cohort including 602 trios. Cell 185, 3426–3440 (2022).
538		10	56	Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with hisat2 and hisat-genotype. Nat. Biotechnol.37, 907–905 (2019).
539		10	57	Leslie, R., O’Donnell, C. J. & Johnson, A. D. GRASP: analysis of genotype–phenotype results from 1390 genome-wide association studies and corresponding open access database. Bioinformatics 30, i185–i194 (2014).
540		10	58	Landrum, M. J. et al. Clinvar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 46, D1062–D1067 (2018).
541		10	59	Stenson, P. D. et al. The Human Gene Mutation Database (HGMD®): optimizing its use in a clinical diagnostic or research setting. Hum. Genet. 139, 1197–1207 (2020).
542		11	0	DeepSeek-AI, D. Guo, D. Yang, H. Zhang, J. Song, R. Zhang, R. Xu, Q. Zhu, S. Ma, P. Wang, X. Bi, X. Zhang, X. Yu, Y. Wu, Z. F. Wu, Z. Gou, Z. Shao, Z. Li, Z. Gao, A. Liu, B. Xue, B. Wang, B. Wu, B. Feng, C. Lu, C. Zhao, C. Deng, C. Zhang, C. Ruan, D. Dai, D. Chen, D. Ji, E. Li, F. Lin, F. Dai, F. Luo, G. Hao, G. Chen, G. Li, H. Zhang, H. Bao, H. Xu, H. Wang, H. Ding, H. Xin, H. Gao, H. Qu, H. Li, J. Guo, J. Li, J. Wang, J. Chen, J. Yuan, J. Qiu, J. Li, J. L. Cai, J. Ni, J. Liang, J. Chen, K. Dong, K. Hu, K. Gao, K. Guan, K. Huang, K. Yu, L. Wang, L. Zhang, L. Zhao, L. Wang, L. Zhang, L. Xu, L. Xia, M. Zhang, M. Zhang, M. Tang, M. Li, M. Wang, M. Li, N. Tian, P. Huang, P. Zhang, Q. Wang, Q. Chen, Q. Du, R. Ge, R. Zhang, R. Pan, R. Wang, R. J. Chen, R. L. Jin, R. Chen, S. Lu, S. Zhou, S. Chen, S. Ye, S. Wang, S. Yu, S. Zhou, S. Pan, S. S. Li, S. Zhou, S. Wu, S. Ye, T. Yun, T. Pei, T. Sun, T. Wang, W. Zeng, W. Zhao, W. Liu, W. Liang, W. Gao, W. Yu, W. Zhang, W. L. Xiao, W. An, X. Liu, X. Wang, X. Chen, X. Nie, X. Cheng, X. Liu, X. Xie, X. Liu, X. Yang, X. Li, X. Su, X. Lin, X. Q. Li, X. Jin, X. Shen, X. Chen, X. Sun, X. Wang, X. Song, X. Zhou, X. Wang, X. Shan, Y. K. Li, Y. Q. Wang, Y. X. Wei, Y. Zhang, Y. Xu, Y. Li, Y. Zhao, Y. Sun, Y. Wang, Y. Yu, Y. Zhang, Y. Shi, Y. Xiong, Y. He, Y. Piao, Y. Wang, Y. Tan, Y. Ma, Y. Liu, Y. Guo, Y. Ou, Y. Wang, Y. Gong, Y. Zou, Y. He, Y. Xiong, Y. Luo, Y. You, Y. Liu, Y. Zhou, Y. X. Zhu, Y. Xu, Y. Huang, Y. Li, Y. Zheng, Y. Zhu, Y. Ma, Y. Tang, Y. Zha, Y. Yan, Z. Z. Ren, Z. Ren, Z. Sha, Z. Fu, Z. Xu, Z. Xie, Z. Zhang, Z. Hao, Z. Ma, Z. Yan, Z. Wu, Z. Gu, Z. Zhu, Z. Liu, Z. Li, Z. Xie, Z. Song, Z. Pan, Z. Huang, Z. Xu, Z. Zhang, and Z. Zhang. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. 1 2025. URL https://arxiv.org/pdf/2501.12948.	https://qiita.com/kaizen_nagoya/items/bb5ee9f17c03e07659d8
543		11	1	AI@Meta. Llama 3.1 model card, 2024. URL https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md.
544		11	2	Anthropic. Claude 3.5 sonnet, 2024. URL https://www.anthropic.com/news/claude-3-5-sonnet.
545		11	3	M. Chen, J. Tworek, H. Jun, Q. Yuan, H. P. de Oliveira Pinto, J. Kaplan, H. Edwards, Y. Burda, N. Joseph, G. Brockman, A. Ray, R. Puri, G. Krueger, M. Petrov, H. Khlaaf, G. Sastry, P. Mishkin, B. Chan, S. Gray, N. Ryder, M. Pavlov, A. Power, L. Kaiser, M. Bavarian, C. Winter, P. Tillet, F. P. Such, D. Cummings, M. Plappert, F. Chantzis, E. Barnes, A. Herbert-Voss, W. H. Guss, A. Nichol, A. Paino, N. Tezak, J. Tang, I. Babuschkin, S. Balaji, S. Jain, W. Saunders, C. Hesse, A. N. Carr, J. Leike, J. Achiam, V. Misra, E. Morikawa, A. Radford, M. Knight, M. Brundage,
546		11	4	M. Murati, K. Mayer, P. Welinder, B. McGrew, D. Amodei, S. McCandlish, I. Sutskever, and W. Zaremba. Evaluating large language models trained on code. CoRR, abs/2107.03374, 2021. URL https://arxiv.org/abs/2107.03374.
547		11	5	A. Dubey, A. Jauhri, A. Pandey, A. Kadian, A. Al-Dahle, A. Letman, A. Mathur, A. Schelten, A. Yang, A. Fan, et al. The llama 3 herd of models. arXiv preprint arXiv:2407.21783, 2024.
548		11	6	Y. Dubois, B. Galambosi, P. Liang, and T. B. Hashimoto. Length-controlled alpacaeval: A simple way to debias automatic evaluators. arXiv preprint arXiv:2404.04475, 2024.
549		11	7	X. Feng, Z. Wan, M. Wen, S. M. McAleer, Y. Wen, W. Zhang, and J. Wang. Alphazero-like tree-search can guide large language model decoding and training, 2024. URL https://arxiv.org/abs/2309.17179.
550		11	8	L. Gao, J. Schulman, and J. Hilton. Scaling laws for reward model overoptimization, 2022. URL https://arxiv.org/abs/2210.10760.
551		11	9	A. P. Gema, J. O. J. Leang, G. Hong, A. Devoto, A. C. M. Mancino, R. Saxena, X. He, Y. Zhao, X. Du, M. R. G. Madani, C. Barale, R. McHardy, J. Harris, J. Kaddour, E. van Krieken, and P. Minervini. Are we done with mmlu? CoRR, abs/2406.04127, 2024. URL https://doi.org/10.48550/arXiv.2406.04127.
552		11	10	Google. Our next-generation model: Gemini 1.5, 2024. URL https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024.
553		11	11	Y. He, S. Li, J. Liu, Y. Tan, W. Wang, H. Huang, X. Bu, H. Guo, C. Hu, B. Zheng, et al. Chinese simpleqa: A chinese factuality evaluation for large language models. arXiv preprint arXiv:2411.07140, 2024.
554		11	12	D. Hendrycks, C. Burns, S. Basart, A. Zou, M. Mazeika, D. Song, and J. Steinhardt. Measuring massive multitask language understanding. arXiv preprint arXiv:2009.03300, 2020.
555		11	13	Y. Huang, Y. Bai, Z. Zhu, J. Zhang, J. Zhang, T. Su, J. Liu, C. Lv, Y. Zhang, J. Lei, et al. C-Eval: A multi-level multi-discipline chinese evaluation suite for foundation models. arXiv preprint arXiv:2305.08322, 2023.
556		11	14	N. Jain, K. Han, A. Gu, W. Li, F. Yan, T. Zhang, S. Wang, A. Solar-Lezama, K. Sen, and I. Stoica. Livecodebench: Holistic and contamination free evaluation of large language models for code. CoRR, abs/2403.07974, 2024. URL https://doi.org/10.48550/arXiv.2403.07974.
557		11	15	S. Krishna, K. Krishna, A. Mohananey, S. Schwarcz, A. Stambler, S. Upadhyay, and M. Faruqui. Fact, fetch, and reason: A unified evaluation of retrieval-augmented generation. CoRR, abs/2409.12941, 2024. doi: 10.48550/ARXIV.2409.12941. URL https://doi.org/10.48550/arXiv.2409.12941.
558		11	16	A. Kumar, V. Zhuang, R. Agarwal, Y. Su, J. D. Co-Reyes, A. Singh, K. Baumli, S. Iqbal, C. Bishop, R. Roelofs, et al. Training language models to self-correct via reinforcement learning. arXiv preprint arXiv:2409.12917, 2024.
559		11	17	H. Li, Y. Zhang, F. Koto, Y. Yang, H. Zhao, Y. Gong, N. Duan, and T. Baldwin. CMMLU: Measuring massive multitask language understanding in Chinese. arXiv preprint arXiv:2306.09212, 2023.
560		11	18	T. Li, W.-L. Chiang, E. Frick, L. Dunlap, T. Wu, B. Zhu, J. E. Gonzalez, and I. Stoica. From crowdsourced data to high-quality benchmarks: Arena-hard and benchbuilder pipeline. arXiv preprint arXiv:2406.11939, 2024.
561		11	19	H. Lightman, V. Kosaraju, Y. Burda, H. Edwards, B. Baker, T. Lee, J. Leike, J. Schulman, I. Sutskever, and K. Cobbe. Let’s verify step by step. arXiv preprint arXiv:2305.20050, 2023.
562		11	20	B. Y. Lin. ZeroEval: A Unified Framework for Evaluating Language Models, July 2024. URL https://github.com/WildEval/ZeroEval.
563		11	21	MAA. American invitational mathematics examination - aime. Mathematics Examination - AIME 2024, February 2024. URL https://maa.org/math In American Invitational-competitions/american-invitational-mathematics-examination-aime.
564		11	22	OpenAI. Hello GPT-4o, 2024a. URL https://openai.com/index/hello-gpt-4o/.
565		11	23	OpenAI. Learning to reason with llms, 2024b. URL https://openai.com/index/learning-to-reason-with-llms/.
566		11	24	OpenAI. Introducing SimpleQA, 2024c. URL https://openai.com/index/introducing-simpleqa/.
567		11	25	OpenAI. Introducing SWE-bench verified we’re releasing a human-validated subset of swebench that more, 2024d. URL https://openai.com/index/introducing-swe-bench-verified/.
568		11	26	Qwen. Qwq: Reflect deeply on the boundaries of the unknown, 2024a. URL https://qwenlm.github.io/blog/qwq-32b-preview/.
569		11	27	Qwen. Qwen2.5: A party of foundation models, 2024b. URL https://qwenlm.github.io/blog/qwen2.5.
570		11	28	D. Rein, B. L. Hou, A. C. Stickland, J. Petty, R. Y. Pang, J. Dirani, J. Michael, and S. R. Bowman. GPQA: A graduate-level google-proof q&a benchmark. arXiv preprint arXiv:2311.12022, 2023.
571		11	29	Z. Shao, P. Wang, Q. Zhu, R. Xu, J. Song, M. Zhang, Y. Li, Y. Wu, and D. Guo. Deepseekmath: Pushing the limits of mathematical reasoning in open language models. arXiv preprint arXiv:2402.03300, 2024.
572		11	30	D. Silver, T. Hubert, J. Schrittwieser, I. Antonoglou, M. Lai, A. Guez, M. Lanctot, L. Sifre, D. Kumaran, T. Graepel, T. P. Lillicrap, K. Simonyan, and D. Hassabis. Mastering chess and shogi by self-play with a general reinforcement learning algorithm. CoRR, abs/1712.01815, 2017a. URL http://arxiv.org/abs/1712.01815.
573		11	31	D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, A. Bolton, Y. Chen, T. P. Lillicrap, F. Hui, L. Sifre, G. van den Driessche, T. Graepel, and D. Hassabis. Mastering the game of go without human knowledge. Nat., 550(7676):354–359, 2017b. doi: 10.1038/NATURE24270. URL https://doi.org/10.1038/nature24270.
574		11	32	C. Snell, J. Lee, K. Xu, and A. Kumar. Scaling llm test-time compute optimally can be more effective than scaling model parameters, 2024. URL https://arxiv.org/abs/2408.03314.
575		11	33	J. Uesato, N. Kushman, R. Kumar, F. Song, N. Siegel, L. Wang, A. Creswell, G. Irving, and I. Higgins. Solving math word problems with process-and outcome-based feedback. arXiv preprint arXiv:2211.14275, 2022.
576		11	34	T. Trinh, Y. Wu, Q. Le, H. He, and T. Luong. Solving olympiad geometry without human demonstrations. Nature, 2024. doi: 10.1038/s41586-023-06747-5.
577		11	35	P. Wang, L. Li, Z. Shao, R. Xu, D. Dai, Y. Li, D. Chen, Y. Wu, and Z. Sui. Math-shepherd: A label-free step-by-step verifier for llms in mathematical reasoning. arXiv preprint arXiv:2312.08935, 2023.
578		11	36	X. Wang, J. Wei, D. Schuurmans, Q. Le, E. Chi, S. Narang, A. Chowdhery, and D. Zhou. Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171, 2022.
579		11	37	Y. Wang, X. Ma, G. Zhang, Y. Ni, A. Chandra, S. Guo, W. Ren, A. Arulraj, X. He, Z. Jiang, T. Li, M. Ku, K. Wang, A. Zhuang, R. Fan, X. Yue, and W. Chen. Mmlu-pro: A more robust and challenging multi-task language understanding benchmark. CoRR, abs/2406.01574, 2024. URL https://doi.org/10.48550/arXiv.2406.01574.
580		11	38	C. S. Xia, Y. Deng, S. Dunn, and L. Zhang. Agentless: Demystifying llm-based software engineering agents. arXiv preprint, 2024.
581		11	39	H. Xin, Z. Z. Ren, J. Song, Z. Shao, W. Zhao, H. Wang, B. Liu, L. Zhang, X. Lu, Q. Du, W. Gao, Q. Zhu, D. Yang, Z. Gou, Z. F. Wu, F. Luo, and C. Ruan. Deepseek-prover-v1.5: Harnessing proof assistant feedback for reinforcement learning and monte-carlo tree search, 2024. URL https://arxiv.org/abs/2408.08152.
582		11	40	J. Zhou, T. Lu, S. Mishra, S. Brahma, S. Basu, Y. Luan, D. Zhou, and L. Hou. Instruction-following evaluation for large language models. arXiv preprint arXiv:2311.07911, 2023.
583		12	0	A. Fallahpour, V. Gureghian, G. J. Filion, A. B. Lindner, and A. Pandi. Codontransformer: A multispecies codon optimizer using context-aware neural networks. Nature Communications, 16(1), Apr 2025. doi: 10.1038/s41467-025-58588-7.	https://qiita.com/kaizen_nagoya/items/d4be1d4dd9eb307f09cc
584		12	1	Pechmann, S. & Frydman, J. Evolutionary conservation of codon optimality reveals hidden signatures of cotranslational folding. Nat. Struct. Mol. Biol. 20, 237–243 (2013).
585		12	2	Rocha, E. P. C. Codon usage bias from tRNA’s point of view: redundancy, specialization, and efficient decoding for translation optimization. Genome Res 14, 2279–2286 (2004).
586		12	3	Ran, W., Kristensen, D. M. & Koonin, E. V. Coupling between protein level selection and codon usage optimization in the evolution of bacteria and archaea. MBio 5, e00956–14 (2014).
587		12	4	Plotkin, J. B. & Kudla, G. Synonymous but not the same: the causes and consequences of codon bias. Nat. Rev. Genet. 12, 32–42 (2011).
588		12	5	Deng, Y., de Lima Hedayioglu, F., Kalfon, J., Chu, D. & von der Haar, T. Hidden patterns of codon usage bias across kingdoms. J. R. Soc. Interface 17, 20190819 (2020).
589		12	6	Mauro, V. P. Codon Optimization in the Production of Recombinant Biotherapeutics: Potential Risks and Considerations. BioDrugs 32, 69–81 (2018).
590		12	7	Mauro, V. P. & Chappell, S. A. A critical analysis of codon optimization in human therapeutics. Trends Mol. Med 20, 604–613 (2014).
591		12	8	Sharp, P. M. & Li, W. H. The codon Adaptation Index-a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 15, 1281–1295 (1987).
592		12	9	Quax, T. E. F., Claassens, N. J., Söll, D. & van der Oost, J. Codon bias as a means to fine-tune gene expression. Mol. Cell 59, 149–161 (2015).
593		12	10	Brule, C. E. & Grayhack, E. J. Synonymous codons: choose wisely for expression. Trends Genet 33, 283–297 (2017).
594		12	11	Khakzad, H. et al. A new age in protein design empowered by deep learning. Cell Syst. 14, 925–939 (2023).
595		12	12	Watson, J. L. et al. De novo design of protein structure and function with RFdiffusion. Nature 620, 1089–1100 (2023).
596		12	13	Ingraham, J. B. et al. Illuminating protein space with a programmable generative model. Nature 623, 1070–1078 (2023).
597		12	14	Madani, A. et al. Large language models generate functional protein sequences across diverse families. Nat. Biotechnol. 41, 1099–1106 (2023).
598		12	15	Ferruz, N., Schmidt, S. & Höcker, B. ProtGPT2 is a deep unsupervised language model for protein design. Nat. Commun. 13, 4348 (2022).
599		12	16	Johnson, S. R. et al. Computational scoring and experimental evaluation of enzymes generated by neural networks. Nat. Biotechnol. https://doi.org/10.1038/s41587-024-02214-2 (2024).
600		12	17	UniProtKB/Swiss-Prot Release 2024_04 statistics. https://web.expasy.org/docs/relnotes/relstat.html.
601		12	18	Jain, R., Jain, A., Mauro, E., LeShane, K. & Densmore, D. ICOR: improving codon optimization with recurrent neural networks. BMC Bioinforma. 24, 132 (2023).
602		12	19	Yang, Q. et al. eRF1 mediates codon usage effects on mRNA translation efficiency through premature termination at rare codons. Nucleic Acids Res. 47, 9243–9258 (2019).
603		12	20	Angov, E., Legler, P. M. & Mease, R. M. Adjustment of codon usage frequencies by codon harmonization improves protein expression and folding. Methods Mol. Biol. 705, 1–13 (2011).
604		12	21	Claassens, N. J. et al. Improving heterologous membrane protein production in Escherichia coli by combining transcriptional tuning and codon usage algorithms. PLoS ONE 12, e0184355 (2017).
605		12	22	[No title]. https://mlcb.github.io/mlcb2019_proceedings/papers/paper_29.pdf.
606		12	23	Fu, H. et al. Codon optimization with deep learning to enhance protein expression. Sci. Rep. 10, 17617 (2020).
607		12	24	Constant, D. A. et al. Deep learning-based codon optimization with large-scale synonymous variant datasets enables generalized tunable protein expression. Preprint at bioRxiv https://doi.org/10.1101/2023.02.11.528149 (2023).
608		12	25	Sabath, N., Wagner, A. & Karlin, D. Evolution of viral proteins originated de novo by overprinting. Mol. Biol. Evol. 29, 3767–3780 (2012).
609		12	26	Cho, K., van Merrienboer, B., Bahdanau, D. & Bengio, Y. On the properties of neural machine translation: Encoder-decoder approaches. https://doi.org/10.48550/ARXIV.1409.1259 (2014).
610		12	27	Vaswani, A. et al. Attention is all you need. https://doi.org/10.48550/ARXIV.1706.03762 (2017).
611		12	28	Brown, T. B. et al. Language models are few-shot learners. https://doi.org/10.48550/ARXIV.2005.14165 (2020).
612		12	29	Zaheer, M. et al. Big bird: Transformers for longer sequences. https://doi.org/10.48550/ARXIV.2007.14062 (2020).
613		12	30	Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional Transformers for language understanding. https://doi.org/10.48550/ARXIV.1810.04805 (2018)
614		12	31	Ranaghan, M. J., Li, J. J., Laprise, D. M. & Garvie, C. W. Assessing optimal: inequalities in codon optimization algorithms. BMC Biol. 19, 36 (2021).
615		12	32	Chandra, S. et al. The high mutational sensitivity of ccda antitoxin is linked to codon optimality. Mol Biol Evol 39, (2022).
616		12	33	Liu, Y., Yang, Q. & Zhao, F. Synonymous But Not Silent: the Codon Usage Code for Gene Expression and Protein Folding. Annu. Rev. Biochem. 90, 375–401 (2021).
617		12	34	Liu, Y. A code within the genetic code: codon usage regulates co-translational protein folding. Cell Commun. Signal. 18, 1–9 (2020).
618		12	35	Lyu, X. & Liu, Y. Nonoptimal codon usage is critical for protein structure and function of the master general amino acid control regulator CPC-1. MBio 11, (2020).
619		12	36	Walsh, I. M., Bowman, M. A., Soto Santarriaga, I. F., Rodriguez, A. & Clark, P. L. Synonymous codon substitutions perturb cotranslational protein folding in vivo and impair cell fitness. Proc. Natl Acad. Sci. USA 117, 3528–3534 (2020).
620		12	37	Pechmann, S., Chartron, J. W. & Frydman, J. Local slowdown of translation by nonoptimal codons promotes nascent-chain recognition by SRP in vivo. Nat. Struct. Mol. Biol. 21, 1100–1105 (2014).
621		12	38	Zhou, T., Weems, M. & Wilke, C. O. Translationally optimal codons associate with structurally sensitive sites in proteins. Mol. Biol. Evol. 26, 1571–1580 (2009).
622		12	39	Zhou, M., Wang, T., Fu, J., Xiao, G. & Liu, Y. Nonoptimal codon usage influences protein structure in intrinsically disordered regions. Mol. Microbiol. 97, 974–987 (2015).
623		12	40	Zhou, M. et al. Non-optimal codon usage affects expression, structure and function of clock protein FRQ. Nature 495, 111–115 (2013).
624		12	41	Clarke, T. F. 4th & Clark, P. L. Rare codons cluster. PLoS ONE 3, e3412 (2008).
625		12	42	Lorenz, R. et al. ViennaRNA Package 2.0. Algorithms Mol. Biol. 6, 26 (2011).
626		12	43	Real, R. & Vargas, J. M. The probabilistic basis of jaccard’s index of similarity. Syst. Biol. 45, 380–385 (1996).
627		12	44	Montgomery, K. T., Tardiff, J., Reid, L. M. & Krauter, K. S. Negative and positive cis-acting elements control the expression of murine alpha 1-protease inhibitor genes. Mol. Cell. Biol. 10, 2625–2637 (1990).
628		12	45	Medina-Muñoz, S. G. et al. Crosstalk between codon optimality and cis-regulatory elements dictates mRNA stability. Genome Biol. 22, 14 (2021).
629		12	46	Shabalina, S. A., Spiridonov, N. A. & Kashina, A. Sounds of silence: synonymous nucleotides as a key to biological regulation and complexity. Nucleic Acids Res. 41, 2073–2094 (2013).
630		12	47	Nuryana, I. et al. Codon optimization of a gene encoding DNA polymerase from Pyrococcus furiosus and its expression in Escherichia coli. J. Genet. Eng. Biotechnol. 21, 129 (2023).
631		12	48	Moss, M. J., Chamness, L. M. & Clark, P. L. The effects of codon usage on protein structure and folding. Annu. Rev. Biophys. 53, 87–108 (2024).
632		12	49	Barrington, C. L. et al. Synonymous codon usage regulates translation initiation. Cell Rep. 42, 113413 (2023).
633		12	50	Outeiral, C. & Deane, C. M. Codon language embeddings provide strong signals for use in protein engineering. Nat. Mach. Intell. 6, 170–179 (2024).
634		12	51	Lin, B. C., Kaissarian, N. M. & Kimchi-Sarfaty, C. Implementing computational methods in tandem with synonymous gene recoding for therapeutic development. Trends Pharmacol. Sci. 44, 73–84 (2023).
635		12	52	Bio.Data.CodonTable module—Biopython 1.75 documentation. https://biopython.org/docs/1.75/api/Bio.Data.CodonTable.html.
636		12	53	Fallahpour, A., Alinoori, M., Afkanpour, A. & Krishnan, A. EHRMamba: towards generalizable and scalable foundation models for Electronic Health Records. https://doi.org/10.48550/ARXIV.2405.14567 (2024).
637		12	54	Wolf, T. et al. HuggingFace’s transformers: State-of-the-art natural language processing. https://doi.org/10.48550/ARXIV.1910.03771 (2019).
638		12	55	Lee, B. D. Python Implementation of Codon Adaptation Index. J. Open Source Softw. 3, 905 (2018).
639		12	56	Codon Usage Database. https://www.kazusa.or.jp/codon/.
640		12	57	Sakoe, H. & Chiba, S. Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. 26, 43–49 (1978).
641		12	58	Dynamic Time Warping. in Information Retrieval for Music and Motion, 69–84 (Springer Berlin Heidelberg, Berlin, Heidelberg, 2007).
642		12	59	Giorgino, T. Computing and visualizing dynamic time warping alignments in R: The dtw Package. J. Stat. Softw. 31, 1–24 (2009).
643		12	60	Górska, A., Plucinska, M., Pedersen, L., Kielpinski, L., Tehler, D. & Hagedorn, P. XNAString: efficient manipulation of modified oligonucleotide sequences. R package version 1.14.0. https://doi.org/10.18129/B9.BIOC.XNASTRING. (2024).
644		12	61	Fallahpour, A. et al. CodonTransformer: a multispecies codon optimizer using context-aware neural networks. Adibvafa/CodonTransformer. https://doi.org/10.5281/ZENODO.15000833 (Zenodo, 2025).
645		13	0	A. Fallahpour, J. Ma, A. Munim, H. Lyu, and B. Wang. Medrax: Medical reasoning agent for chest x-ray, 2025. URL https://arxiv.org/abs/2502.02673.
646		13	1	Adibi, A., Cao, X., Ji, Z., Kaur, J. N., Chen, W., Healey, E., Nuwagira, B., Ye, W., Woollard, G., Xu, M. A., Cui, H., Xi, J., Chang, T., Bikia, V., Zhang, N., Noori, A., Xia, Y., Hossain, M. B., Frank, H. A., Peluso, A., Pu, Y., Shen, S. Z., Wu, J., Fallahpour, A., Mahbub, S., Duncan, R., Zhang, Y., Cao, Y., Xu, Z., Craig, M., Krishnan, R. G., Beheshti, R., Rehg, J. M., Karim, M. E., Coffee, M., Celi, L. A., Fries, J. A., Sadatsafavi, M., Shung, D., McWeeney, S., Dafflon, J., and Jabbour, S. Recent advances, applications and open challenges in machine learning for health: Reflections from research roundtables at ml4h 2024 symposium, 2025.
647		13	2	Ahn, J. S., Ebrahimian, S., McDermott, S., Lee, S., Naccarato, L., Di Capua, J. F., Wu, M. Y., Zhang, E. W., Muse, V., Miller, B., et al. Association of artificial intelligence–aided chest radiograph interpretation with reader performance and efficiency. JAMA Network Open, 5(8):e2229289–e2229289, 2022.
648		13	3	Baghbanzadeh, N., Fallahpour, A., Parhizkar, Y., Ogidi, F., Roy, S., Ashkezari, S., Khazaie, V. R., Colacci, M., Etemad, A., Afkanpour, A., and Dolatabadi, E. Advancing medical representation learning through high-quality data, 2025.
649		13	4	Bahl, S., Ramzan, T., and Maraj, R. Interpretation and documentation of chest x-rays in the acute medical unit. Clinical Medicine, 20(2):s73, 2020.
650		13	5	Baltruschat, I., Steinmeister, L., Nickisch, H., Saalbach, A., Grass, M., Adam, G., Knopp, T., and Ittrich, H. Smart chest x-ray worklist prioritization using artificial intelligence: a clinical workflow simulation. European radiology, 31:3837–3845, 2021.
651		13	6	Bannur, S., Bouzid, K., Castro, D. C., Schwaighofer, A., Thieme, A., Bond-Taylor, S., Ilse, M., P´ erez-Garc´ ıa, F., Salvatelli, V., Sharma, H., Meissen, F., Ranjit, M., Srivastav, S., Gong, J., Codella, N. C. F., Falck, F., Oktay, O., Lungren, M. P., Wetscherek, M. T., Alvarez-Valle, J., and Hyland, S. L. Maira-2: Grounded radiology report generation, 2024.
652		13	7	Bansal, H., Israel, D., Zhao, S., Li, S., Nguyen, T., and Grover, A. Medmax: Mixed-modal instruction tuning for training biomedical assistants, 2024.
653		13	8	Chambon, P., Bluethgen, C., Delbrouck, J.-B., der Sluijs, R. V., Połacin, M., Chaves, J. M. Z., Abraham, T. M., Purohit, S., Langlotz, C. P., and Chaudhari, A. Roentgen: Vision-language foundation model for chest x-ray generation, 2022.
654		13	9	Chambon, P., Delbrouck, J.-B., Sounack, T., Huang, S.-C., Chen, Z., Varma, M., Truong, S. Q., Chuong, C. T., and Langlotz, C. P. Chexpert plus: Augmenting a large chest x-ray dataset with text radiology reports, patient demographics and additional image formats, 2024.
655		13	10	Chen, Z., Varma, M., Delbrouck, J.-B., Paschali, M., Blankemeier, L., Van Veen, D., Valanarasu, J. M. J., Youssef, A., Cohen, J. P., Reis, E. P., et al. Chexagent: Towards
656		13	11	a foundation model for chest x-ray interpretation. arXiv preprint arXiv:2401.12208, 2024a.
657		13	12	Chen, Z., Varma, M., Xu, J., Paschali, M., Veen, D. V., Johnston, A., Youssef, A., Blankemeier, L., Bluethgen, C., Altmayer, S., Valanarasu, J. M. J., Muneer, M. S. E., Reis, E. P., Cohen, J. P., Olsen, C., Abraham, T. M., Tsai, E. B., Beaulieu, C. F., Jitsev, J., Gatidis, S., Delbrouck, J.-B., Chaudhari, A. S., and Langlotz, C. P. A vision-language foundation model to enhance efficiency of chest x-ray interpretation, 2024b.
658		13	13	MedRAX: Medical Reasoning Agent for Chest X-ray Cohen, J. P., Hashir, M., Brooks, R., and Bertrand, H. On the limits of cross-domain generalization in automated x-ray prediction. In Medical Imaging with Deep Learning, 2020.
659		13	14	Cohen, J. P., Viviano, J. D., Bertin, P., Morrison, P., Torabian, P., Guarrera, M., Lungren, M. P., Chaudhari, A., Brooks, R., Hashir, M., and Bertrand, H. TorchXRayVision: A library of chest X-ray datasets and models. In Medical Imaging with Deep Learning, 2022.
660		13	15	Erdal, B. S., Gupta, V., Demirer, M., Fair, K. H., White, R. D., Blair, J., Deichert, B., Lafleur, L., Qin, M. M., Bericat, D., and Genereaux, B. Integration and implementation strategies for ai algorithm deployment with smart routing rules and workflow management, 2023.
661		13	16	Eriksen, A. V., M¨ oller, S., and Ryg, J. Use of gpt-4 to diagnose complex clinical cases, 2024.
662		13	17	Fallahpour, A., Alinoori, M., Ye, W., Cao, X., Afkanpour, A., and Krishnan, A. Ehrmamba: Towards generalizable and scalable foundation models for electronic health records, 2024.
663		13	18	Guo, D., Yang, D., Zhang, H., Song, J., Zhang, R., Xu, R., Zhu, Q., Ma, S., Wang, P., Bi, X., et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv preprint arXiv:2501.12948, 2025.
664		13	19	Huang, J., Neill, L., Wittbrodt, M., Melnick, D., Klug, M., Thompson, M., Bailitz, J., Loftus, T., Malik, S., Phull, A., et al. Generative artificial intelligence for chest radiograph interpretation in the emergency department. JAMA network open, 6(10):e2336100–e2336100, 2023.
665		13	20	Hyland, S. L., Bannur, S., Bouzid, K., Castro, D. C., Ranjit, M., Schwaighofer, A., P´ erez-Garc´ ıa, F., Salvatelli, V., Srivastav, S., Thieme, A., Codella, N., Lungren, M. P., Wetscherek, M. T., Oktay, O., and Alvarez-Valle, J. Maira1: A specialised large multimodal model for radiology report generation, 2024.
666		13	21	Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., Chute, C., Marklund, H., Haghgoo, B., Ball, R., Shpanskaya, K., Seekins, J., Mong, D. A., Halabi, S. S., Sandberg, J. K., Jones, R., Larson, D. B., Langlotz, C. P., Patel, B. N., Lungren, M. P., and Ng, A. Y. Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison, 2019.
667		13	22	Jaech, A., Kalai, A., Lerer, A., Richardson, A., El-Kishky, A., Low, A., Helyar, A., Madry, A., Beutel, A., Carney, A., et al. Openai o1 system card. arXiv preprint arXiv:2412.16720, 2024.
668		13	23	Javan, R., Kim, T., and Mostaghni, N. Gpt-4 vision: Multimodal evolution of chatgpt and potential role in radiology. Cureus, 16(8):e68298, 2024.
669		13	24	Jiang, Y., Black, K. C., Geng, G., Park, D., Ng, A. Y., and Chen, J. H. Medagentbench: Dataset for benchmarking llms as agents in medical applications, 2025.
670		13	25	Jimenez, C. E., Yang, J., Wettig, A., Yao, S., Pei, K., Press, O., and Narasimhan, K. Swe-bench: Can language models resolve real-world github issues? arXiv preprint arXiv:2310.06770, 2023.
671		13	26	Kim, Y., Park, C., Jeong, H., Chan, Y. S., Xu, X., McDuff, D., Lee, H., Ghassemi, M., Breazeal, C., and Park, H. W. Mdagents: An adaptive collaboration of llms for medical decision-making. In The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024.
672		13	27	Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A. C., Lo, W.-Y., Doll´ ar, P., and Girshick, R. Segment anything, 2023.
673		13	28	Li, B., Yan, T., Pan, Y., Luo, J., Ji, R., Ding, J., Xu, Z., Liu, S., Dong, H., Lin, Z., et al. Mmedagent: Learning to use medical tools with multi-modal agent. arXiv preprint arXiv:2407.02483, 2024a.
674		13	29	Li, C., Wong, C., Zhang, S., Usuyama, N., Liu, H., Yang, J., Naumann, T., Poon, H., and Gao, J. Llava-med: Training a large language-and-vision assistant for biomedicine in one day. Advances in Neural Information Processing Systems, 36, 2024b.
675		13	30	Lian, J., Liu, J., Zhang, S., Gao, K., Liu, X., Zhang, D., and Yu, Y. A Structure-Aware Relation Network for Thoracic Diseases Detection and Segmentation. IEEE Transactions on Medical Imaging, 2021. doi: 10.48550/arxiv.2104.10326.
676		13	31	Liu, B., Zhan, L.-M., Xu, L., Ma, L., Yang, Y., and Wu, X.-M. Slake: A semantically-labeled knowledge-enhanced dataset for medical visual question answering, 2021.
677		13	32	Liu, X., Yu, H., Zhang, H., Xu, Y., Lei, X., Lai, H., Gu, Y., Ding, H., Men, K., Yang, K., et al. Agentbench: Evaluating llms as agents. arXiv preprint arXiv:2308.03688, 2023.
678		13	33	Ma, J., He, Y., Li, F., Han, L., You, C., and Wang, B. Segment anything in medical images. Nature Communications, 15(1), January 2024. ISSN 2041-1723. doi:10.1038/s41467-024-44824-z.
679		13	34	Ma, J., Yang, Z., Kim, S., Chen, B., Baharoon, M., Fallahpour, A., Asakereh, R., Lyu, H., and Wang, B. Medsam2: Segment anything in 3d medical images and videos, 2025.
680		13	35	Masterman, T., Besen, S., Sawtell, M., and Chao, A. The landscape of emerging ai agent architectures for reasoning, planning, and tool calling: A survey. arXiv preprint arXiv:2404.11584, 2024.
681		13	36	MedRAX: Medical Reasoning Agent for Chest X-ray Nori, H., King, N., McKinney, S. M., Carignan, D., and Horvitz, E. Capabilities of gpt-4 on medical challenge problems. arXiv preprint arXiv:2303.13375, 2023.
682		13	37	Ouis, M. Y. and Akhloufi, M. A. Chestbiox-gen: contextual biomedical report generation from chest x-ray images using biogpt and co-attention mechanism. Frontiers in Imaging, 3:1373420, 2024.
683		13	38	Park, J., Kim, S., Yoon, B., Hyun, J., and Choi, K. M4cxr: Exploring multi-task potentials of multi-modal large language models for chest x-ray interpretation, 2024.
684		13	39	Pellegrini, C., Keicher, M., Ozsoy, E., and Navab, N. Radrestruct: A novel vqa benchmark and method for structured radiology reporting, 2023.
685		13	40	Pellegrini, C., Ozsoy, E., Busam, B., Navab, N., and Keicher, M. Radialog: A large vision-language model for radiology report generation and conversational assistance, 2025.
686		13	41	Pham, H. H., Nguyen, H. Q., Nguyen, H. T., Le, L. T., and Khanh, L. An accurate and explainable deep learning system improves interobserver agreement in the interpretation of chest radiograph. IEEE Access, 10:104512–104531, 2022.
687		13	42	Schmidgall, S., Ziaei, R., Harris, C., Reis, E., Jopling, J., and Moor, M. Agentclinic: a multimodal agent benchmark to evaluate ai in simulated clinical environments, 2024.
688		13	43	Shin, H. J., Han, K., Ryu, L., and Kim, E.-K. The impact of artificial intelligence on the reading times of radiologists for chest radiographs. NPJ Digital Medicine, 6(1):82, 2023.
689		13	44	Tanno, R., Barrett, D. G., Sellergren, A., Ghaisas, S., Dathathri, S., See, A., Welbl, J., Lau, C., Tu, T., Azizi, S., et al. Collaboration between clinicians and vision–language models in radiology report generation. Nature Medicine, pp. 1–10, 2024.
690		13	45	Tu, T., Azizi, S., Driess, D., Schaekermann, M., Amin, M., Chang, P.-C., Carroll, A., Lau, C., Tanno, R., Ktena, I., Mustafa, B., Chowdhery, A., Liu, Y., Kornblith, S., Fleet, D., Mansfield, P., Prakash, S., Wong, R., Virmani, S., Semturs, C., Mahdavi, S. S., Green, B., Dominowska, E., y Arcas, B. A., Barral, J., Webster, D., Corrado, G. S., Matias, Y., Singhal, K., Florence, P., Karthikesalingam, A., and Natarajan, V. Towards generalist biomedical ai, 2023.
691		13	46	United Nations Scientific Committee on the Effects of Atomic Radiation. Sources, Effects and Risks of Ionizing Radiation: UNSCEAR 2020/2021 Report, Volume I. United Nations, New York, 2022. ISBN 978-92-1- 139206-7.
692		13	47	Wu, C., Zhang, X., Zhang, Y., Wang, Y., and Xie, W. Towards generalist foundation model for radiology by leveraging web-scale 2d and 3d medical data, 2023.
693		13	48	Xi, Z., Chen, W., Guo, X., He, W., Ding, Y., Hong, B., Zhang, M., Wang, J., Jin, S., Zhou, E., et al. The rise and potential of large language model based agents: A survey. Science China Information Sciences, 68(2):121101, 2025.
694		13	49	Yan, Z., Zhang, K., Zhou, R., He, L., Li, X., and Sun, L. Multimodal chatgpt for medical applications: an experimental study of gpt-4v. arXiv preprint arXiv:2310.19061, 2023.
695		13	50	Yang, H. M., Duan, T., Ding, D., Bagul, A., Langlotz, C., Shpanskaya, K., et al. Chexnet: radiologist-level pneumonia detection on chest x-rays with deep learning. arXiv preprint arXiv:1711.05225, 2017.
696		13	51	Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., and Cao, Y. React: Synergizing reasoning and acting in language models, 2023.
697		13	52	Yin, G., Bai, H., Ma, S., Nan, F., Sun, Y., Xu, Z., Ma, S., Lu, J., Kong, X., Zhang, A., et al. Mmau: A holistic benchmark of agent capabilities across diverse domains. arXiv preprint arXiv:2407.18961, 2024.
698		13	53	Zambrano Chaves, J., Huang, S.-C., Xu, Y., Xu, H., Usuyama, N., and Zhang, S, e. a. Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation. arXiv preprint arXiv:2403.08002, 2024.
699		13	54	Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. Pyramid scene parsing network, 2017.
700		13	55	Zhao, P., Jin, Z., and Cheng, N. An in-depth survey of large language model-based artificial intelligence agents. arXiv preprint arXiv:2309.14365, 2023.
701		14	0	H. Feng, L. Wu, B. Zhao, C. Huff, J. Zhang, J. Wu, L. Lin, P. Wei, C. Wu, P. W. pwei, and A. Professor. Benchmarking dna foundation models for genomic sequence classification running title: Dna foundation models benchmarking. doi: 10.1101/2024.08.16.608288. URL https://doi.org/10.1101/2024.08.16.608288.	https://qiita.com/kaizen_nagoya/items/01e3dde0d8274fee0fd8
702		14	1	] OpenAI et al. GPT-4 Technical Report. Preprint at https://doi.org/10.48550/arXiv.2303.08774 (2024).
703		14	2	] Touvron, H. et al. Llama 2: Open Foundation and Fine-Tuned Chat Models. Preprint at https://doi.org/10.48550/arXiv.2307.09288 (2023).
704		14	3	] Jiang, A. Q. et al. Mistral 7B. Preprint at https://doi.org/10.48550/arXiv.2310.06825 (2023).
705		14	4	] Chen, M. et al. Evaluating Large Language Models Trained on Code. Preprint at https://doi.org/10.48550/arXiv.2107.03374 (2021).
706		14	5	] Madani, A. et al. Large language models generate functional protein sequences across diverse families. Nat Biotechnol 41, 1099–1106 (2023).
707		14	6	] Cui, H. et al. scGPT: toward building a foundation model for single-cell multi-omics using generative AI. Nat Methods 1–11 (2024) doi:10.1038/s41592-024-02201-0.
708		14	7	] Lin, Z. et al. Language models of protein sequences at the scale of evolution enable accurate structure prediction. Preprint at https://doi.org/10.1101/2022.07.20.500902 (2022).
709		14	8	] Gershman, A. et al. Epigenetic patterns in a complete human genome. Science 376, eabj5089 (2022).
710		14	9	] Wang, G. et al. Understanding Transcription Factor Regulation by Integrating Gene Expression and DNase I Hypersensitive Sites. Biomed Res Int 2015, 757530 (2015).
711		14	10	Zhou, Z. et al. DNABERT-2: Efficient Foundation Model and Benchmark For Multi-Species Genome. Preprint at https://doi.org/10.48550/arXiv.2306.15006 (2024).
712		14	11	Dalla-Torre, H. et al. The Nucleotide Transformer: Building and Evaluating Robust Foundation Models for Human Genomics. Preprint at https://doi.org/10.1101/2023.01.11.523679 (2023).
713		14	12	Nguyen, E. et al. HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution. Preprint at https://doi.org/10.48550/arXiv.2306.15794 (2023).
714		14	13	Genome Reference Consortium. Genome reference consortium human build 38 (grch38). National Center for Biotechnology Information, 2013. URL https://www.ncbi.nlm.nih.gov/assembly/GCF_000001405.26/
715		14	14	M. Byrska-Bishop, U. S. Evani, X. Zhao, A. O. Basile, H. J. Abel, A. A. Regier, A. Corvelo, W. E. Clarke, R. Musunuri, K. Nagulapalli, et al., “High-coverage whole-genome sequencing of the expanded 1000 genomes project cohort including 602 trios,” Cell, vol. 185, no. 18, pp. 3426– 3440, 2022.
716	676	14	15	Hu, E. J. et al. LoRA: Low-Rank Adaptation of Large Language Models. Preprint at https://doi.org/10.48550/arXiv.2106.09685 (2021).
717		14	16	Liu, H. et al. Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning. Preprint at https://doi.org/10.48550/arXiv.2205.05638 (2022).
718		14	17	Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Preprint at https://doi.org/10.48550/arXiv.1810.04805 (2019).
719		14	18	Xu, H., Jia, P. & Zhao, Z. Deep4mC: systematic assessment and computational prediction for DNA N4-methylcytosine sites by deep learning. Briefings in Bioinformatics 22, bbaa099 (2021).
720		14	19	Liu, B., Long, R. & Chou, K.-C. iDHS-EL: identifying DNase I hypersensitive sites by fusing three different modes of pseudo nucleotide composition into an ensemble learning framework. Bioinformatics 32, 2411–2418 (2016).
721		14	20	Jin, J. et al. iDNA-ABF: multi-scale deep biological language learning model for the interpretable prediction of DNA methylations. Genome Biology 23, 219 (2022).
722		14	21	Zhang, P., Zhang, H. & Wu, H. iPro-WAEL: a comprehensive and robust framework for identifying promoters in multiple species. Nucleic Acids Research 50, 10278–10289 (2022).
723		14	22	Grinsztajn, L., Oyallon, E. & Varoquaux, G. Why do tree-based models still outperform deep learning on typical tabular data? Advances in Neural Information Processing Systems 35, 507–520 (2022).
724		14	23	Gillioz, A., Casas, J., Mugellini, E. & Khaled, O. A. Overview of the Transformer-based Models for NLP Tasks. in Annals of Computer Science and Information Systems vol. 21 179–183 (2020).
725		14	24	Zhang, H. & Shafiq, M. O. Survey of transformers and towards ensemble learning using transformers for natural language processing. Journal of Big Data 11, 25 (2024).
726		14	25	Yang, X., Huang, J. Y., Zhou, W. & Chen, M. Parameter-Efficient Tuning with Special Token Adaptation. Preprint at https://doi.org/10.48550/arXiv.2210.04382 (2023).
727		14	26	Hubert, L. & Arabie, P. Comparing partitions. Journal of Classification 2, 193–218 (1985).
728		14	27	Vinh, N. X., Epps, J. & Bailey, J. Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance. Journal of Machine Learning Research 11, 2837–2854 (2010).
729		14	28	Rousseeuw, P. J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics 20, 53–65 (1987).
730		14	29	Matthews, B. W. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA) - Protein Structure 405, 442–451 (1975).
731		14	30	Marin, F. I. et al. BEND: Benchmarking DNA Language Models on biologically meaningful tasks. Preprint at https://doi.org/10.48550/arXiv.2311.12570 (2024).
732		14	31	Lester, B., Al-Rfou, R. & Constant, N. The Power of Scale for Parameter-Efficient Prompt Tuning. Preprint at https://doi.org/10.48550/arXiv.2104.08691 (2021).
733		14	32	Chen, T. & Guestrin, C. XGBoost: A Scalable Tree Boosting System. in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794 (Association for Computing Machinery, New York, NY, USA, 2016). doi:10.1145/2939672.2939785.
734		14	33	Breiman, L. Random Forests. Machine Learning 45, 5–32 (2001).
735		14	34	DeLong, Elizabeth R., et al. “Comparing the Areas under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach.” Biometrics, vol. 44, no. 3, 1988, pp. 837–45. JSTOR, https://doi.org/10.2307/2531595.
736		15	0	E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen. Lora: Low-rank adaptation of large language models, 2021. URL https://arxiv.org/abs/2106.09685.	https://qiita.com/kaizen_nagoya/items/877058f681d77808b44c
737		15	1	Armen Aghajanyan, Luke Zettlemoyer, and Sonal Gupta. Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning. arXiv:2012.13255 [cs], December 2020. URL http://arxiv.org/abs/2012.13255.
738		15	2	Zeyuan Allen-Zhu and Yuanzhi Li. What Can ResNet Learn Efficiently, Going Beyond Kernels? In NeurIPS, 2019. Full version available at http://arxiv.org/abs/1905.10337.
739		15	3	Zeyuan Allen-Zhu and Yuanzhi Li. Backward feature correction: How deep learning performs deep learning. arXiv preprint arXiv:2001.04413, 2020a.
740		15	4	Zeyuan Allen-Zhu and Yuanzhi Li. Feature purification: How adversarial training performs robust deep learning. arXiv preprint arXiv:2005.10190, 2020b.
741		15	5	Zeyuan Allen-Zhu, Yuanzhi Li, and Zhao Song. A convergence theory for deep learning via overparameterization. In ICML, 2019. Full version available at http://arxiv.org/abs/1811.03962.
742		15	6	Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. Layer normalization, 2016.
743		15	7	Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. Language Models are Few-Shot Learners. arXiv:2005.14165[cs], July 2020. URL http://arxiv.org/abs/2005.14165.
744		15	8	Jian-Feng Cai, Emmanuel J Cand` es, and Zuowei Shen. A singular value thresholding algorithm for matrix completion. SIAM Journal on optimization, 20(4):1956–1982, 2010.
745		15	9	Daniel Cer, Mona Diab, Eneko Agirre, Inigo Lopez-Gazpio, and Lucia Specia. Semeval-2017 task 1: Semantic textual similarity multilingual and crosslingual focused evaluation. Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), 2017. doi: 10.18653/v1/s17-2001. URL http://dx.doi.org/10.18653/v1/S17-2001.
746		15	10	Ronan Collobert and Jason Weston. A unified architecture for natural language processing: deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning, ICML ’08, pp. 160–167, New York, NY, USA, July 2008. Association for Computing Machinery. ISBN 978-1-60558-205-4. doi: 10.1145/1390156.1390177. URL https://doi.org/10.1145/1390156.1390177.
747		15	11	Misha Denil, Babak Shakibi, Laurent Dinh, Marc’Aurelio Ranzato, and Nando de Freitas. Predicting parameters in deep learning, 2014.
748		15	12	Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding, 2019a.
749		15	13	Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 [cs], May 2019b. URL http://arxiv.org/abs/1810.04805. arXiv: 1810.04805.
750		15	14	William B. Dolan and Chris Brockett. Automatically constructing a corpus of sentential paraphrases. In Proceedings of the Third International Workshop on Paraphrasing (IWP2005), 2005. URL https://aclanthology.org/I05-5002.
751		15	15	Claire Gardent, Anastasia Shimorina, Shashi Narayan, and Laura Perez-Beltrachini. The webnlg challenge: Generating text from rdf data. In Proceedings of the 10th International Conference on Natural Language Generation, pp. 124–133, 2017.
752		15	16	Behrooz Ghorbani, Song Mei, Theodor Misiakiewicz, and Andrea Montanari. When do neural networks outperform kernel methods? arXiv preprint arXiv:2006.13409, 2020.
753		15	17	Bogdan Gliwa, Iwona Mochol, Maciej Biesek, and Aleksander Wawer. Samsum corpus: A human-annotated dialogue dataset for abstractive summarization. CoRR, abs/1911.12237, 2019. URL http://arxiv.org/abs/1911.12237.
754		15	18	Lars Grasedyck, Daniel Kressner, and Christine Tobler. A literature survey of low-rank tensor approximation techniques. GAMM-Mitteilungen, 36(1):53–78, 2013.
755		15	19	Jihun Ham and Daniel D. Lee. Grassmann discriminant analysis: a unifying view on subspace-based learning. In ICML, pp. 376–383, 2008. URL https://doi.org/10.1145/1390156.1390204.
756		15	20	Karen Hambardzumyan, Hrant Khachatrian, and Jonathan May. WARP: Word-level Adversarial ReProgramming. arXiv:2101.00121 [cs], December 2020. URL http://arxiv.org/abs/2101.00121. arXiv: 2101.00121.
757		15	21	Pengcheng He, Xiaodong Liu, Jianfeng Gao, and Weizhu Chen. Deberta: Decoding-enhanced bert with disentangled attention, 2021.
758		15	22	Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, and Sylvain Gelly. Parameter-Efficient Transfer Learning for NLP. arXiv:1902.00751 [cs, stat], June 2019. URL http://arxiv.org/abs/1902.00751.
759		15	23	Max Jaderberg, Andrea Vedaldi, and Andrew Zisserman. Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866, 2014.
760		15	24	Mikhail Khodak, Neil Tenenholtz, Lester Mackey, and Nicol` o Fusi. Initialization and regularization of factorized neural layers, 2021.
761		15	25	Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization, 2017.
762		15	26	Dmitry Lepikhin, HyoukJoong Lee, Yuanzhong Xu, Dehao Chen, Orhan Firat, Yanping Huang, Maxim Krikun, Noam Shazeer, and Zhifeng Chen. Gshard: Scaling giant models with conditional computation and automatic sharding, 2020.
763		15	27	Brian Lester, Rami Al-Rfou, and Noah Constant. The Power of Scale for Parameter-Efficient Prompt Tuning. arXiv:2104.08691 [cs], April 2021. URL http://arxiv.org/abs/2104.08691. arXiv: 2104.08691.
764		15	28	Chunyuan Li, Heerad Farkhoor, Rosanne Liu, and Jason Yosinski. Measuring the Intrinsic Dimension of Objective Landscapes. arXiv:1804.08838 [cs, stat], April 2018a. URL http://arxiv.org/abs/1804.08838. arXiv: 1804.08838.
765		15	29	Xiang Lisa Li and Percy Liang. Prefix-Tuning: Optimizing Continuous Prompts for Generation. arXiv:2101.00190 [cs], January 2021. URL http://arxiv.org/abs/2101.00190.
766		15	30	Yuanzhi Li and Yingyu Liang. Learning overparameterized neural networks via stochastic gradient descent on structured data. In Advances in Neural Information Processing Systems, 2018.
767		15	31	Yuanzhi Li, Yingyu Liang, and Andrej Risteski. Recovery guarantee of weighted low-rank approximation via alternating minimization. In International Conference on Machine Learning, pp.2358–2367. PMLR, 2016.
768		15	32	Yuanzhi Li, Tengyu Ma, and Hongyang Zhang. Algorithmic regularization in over-parameterized matrix sensing and neural networks with quadratic activations. In Conference On Learning Theory, pp. 2–47. PMLR, 2018b.
769		15	33	Zhaojiang Lin, Andrea Madotto, and Pascale Fung. Exploring versatile generative language model via parameter-efficient transfer learning. In Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 441–459, Online, November 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.findings-emnlp.41. URL https://aclanthologyorg/2020.findings-emnlp.41.
770		15	34	Xiao Liu, Yanan Zheng, Zhengxiao Du, Ming Ding, Yujie Qian, Zhilin Yang, and Jie Tang. GPT Understands, Too. arXiv:2103.10385 [cs], March 2021. URL http://arxiv.org/abs/2103.10385. arXiv: 2103.10385.
771		15	35	Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Roberta: A robustly optimized bert pretraining approach, 2019.
772		15	36	Ilya Loshchilov and Frank Hutter. arXiv:1711.05101, 2017.
773		15	37	Decoupled weight decay regularization. arXiv preprint Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization, 2019.
774		15	38	Rabeeh Karimi Mahabadi, James Henderson, and Sebastian Ruder. Compacter: Efficient low-rank hypercomplex adapter layers, 2021.
775		15	39	Linyong Nan, Dragomir Radev, Rui Zhang, Amrit Rau, Abhinand Sivaprasad, Chiachun Hsieh, Xiangru Tang, Aadit Vyas, Neha Verma, Pranav Krishna, et al. Dart: Open-domain structured data record to text generation. arXiv preprint arXiv:2007.02871, 2020.
776		15	40	Jekaterina Novikova, Ondˇ rej Duˇ sek, and Verena Rieser. The e2e dataset: New challenges for end-to-end generation. arXiv preprint arXiv:1706.09254, 2017.
777		15	41	Samet Oymak, Zalan Fabian, Mingchen Li, and Mahdi Soltanolkotabi. Generalization guarantees for neural networks via harnessing the low-rank structure of the jacobian. arXiv preprint arXiv:1906.05392, 2019.
778		15	42	Jonas Pfeiffer, Aishwarya Kamath, Andreas R¨ uckl´ e, Kyunghyun Cho, and Iryna Gurevych. Adapter-fusion: Non-destructive task composition for transfer learning, 2021.
779		15	43	Daniel Povey, Gaofeng Cheng, Yiming Wang, Ke Li, Hainan Xu, Mahsa Yarmohammadi, and Sanjeev Khudanpur. Semi-orthogonal low-rank matrix factorization for deep neural networks. In Interspeech, pp. 3743–3747, 2018.
780		15	44	Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. Improving Language Understanding by Generative Pre-Training. pp. 12, a.
781		15	45	Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. Language Models are Unsupervised Multitask Learners. pp. 24, b.
782		15	46	Pranav Rajpurkar, Robin Jia, and Percy Liang. Know what you don’t know: Unanswerable questions for squad. CoRR, abs/1806.03822, 2018. URL http://arxiv.org/abs/1806.03822.
783		15	47	Sylvestre-Alvise Rebuffi, Hakan Bilen, and Andrea Vedaldi. Learning multiple visual domains with residual adapters. arXiv:1705.08045 [cs, stat], November 2017. URL http://arxiv.org/abs/1705.08045. arXiv: 1705.08045.
784		15	48	Andreas R¨ uckl´ e, Gregor Geigle, Max Glockner, Tilman Beck, Jonas Pfeiffer, Nils Reimers, and Iryna Gurevych. Adapterdrop: On the efficiency of adapters in transformers, 2020.
785		15	49	Tara N Sainath, Brian Kingsbury, Vikas Sindhwani, Ebru Arisoy, and Bhuvana Ramabhadran. Low-rank matrix factorization for deep neural network training with high-dimensional output targets. In 2013 IEEE international conference on acoustics, speech and signal processing, pp. 6655–6659. IEEE, 2013.
786		15	50	Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper, and Bryan Catanzaro. Megatron-lm: Training multi-billion parameter language models using model parallelism, 2020.
787		15	51	Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D. Manning, Andrew Ng, and Christopher Potts. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1631–1642, Seattle, Washington, USA, October 2013. Association for Computational Linguistics. URL https://aclanthology.org/D13-1170.
788		15	52	Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6000–6010, 2017.
789		15	53	Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. Glue: A multi-task benchmark and analysis platform for natural language understanding, 2019.
790		15	54	Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. Superglue: A stickier benchmark for general-purpose language understanding systems, 2020.
791		15	55	Alex Warstadt, Amanpreet Singh, and Samuel R Bowman. Neural network acceptability judgments. arXiv preprint arXiv:1805.12471, 2018.
792		15	56	Adina Williams, Nikita Nangia, and Samuel Bowman. A broad-coverage challenge corpus for sentence understanding through inference. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 1112–1122, New Orleans, Louisiana, June 2018. Association for Computational Linguistics. doi: 10.18653/v1/N18-1101. URL https://www.aclweb.org/anthology/N18-1101.
793		15	57	Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, R´ emi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45, Online, October 2020. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/2020.emnlp-demos.6.
794		15	58	Greg Yang and Edward J. Hu. Feature Learning in Infinite-Width Neural Networks. arXiv:2011.14522 [cond-mat], May 2021. URL http://arxiv.org/abs/2011.14522.arXiv: 2011.14522.
795		15	59	Elad Ben Zaken, Shauli Ravfogel, and Yoav Goldberg. Bitfit: Simple parameter-efficient fine-tuning for transformer-based masked language-models, 2021.
796		15	60	Yu Zhang, Ekapol Chuangsuwanich, and James Glass. Extracting deep neural network bottleneck features using low-rank matrix factorization. In 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp. 185–189. IEEE, 2014.
797		15	61	Yong Zhao, Jinyu Li, and Yifan Gong. Low-rank plus diagonal adaptation for deep neural networks. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5005–5009. IEEE, 2016.
798		15	62	Victor Zhong, Caiming Xiong, and Richard Socher. Seq2sql: Generating structured queries from natural language using reinforcement learning. CoRR, abs/1709.00103, 2017. URL http://arxiv.org/abs/1709.00103.
799		16	0	E. Huckvale and H. N. Moseley. kegg pull: a software package for the restful access and pulling from the kyoto encyclopedia of gene and genomes. BMC Bioinformatics, 24:1–17, 12 2023. ISSN 14712105. doi: 10.1186/S12859-023-05208-0/TABLES/12. URL https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-023-05208-0http://creativecommons.org/publicdomain/zero/1.0/.	https://qiita.com/kaizen_nagoya/items/05be40565793f2b4f7f3
800		16	1	Kawashima S, Katayama T, Sato Y, Kanehisa M. KEGG API: a web service using SOAP/WSDL to Access the KEGG System. Genome Inform. 2003;14:673.
801		16	2	Kanehisa M. Toward understanding the origin and evolution of cellular organisms. Protein Sci. 2019;28:1947–51.
802		16	3	Kanehisa M, Furumichi M, Sato Y, Ishiguro-Watanabe M, Tanabe M. KEGG: integrating viruses and cellular organisms. Nucleic Acids Res. 2021;49:D545–51.
803		16	4	The kyoto encyclopedia of genes and genomes—kegg. Yeast. 2000;1:48–55.
804		16	5	Fielding RT. Representational state transfer. Architectural Styles and the Design of Network-Based Software Architectures. Doctoral dissertation. University of California Irvine, Irvine, CA, USA; 2000.
805		16	6	Reitz K. requests. Computer software. Pypi; 2013.
806		16	7	Christudas B. cURL and Postman. In: Practical Microservices Architectural Patterns: Event-Based Java Microservices with Spring Boot and Spring Cloud. Berkeley, CA: Apress. 2019;847–55.
807		16	8	R Core Team, editor. R: A Language and environment for statistical computing. 2018.
808		16	9	Rossum GV, Drake FL. Python 3 Reference Manual. CreateSpace; 2009.
809		16	10	Wilkinson MD, Dumontier M, Aalbersberg IJJ, Appleton G, Axton M, Baak A, et al. The FAIR guiding principles for scientific data management and stewardship. Sci Data. 2016;3:160018.
810		16	11	Tenenbaum D, Volkening J. KEGGREST. Computer software. Bioconductor Package Maintainer; 2022.
811		16	12	Castelli FM. KEGGutils v04.1. Computer software. 2022. Zenodo. https://doi.org/10.5281/zenodo.7482523.
812		16	13	Cock PJA. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Computer software. PyPi; 2009.
813		16	14	Giampieri E. keggrest. Computer software. PyPi; 2013.
814		17	0	Q. Jin, Y. Yang, Q. Chen, and Z. Lu. Genegpt: augmenting large language models with domain tools for improved access to biomedical information. Bioinformatics, 40, 2 2024. ISSN 13674811. doi: 10.1093/BIOINFORMATICS/BTAE075. URL https://dx.doi.org/10.1093/bioinformatics/btae075.	https://qiita.com/kaizen_nagoya/items/8897792ff52fb5e68a46
815		17	1	Altschul SF, Gish W, Miller W et al. Basic local alignment search tool. J Mol Biol 1990;215:403–10.
816		17	2	Boratyn GM, Camacho C, Cooper PS et al. Blast: a more efficient report with usability improvements. Nucleic Acids Res 2013;41:W29–W33.
817		17	3	Borgeaud S, Mensch A, Hoffmann J et al. Improving language models by retrieving from trillions of tokens. In: International conference on machine learning, Baltimore, Maryland, USA, p. 2206–40. PMLR, 2022.
818		17	4	Brown T, Mann B, Ryder N et al. Language models are few-shot learners. Advances in Neural Information Processing Systems 2020;33:1877–901.
819		17	5	Chen M, Tworek J, Jun H et al. Evaluating large language models trained on code. arXiv, arXiv:2107.03374, 2021, preprint: not peer reviewed.
820		17	6	Chowdhery A, Narang S, Devlin J et al. Palm: scaling language modeling with pathways. arXiv, arXiv:2204.02311, 2022, preprint: not peer reviewed.
821		17	7	Ely JW, Osheroff JA, Chambliss ML et al. Answering physicians’ clinical questions: obstacles and potential solutions. J Am Med Inform Assoc 2005;12:217–24.
822		17	8	Gao L, Madaan A, Zhou S et al. Pal: program-aided language models. arXiv, arXiv:2211.10435, 2022, preprint: not peer reviewed.
823		17	9	Guu K, Lee K, Tung Z et al. Retrieval augmented language model pre-training. In: International conference on machine learning, p. 3929–3938. PMLR, 2020.
824		17	10	Hou W, Ji Z. Geneturing tests gpt models in genomics. bioRxiv 2023:2023–03. pages
825		17	11	Ji Z, Lee N, Frieske R et al. Survey of hallucination in natural language generation. ACM Comput Surv 2023;55:1–38.
826		17	12	Jin Q, Leaman R, Lu Z. Retrieve, summarize, and verify: how will chatgpt impact information seeking from the medical literature? J Am Soc Nephrol 2023a;34:1302–4.
827		17	13	Jin Q, Wang Z, Floudas CS et al. Matching patients to clinical trials with large language models. arXiv, arXiv:2307.15051, 2023b, preprint: not peer reviewed.
828		17	14	Jin Q, Yuan Z, Xiong G et al. Biomedical question answering: a survey of approaches and challenges. ACM Comput Surv 2022;55:1–36.
829		17	15	Kaplan J, McCandlish S, Henighan T et al. Scaling laws for neural language models. arXiv, arXiv:2001.08361, 2020, preprint: not peer reviewed.
830		17	16	Lewis P, Perez E, Piktus A et al. Retrieval-augmented generation for knowledge-intensive nlp tasks. Adv Neural Inform Process Syst 2020;33:9459–74.
831		17	17	Liévin V, Hother CE, Winther O. Can large language models reason about medical questions? arXiv, arXiv:2207.08143, 2022, preprint: not peer reviewed.
832		17	18	Luo R, Sun L, Xia Y et al. Biogpt: generative pre-trained transformer for biomedical text generation and mining. Brief Bioinform 2022;23. https://doi.org/10.1093/bib/bbac409.
833		17	19	Mialon G, Dessì R, Lomeli M et al. Augmented language models: a survey. arXiv, arXiv:2302.07842, 2023, preprint: not peer reviewed.
834		17	20	Nori H, King N, McKinney SM et al. Capabilities of gpt-4 on medical challenge problems. arXiv, arXiv:2303.13375, 2023, preprint: not peer reviewed.
835		17	21	OpenAI. GPT-4 technical report. CoRR abs/2303.08774, 2023.
836		17	22	Parisi A, Zhao Y, Fiedel N. Talm: tool augmented language models. arXiv, arXiv:2205.12255, 2022, preprint: not peer reviewed.
837		17	23	Qin Y, Hu S, Lin Y et al. Tool learning with foundation models. arXiv, arXiv:2304.08354, 2023, preprint: not peer reviewed. http://arxiv.org/pdf/2304.08354.pdf.
838		17	24	Radford A, Narasimhan K, Salimans T et al. Improving language understanding by generative pre-training, OpenAI Blog, 2018.
839		17	25	Radford A, Wu J, Child R et al. Language models are unsupervised multitask learners. OpenAI Blog 2019;1:9.
840		17	26	Sayers EW, Agarwala R, Bolton EE et al. Database resources of the national center for biotechnology information. Nucleic Acids Res 2019;47:D23–D28.
841		17	27	Schick T, Dwivedi-Yu J, Dessì R et al. Toolformer: language models can teach themselves to use tools. arXiv, arXiv:2302.04761, 2023, preprint: not peer reviewed.
842		17	28	Schuler G, Epstein J, Ohkawa H et al. Entrez: molecular biology database and retrieval system. Methods Enzymol 1996;266:141–62.
843		17	29	Singhal K, Azizi S, Tu T et al. Large language models encode clinical knowledge. arXiv, arXiv:2212.13138, 2022, preprint: not peer reviewed.
844		17	30	Tian S, Jin Q, Yeganova L et al. Opportunities and challenges for chatgpt and large language models in biomedicine and health. Brief Bioinform 2024;25(1). https://doi.org/10.1093/bib/bbad493.
845		17	31	Wei J, Tay Y, Bommasani R et al. Emergent abilities of large language models. arXiv, arXiv:2206.07682, 2022a, preprint: not peer reviewed.
846		17	32	Wei J, Wang X, Schuurmans D et al. Chain of thought prompting elicits reasoning in large language models. arXiv, arXiv:2201.11903, 2022b, preprint: not peer reviewed.
847		17	33	Wong C, Zheng S, Gu Y et al. Scaling clinical trial matching using large language models: a case study in oncology. arXiv, arXiv:2308.02180, 2023, preprint: not peer reviewed.
848		17	34	Yao S, Zhao J, Yu D et al. React: synergizing reasoning and acting in language models. arXiv, arXiv:2210.03629, 2022, preprint: not peer reviewed.
849		18	0	M. Kanehisa, M. Furumichi, Y. Sato, Y. Matsuura, and M. Ishiguro-Watanabe. Kegg: biological systems database as a model of the real world. Nucleic Acids Research, 53:D672–D677, 1 2025. ISSN 0305-1048. doi: 10.1093/NAR/GKAE909. URL https://dx.doi.org/10.1093/nar/gkae909.	https://qiita.com/kaizen_nagoya/items/f63573043eaf8f9c6a2c
850		18	1	1. Kanehisa,M.(2019) Toward understanding the origin and evolution of cellular organisms.Protein Sci.,28,1947–1951.
851		18	2	2. Kanehisa,M.,Furumichi,M.,Sato,Y.,Kawashima,M.and Ishiguro-Watanabe,M.(2023) KEGG for taxonomy-based analysis of pathways and genomes.Nucleic Acids Res.,51,D587–D592.
852		18	3	3. Jin,Z.,Sato,Y.,Kawashima,M.and Kanehisa,M.(2023) KEGG tools for classification and analysis of viral proteins.Protein Sci., 32,e4840.
853		18	4	4. Fujibuchi,W.,Goto,S.,Migimatsu,H.,Uchiyama,I.,Ogiwara,A., Akiyama,Y.and Kanehisa,M.(1998) DBGET/LinkDB: an integrated database retrieval system.Pac. Symp. Biocomput., 683–694.
854		18	5	5. Kanehisa,M.and Sato,Y.(2020) KEGG Mapper for inferring cellular functions from protein sequences.Protein Sci.,29,28–35.
855		18	6	6. Kanehisa,M.,Sato,Y.and Kawashima,M.(2022) KEGG mapping tools for uncovering hidden features in biological data.Protein Sci.,31,47–53.
856		18	7	7. Haft,D.H.,Badretdin,A.,Coulouris,G.,DiCuccio,M.,Durkin,A.S., Jovenitti,E.,Li,W.,Mersha,M.,O’Neill,K.R.,Virothaisakun,J.,et al. (2024) RefSeq and the prokaryotic genome annotation pipeline in the age of metagenomes.Nucleic Acids Res.,52,D762–D769.
857		18	8	8. Goad,W.B.and Kanehisa,M.I.(1982) Pattern recognition in nucleic acid sequences.I.A general method for finding local homologies and symmetries.Nucleic Acids Res.,10,247–263.
858		18	9	9. Smith,T.F.and Waterman,M.S.(1981) Identification of common molecular subsequences.J. Mol. Biol.,147,195–197.
859		18	10	10. Schoch,C.L.,Ciufo,S.,Domrachev,M.,Hotton,C.L.,Kannan,S., Khovanskaya, R.,Leipe, D.,Mcveigh, R.,O’Neill,K.,Robbertse,B., et al. (2020) NCBI Taxonomy: a comprehensive update on curation,resources and tools.Database,2020,baaa062.
860		18	11	11. Siddell,S.G.,Smith,D.B.,Adriaenssens,E.,Alfenas-Zerbini,P., Dutilh, B.E.,Garcia, M.L.,Junglen, S.,Krupovic, M.,Kuhn, J.H., Lambert,A.J.,et al. (2023) Virus taxonomy and the role of the International Committee on Taxonomy of Viruses (ICTV).J. Gen. Virol.,104,001840.
861		19	0	J. Kans. Entrez direct: E-utilities on the unix command line - entrez programming utilities help - ncbi bookshelf, 4 2013. URL https://www.ncbi.nlm.nih.gov/books/NBK179288/.	https://qiita.com/kaizen_nagoya/items/cc4bbde566e67abc93d9
862		19	1	The Smithsonian Online Collections Databases are provided by the National Museum of Natural History, Smithsonian Institution, 10th and Constitution Ave. N.W., Washington, DC 20560-0193. https://collections.nmnh.si.edu/.
863		19	2	den Dunnen JT, Dalgleish R, Maglott DR, Hart RK, Greenblatt MS, McGowan-Jordan J, Roux AF, Smith T, Antonarakis SE, Taschner PE. HGVS Recommendations for the Description of Sequence Variants: 2016 Update. Hum Mutat. 2016. https://doi.org/10.1002/humu.22981. (PMID 26931183.)
864		19	3	Holmes JB, Moyer E, Phan L, Maglott D, Kattman B. SPDI: data model for variants and applications at NCBI. Bioinformatics. 2020. https://doi.org/10.1093/bioinformatics/btz856. (PMID 31738401.)
865		19	4	Hutchins BI, Baker KL, Davis MT, Diwersy MA, Haque E, Harriman RM, Hoppe TA, Leicht SA, Meyer P, Santangelo GM. The NIH Open Citation Collection: A public access, broad coverage resource. PLoS Biol. 2019. https://doi.org/10.1371/journal.pbio.3000385. (PMID 31600197.)
866		19	5	Kim S, Thiessen PA, Cheng T, Yu B, Bolton EE. An update on PUG-REST: RESTful interface for programmatic access to PubChem. Nucleic Acids Res. 2018. https://doi.org/10.1093/nar/gky294. (PMID 29718389.)
867		19	6	Mitchell JA, Aronson AR, Mork JG, Folk LC, Humphrey SM, Ward JM. Gene indexing: characterization and analysis of NLM's GeneRIFs. AMIA Annu Symp Proc. 2003:460-4. (PMID 14728215.)
868		19	7	Ostell JM, Wheelan SJ, Kans JA. The NCBI data model. Methods Biochem Anal. 2001. https://doi.org/10.1002/0471223921.ch2. (PMID 11449725.)
869		19	8	Schuler GD, Epstein JA, Ohkawa H, Kans JA. Entrez: molecular biology database and retrieval system. Methods Enzymol. 1996. https://doi.org/10.1016/s0076-6879(96)66012-1. (PMID 8743683.)
870		19	9	Wei C-H, Allot A, Leaman R, Lu Z. PubTator central: automated concept annotation for biomedical full text articles. Nucleic Acids Res. 2019. https://doi.org/10.1093/nar/gkz389. (PMID 31114887.)
871		19	10	Wu C, Macleod I, Su AI. BioGPS and MyGene.info: organizing online, gene-centric information. Nucleic Acids Res. 2013. https://doi.org/10.1093/nar/gks1114. (PMID 23175613.)
872		20	0	M. J. Landrum, J. M. Lee, G. R. Riley, W. Jang, W. S. Rubinstein, D. M. Church, and D. R. Maglott. Clinvar: Public archive of relationships among sequence variation and human phenotype. Nucleic Acids Research, 42, 1 2014. ISSN 03051048. doi: 10.1093/NAR/GKT1113,. URL https://pubmed.ncbi.nlm.nih.gov/24234437/.	https://qiita.com/kaizen_nagoya/items/8149b7a5a4f930490fad
873		20	1	NCBI Resource Coordinators. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2013;41:D8–D20.
874		20	2	Lappalainen I, Lopez J, Skipper L, Hefferon T, Spalding JD, Garner J, Chen C, Maguire M, Corbett M, Zhou G, et al. DbVar and DGVa: public archives for genomic structural variation. Nucleic Acids Res. 2013;41:D936–D941.
875		20	3	Rubinstein WS, Maglott DR, Lee JM, Kattman BL, Malheiro AJ, Ovetsky M, Hem V, Gorelenkov V, Song G, Wallin C, et al. The NIH genetic testing registry: a new, centralized database of genetic tests to enable access to comprehensive information and improve transparency. Nucleic Acids Res. 2013;41:D925–D935.
876		20	4	den Dunnen JT, Antonarakis SE. Mutation nomenclature extensions and suggestions to describe complex mutations: a discussion. Hum. Mutation. 2000;15:7–12.
877		20	5	Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell C, Hart J, Landrum MJ, McGarvey KM, et al. RefSeq: an update on mammalian reference sequences. Nucleic Acids Res. 2014;42:D756–D763.
878		21	0	J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang. Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36:1234–1240, 2 2020. ISSN 1367-4803. doi: 10.1093/BIOINFORMATICS/BTZ682. URL https://dx.doi.org/10.1093/bioinformatics/btz682.	https://qiita.com/kaizen_nagoya/items/63781eb6db1fc2ded80a
879		21	1	Alsentzer E. et al. (2019) Publicly available clinical bert embeddings. In: Proceedings of the 2nd Clinical Natural Language Processing Workshop, Minneapolis, MN, USA. pp. 72–78. Association for Computational Linguistics. https://www.aclweb.org/anthology/W19-1909.
880		21	2	Bhasuran B. , Natarajan J. (2018) Automatic extraction of gene-disease associations from literature using joint ensemble learning. PLoS One, 13, e0200699.
881		21	3	Bravo À. et al. (2015) Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research. BMC Bioinformatics, 16, 55.
882		21	4	Devlin J. et al. (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA. pp. 4171–4186. Association for Computational Linguistics. https://www.aclweb.org/anthology/N19-1423.
883		21	5	Doğan R.I. et al. (2014) NCBI disease corpus: a resource for disease name recognition and concept normalization. J. Biomed. Inform., 47, 1–10.
884		21	6	Gerner M. et al. (2010) Linnaeus: a species name identification system for biomedical literature. BMC Bioinformatics, 11, 85.
885		21	7	Giorgi J.M. , Bader G.D. (2018) Transfer learning for biomedical named entity recognition with neural networks. Bioinformatics, 34, 4087.
886		21	8	Habibi M. et al. (2017) Deep learning with word embeddings improves biomedical named entity recognition. Bioinformatics, 33, i37–i48.
887		21	9	Kim J.-D. et al. (2004) Introduction to the bio-entity recognition task at JNLPBA. In: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA/BioNLP), Geneva, Switzerland. pp. 73–78. COLING. https://www.aclweb.org/anthology/W04-1213.
888		21	10	Krallinger M. et al. (2015) The chemdner corpus of chemicals and drugs and its annotation principles. J. Cheminform., 7.
889		21	11	Krallinger M. et al. (2017) Overview of the BioCreative VI chemical-protein interaction track. In: Proceedings of the BioCreative VI Workshop, Bethesda, MD, USA, pp. 141–146. https://academic.oup.com/database/article/doi/10.1093/database/bay073/5055578.
890		21	12	Li J. et al. (2016) Biocreative V CDR task corpus: a resource for chemical disease relation extraction. Database, 2016.
891		21	13	Lim S. , Kang J. (2018) Chemical–gene relation extraction using recursive neural network. Database, 2018.
892		21	14	Lin C. et al. (2019) A bert-based universal model for both within-and cross-sentence clinical temporal relation extraction. In: Proceedings of the 2nd Clinical Natural Language Processing Workshop, Minneapolis, MN, USA. pp. 65–71. Association for Computational Linguistics. https://www.aclweb.org/anthology/W19-1908.
893		21	15	Lou Y. et al. (2017) A transition-based joint model for disease named entity recognition and normalization. Bioinformatics, 33, 2363–2371.
894		21	16	Luo L. et al. (2018) An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition. Bioinformatics, 34, 1381–1388.
895		21	17	McCann B. et al. (2017) Learned in translation: contextualized word vectors. In: Guyon,I. et al. (eds.), Advances in Neural Information Processing Systems 30, Curran Associates, Inc., pp. 6294–6305. http://papers.nips.cc/paper/7209-learned-in-translation-contextualized-word-vectors.pdf.
896		21	18	Mikolov T. et al. (2013) Distributed representations of words and phrases and their compositionality. In: Burges,C.J.C. (eds.), Advances in Neural Information Processing Systems 26, Curran Associates, Inc., pp. 3111–3119. http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf.
897		21	19	Mohan S. , Li D. (2019) Medmentions: a large biomedical corpus annotated with UMLS concepts. arXiv preprint arXiv: 1902.09476.
898		21	20	Pafilis E. et al. (2013) The species and organisms resources for fast and accurate identification of taxonomic names in text. PLoS One, 8, e65390.
899		21	21	Pennington J. et al. (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar. pp. 1532–1543. Association for Computational Linguistics. https://www.aclweb.org/anthology/D14-1162.
900		21	22	Peters M.E. et al. (2018) Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), New Orleans, LA. pp. 2227–2237. Association for Computational Linguistics. https://www.aclweb.org/anthology/N18-1202.
901		21	23	Pyysalo S. et al. (2013) Distributional semantics resources for biomedical text processing. In: Proceedings of the 5th International Symposium on Languages in Biology and Medicine, Tokyo, Japan, pp. 39–43. https://academic.oup.com/bioinformatics/article/33/14/i37/3953940.
902		21	24	Rajpurkar P. et al. (2016) Squad: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, TX. pp. 2383–2392. Association for Computational Linguistics. https://www.aclweb.org/anthology/D16-1264.
903		21	25	Sachan D.S. et al. (2018) Effective use of bidirectional language modeling for transfer learning in biomedical named entity recognition. In: Finale,D.-V. et al. (eds.), Proceedings of Machine Learning Research, Palo Alto, CA, Vol. 85, pp. 383–402. PMLR. http://proceedings.mlr.press/v85/sachan18a.html.
904		21	26	Smith L. et al. (2008) Overview of biocreative ii gene mention recognition. Genome Biol., 9, S2.
905		21	27	Sousa D. et al. (2019) A silver standard corpus of human phenotype-gene relations. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN. pp. 1487–1492. Association for Computational Linguistics. https://www.aclweb.org/anthology/N19-1152.
906		21	28	Sung N. et al. (2017) NSML: A machine learning platform that enables you to focus on your models. arXiv preprint arXiv: 1712.05902.
907		21	29	Tsatsaronis G. et al. (2015) An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition. BMC Bioinformatics, 16, 138.
908		21	30	Uzuner Ö. et al. (2011) 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. J. Am. Med. Inform. Assoc., 18, 552–556.
909		21	31	Van Mulligen E.M. et al. (2012) The EU-ADR corpus: annotated drugs, diseases, targets, and their relationships. J. Biomed. Inform., 45, 879–884.
910		21	32	Vaswani A. et al. (2017) Attention is all you need. In: Guyon,I. et al. (eds.), Advances in Neural Information Processing Systems, pp. 5998–6008. Curran Associates, Inc. http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf.
911		21	33	Wang X. et al. (2018) Cross-type biomedical named entity recognition with deep multi-task learning. Bioinformatics, 35, 1745–1752.
912		21	34	Wiese G. et al. (2017) Neural domain adaptation for biomedical question answering. In: Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), Vancouver, Canada. pp. 281–289. Association for Computational Linguistics. https://www.aclweb.org/anthology/K17-1029.
913		21	35	Wu Y. et al. (2016) Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv: 1609.08144.
914		21	36	Xu K. et al. (2019) Document-level attention-based BiLSTM-CRF incorporating disease dictionary for disease named entity recognition. Comput. Biol. Med., 108, 122–132.
915		21	37	Yoon W. et al. (2019) Collabonet: collaboration of deep neural networks for biomedical named entity recognition. BMC Bioinformatics, 20, 249.
916		21	38	Zhu H. et al. (2018) Clinical concept extraction with contextual word embedding. NIPS Machine Learning for Health Workshop. http://par.nsf.gov/biblio/10098080.
917		22	0	Q. Li, Z. Hu, Y. Wang, L. Li, Y. Fan, I. King, G. Jia, S. Wang, L. Song, and Y. Li. Progress and opportunities of foundation models in bioinformatics. Briefings in Bioinformatics, 25:548, 9 2024. ISSN 14774054. doi: 10.1093/BIB/BBAE548. URL https://dx.doi.org/10.1093/bib/bbae548.	https://qiita.com/kaizen_nagoya/items/6ef20eaf796532fed6f8
918		22	1	Hughes JP, Rees S, Kalindjian SB. et al. Principles of early drug discovery. Br J Pharmacol 2011;162:1239–49. 10.1111/j.1476-5381.2010.01127.x.
919		22	2	Bommasani DA, Hudson E, Adeli E. et al. On the opportunities and risks of foundation models.arXiv preprint arXiv:2108.07258, 2021.
920		22	3	Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med 2019;25:44–56. 10.1038/s41591-018-0300-7.
921		22	4	Park Y S, Lek S. Artificial Neural Networks: Multilayer Perceptron for Ecological Modeling[M]. In: Jørgensen SE, (eds.), Developments in Environmental Modeling. Netherlands: Elsevier, 2016;28: 123–40, 10.1016/B978-0-444-63623-2.00007-4.
922		22	5	Wang M, Tai CEW, Wei L. DeFine: deep convolutional neural networks accurately quantify intensities of transcription factor-DNA binding and facilitate evaluation of functional non-coding variants. Nucleic Acids Res 2018;46:e69–9. 10.1093/nar/gky215.
923		22	6	Shen J, Liu F, Tu Y. et al. Finding gene network topologies for given biological function with recurrent neural network. Nat Commun 2021;12:3125. 10.1038/s41467-021-23420-5.
924		22	7	Whalen S, Truty RM, Pollard KS. Enhancer–promoter interactions are encoded by complex genomic signatures on looping chromatin. Nat Genet 2016;48:488–96. 10.1038/ng.3539.
925		22	8	Forster DT, Li SC, Yashiroda Y. et al. BIONIC: biological network integration using convolutions. Nat Methods 2022;19:1250–61. 10.1038/s41592-022-01616-x.
926		22	9	Dong K, Zhang S. Deciphering spatial domains from spatially resolved transcriptomics with an adaptive graph attention auto-encoder. Nat Commun 2022;13:1739. 10.1038/s41467-022-29439-6.
927		22	10	Mahmud M, Kaiser MS, Hussain A. et al. Applications of deep learning and reinforcement learning to biological data. IEEE Trans Neural Netw Learn Syst 2018;29:2063–79. 10.1109/TNNLS.2018.2790388.
928		22	11	Wiggins WF, Tejani AS. On the opportunities and risks of foundation models for natural language processing in radiology. Radiol Artif Intell 2022;4:e220119. 10.1148/ryai.220119.
929		22	12	Baker B, Akkaya I, Zhokov P. et al. Video pretraining (vpt): learning to act by watching unlabeled online videos. Adv Neural Inf Process Syst 2022;35:24639–54.
930		22	13	Tack A, Piech C. The AI teacher test: measuring the pedagogical ability of blender and GPT-3 in educational dialogues. arXiv preprint arXiv:2205.07540, 2022.
931		22	14	Moor M, Banerjee O, Abad ZSH. et al. Foundation models for generalist medical artificial intelligence. Nature 2023;616:259–65. 10.1038/s41586-023-05881-4.
932		22	15	Rao R M, Liu J, Verkuil R. et al. MSA transformer. International Conference on Machine Learning PMLR 2021;139:8844–56.
933		22	16	Sapoval N, Aghazadeh A, Nute MG. et al. Current progress and open challenges for applying deep learning across the biosciences. Nat Commun 2022;13:1728. 10.1038/s41467-022-29268-7.
934		22	17	Theodoris CV, Xiao L, Chopra A. et al. Transfer learning enables predictions in network biology. Nature 2023;618:616–24. 10.1038/s41586-023-06139-9.
935		22	18	Zou J, Huss M, Abid A. et al. A primer on deep learning in genomics. Nat Genet 2019;51:12–8. 10.1038/s41588-018-0295-5.
936		22	19	Uhlmann V, Donati L, Sage D. A practical guide to supervised deep learning for bioimage analysis: challenges and good practices. IEEE Signal Process Mag 2022;39:73–86. 10.1109/MSP.2021.3123589.
937		22	20	Wasserman WW, Sandelin A. Applied bioinformatics for the identification of regulatory elements. Nat Rev Genet 2004;5:276–87. 10.1038/nrg1315.
938		22	21	Howard J, Ruder S. Universal language model fine-tuning for text classification. arXiv preprint arXiv:1801.06146, 2018.
939		22	22	Yuan L, Chen D, Chen YL. et al. Florence: a new foundation model for computervision, arXiv preprint arXiv:2111.11432, 2021.
940		22	23	Devlin J, Chang MW, Lee K. et al. Bert: pre-training of deep bidirectional transformers for languageunderstanding. arXiv preprint arXiv:1810.04805, 2018.
941		22	24	Lee J, Yoon W, Kim S. et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 2020;36:1234–40. 10.1093/bioinformatics/btz682.
942		22	25	Gu Y, Tinn R, Cheng H. et al. Domain-specific language model pretraining for biomedical natural language processing. ACM Trans Comput Healthc 2021;3:1–23.
943		22	26	Ji Y, Zhou Z, Liu H. et al. DNABERT: pre-trained bidirectional encoder representations from transformers model for DNA-language in genome. Bioinformatics 2021;37:2112–20. 10.1093/bioinformatics/btab083.
944		22	27	Brandes N, Ofer D, Peleg Y. et al. Proteinbert: a universal deep-learning model of protein sequence and function. Bioinformatics 2022;38:2102–10. 10.1093/bioinformatics/btac020.
945		22	28	Radford A, Wu J, Child R. et al. Language models are unsupervised multitask learners. OpenAI Blog 2019;1:9.
946		22	29	Wu Y, Wang S, Yang H. et al. An early evaluation of gpt-4v(ision).arXiv preprintarXiv:2310.16534, 2023.
947		22	30	Lin Z, Akin H, Rao R. et al. Language models of protein sequences at the scale of evolution enable accurate structure prediction. bioRxiv 500902, 2022.
948		22	31	Hayes T, Rao R, Akin H. et al. Simulating 500 million years of evolution with a language model. bioRxiv 600583, 2024.
949		22	32	Raffel C, Shazeer N, Roberts A. et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 2020;21:1–67.
950		22	33	Song K, Tan X, Qin T. et al. Mpnet: masked and permuted pre-training for language understanding. Adv Neural Inf Process Syst 2020;33:16857–67.
951		22	34	Avsec Ž, Agarwal V, Visentin D. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat Methods 2021;18:1196–203. 10.1038/s41592-021-01252-x.
952		22	35	Xu M, Yuan X, Miret S. et al. Protst: multi-modality learning of protein sequences and biomedicaltexts. arXiv preprint arXiv:2301.12040, 2023.
953		22	36	Ferruz N, Schmidt S, Höcker B. ProtGPT2 is a deep unsupervised language model for protein design. Nat Commun 2022;13:4348. 10.1038/s41467-022-32007-7.
954		22	37	Chen B, Cheng X, Geng Y. et al. xTrimoPGLM: unified 100B-scale pre-trained transformer for deciphering the language of protein. arXiv preprint arXiv:2401.06199, 2024.
955		22	38	Liu Y and Tian B. Protein-DNA binding sites prediction based on pre-trained protein language model and contrastive learning [J]. arXiv preprint arXiv, 2023;2306:15912.
956		22	39	Azher ZL, Suvarna A, Chen JQ. et al. Assessment of emerging pretraining strategies in interpretable multimodal deep learning for cancer prognostication. BioData Min 2023;16:23. 10.1186/s13040-023-00338-w.
957		22	40	Liu Y, Tian B. Protein-DNA binding sites prediction based on pre-trained protein language model and contrastive learning. Briefings in Bioinformatics 2024;25.1:bbad488. 10.1093/bib/bbad488.
958		22	41	Nguyen E, Poli M, Faizi M. et al. Hyenadna: long-range genomic sequence modeling at single nucleotide resolution. Advances in Neural Information Processing Systems, 2024;36.
959		22	42	Cui H, Wang C, Maan H. et al. scGPT: towards building a foundation model for single-cell multi-omics using generative AI. Nature Methods 2024;1–11.
960		22	43	Madani A, Krause B, Greene ER. et al. Large language models generate functional protein sequences across diverse families. Nat Biotechnol 2023;41:1099–106. 10.1038/s41587-022-01618-2.
961		22	44	Senior AW, Evans R, Jumper J. et al. Improved protein structure prediction using potentials from deep learning. Nature 2020;577:706–10. 10.1038/s41586-019-1923-7.
962		22	45	Walsh B, Mohamed SK, Nováček V. Biokg: A knowledge graph for relational learning on biological data. In: d'Aquin PM, Dietze PS, (eds.), Proceedings of the 29th ACM International Conference on Information & Knowledge Management. ACM (Association for Computing Machinery), New York, NY, USA, 2020; 3173–3180.
963		22	46	Bernstein NJ, Fong NL, Lam I. et al. Solo: doublet identification in single-cell RNA-seq via semi-supervised deep learning. Cell Syst 2020;11:95–101.e5e5. 10.1016/j.cels.2020.05.010.
964		22	47	Brendel M, Su C, Bai Z. et al. Application of deep learning on single-cell RNA sequencing data analysis: a review. Genomics Proteomics Bioinformatics 2022;20:814–35. 10.1016/j.gpb.2022.11.011.
965		22	48	Arisdakessian C, Poirion O, Yunits B. et al. DeepImpute: an accurate, fast, and scalable deep neural network method to impute single-cell RNA-seq data. Genome Biol 2019;20:211. 10.1186/s13059-019-1837-6.
966		22	49	Tran HTN, Ang KS, Chevrier M. et al. A benchmark of batch-effect correction methods for single-cell RNA sequencing data. Genome Biol 2020;21:1–32. 10.1186/s13059-019-1850-9.
967		22	50	Clement L. Statistical methods for quantitative MS-based proteomics: part I. Preprocessing.
968		22	51	Mowoe MO, Garnett S, Lennard K. et al. Pro-MAP: a robust pipeline for the pre-processing of single channel protein microarray data. BMC Bioinformatics 2022;23:534. 10.1186/s12859-022-05095-x.
969		22	52	Hong L, Sun S, Zheng L, Tan Q X, and Li Y. fastmsa: Accelerating multiple sequence alignment with dense retrieval on protein language. bioRxiv 2021;2021–12.
970		22	53	Steinegger M, Meier M, Mirdita M. et al. HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinformatics 2019;20:1–15. 10.1186/s12859-019-3019-7.
971		22	54	Stecher G, Tamura K, Kumar S. Molecular evolutionary genetics analysis (MEGA) for macOS. Mol Biol Evol 2020;37:1237–9. 10.1093/molbev/msz312.
972		22	55	Chen K, Zhao H, Yang Y. Capturing large genomic contexts for accurately predicting enhancer-promoter interactions. Brief Bioinform 2022; 23:bbab577. 10.1093/bib/bbab577.
973		22	56	Novakovsky G, Dexter N, Libbrecht MW. et al. Obtaining genetics insights from deep learning via explainable artificial intelligence. Nat Rev Genet 2023;24:125–37. 10.1038/s41576-022-00532-2.
974		22	57	Dalla-Torre H, Gonzalez L, Mendoza J. et al. The nucleotide transformer: building and evaluating robust foundation models for human genomics. bioRxiv 2023;2023–01.
975		22	58	Chen J, Hu Z, Sun S. et al. Interpretable RNA foundation model from unannotated data for highly accurate RNA structure and function predictions. arXiv preprint arXiv:2204.00300, 2022.
976		22	59	Alipanahi B, Delong A, Weirauch MT. et al. Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning. Nat Biotechnol 2015;33:831–8. 10.1038/nbt.3300.
977		22	60	Liu P, Yuan W, Fu J. et al. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput Surv 2023;55:1–35. 10.1145/3560815.
978		22	61	Rentzsch P, Schubach M, Shendure J. et al. CADD-splice—improving genome-wide variant effect prediction using deep learning-derived splice scores. Genome Med 2021;13:1–12. 10.1186/s13073-021-00835-9.
979		22	62	Mi H, Muruganujan A, Huang X. et al. Protocol update for large-scale genome and gene function analysis with the PANTHER classification system (v. 14.0). Nat Protoc 2019;14:703–21. 10.1038/s41596-019-0128-8.
980		22	63	Ernst J, Kheradpour P, Mikkelsen TS. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 2011;473:43–9. 10.1038/nature09906.
981		22	64	Tang Z, Li C, Kang B. et al. GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses. Nucleic Acids Res 2017;45:W98–102. 10.1093/nar/gkx247.
982		22	65	Saelens W, Cannoodt R, Todorov H. et al. A comparison of single-cell trajectory inference methods. Nat Biotechnol 2019;37:547–54. 10.1038/s41587-019-0071-9.
983		22	66	Kaddour J, Harris J, Mozes M. et al. Challenges and applications of large languagemodels. arXiv preprint arXiv:2307.10169. 2023.
984		22	67	Liu W, Zhou P, Zhao Z. et al. K-bert: enabling language representation with knowledge graph. In: Proceedings of the AAAI Conference on Artificial Intelligence 2020;34:2901–8. 10.1609/aaai.v34i03.5681.
985		22	68	Brown TB, Mann B, Ryder N. et al. Language models are few-shot learners. Adv Neural Inf Process Syst 2020;33:1877–901.
986		22	69	Yasunaga M, Bosselut A, Ren H. et al. Deep bidirectional language-knowledge graph pretraining. Adv Neural Inf Process Syst 2022;35:37309–23.
987		22	70	Denny V, Krötzsch M. Wikidata: a free collaborative knowledgebase. Communications of the ACM, 2014;57:78–85.
988		22	71	Zhu Y, Kiros R, Zemel R. et al. Aligning books and movies: towards story-like visual explanations by watching movies and reading books. arXiv preprint arXiv:1506.06724, 2015.
989		22	72	Speer R, Chin J, Havasi C. Conceptnet 5.5: an open multilingual graph of general knowledge. In Proceedings of the AAAI Conference on Artificial Intelligence 2017;31:4444–4451. 10.1609/aaai.v31i1.11164.
990		22	73	Jia G, Li Y, Zhong X. et al. The high-dimensional space of human diseases built from diagnosis records and mapped to genetic loci. Nat Comput Sci 2023;3:403–17. 10.1038/s43588-023-00453-y.
991		22	74	Jia G, Li Y, Zhang H. et al. Estimating heritability and genetic correlations from large health datasets in the absence of genetic data. Nat Commun 2019;10:5508. 10.1038/s41467-019-13455-0.
992		22	75	Singhal K, Azizi S, Tu T. et al. Large language models encode clinical knowledge. Nature 2023;620:172–80. 10.1038/s41586-023-06291-2.
993		22	76	Kanakarajan RK, Kundumani B, Sankarasubbu M. BioELECTRA: Pretrained biomedical text encoder using discriminators. In: Demner-Fushman D, Cohen KB, Ananiadou S, Tsujii J, (eds.), Proceedings of the 20th Workshop on Biomedical Language Processing. Association for Computational Linguistics, Online, 2021;143–154.
994		22	77	Babjac AN, Lu Z, Emrich SJ. CodonBERT: Using BERT for sentiment analysis to better predict genes with low expression. In: Wang MD, Byung-Jun Yoon P, (eds.), Proceedings of the 14th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics. Association for Computing Machinery, New York, NY, United States, 2023; 1–6.
995		22	78	Yuan H, Yuan Z, Gan R. et al. BioBART: Pretraining and evaluation of a biomedical generative language model. arXiv preprint arXiv:2204.03905, 2022.
996		22	79	Rajpurkar P, Zhang J, Lopyrev K. et al. Squad: 100,000+ questions for machine comprehension of textarXiv preprint. arXiv preprint arXiv:1606.05250, 2016.
997		22	80	Fiorini N, Leaman R, Lipman DJ. et al. How user intelligence is improving PubMed. Nat Biotechnol 2018;36:937–45. 10.1038/nbt.4267.
998		22	81	Wu J, Fu R, Fang H. et al. Medical sam adapter: adapting segment anything model for medical image segmentation. arXiv preprint arXiv:2304.12620, 2023.
999		22	82	Pathak Y, Shukla PK, Tiwari A. et al. Deep transfer learning based classification model for COVID-19 disease. Ing Rech Biomed 2022;43:87–92. 10.1016/j.irbm.2020.05.003.
1000		22	83	Bolton E, Hall D, Yasunaga M. et al. Stanford crfm introduces pubmedgpt 2.7 b. 2022.
1001		22	84	Zhou Z, Ji Y, Li W. et al. DNABERT-2: efficient foundation model and benchmark for multi-species genome. arXiv preprint arXiv:2306.15006, 2023.
1002		22	85	Wang R, Wang Z, Wang J. et al. SpliceFinder: ab initio prediction of splice sites using convolutional neural network. BMC Bioinformatics 2019;20:1–13. 10.1186/s12859-019-3306-3.
1003		22	86	Repecka D, Jauniskis V, Karpus L. et al. Expanding functional protein sequence spaces using generative adversarial networks. Nat Mach Intell 2021;3:324–33. 10.1038/s42256-021-00310-5.
1004		22	87	Gresova K, Martinek V, Cechak D. et al. Genomic benchmarks: a collection of datasets for genomic sequence classification. BMC Genomic Data, 2023;24:25.
1005		22	88	Wu R, Ding F, Wang R. et al. High-resolution de novo structure prediction from primary sequence. bioRxiv 2022; 2022-07.
1006		22	89	Lin Z, Akin H, Rao R. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 2023;379:1123–30. 10.1126/science.ade2574.
1007		22	90	Ruffolo JA, Chu LS, Mahajan SP. et al. Fast, accurate antibody structure prediction from deep learning on massive set of natural antibodies. Nat Commun 2023;14:2389. 10.1038/s41467-023-38063-x.
1008		22	91	Wang Y, Xumeng Gong, Li S, Yang B, Sun Y, Chuan Shi, Wang Y, Yang C, Li H, and Song L. xtrimoabfold: de novo antibody structure prediction without msa. arXiv preprint arXiv:2212.00735, 2022.
1009		22	92	Skinnider M, Johnston C, Gunabalasingam M. et al. Comprehensive prediction of secondary metabolite structure and biological activity from microbial genome sequences. Nat Commun 2020;11:6058. 10.1038/s41467-020-19986-1.
1010		22	93	Jumper J, Evans R, Pritzel A. et al. Highly accurate protein structure prediction with AlphaFold. Nature 2021;596:583–9. 10.1038/s41586-021-03819-2.
1011		22	94	Rives A, Meier J, Sercu T. et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc Natl Acad Sci 2021;118:e2016239118. 10.1073/pnas.2016239118.
1012		22	95	Klausen MS, Jespersen MC, Nielsen H. et al. NetSurfP-2.0: improved prediction of protein structural features by integrated deep learning. Proteins 2019;87:520–7. 10.1002/prot.25674.
1013		22	96	Elnaggar A, Heinzinger M, Dallago C. et al. Prottrans: toward understanding the language of life through self-supervised learning. IEEE Trans Pattern Anal Mach Intell 2021;44:7112–27. 10.1109/TPAMI.2021.3095381.
1014		22	97	Zhou G, Gao Z, Ding Q. et al. Uni-Mol: A universal 3d molecular representation learning framework. chemrxiv, 2023.
1015		22	98	Feynman R. The Character of Physical Law, with New Foreword. MIT Press, Cambridge, Massachusetts, USA, 2017, 10.7551/mitpress/11068.001.0001.
1016		22	99	Chowdhury R, Bouatta N, Biswas S. et al. Single-sequence protein structure prediction using a language model and deep learning. Nat Biotechnol 2022;40:1617–23. 10.1038/s41587-022-01432-w.
1017		22	100	Guo Y, Wu J, Ma H. et al. Self-supervised pre-training for protein embeddings using tertiary structures. Proceedings of the AAAI Conference on Artificial Intelligence 2022;36:6801–9. 10.1609/aaai.v36i6.20636.
1018		22	101	McDermott M, Yap B, Szolovits P. et al. Structure-inducing pre-training. Nat Mach Intell 2023;5:612–21. 10.1038/s42256-023-00647-z.
1019		22	102	Singh J, Hanson J, Paliwal K. et al. RNA secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning. Nat Commun 2019;10:5407. 10.1038/s41467-019-13395-9.
1020		22	103	Fu L, Cao Y, Wu J. et al. UFold: fast and accurate RNA secondary structure prediction with deep learning. Nucleic Acids Res 2022;50:e14–4. 10.1093/nar/gkab1074.
1021		22	104	Zhu H, Hu J, Song XN. et al. DNAPred: accurate identification of DNA-binding sites from protein sequence by ensembled hyperplane-distance-based support vector machines. J Chem Inf Model 2019;59:3057–71. 10.1021/acs.jcim.8b00749.
1022		22	105	Zhang J, Chen Q, Liu B. NCBRPred: predicting nucleic acid binding residues in proteins based on multilabel learning. Brief Bioinform 2021;22:bbaa397. 10.1093/bib/bbaa397.
1023		22	106	Su H, Liu M, Sun S. et al. Improving the prediction of protein-nucleic acids binding residues via multiple sequence profiles and the consensus of complementary methods. Bioinformatics 2019;35:930–6. 10.1093/bioinformatics/bty756.
1024		22	107	Ashburner M, Ball C, Blake J. et al. Gene ontology: tool for the unification of biology. Nat Genet 2000;25:25–9. 10.1038/75556.
1025		22	108	Gligorijević V, Renfrew PD, Kosciolek T. et al. Structure-based protein function prediction using graph convolutional networks. Nat Commun 2021;12:3168. 10.1038/s41467-021-23303-9.
1026		22	109	Kulmanov M, Hoehndorf R. DeepGOZero: improving protein function prediction from sequence and zero-shot learning based on ontology axioms. Bioinformatics 2022;38:i238–45. 10.1093/bioinformatics/btac256.
1027		22	110	Yang F, Wang W, Wang F. et al. scBERT as a large-scale pre-trained deep language model for cell type annotation of single-cell RNA-seq data. Nat Mach Intell 2022;4:852–66. 10.1038/s42256-022-00534-z.
1028		22	111	Choromanski K, Likhosherstov V, Dohan D. et al. Rethinking attention with performers. arXiv preprint arXiv:2009.14794, 2020.
1029		22	112	Lu Y, Jiang X, Fang Y. et al. Learning to pre-train graph neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 2021;35:4276–84. 10.1609/aaai.v35i5.16552.
1030		22	113	Li C, Liu B, Kang B. et al. SciBet as a portable and fast single cell type identifier. Nat Commun 2020;11:1818. 10.1038/s41467-020-15523-2.
1031		22	114	Kiselev VY, Yiu A, Hemberg M. Scmap: projection of single-cell RNA-seq data across data sets. Nat Methods 2018;15:359–62. 10.1038/nmeth.4644.
1032		22	115	Yang X, Mann KK, Wu H. et al. scCross: a deep generative model for unifying single-cell multi-omics with seamless integration, cross-modal generation, and in silico exploration. Genome Biol 2024;25:198. 10.1186/s13059-024-03338-z.
1033		22	116	Hao M, Gong J, Zeng X. et al. Large-scale foundation model on single-cell transcriptomics. Nat Methods 2024;21:1481–1491.
1034		22	117	Saharia C, Chan W, Saxena S. et al. Photorealistic text-to-image diffusion models with deep language understanding. Adv Neural Inf Process Syst 2022;35:36479–94.
1035		22	118	Cao ZJ, Gao G. Multi-omics single-cell data integration and regulatory inference with graph-linked embedding. Nat Biotechnol 2022;40:1458–66. 10.1038/s41587-022-01284-4.
1036		22	119	Ciciani M, Demozzi M, Pedrazzoli E. et al. Automated identification of sequence-tailored Cas9 proteins using massive metagenomic data. Nat Commun 2022;13:6474. 10.1038/s41467-022-34213-9.
1037		22	120	Ruiz C, Zitnik M, Leskovec J. Identification of disease treatment mechanisms through the multiscale interactome. Nat Commun 2021;12:1796. 10.1038/s41467-021-21770-8.
1038		22	121	Eraslan G, Avsec Ž, Gagneur J. et al. Deep learning: new computational modeling techniques for genomics. Nat Rev Genet 2019;20:389–403. 10.1038/s41576-019-0122-6.
1039		22	122	Poli M, Massaroli S, Nguyen E. et al. Hyena hierarchy: towards larger convolutional language models. In: International Conference on Machine Learning. PMLR, 2023;28043–28078.
1040		22	123	Jeliazkov JR, del Alamo D, Karpiak JD. Esmfold hallucinates native-like protein sequences. bioRxiv 2023; 2023–05.
1041		22	124	Wang Z, Dai Z, Póczos B. et al. Characterizing and avoiding negative transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019; 11293–11302.
1042		22	125	Wang H, Kaddour J, Liu S. et al. Evaluating self-supervised learning for molecular graph embeddings. Advances in Neural Information Processing Systems, 2024;36.
1043		22	126	Zhou H, Zhang S, Peng J. et al. Informer: beyond efficient transformer for long sequence time-series forecasting. Proceedings of the AAAI Conference on Artificial Intelligence 2021;35:11106–15. 10.1609/aaai.v35i12.17325.
1044		22	127	Press O, Smith NA, Lewis M. Shortformer: better language modeling using shorterinputs. arXiv preprint arXiv:2012.15832, 2020.
1045		22	128	Li C, Zhang M, He Y. The stability-efficiency dilemma: investigating sequence length warmup for training GPT models. Adv Neural Inf Process Syst 2022;35:26736–50.
1046		22	129	Dao T, Fu D, Ermon S. et al. Flashattention: fast and memory-efficient exact attention with io-awareness. Adv Neural Inf Process Syst 2022;35:16344–59.
1047		22	130	Ainslie J, Lee-Thorp J, de Jong M. et al. GQA: training generalized multi-query transformer models from multi-headcheckpoints. arXiv preprint arXiv:2305.13245, 2023.
1048		22	131	Hijma P, Heldens S, Sclocco A. et al. Optimization techniques for GPU programming. ACM Comput Surv 2023;55:1–81. 10.1145/3570638.
1049		22	132	Cui P, Athey S. Stable learning establishes some common ground between causal inference and machine learning. Nat Mach Intell 2022;4:110–5. 10.1038/s42256-022-00445-z.
1050		22	133	Jin Q, Yuan Z, Xiong G. et al. Biomedical question answering: a survey of approaches and challenges. ACM Comput Surv 2022;55:1–36. 10.1145/3490238.
1051		22	134	Danaee P, Rouches M, Wiley M. et al. bpRNA: large-scale automated annotation and analysis of RNA secondary structure. Nucleic Acids Res 2018;46:5381–94. 10.1093/nar/gky285.
1052		22	135	Moon I, LoPiccolo J, Baca SC. et al. Machine learning for genetics-based classification and treatment response prediction in cancer of unknown primary. Nat Med 2023;29:2057–67. 10.1038/s41591-023-02482-6.
1053		22	136	Wornow M, Xu Y, Thapa R. et al. The shaky foundations of large language models and foundation models for electronic health records. npj Digit Med 2023;6:135. 10.1038/s41746-023-00879-8.
1054		23	0	F. I. Marin, F. Teufel, M. Horlacher, D. Madsen, D. Pultz, O. Winther, and W. Boomsma. Bend: Benchmarking dna language models on biologically meaningful tasks. 12th International Conference on Learning Representations, ICLR 2024, 11 2023. URL https://arxiv.org/pdf/2311.12570.	https://qiita.com/kaizen_nagoya/items/8417e72454d2107a9d06
1055		23	1	Babak Alipanahi, Andrew Delong, Matthew T Weirauch, and Brendan J Frey. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nature Biotechnology, 33(8):831–838, August 2015. ISSN 1087-0156, 1546-1696. doi: 10.1038/nbt.3300. URL https://www.nature.com/articles/nbt.3300.
1056		23	2	Weizhi An, Yuzhi Guo, Yatao Bian, Hehuan Ma, Jinyu Yang, Chunyuan Li, and Junzhou Huang. MoDNA: motif-oriented pre-training for DNA language model. In Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, pp. 1–5, Northbrook Illinois, August 2022. ACM. ISBN 978-1-4503-9386-7. doi: 10.1145/3535508.3545512. URL https://dl.acm.org/doi/10.1145/3535508.3545512.
1057		23	3	Christof Angermueller, Heather J. Lee, Wolf Reik, and Oliver Stegle. DeepCpG: accurate prediction of single-cell DNA methylation states using deep learning. Genome Biology, 18(1):67, December 2017. ISSN 1474-760X. doi: 10.1186/s13059-017-1189-z. URL http://genomebiology.biomedcentral.com/articles/10.1186/s13059-017-1189-z.
1058		23	4	Adam Auton, Gonc¸alo R. Abecasis, David M. Altshuler, Richard M. Durbin, Gonc¸alo R. Abecasis, David R. Bentley, Aravinda Chakravarti, Andrew G. Clark, Peter Donnelly, Evan E. Eichler, Paul Flicek, Stacey B. Gabriel, Richard A. Gibbs, Eric D. Green, Matthew E. Hurles, Bartha M. Knoppers, Jan O. Korbel, Eric S. Lander, Charles Lee, Hans Lehrach, Elaine R. Mardis, Gabor T. Marth, Gil A. McVean, Deborah A. Nickerson, Jeanette P. Schmidt, Stephen T. Sherry, Jun Wang, Richard K. Wilson, Richard A. Gibbs, Eric Boerwinkle, Harsha Doddapaneni, Yi Han, Viktoriya Korchina, Christie Kovar, Sandra Lee, Donna Muzny, Jeffrey G. Reid, Yiming Zhu, Jun Wang, Yuqi Chang, Qiang Feng, Xiaodong Fang, Xiaosen Guo, Min Jian, Hui Jiang, Xin Jin, Tianming Lan, Guoqing Li, Jingxiang Li, Yingrui Li, Shengmao Liu, Xiao Liu, Yao Lu, Xuedi Ma, Meifang Tang, Bo Wang, Guangbiao Wang, Honglong Wu, Renhua Wu, Xun Xu, Ye Yin, Dandan Zhang, Wenwei Zhang, Jiao Zhao, Meiru Zhao, Xiaole Zheng, Eric S. Lander, David M. Altshuler, Stacey B. Gabriel, Namrata Gupta, Neda Gharani, Lorraine H. Toji, Norman P. Gerry, Alissa M. Resch, Paul Flicek, Jonathan Barker, Laura Clarke, Laurent Gil, Sarah E. Hunt, Gavin Kelman, Eugene Kulesha, Rasko Leinonen, William M. McLaren, Rajesh Radhakrishnan, Asier Roa, Dmitriy Smirnov, Richard E. Smith, Ian Streeter, Anja Thormann, Iliana Toneva, Brendan Vaughan, Xiangqun Zheng-Bradley, David R. Bentley, Russell Grocock, Sean Humphray, Terena James, Zoya Kingsbury, Hans Lehrach, Ralf Sudbrak, Marcus W. Albrecht, Vyacheslav S. Amstislavskiy, Tatiana A. Borodina, Matthias Lienhard, Florian Mertes, Marc Sultan, Bernd Timmermann, Marie-Laure Yaspo, Elaine R. Mardis, Richard K. Wilson, Lucinda Fulton, Robert Fulton, Stephen T. Sherry, Victor Ananiev, Zinaida Belaia, Dimitriy Beloslyudtsev, Nathan Bouk, Chao Chen, Deanna Church, Robert Cohen, Charles Cook, John Garner, Timothy Hefferon, Mikhail Kimelman, Chunlei Liu, John Lopez, Peter Meric, Chris O’Sullivan, Yuri Ostapchuk, Lon Phan, Sergiy Ponomarov, Valerie Schneider, Eugene Shekhtman, Karl Sirotkin, Douglas Slotta, Hua Zhang, Gil A. McVean, Richard M. Durbin, Senduran Balasubramaniam, John Burton, Petr Danecek, Thomas M. Keane, Anja Kolb-Kokocinski, Shane McCarthy, James Stalker, Michael Quail, Jeanette P. Schmidt, Christopher J. Davies, Jeremy Gollub, Teresa Webster, Brant Wong, Yiping Zhan, Adam Auton, Christopher L. Campbell, Yu Kong, Anthony Marcketta, Richard A. Gibbs, Fuli Yu, Lilian Antunes, Matthew Bainbridge, Donna Muzny, Aniko Sabo, Zhuoyi Huang, Published as a conference paper at ICLR 2024
1059		23	5	Jun Wang, Lachlan J. M. Coin, Lin Fang, Xiaosen Guo, Xin Jin, Guoqing Li, Qibin Li, Yingrui Li, Zhenyu Li, Haoxiang Lin, Binghang Liu, Ruibang Luo, Haojing Shao, Yinlong Xie, Chen Ye, Chang Yu, Fan Zhang, Hancheng Zheng, Hongmei Zhu, Can Alkan, Elif Dal, Fatma Kahveci, Gabor T. Marth, Erik P. Garrison, Deniz Kural, Wan-Ping Lee, Wen Fung Leong, Michael Stromberg, Alistair N. Ward, Jiantao Wu, Mengyao Zhang, Mark J. Daly, Mark A. DePristo, Robert E. Handsaker, David M. Altshuler, Eric Banks, Gaurav Bhatia, Guillermo del Angel, Stacey B. Gabriel, Giulio Genovese, Namrata Gupta, Heng Li, Seva Kashin, Eric S. Lander, Steven A. McCarroll, James C. Nemesh, Ryan E. Poplin, Seungtai C. Yoon, Jayon Lihm, Vladimir Makarov, Andrew G. Clark, Srikanth Gottipati, Alon Keinan, Juan L. Rodriguez-Flores, Jan O. Korbel, Tobias Rausch, Markus H. Fritz, Adrian M. St¨ utz, Paul Flicek, Kathryn Beal, Laura Clarke, Avik Datta, Javier Herrero, William M. McLaren, Graham R. S. Ritchie, Richard E. Smith, Daniel Zerbino, Xiangqun Zheng-Bradley, Pardis C. Sabeti, Ilya Shlyakhter, Stephen F. Schaffner, Joseph Vitti, David N. Cooper, Edward V. Ball, Peter D. Stenson, David R. Bentley, Bret Barnes, Markus Bauer, R. Keira Cheetham, Anthony Cox, Michael Eberle, Sean Humphray, Scott Kahn, Lisa Murray, John Peden, Richard Shaw, Eimear E. Kenny, Mark A. Batzer, Miriam K. Konkel, Jerilyn A. Walker, Daniel G. MacArthur, Monkol Lek, Ralf Sudbrak, Vyacheslav S. Amstislavskiy, Ralf Herwig, Elaine R. Mardis, Li Ding, Daniel C. Koboldt, David Larson, Kai Ye, Simon Gravel, The 1000 Genomes Project Consortium, Corresponding authors, Steering committee, Production group, Baylor College of Medicine, BGI-Shenzhen, Broad Institute of MIT and Harvard, Coriell Institute for Medical Research, European Bioinformatics Institute European Molecular Biology Laboratory, Illumina, Max Planck Institute for Molecular Genetics, McDonnell Genome Institute at Washington University, US National Institutes of Health, University of Oxford, Wellcome Trust Sanger Institute, Analysis group, Affymetrix, Albert Einstein College of Medicine, Bilkent University, Boston College, Cold Spring Harbor Laboratory, Cornell University, European Molecular Biology Laboratory, Harvard University, Human Gene Mutation Database, Icahn School of Medicine at Mount Sinai, Louisiana State University, Massachusetts General Hospital, McGill University, and NIH National Eye Institute. A global reference for human genetic variation. Nature, 526(7571):68–74, October 2015. ISSN 1476-4687. doi: 10.1038/nature15393. URL https://www.nature.com/articles/nature15393. Number: 7571 Publisher: Nature Publishing Group.
1060		23	6	Ziga Avsec, Vikram Agarwal, Daniel Visentin, Joseph R. Ledsam, Agnieszka Grabska-Barwinska, Kyle R. Taylor, Yannis Assael, John Jumper, Pushmeet Kohli, and David R. Kelley. Effective gene expression prediction from sequence by integrating long-range interactions. Nature Methods, 18(10):1196–1203, October 2021. ISSN 1548-7105. doi: 10.1038/s41592-021-01252-x. URL https://www.nature.com/articles/s41592-021-01252-x. Number: 10 Publisher: Nature Publishing Group.
1061		23	7	Andrew J Bannister and Tony Kouzarides. Regulation of chromatin by histone modifications. Cell Research, 21(3):381–395, March 2011. ISSN 1748-7838. doi: 10.1038/cr.2011.22. URL https://doi.org/10.1038/cr.2011.22.
1062		23	8	Gonzalo Benegas, Sanjit Singh Batra, and Yun S. Song. DNA language models are powerful zero-shot predictors of genome-wide variant effects. bioRxiv, pp. 2022.08.22.504706, January 2023. doi: 10.1101/2022.08.22.504706. URL http://biorxiv.org/content/early/2023/04/12/2022.08.22.504706.abstract.
1063		23	9	Tristan Bepler and Bonnie Berger. Learning the protein language: Evolution, structure, and function. Cell Systems, 12(6):654–669.e3, June 2021. ISSN 24054712. doi: 10.1016/j.cels.2021.05.017. URL https://linkinghub.elsevier.com/retrieve/pii/S2405471221002039.
1064		23	10	Kathleen M. Chen, Aaron K. Wong, Olga G. Troyanskaya, and Jian Zhou. A sequence-based global map of regulatory activity for deciphering human genetics. Nature Genetics, 54(7):940–949, July 2022. ISSN 1061-4036, 1546-1718. doi: 10.1038/s41588-022-01102-2. URL https://www.nature.com/articles/s41588-022-01102-2.
1065		23	11	Lei Cheng, Tong Yu, Tero Aittokallio, Jukka Corander, Ruslan Khalitov, and Zhirong Yang. Self-supervised learning for DNA sequences with circular dilated convolutional networks. preprint, Bioinformatics, February 2023. URL http://biorxiv.org/lookup/doi/10.1101/2023.01.30.526193. Published as a conference paper at ICLR 2024
1066		23	12	Hugo Dalla-Torre, Liam Gonzalez, Javier Mendoza Revilla, Nicolas Lopez Carranza, Adam Henryk Grzywaczewski, Francesco Oteri, Christian Dallago, Evan Trop, Hassan Sirelkhatim, Guillaume Richard, Marcin Skwark, Karim Beguir, Marie Lopez, and Thomas Pierrot. The Nucleotide Transformer: Building and Evaluating Robust Foundation Models for Human Genomics. bioRxiv, pp. 2023.01.11.523679, January 2023. doi: 10.1101/2023.01.11.523679. URL http://biorxiv.org/content/early/2023/01/15/2023.01.11.523679.abstract.
1067		23	13	Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. 2018. doi: 10.48550/ARXIV.1810.04805. URL https://arxiv.org/abs/1810.04805. Publisher: arXiv Version Number: 2.
1068		23	14	Ahmed Elnaggar, Michael Heinzinger, Christian Dallago, Ghalia Rehawi, Yu Wang, Llion Jones, Tom Gibbs, Tamas Feher, Christoph Angerer, Martin Steinegger, Debsindhu Bhowmik, and Burkhard Rost. ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):7112–7127, October 2022. ISSN 0162-8828, 2160-9292, 1939-3539. doi: 10.1109/TPAMI.2021.3095381. URL https://ieeexplore.ieee.org/document/9477085/.
1069		23	15	ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature, 489(7414):57–74, September 2012. ISSN 1476-4687. doi: 10.1038/nature11247.
1070		23	16	Veniamin Fishman, Yuri Kuratov, Maxim Petrov, Aleksei Shmelev, Denis Shepelin, Nikolay Chekanov, Olga Kardymon, and Mikhail Burtsev. GENA-LM: A Family of Open-Source Foundational Models for Long DNA Sequences, June 2023. URL https://www.biorxiv.org/content/10.1101/2023.06.12.544594v1. Pages: 2023.06.12.544594 Section: New Results.
1071		23	17	Adam Frankish, Mark Diekhans, Irwin Jungreis, Julien Lagarde, Jane E Loveland, Jonathan M Mudge, Cristina Sisu, James C Wright, Joel Armstrong, If Barnes, Andrew Berry, Alexandra Bignell, Carles Boix, Silvia Carbonell Sala, Fiona Cunningham, Tom´ as Di Domenico, Sarah Donaldson, Ian T Fiddes, Carlos Garc´ ıa Gir´ on, Jose Manuel Gonzalez, Tiago Grego, Matthew Hardy, Thibaut Hourlier, Kevin L Howe, Toby Hunt, Osagie G Izuogu, Rory Johnson, Fergal J Martin, Laura Mart´ ınez, Shamika Mohanan, Paul Muir, Fabio C P Navarro, Anne Parker, Baikang Pei, Fernando Pozo, Ferriol Calvet Riera, Magali Ruffier, Bianca M Schmitt, Eloise Stapleton, Marie-Marthe Suner, Irina Sycheva, Barbara Uszczynska-Ratajczak, Maxim Y Wolf, Jinuri Xu, Yucheng T Yang, Andrew Yates, Daniel Zerbino, Yan Zhang, Jyoti S Choudhary, Mark Gerstein, Roderic Guig´ o, Tim J P Hubbard, Manolis Kellis, Benedict Paten, Michael L Tress, and Paul Flicek. GENCODE 2021. Nucleic Acids Research, 49(D1):D916–D923, January 2021. ISSN 0305-1048. doi: 10.1093/nar/gkaa1087. URL https://doi.org/10.1093/nar/gkaa1087.
1072		23	18	Jonathan Frazer, Pascal Notin, Mafalda Dias, Aidan Gomez, Joseph K. Min, Kelly Brock, Yarin Gal, and Debora S. Marks. Disease variant prediction with deep generative models of evolutionary data. Nature, 599(7883):91–95, November 2021. ISSN 1476-4687. doi: 10.1038/s41586-021-04043-8. URL https://doi.org/10.1038/s41586-021-04043-8.
1073		23	19	Charles P. Fulco, Joseph Nasser, Thouis R. Jones, Glen Munson, Drew T. Bergman, Vidya Subramanian, Sharon R. Grossman, Rockwell Anyoha, Benjamin R. Doughty, Tejal A. Patwardhan, Tung H. Nguyen, Michael Kane, Elizabeth M. Perez, Neva C. Durand, Caleb A. Lareau, Elena K. Stamenova, Erez Lieberman Aiden, Eric S. Lander, and Jesse M. Engreitz. Activity-by-contact model of enhancer–promoter regulation from thousands of CRISPR perturbations. Nature Genetics, 51(12):1664–1669, December 2019. ISSN 1546-1718. doi: 10.1038/s41588-019-0538-0. URL https://www.nature.com/articles/s41588-019-0538-0. Number: 12 Publisher: Nature Publishing Group.
1074		23	20	Dennis Gankin, Alexander Karollus, Martin Grosshauser, Kristian Klemon, Johannes Hingerl, and Julien Gagneur. Species-aware DNA language modeling. bioRxiv, pp. 2023.01.26.525670, Jan-uary 2023. doi: 10.1101/2023.01.26.525670. URL http://biorxiv.org/content/early/2023/01/27/2023.01.26.525670.abstract. Published as a conference paper at ICLR 2024
1075		23	21	Molly Gasperini, Andrew J. Hill, Jos´ e L. McFaline-Figueroa, Beth Martin, Seungsoo Kim, Melissa D. Zhang, Dana Jackson, Anh Leith, Jacob Schreiber, William S. Noble, Cole Trapnell, Nadav Ahituv, and Jay Shendure. A Genome-wide Framework for Mapping Gene Regulation via Cellular Genetic Screens. Cell, 176(1):377–390.e19, January 2019. ISSN 0092-8674. doi: 10.1016/j.cell.2018.11.029. URL https://www.sciencedirect.com/science/article/pii/S009286741831554X.
1076		23	22	Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daum´ e, and Kate Crawford. Datasheets for Datasets. 2018. doi: 10.48550/ARXIV.1803.09010. URL https://arxiv.org/abs/1803.09010. Publisher: arXiv Version Number:8.
1077		23	23	J. Gorodkin. Comparing two K-category assignments by a K-category correlation coefficient. Computational Biology and Chemistry, 28(5):367–374, December 2004. ISSN 1476-9271. doi: 10.1016/j.compbiolchem.2004.09.006. URL https://www.sciencedirect.com/science/article/pii/S1476927104000799.
1078		23	24	Katar´ ına Greˇsov´ a, Vlastimil Martinek, David Cech´ ak, Petr Simeˇ cek, and Panagiotis Alexiou. Genomic benchmarks: a collection of datasets for genomic sequence classification. BMC Genomic Data, 24(1):25, May 2023. ISSN 2730-6844. doi: 10.1186/s12863-023-01123-8. URL https://doi.org/10.1186/s12863-023-01123-8.
1079		23	25	Benjamin C. Hitz, Jin-Wook Lee, Otto Jolanki, Meenakshi S. Kagda, Keenan Graham, Paul Sud, Idan Gabdank, J. Seth Strattan, Cricket A. Sloan, Timothy Dreszer, Laurence D. Rowe, Nikhil R. Podduturi, Venkat S. Malladi, Esther T. Chan, Jean M. Davidson, Marcus Ho, Stuart Miyasato, Matt Simison, Forrest Tanaka, Yunhai Luo, Ian Whaling, Eurie L. Hong, Brian T. Lee, Richard Sandstrom, Eric Rynes, Jemma Nelson, Andrew Nishida, Alyssa Ingersoll, Michael Buckley, Mark Frerker, Daniel S Kim, Nathan Boley, Diane Trout, Alex Dobin, Sorena Rahmanian, Dana Wyman, Gabriela Balderrama-Gutierrez, Fairlie Reese, Neva C. Durand, Olga Dudchenko, David Weisz, Suhas S. P. Rao, Alyssa Blackburn, Dimos Gkountaroulis, Mahdi Sadr, Moshe Olshansky, Yossi Eliaz, Dat Nguyen, Ivan Bochkov, Muhammad Saad Shamim, Ragini Mahajan, Erez Aiden, Tom Gingeras, Simon Heath, Martin Hirst, W. James Kent, Anshul Kundaje, Ali Mortazavi, Barbara Wold, and J. Michael Cherry. The ENCODE Uniform Analysis Pipelines. preprint, Bioinformatics, April 2023. URL http://biorxiv.org/lookup/doi/10.1101/2023.04.04.535623.
1080		23	26	Yanrong Ji, Zhihan Zhou, Han Liu, and Ramana V Davuluri. DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome. Bioinformatics, 37(15):2112–2120, August 2021. ISSN 1367-4803. doi: 10.1093/bioinformatics/btab083. URL https://doi.org/10.1093/bioinformatics/btab083.
1081		23	27	Meenakshi S. Kagda, Bonita Lam, Casey Litton, Corinn Small, Cricket A. Sloan, Emma Spragins, Forrest Tanaka, Ian Whaling, Idan Gabdank, Ingrid Youngworth, J. Seth Strattan, Jason Hilton, Jennifer Jou, Jessica Au, Jin-Wook Lee, Kalina Andreeva, Keenan Graham, Khine Lin, Matt Simison, Otto Jolanki, Paul Sud, Pedro Assis, Philip Adenekan, Eric Douglas, Mingjie Li, Pedro Assis, Keenan Graham, Paul Sud, Stuart Miyasato, Weiwei Zhong, Yunhai Luo, Zachary Myers, J. Michael Cherry, and Benjamin C. Hitz. Data navigation on the ENCODE portal. 2023. doi: 10.48550/ARXIV.2305.00006. URL https://arxiv.org/abs/2305.00006. Publisher: arXiv Version Number: 2.
1082		23	28	David R. Kelley, Jasper Snoek, and John L. Rinn. Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Research, 26(7):990–999, July 2016. ISSN 1549-5469. doi: 10.1101/gr.200535.115.
1083		23	29	David R. Kelley, Yakir A. Reshef, Maxwell Bileschi, David Belanger, Cory Y. McLean, and Jasper Snoek. Sequential regulatory activity prediction across chromosomes with convolutional neural networks. Genome Research, 28(5):739–750, May 2018. ISSN 1088-9051, 1549-5469. doi: 10.1101/gr.227819.117. URL http://genome.cshlp.org/lookup/doi/10.1101/gr.227819.117.
1084		23	30	Andriy Kryshtafovych, Torsten Schwede, Maya Topf, Krzysztof Fidelis, and John Moult. Critical assessment of methods of protein structure prediction (CASP)—Round XIV. Proteins: Published as a conference paper at ICLR 2024
1085		23	31	Structure, Function, and Bioinformatics, 89(12):1607–1617, 2021. ISSN 1097-0134. doi: 10.1002/prot.26237. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/prot.26237. eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/prot.26237.
1086		23	32	Melissa J Landrum, Shanmuga Chitipiralla, Garth R Brown, Chao Chen, Baoshan Gu, Jennifer Hart, Douglas Hoffman, Wonhee Jang, Kuljeet Kaur, Chunlei Liu, Vitaly Lyoshin, Zenith Maddipatla, Rama Maiti, Joseph Mitchell, Nuala O’Leary, George R Riley, Wenyao Shi, George Zhou, Valerie Schneider, Donna Maglott, J Bradley Holmes, and Brandi L Kattman. ClinVar: improvements to accessing data. Nucleic Acids Research, 48(D1):D835–D844, January 2020. ISSN 0305-1048. doi: 10.1093/nar/gkz972. URL https://doi.org/10.1093/nar/gkz972.
1087		23	33	Richard Leslie, Christopher J. O’Donnell, and Andrew D. Johnson. GRASP: analysis of geno-type–phenotype results from 1390 genome-wide association studies and corresponding open access database. Bioinformatics, 30(12):i185–i194, June 2014. ISSN 1367-4803. doi:10.1093/bioinformatics/btu273. URL https://doi.org/10.1093/bioinformatics/btu273.
1088		23	34	Benjamin Levy, Zihao Xu, Liyang Zhao, Karl Kremling, Ross Altman, Phoebe Wong, and Chris Tanner. FloraBERT: cross-species transfer learning withattention-based neural networks for geneexpression prediction. preprint, In Review, August 2022. URL https://www.researchsquare.com/article/rs-1927200/v1.
1089		23	35	Zeming Lin, Halil Akin, Roshan Rao, Brian Hie, Zhongkai Zhu, Wenting Lu, Nikita Smetanin, Robert Verkuil, Ori Kabeli, Yaniv Shmueli, Allan dos Santos Costa, Maryam Fazel-Zarandi, Tom Sercu, Salvatore Candido, and Alexander Rives. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science, 379(6637):1123–1130, March 2023. ISSN 0036-8075, 1095-9203. doi: 10.1126/science.ade2574. URL https://www.science.org/doi/10.1126/science.ade2574.
1090		23	36	Yunhai Luo, Benjamin C. Hitz, Idan Gabdank, Jason A. Hilton, Meenakshi S. Kagda, Bonita Lam, Zachary Myers, Paul Sud, Jennifer Jou, Khine Lin, Ulugbek K. Baymuradov, Keenan Graham, Casey Litton, Stuart R. Miyasato, J. Seth Strattan, Otto Jolanki, Jin-Wook Lee, Forrest Y. Tanaka, Philip Adenekan, Emma O’Neill, and J. Michael Cherry. New developments on the Encyclopedia of DNA Elements (ENCODE) data portal. Nucleic Acids Research, 48(D1):D882–D889, January 2020. ISSN 1362-4962. doi: 10.1093/nar/gkz1062.
1091		23	37	Ali Madani, Ben Krause, Eric R. Greene, Subu Subramanian, Benjamin P. Mohr, James M. Holton, Jose Luis Olmos, Caiming Xiong, Zachary Z. Sun, Richard Socher, James S. Fraser, and Nikhil Naik. Large language models generate functional protein sequences across diverse families. Nature Biotechnology, January 2023. ISSN 1087-0156, 1546-1696. doi: 10.1038/s41587-022-01618-2. URL https://www.nature.com/articles/s41587-022-01618-2.
1092		23	38	William McLaren, Laurent Gil, Sarah E. Hunt, Harpreet Singh Riat, Graham R. S. Ritchie, Anja Thormann, Paul Flicek, and Fiona Cunningham. The Ensembl Variant Effect Predictor. Genome Biology, 17(1):122, June 2016. ISSN 1474-760X. doi: 10.1186/s13059-016-0974-4. URL https://doi.org/10.1186/s13059-016-0974-4.
1093		23	39	Gil A. McVean, David M. Altshuler (Co-Chair), Richard M. Durbin (Co-Chair), Gonc¸alo R. Abecasis, David R. Bentley, Aravinda Chakravarti, Andrew G. Clark, Peter Donnelly, Evan E. Eichler, Paul Flicek, Stacey B. Gabriel, Richard A. Gibbs, Eric D. Green, Matthew E. Hurles, Bartha M. Knoppers, Jan O. Korbel, Eric S. Lander, Charles Lee, Hans Lehrach, Elaine R. Mardis, Gabor T. Marth, Gil A. McVean, Deborah A. Nickerson, Jeanette P. Schmidt, Stephen T. Sherry, Jun Wang, Richard K. Wilson, Richard A. Gibbs (Principal Investigator), Huyen Dinh, Christie Kovar, Sandra Lee, Lora Lewis, Donna Muzny, Jeff Reid, Min Wang, Jun Wang (Principal Investigator), Xiaodong Fang, Xiaosen Guo, Min Jian, Hui Jiang, Xin Jin, Guoqing Li, Jingxiang Li, Yingrui Li, Zhuo Li, Xiao Liu, Yao Lu, Xuedi Ma, Zhe Su, Shuaishuai Tai, Meifang Tang, Bo Wang, Guangbiao Wang, Honglong Wu, Renhua Wu, Ye Yin, Wenwei Zhang, Jiao Zhao, Meiru Zhao, Xiaole Zheng, Yan Zhou, Eric S. Lander (Principal Investigator), David M. Altshuler, Stacey B. Gabriel (Co-Chair), Namrata Gupta, Paul Flicek (Principal Investigator), Laura Clarke, Rasko Leinonen, Richard E. Smith, Xiangqun Zheng-Bradley, David R. Bentley (Principal Investigator), Published as a conference paper at ICLR 2024
1094		23	40	Russell Grocock, Sean Humphray, Terena James, Zoya Kingsbury, Hans Lehrach (Principal Investigator), Ralf Sudbrak (Project Leader), Marcus W. Albrecht, Vyacheslav S. Amstislavskiy, Tatiana A. Borodina, Matthias Lienhard, Florian Mertes, Marc Sultan, Bernd Timmermann, MarieLaure Yaspo, Stephen T. Sherry (Principal Investigator), Gil A. McVean (Principal Investigator), Elaine R. Mardis (Co-Principal Investigator) (Co-Chair), Richard K. Wilson (Co-Principal Investigator), Lucinda Fulton, Robert Fulton, George M. Weinstock, Richard M. Durbin (Principal Investigator), Senduran Balasubramaniam, John Burton, Petr Danecek, Thomas M. Keane, Anja Kolb-Kokocinski, Shane McCarthy, James Stalker, Michael Quail, Jeanette P. Schmidt (Principal Investigator), Christopher J. Davies, Jeremy Gollub, Teresa Webster, Brant Wong, Yiping Zhan, Adam Auton (Principal Investigator), Richard A. Gibbs (Principal Investigator), Fuli Yu (Project Leader), Matthew Bainbridge, Danny Challis, Uday S. Evani, James Lu, Donna Muzny, Uma Nagaswamy, Jeff Reid, Aniko Sabo, Yi Wang, Jin Yu, Jun Wang (Principal Investigator), Lachlan J. M. Coin, Lin Fang, Xiaosen Guo, Xin Jin, Guoqing Li, Qibin Li, Yingrui Li, Zhenyu Li, Haoxiang Lin, Binghang Liu, Ruibang Luo, Nan Qin, Haojing Shao, Bingqiang Wang, Yinlong Xie, Chen Ye, Chang Yu, Fan Zhang, Hancheng Zheng, Hongmei Zhu, Gabor T. Marth (Principal Investigator), Erik P. Garrison, Deniz Kural, Wan-Ping Lee, Wen Fung Leong, Alistair N. Ward, Jiantao Wu, Mengyao Zhang, Charles Lee (Principal Investigator), Lauren Griffin, Chih-Heng Hsieh, Ryan E. Mills, Xinghua Shi, Marcin von Grotthuss, Chengsheng Zhang, Mark J. Daly (Principal Investigator), Mark A. DePristo (Project Leader), David M. Altshuler, Eric Banks, Gaurav Bhatia, Mauricio O. Carneiro, Guillermo del Angel, Stacey B. Gabriel, Giulio Genovese, Namrata Gupta, Robert E. Handsaker, Chris Hartl, Eric S. Lander, Steven A. McCarroll, James C. Nemesh, Ryan E. Poplin, Stephen F. Schaffner, Khalid Shakir, Seungtai C. Yoon (Principal Investigator), Jayon Lihm, Vladimir Makarov, Hanjun Jin (Principal Investigator), Wook Kim, Ki Cheol Kim, Jan O. Korbel (Principal Investigator), Tobias Rausch, Paul Flicek (Principal Investigator), Kathryn Beal, Laura Clarke, Fiona Cunningham, Javier Herrero, William M. McLaren, Graham R. S. Ritchie, Richard E. Smith, Xiangqun Zheng-Bradley, Andrew G. Clark (Principal Investigator), Srikanth Gottipati, Alon Keinan, Juan L. RodriguezFlores, Pardis C. Sabeti (Principal Investigator), Sharon R. Grossman, Shervin Tabrizi, Ridhi Tariyal, David N. Cooper (Principal Investigator), Edward V. Ball, Peter D. Stenson, David R. Bentley (Principal Investigator), Bret Barnes, Markus Bauer, R. Keira Cheetham, Tony Cox, Michael Eberle, Sean Humphray, Scott Kahn, Lisa Murray, John Peden, Richard Shaw, Kai Ye (Principal Investigator), Mark A. Batzer (Principal Investigator), Miriam K. Konkel, Jerilyn A. Walker, Daniel G. MacArthur (Principal Investigator), Monkol Lek, Sudbrak (Project Leader), Vyacheslav S. Amstislavskiy, Ralf Herwig, Mark D. Shriver (Principal Investigator), Carlos D. Bustamante (Principal Investigator), Jake K. Byrnes, Francisco M. De La Vega, Simon Gravel, Eimear E. Kenny, Jeffrey M. Kidd, Phil Lacroute, Brian K. Maples, Andres MorenoEstrada, Fouad Zakharia, Eran Halperin (Principal Investigator), Yael Baran, David W. Craig (Principal Investigator), Alexis Christoforides, Nils Homer, Tyler Izatt, Ahmet A. Kurdoglu, Shripad A. Sinari, Kevin Squire, Stephen T. Sherry (Principal Investigator), Chunlin Xiao, Jonathan Sebat (Principal Investigator), Vineet Bafna, Kenny Ye, Esteban G. Burchard (Principal Investigator), Ryan D. Hernandez (Principal Investigator), Christopher R. Gignoux, David Haussler (Principal Investigator), Sol J. Katzman, W. James Kent, Bryan Howie, Andres Ruiz-Linares (Principal Investigator), The 1000 Genomes Project Consortium, Corresponding Author, Steering committee, Production group:, Baylor College of Medicine, BGI-Shenzhen, Broad Institute of MIT and Harvard, European Bioinformatics Institute, Illumina, Max Planck Institute for Molecular Genetics, US National Institutes of Health, University of Oxford, Washington University in St Louis, Wellcome Trust Sanger Institute, Analysis group:, Affymetrix, Albert Einstein College of Medicine, Boston College, Brigham and Women’s Hospital, Cold Spring Harbor Laboratory, Dankook University, European Molecular Biology Laboratory, Cornell University, Harvard University, Human Gene Mutation Database, Leiden University Medical Center, Louisiana State University, Massachusetts General Hospital, Pennsylvania State University, Stanford University, Tel-Aviv University, Translational Genomics Research Institute, San Diego University of California, San Francisco University of California, Santa Cruz University of California, University of Chicago, University College London, and University of Geneva. An integrated map of genetic variation from 1,092 human genomes. Nature, 491(7422):56–65, November 2012. ISSN 1476-4687. doi: 10.1038/nature11632. URL https://doi.org/10.1038/nature11632.
1095		23	41	Stephen Merity, Nitish Shirish Keskar, and Richard Socher. Regularizing and Optimizing LSTM Language Models. 2017. doi: 10.48550/ARXIV.1708.02182. URL https://arxiv.org/
1096		23	42	Published as a conference paper at ICLR 2024 abs/1708.02182. Publisher: arXiv Version Number: 1.
1097		23	43	Eric Nguyen, Michael Poli, Marjan Faizi, Armin Thomas, Callum Birch-Sykes, Michael Wornow, Aman Patel, Clayton Rabideau, Stefano Massaroli, Yoshua Bengio, Stefano Ermon, Stephen A. Baccus, and Chris R´ e. HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution. 2023. doi: 10.48550/ARXIV.2306.15794. URL https://arxiv.org/abs/2306.15794. Publisher: arXiv Version Number: 1.
1098		23	44	Michael Poli, Stefano Massaroli, Eric Nguyen, Daniel Y. Fu, Tri Dao, Stephen Baccus, Yoshua Bengio, Stefano Ermon, and Christopher R´ e. Hyena Hierarchy: Towards Larger Convolutional Language Models, April 2023. URL http://arxiv.org/abs/2302.10866. arXiv:2302.10866 [cs].
1099		23	45	Ofir Press, Noah A. Smith, and Mike Lewis. Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation, April 2022. URL http://arxiv.org/abs/2108.12409. arXiv:2108.12409 [cs].
1100		23	46	Roshan Rao, Nicholas Bhattacharya, Neil Thomas, Yan Duan, Xi Chen, John Canny, Pieter Abbeel, and Yun S. Song. Evaluating Protein Transfer Learning with TAPE. Advances in neural information processing systems, 32:9689–9701, December 2019. ISSN 1049-5258. URL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7774645/.
1101		23	47	Roshan Rao, Joshua Meier, Tom Sercu, Sergey Ovchinnikov, and Alexander Rives. Transformer protein language models are unsupervised structure learners. preprint, Synthetic Biology, December 2020. URL http://biorxiv.org/lookup/doi/10.1101/2020.12.15.422761.
1102		23	48	Alexander Rives, Joshua Meier, Tom Sercu, Siddharth Goyal, Zeming Lin, Jason Liu, Demi Guo, Myle Ott, C. Lawrence Zitnick, Jerry Ma, and Rob Fergus. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. preprint, Synthetic Biology, April 2019. URL http://biorxiv.org/lookup/doi/10.1101/622803.
1103		23	49	Marco Salvatore, Marc Horlacher, Annalisa Marsico, Ole Winther, and Robin Andersson. Transfer learning identifies sequence determinants of cell-type specific regulatory element accessibility. NAR Genomics and Bioinformatics, 5(2):lqad026, March 2023. ISSN 2631-9268. doi: 10.1093/nargab/lqad026. URL https://academic.oup.com/nargab/article/doi/10.1093/nargab/lqad026/7092956.
1104		23	50	Melissa Sanabria, Jonas Hirsch, and Anna R. Poetsch. The human genome’s vocabulary as proposed by the DNA language model GROVER, September 2023. URL https://www.biorxiv.org/content/10.1101/2023.07.19.549677v2. Pages: 2023.07.19.549677 Section: New Results.
1105		23	51	Nicolas Scalzitti, Anne Jeannin-Girardon, Pierre Collet, Olivier Poch, and Julie D. Thompson. A benchmark study of ab initio gene prediction methods in diverse eukaryotic organisms. BMC Genomics, 21(1):293, December 2020. ISSN 1471-2164. doi:10.1186/s12864-020-6707-9. URL https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-020-6707-9.
1106		23	52	Valerie A. Schneider, Tina Graves-Lindsay, Kerstin Howe, Nathan Bouk, Hsiu-Chuan Chen, Paul A. Kitts, Terence D. Murphy, Kim D. Pruitt, Franc¸oise Thibaud-Nissen, Derek Albracht, Robert S. Fulton, Milinn Kremitzki, Vincent Magrini, Chris Markovic, Sean McGrath, Karyn Meltz Steinberg, Kate Auger, William Chow, Joanna Collins, Glenn Harden, Timothy Hubbard, Sarah Pelan, Jared T. Simpson, Glen Threadgold, James Torrance, Jonathan M. Wood, Laura Clarke, Sergey Koren, Matthew Boitano, Paul Peluso, Heng Li, Chen-Shan Chin, Adam M. Phillippy, Richard Durbin, Richard K. Wilson, Paul Flicek, Evan E. Eichler, and Deanna M. Church. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Research, 27(5):849–864, May 2017. ISSN 1549-5469. doi:10.1101/gr.213611.116.
1107		23	53	Ritambhara Singh, Jack Lanchantin, Gabriel Robins, and Yanjun Qi. DeepChrome: deep-learning for predicting gene expression from histone modifications. Bioinformatics, 32(17): i639–i648, September 2016. ISSN 1367-4803, 1367-4811. doi: 10.1093/bioinformatics/ Published as a conference paper at ICLR 2024 btw427. URL https://academic.oup.com/bioinformatics/article/32/17/i639/2450757.
1108		23	54	Mario Stanke and Stephan Waack. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics (Oxford, England), 19 Suppl 2:ii215–225, October 2003. ISSN 1367-4811. doi: 10.1093/bioinformatics/btg1080.
1109		23	55	Felix Teufel, Magn´ us Halld´ or G´ ıslason, Jos´ e Juan Almagro Armenteros, Alexander Rosenberg Johansen, Ole Winther, and Henrik Nielsen. GraphPart: homology partitioning for biological sequence analysis. NAR Genomics and Bioinformatics, 5(4):lqad088, October 2023. ISSN 2631-9268. doi: 10.1093/nargab/lqad088. URL https://academic.oup.com/nargab/article/doi/10.1093/nargab/lqad088/7318077.
1110		23	56	Jesse Vig, Ali Madani, Lav R. Varshney, Caiming Xiong, Richard Socher, and Nazneen Fatema Rajani. BERTology Meets Biology: Interpreting Attention in Protein Language Models, March 2021. URL http://arxiv.org/abs/2006.15222. arXiv:2006.15222 [cs, q-bio].
1111		23	57	Minghao Xu, Zuobai Zhang, Jiarui Lu, Zhaocheng Zhu, Yangtian Zhang, Ma Chang, Runcheng Liu, and Jian Tang. PEER: A Comprehensive and Multi-Task Benchmark for Protein Sequence Understanding. Advances in Neural Information Processing Systems, 35:35156–35173, December 2022. URL https://proceedings.neurips.cc/paper_files/paper/2022/hash/e467582d42d9c13fa9603df16f31de6d-Abstract-Datasets_and_Benchmarks.html.
1112		23	58	Meng Yang, Haiping Huang, Lichao Huang, Nan Zhang, Jihong Wu, Huanming Yang, and Feng Mu. LOGO, a contextualized pre-trained language model of human genome flexibly adapts to various downstream tasks by fine-tuning. preprint, In Review, August 2021. URL https://www.researchsquare.com/article/rs-448927/v1.
1113		23	59	Manzil Zaheer, Guru Guruganesh, Kumar Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, and Amr Ahmed. Big Bird: Transformers for Longer Sequences. In Advances in Neural Information Processing Systems, volume 33, pp. 17283–17297. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper/2020/hash/c8512d142a2d849725f31a9a7a361ab9-Abstract.html.
1114		23	60	Jian Zhou and Olga G Troyanskaya. Predicting effects of noncoding variants with deep learning–based sequence model. Nature Methods, 12(10):931–934, October 2015. ISSN 1548-7091,1548-7105. doi: 10.1038/nmeth.3547. URL https://www.nature.com/articles/nmeth.3547.
1115		23	61	Naihui Zhou, Yuxiang Jiang, Timothy R. Bergquist, Alexandra J. Lee, Balint Z. Kacsoh, Alex W. Crocker, Kimberley A. Lewis, George Georghiou, Huy N. Nguyen, Md Nafiz Hamid, Larry Davis, Tunca Dogan, Volkan Atalay, Ahmet S. Rifaioglu, Alperen Dalkıran, Rengul Cetin Atalay, Chengxin Zhang, Rebecca L. Hurto, Peter L. Freddolino, Yang Zhang, Prajwal Bhat, Fran Supek, Jos´ e M. Fern´ andez, Branislava Gemovic, Vladimir R. Perovic, Radoslav S. Davidovi´ c, Neven Sumonja, Nevena Veljkovic, Ehsaneddin Asgari, Mohammad R.K. Mofrad, Giuseppe Profiti, Castrense Savojardo, Pier Luigi Martelli, Rita Casadio, Florian Boecker, Heiko Schoof, Indika Kahanda, Natalie Thurlby, Alice C. McHardy, Alexandre Renaux, Rabie Saidi, Julian Gough, Alex A. Freitas, Magdalena Antczak, Fabio Fabris, Mark N. Wass, Jie Hou, Jianlin Cheng, Zheng Wang, Alfonso E. Romero, Alberto Paccanaro, Haixuan Yang, Tatyana Goldberg, Chenguang Zhao, Liisa Holm, Petri T¨or onen, Alan J. Medlar, Elaine Zosa, Itamar Borukhov, Ilya Novikov, Angela Wilkins, Olivier Lichtarge, Po-Han Chi, Wei-Cheng Tseng, Michal Linial, Peter W. Rose, Christophe Dessimoz, Vedrana Vidulin, Saso Dzeroski, Ian Sillitoe, Sayoni Das, Jonathan Gill Lees, David T. Jones, Cen Wan, Domenico Cozzetto, Rui Fa, Mateo Torres, Alex Warwick Vesztrocy, Jose Manuel Rodriguez, Michael L. Tress, Marco Frasca, Marco Notaro, Giuliano Grossi, Alessandro Petrini, Matteo Re, Giorgio Valentini, Marco Mesiti, Daniel B. Roche, Jonas Reeb, David W. Ritchie, Sabeur Aridhi, Seyed Ziaeddin Alborzi, Marie-Dominique Devignes, Da Chen Emily Koo, Richard Bonneau, Vladimir Gligorijevi´ c, Meet Barot, Hai Fang, Stefano Toppo, Enrico Lavezzo, Marco Falda, Michele Berselli, Silvio C.E. Tosatto, Marco Carraro, Damiano Piovesan, Hafeez Ur Rehman, Qizhong Mao, Shanshan Zhang, Slobodan Vucetic, Gage S. Black, Dane Jo, Erica Suh, Jonathan B. Dayton, Dallas J. Larsen, Ashton R. Omdahl, Liam J. McGuffin, Danielle A. Brackenridge, Patricia C. Babbitt, Jeffrey M. Yunes, Paolo Fontana, Feng Zhang, Shanfeng Zhu, Ronghui You, Zihan Zhang, Suyang Dai, Shuwei Yao, Weidong Tian, Renzhi Cao, Caleb Chandler, Miguel Amezola, Devon Johnson, Jia-Ming Chang, Wen-Hung Liao, Yi-Wei Liu, Stefano Pascarelli, Yotam Frank, Robert Hoehndorf, Maxat Kulmanov, Imane Boudellioua, Gianfranco Politano, Stefano Di Carlo, Alfredo Benso, Kai Hakala, Filip Ginter, Farrokh Mehryary, Suwisa Kaewphan, Jari Bj¨ orne, Hans Moen, Martti E.E. Tolvanen, Tapio Salakoski, Daisuke Kihara, Aashish Jain, Tomislav Smuc, Adrian Altenhoff, Asa Ben-Hur, Burkhard Rost, Steven E. Brenner, Christine A. Orengo, Constance J. Jeffery, Giovanni Bosco, Deborah A. Hogan, Maria J. Martin, Claire O’Donovan, Sean D. Mooney, Casey S. Greene, Predrag Radivojac, and Iddo Friedberg. The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens. Genome Biology, 20(1):244, November 2019. ISSN 1474-760X. doi: 10.1186/s13059-019-1835-8. URL https://doi.org/10.1186/s13059-019-1835-8.
1116		23	62	Zhihan Zhou, Yanrong Ji, Weijian Li, Pratik Dutta, Ramana Davuluri, and Han Liu. DNABERT-2: Efficient Foundation Model and Benchmark For Multi-Species Genome, June 2023. URL http://arxiv.org/abs/2306.15006. arXiv:2306.15006 [cs, q-bio].
1117		24	0	E. Nguyen, M. Poli, M. Faizi, A. W. Thomas, C. B. Sykes, M. Wornow, A. Patel, C. Rabideau, S. Massaroli, Y. Bengio, S. Ermon, S. A. Baccus, and C. Ré. Hyenadna: Long-range genomic sequence modeling at single nucleotide resolution. ArXiv, 6 2023. ISSN 2331-8422. URL https://arxiv.org/pdf/2306.15794.	https://qiita.com/kaizen_nagoya/items/07e1ba1138b0825c8a73
1118		24	1	Ž. Avsec, V. Agarwal, D. Visentin, J. R. Ledsam, A. Grabska-Barwinska, K. R. Taylor, Y. Assael, J. Jumper, P. Kohli, and D. R. Kelley. Effective gene expression prediction from sequence by integrating long-range interactions. Nature methods, 18(10):1196–1203, 2021.
1119		24	2	D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
1120		24	3	G. Benegas, S. S. Batra, and Y. S. Song. DNA language models are powerful zero-shot predictors of noncoding variant effects. bioRxiv, pages 2022–08, 2022.
1121		24	4	R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosselut, E. Brunskill, et al. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021.
1122		24	5	N. Brandes, D. Ofer, Y. Peleg, N. Rappoport, and M. Linial. ProteinBERT: a universal deep-learning model of protein sequence and function. Bioinformatics, 38(8):2102–2110, 2022.
1123		24	6	T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
1124		24	7	T. Chen and C. Guestrin. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pages 785–794, 2016.
1125		24	8	D. M. Church, V. A. Schneider, T. Graves, K. Auger, F. Cunningham, N. Bouk, H.-C. Chen, R. Agarwala, W. M. McLaren, G. R. Ritchie, et al. Modernizing reference genome assemblies. PLoS biology, 9(7): e1001091, 2011.
1126		24	9	F. Cunningham, J. E. Allen, J. Allen, J. Alvarez-Jarreta, M. R. Amode, I. M. Armean, O. Austine-Orimoloye, A. G. Azov, I. Barnes, R. Bennett, et al. Ensembl 2022. Nucleic acids research, 50(D1):D988–D995, 2022.
1127		24	10	H. Dalla-Torre, L. Gonzalez, J. Mendoza-Revilla, N. L. Carranza, A. H. Grzywaczewski, F. Oteri, C. Dallago, E. Trop, H. Sirelkhatim, G. Richard, M. Skwark, K. Beguir, M. Lopez, and T. Pierrot. The Nucleotide Transformer: Building and evaluating robust foundation models for human genomics. bioRxiv, 2023.
1128		24	11	T. Dao, D. Y. Fu, S. Ermon, A. Rudra, and C. Ré. FlashAttention: Fast and memory-efficient exact attention with IO-awareness. In Advances in Neural Information Processing Systems, 2022a.
1129		24	12	T. Dao, D. Y. Fu, K. K. Saab, A. W. Thomas, A. Rudra, and C. Ré. Hungry hungry hippos: Towards language modeling with state space models. arXiv preprint arXiv:2212.14052, 2022b.
1130		24	13	J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
1131		24	14	A. Elnaggar, M. Heinzinger, C. Dallago, G. Rehawi, Y. Wang, L. Jones, T. Gibbs, T. Feher, C. Angerer, M. Steinegger, et al. Prottrans: Toward understanding the language of life through self-supervised learning. IEEE transactions on pattern analysis and machine intelligence, 44(10):7112–7127, 2021.
1132		24	15	ENCODE Project Consortium. An integrated encyclopedia of dna elements in the human genome. Nature, 489(7414):57, 2012.
1133		24	16	ENCODEProjectConsortium. Expande dencyclopaedias of DNA elements in the humanan dmouse genomes. Nature, 583:699–710, 2020.
1134		24	17	N. Ferruz, S. Schmidt, and B. Höcker. ProtGPT2 is a deep unsupervised language model for protein design. Nature communications, 13(1):4348, 2022.
1135		24	18	Q. Fournier, G. M. Caron, and D. Aloise. A practical survey on faster and lighter transformers. ACM Computing Surveys, 2021.
1136		24	19	D. Y. Fu, E. L. Epstein, E. Nguyen, A. W. Thomas, M. Zhang, T. Dao, A. Rudra, and C. Ré. Simple hardware-efficient long convolutions for sequence modeling. arXiv preprint arXiv:2302.06646, 2023.
1137		24	20	D. Gankin, A. Karollus, M. Grosshauser, K. Klemon, J. Hingerl, and J. Gagneur. Species-aware DNA language modeling. bioRxiv, pages 2023–01, 2023.
1138		24	21	Q. Geng, R. Yang, and L. Zhang. A deep learning framework for enhancer prediction using word embedding and sequence generation. Biophysical Chemistry, 286:106822, 2022.
1139		24	22	Genome Reference Consortium. Genome reference consortium human build 38 (grch38). National Center for Biotechnology Information, 2013. URL https://www.ncbi.nlm.nih.gov/assembly/GCF_000001405.26/.
1140		24	23	K. Gresova, V. Martinek, D. Cechak, P. Simecek, and P. Alexiou. Genomic Benchmarks: A collection of datasets for genomic sequence classification. bioRxiv, 2022.
1141		24	24	A. Gu, K. Goel, and C. Ré. Efficiently modeling long sequences with structured state spaces. arXiv preprint arXiv:2111.00396, 2021.
1142		24	25	Y. Ji, Z. Zhou, H. Liu, and R. V. Davuluri. DNABERT: pre-trained bidirectional encoder representations from transformers model for DNA-language in genome. Bioinformatics, 37(15):2112–2120, 2021.
1143		24	26	W. J. Kent, C. W. Sugnet, T. S. Furey, K. M. Roskin, T. H. Pringle, A. M. Zahler, and D. Haussler. The human genome browser at ucsc. Genome research, 12(6):996–1006, 2002.
1144		24	27	B. Lester, R. Al-Rfou, and N. Constant. The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691, 2021.
1145		24	28	C. Li, M. Zhang, and Y. He. The stability-efficiency dilemma: Investigating sequence length warmup for training GPT models. In Advances in Neural Information Processing Systems, 2022.
1146		24	29	Z. Lin, H. Akin, R. Rao, B. Hie, Z. Zhu, W. Lu, A. dos Santos Costa, M. Fazel-Zarandi, T. Sercu, S. Candido, et al. Language models of protein sequences at the scale of evolution enable accurate structure prediction. BioRxiv, 2022.
1147		24	30	P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9):1–35, 2023.
1148		24	31	A. Madani, B. Krause, E. R. Greene, S. Subramanian, B. P. Mohr, J. M. Holton, J. L. Olmos Jr, C. Xiong, Z. Z. Sun, R. Socher, et al. Large language models generate functional protein sequences across diverse families. Nature Biotechnology, pages 1–8, 2023.
1149		24	32	J. Meier, R. Rao, R. Verkuil, J. Liu, T. Sercu, and A. Rives. Language models enable zero-shot prediction of the effects of mutations on protein function. Advances in Neural Information Processing Systems, 34: 29287–29303, 2021.
1150		24	33	J. Nasser, D. T. Bergman, C. P. Fulco, P. Guckelberger, B. R. Doughty, T. A. Patwardhan, T. R. Jones, T. H. Nguyen, J. C. Ulirsch, F. Lekschas, K. Mualim, H. M. Natri, E. M. Weeks, G. Munson, M. Kane, H. Y. Kang, A. Cui, J. P. Ray, T. M. Eisenhaure, R. L. Collins, K. Dey, H. Pfister, A. L. Price, C. B. Epstein, A. Kundaje, R. J. Xavier, M. J. Daly, H. Huang, H. K. Finucane, N. Hacohen, E. S. Lander, and J. M. Engreitz. Genome-wide enhancer maps link risk variants to disease genes. Nature, 593:238–243, 2021.
1151		24	34	M. Oubounyt, Z. Louadi, H. Tayara, and K. T. Chong. DeePromoter: Robust promoter predictor using deep learning. Frontiers in Genetics, 10, 2019.
1152		24	35	T. H. Pham, D. H. Tran, T. B. H. Ho, K. Satou, and G. Valiente. Qualitatively predicting acetylation and methylation areas in DNA sequences. Genome Informatics, 16(2):3–11, 2005.
1153		24	36	D. K. Pokholok, C. T. Harbison, S. Levine, F. Lewitter, D. K. Gifford, and R. A. Young. Genome-wide map of nucleosome acetylation and methylation in yeast. Cell, 122(4):517–527, 2005.
1154		24	37	M. Poli, S. Massaroli, E. Nguyen, D. Y. Fu, T. Dao, S. Baccus, Y. Bengio, S. Ermon, and C. Ré. Hyena Hierarchy: Towards larger convolutional language models. arXiv preprint arXiv:2302.10866, 2023.
1155		24	38	O. Press, N. A. Smith, and M. Lewis. Shortformer: Better language modeling using shorter inputs. arXiv preprint arXiv:2012.15832, 2020.
1156		24	39	A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, et al. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
1157		24	40	A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. Radford, M. Chen, and I. Sutskever. Zero-shot text-to-image generation. In International Conference on Machine Learning, pages 8821–8831. PMLR, 2021.
1158		24	41	R. Rao, J. Meier, T. Sercu, S. Ovchinnikov, and A. Rives. Transformer protein language models are unsupervised structure learners. Biorxiv, pages 2020–12, 2020.
1159		24	42	Roadmap Epigenomics Consortium. Integrative analysis of 111 reference human epigenomes. Nature, 518(7539):317–330, 2015.
1160		24	43	J. Rogers and R. A. Gibbs. Comparative primate genomics: emerging patterns of genome content and dynamics. Nature Reviews Genetics, 15(5):347–359, 2014.
1161		24	44	D. W. Romero, R.-J. Bruintjes, J. M. Tomczak, E. J. Bekkers, M. Hoogendoorn, and J. C. van Gemert. Flexconv: Continuous kernel convolution swith differentiable kernel sizes. arXiv preprint arXiv:2110.08059, 2021a.
1162		24	45	D. W. Romero, A. Kuzina, E. J. Bekkers, J. M. Tomczak, and M. Hoogendoorn. Ckconv: Continuous kernel convolution for sequential data. arXiv preprint arXiv:2102.02611, 2021b.
1163		24	46	N. Scalzitti, A. Kress, R. Orhand, T. Weber, L. Moulinier, A. Jeannin-Girardon, O. Collet, Pierre anf Poch, and J. D. Thompson. Spliceator: multi-species splice site prediction using convolutional neural networks. BMC Bioinformatics, 22(1):1–26, 2021.
1164		24	47	J. T. Smith, A. Warrington, and S. W. Linderman. Simplified state space layers for sequence modeling. arXiv preprint arXiv:2208.04933, 2022.
1165		24	48	Y. Tay, M. Dehghani, S. Abnar, Y. Shen, D. Bahri, P. Pham, J. Rao, L. Yang, S. Ruder, and D. Metzler. Long range arena: A benchmark for efficient transformers. arXiv preprint arXiv:2011.04006, 2020.
1166		24	49	Y. Tay, V. Q. Tran, S. Ruder, J. Gupta, H. W. Chung, D. Bahri, Z. Qin, S. Baumgartner, C. Yu, and D. Metzler. Charformer: Fast character transformers via gradient-based subword tokenization. arXiv preprint arXiv:2106.12672, 2021.
1167		24	50	L. Van der Maaten and G. Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
1168		24	51	A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
1169		24	52	J. Wei, M. Bosma, V. Y. Zhao, K. Guu, A. W. Yu, B. Lester, N. Du, A. M. Dai, and Q. V. Le. Finetuned language models are zero-shot learners. arXiv preprint arXiv:2109.01652, 2021.
1170		24	53	F. Yang, W. Wang, F. Wang, Y. Fang, D. Tang, J. Huang, H. Lu, and J. Yao. scBERT as a large-scale pretrained deep language model for cell type annotation of single-cell RNA-seq data. Nature Machine Intelligence, 4(10):852–866, 2022.
1171		24	54	J. Yu, Z. Wang, V. Vasudevan, L. Yeung, M. Seyedhosseini, and Y. Wu. Coca: Contrastive captioners are image-text foundation models. arXiv preprint arXiv:2205.01917, 2022.
1172		24	55	M. Zaheer, G. Guruganesh, K. A. Dubey, J. Ainslie, C. Alberti, S. Ontanon, P. Pham, A. Ravula, Q. Wang, L. Yang, et al. Big bird: Transformers for longer sequences. Advances in neural information processing systems, 33:17283–17297, 2020.
1173		24	56	S. Zaina, E. L. Pérez-Luque, and G. Lund. Genetics talks to epigenetics? the interplay between sequence variants and chromatin structure. Current genomics, 11(5):359–367, 2010.
1174		24	57	J. Zhou and O. G. Troyanskaya. Predicting effects of noncoding variants with deep learning–based sequence model. Nature methods, 12(10):931–934, 2015.
1175		24	58	M. Zvyagin, A. Brace, K. Hippe, Y. Deng, B. Zhang, C. O. Bohorquez, A. Clyde, B. Kale, D. Perez-Rivera, H. Ma, et al. GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics. bioRxiv, pages 2022–10, 2022.
1176		25	0	OpenAI, :, A. Hurst, A. Lerer, A. P. Goucher, A. Perelman, A. Ramesh, A. Clark, A. Ostrow, A. Welihinda, A. Hayes, A. Radford, A. M ˛adry, A. Baker-Whitcomb, A. Beutel, A. Borzunov, A. Carney, A. Chow, A. Kirillov, A. Nichol, A. Paino, A. Renzin, A. T. Passos, A. Kirillov, A. Christakis, A. Conneau, A. Kamali, A. Jabri, A. Moyer, A. Tam, A. Crookes, A. Tootoochian, A. Tootoonchian, A. Kumar, A. Vallone, A. Karpathy, A. Braunstein, A. Cann, A. Codispoti, A. Galu, A. Kondrich, A. Tulloch, A. Mishchenko, A. Baek, A. Jiang, A. Pelisse, A. Woodford, A. Gosalia, A. Dhar, A. Pantuliano, A. Nayak, A. Oliver, B. Zoph, B. Ghorbani, B. Leimberger, B. Rossen, B. Sokolowsky, B. Wang, B. Zweig, B. Hoover, B. Samic, B. McGrew, B. Spero, B. Giertler, B. Cheng, B. Lightcap, B. Walkin, B. Quinn, B. Guarraci, B. Hsu, B. Kellogg, B. Eastman, C. Lugaresi, C. Wainwright, C. Bassin, C. Hudson, C. Chu, C. Nelson, C. Li, C. J. Shern, C. Conger, C. Barette, C. Voss, C. Ding, C. Lu, C. Zhang, C. Beaumont, C. Hallacy, C. Koch, C. Gibson, C. Kim, C. Choi, C. McLeavey, C. Hesse, C. Fischer, C. Winter, C. Czarnecki, C. Jarvis, C. Wei, C. Koumouzelis, D. Sherburn, D. Kappler, D. Levin, D. Levy, D. Carr, D. Farhi, D. Mely, D. Robinson, D. Sasaki, D. Jin, D. Valladares, D. Tsipras, D. Li, D. P. Nguyen, D. Findlay, E. Oiwoh, E. Wong, E. Asdar, E. Proehl, E. Yang, E. Antonow, E. Kramer, E. Peterson, E. Sigler, E. Wallace, E. Brevdo, E. Mays, F. Khorasani, F. P. Such, F. Raso, F. Zhang, F. von Lohmann, F. Sulit, G. Goh, G. Oden, G. Salmon, G. Starace, G. Brockman, H. Salman, H. Bao, H. Hu, H. Wong, H. Wang, H. Schmidt, H. Whitney, H. Jun, H. Kirchner, H. P. de Oliveira Pinto, H. Ren, H. Chang, H. W. Chung, I. Kivlichan, I. O’Connell, I. O’Connell, I. Osband, I. Silber, I. Sohl, I. Okuyucu, I. Lan, I. Kostrikov, I. Sutskever, I. Kanitscheider, I. Gulrajani, J. Coxon, J. Menick, J. Pachocki, J. Aung, J. Betker, J. Crooks, J. Lennon, J. Kiros, J. Leike, J. Park, J. Kwon, J. Phang, J. Teplitz, J. Wei, J. Wolfe, J. Chen, J. Harris, J. Varavva, J. G. Lee, J. Shieh, J. Lin, J. Yu, J. Weng, J. Tang, J. Yu, J. Jang, J. Q. Candela, J. Beutler, J. Landers, J. Parish, J. Heidecke, J. Schulman, J. Lachman, J. McKay, J. Uesato, J. Ward, J. W. Kim, J. Huizinga, J. Sitkin, J. Kraaijeveld, J. Gross, J. Kaplan, J. Snyder, J. Achiam, J. Jiao, J. Lee, J. Zhuang, J. Harriman, K. Fricke, K. Hayashi, K. Singhal, K. Shi, K. Karthik, K. Wood, K. Rimbach, K. Hsu, K. Nguyen, K. Gu-Lemberg, K. Button, K. Liu, K. Howe, K. Muthukumar, K. Luther, L. Ahmad, L. Kai, L. Itow, L. Workman, L. Pathak, L. Chen, L. Jing, L. Guy, L. Fedus, L. Zhou, L. Mamitsuka, L. Weng, L. McCallum, L. Held, L. Ouyang, L. Feuvrier, L. Zhang, L. Kondraciuk, L. Kaiser, L. Hewitt, L. Metz, L. Doshi, M. Aflak, M. Simens, M. Boyd, M. Thompson, M. Dukhan, M. Chen, M. Gray, M. Hudnall, M. Zhang, M. Aljubeh, M. Litwin, M. Zeng, M. Johnson, M. Shetty, M. Gupta, M. Shah, M. Yatbaz, M. J. Yang, M. Zhong, M. Glaese, M. Chen, M. Janner, M. Lampe, M. Petrov, M. Wu, M. Wang, M. Fradin, M. Pokrass, M. Castro, M. O. T. de Castro, M. Pavlov, M. Brundage, M. Wang, M. Khan, M. Murati, M. Bavarian, M. Lin, M. Yesildal, N. Soto, N. Gimelshein, N. Cone, N. Staudacher, N. Summers, N. LaFontaine, N. Chowdhury, N. Ryder, N. Stathas, N. Turley, N. Tezak, N. Felix, N. Kudige, N. Keskar, N. Deutsch, N. Bundick, N. Puckett, O. Nachum, O. Okelola, O. Boiko, O. Murk, O. Jaffe, O. Watkins, O. Godement, O. Campbell-Moore, P. Chao, P. McMillan, P. Belov, P. Su, P. Bak, P. Bakkum, P. Deng, P. Dolan, P. Hoeschele, P. Welinder, P. Tillet, P. Pronin, P. Tillet, P. Dhariwal, Q. Yuan, R. Dias, R. Lim, R. Arora, R. Troll, R. Lin, R. G. Lopes, R. Puri, R. Miyara, R. Leike, R. Gaubert, R. Zamani, R. Wang, R. Donnelly, R. Honsby, R. Smith, R. Sahai, R. Ramchandani, R. Huet, R. Carmichael, R. Zellers, R. Chen, R. Chen, R. Nigmatullin, R. Cheu, S. Jain, S. Altman, S. Schoenholz, S. Toizer, S. Miserendino, S. Agarwal, S. Culver, S. Ethersmith, S. Gray, S. Grove, S. Metzger, S. Hermani, S. Jain, S. Zhao, S. Wu, S. Jomoto, S. Wu, Shuaiqi, Xia, S. Phene, S. Papay, S. Narayanan, S. Coffey, S. Lee, S. Hall, S. Balaji, T. Broda, T. Stramer, T. Xu, T. Gogineni, T. Christianson, T. Sanders, T. Patwardhan, T. Cunninghman, T. Degry, T. Dimson, T. Raoux, T. Shadwell, T. Zheng, T. Underwood, T. Markov, T. Sherbakov, T. Rubin, T. Stasi, T. Kaftan, T. Heywood, T. Peterson, T. Walters, T. Eloundou, V. Qi, V. Moeller, V. Monaco, V. Kuo, V. Fomenko, W. Chang, W. Zheng, W. Zhou, W. Manassra, W. Sheu, W. Zaremba, Y. Patil, Y. Qian, Y. Kim, Y. Cheng, Y. Zhang, Y. He, Y. Zhang, Y. Jin, Y. Dai, and Y. Malkov. Gpt-4o system card. 10 2024. URL https://arxiv.org/pdf/2410.21276.	https://qiita.com/kaizen_nagoya/items/06e4c54af663456b49f9
1177		25	1	] OpenAI, “Hello gpt-4,” 2024.
1178		25	2	] T. Stivers, N. J. Enfield, P. Brown, C. Englert, M. Hayashi, T. Heinemann, G. Hoymann, F. Rossano, J. P. de Ruiter, K. E. Yoon, and S. C. Levinson, “Universals and cultural variation in turn-taking in conversation,” Proceedings of the National Academy of Sciences, vol. 106, no. 26, pp. 10587–10592, 2009.
1179		25	3	] The White House, “Fact sheet: Biden-harris administration secures voluntary commitments from leading artificial intelligence companies to manage the risks posed by ai,” 2023.
1180		25	4	] OpenAI, “Openai preparedness framework beta,” 2023. https://cdn.openai.com/openai-preparedness-framework-beta.pdf.
1181		25	5	] Shutterstock, “Shutterstock press release,” 2023.
1182		25	6	] OpenAI, “Gpt-4 technical report,” 2023.
1183		25	7	] OpenAI, “Gpt-4v(ision) system card.” https://openai.com/index/gpt-4v-system-card/, 2023. Accessed: 2024-07-22.
1184		25	8	] OpenAI, “Navigating the challenges and opportunities of synthetic voices.” https://openai.com/index/navigating-the-challenges-and-opportunities-of-synthetic-voices/, 2024. Accessed: 2024-07-22.
1185		25	9	] K. T. Mai, S. Bray, T. Davies, and L. D. Griffin, “Warning: Humans cannot reliably detect speech deepfakes,” PLoS One, vol. 18, p. e0285333, Aug. 2023.
1186		25	10	M. Mori, K. F. MacDorman, and N. Kageki, “The uncanny valley [from the field],” IEEE Robotics & automation magazine, vol. 19, no. 2, pp. 98–100, 2012.
1187		25	11	OpenAI, “How the voices for chatgpt were chosen,” 2024.
1188		25	12	I. Solaiman, Z. Talat, W. Agnew, L. Ahmad, D. Baker, S. L. Blodgett, C. Chen, H. D. I. au2, J. Dodge, I. Duan, E. Evans, F. Friedrich, A. Ghosh, U. Gohar, S. Hooker, Y. Jernite, R. Kalluri, A. Lusoli, A. Leidinger, M. Lin, X. Lin, S. Luccioni, J. Mickel, M. Mitchell, J. Newman, A. Ovalle, M.-T. Png, S. Singh, A. Strait, L. Struppek, and A. Subramonian, “Evaluating the social impact of generative ai systems in systems and society,” 2024.
1189		25	13	R. Shelby, S. Rismani, K. Henne, A. Moon, N. Rostamzadeh, P. Nicholas, N. Yilla, J. Gallegos, A. Smart, E. Garcia, and G. Virk, “Sociotechnical harms of algorithmic systems: Scoping a taxonomy for harm reduction,” 2023.
1190		25	14	S. L. Blodgett, Q. V. Liao, A. Olteanu, R. Mihalcea, M. Muller, M. K. Scheuerman, C. Tan, and Q. Yang, “Responsible language technologies: Foreseeing and mitigating harms,” in Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems, CHI EA ’22, (New York, NY, USA), Association for Computing Machinery, 2022.
1191		25	15	H. Suresh and J. Guttag, “A framework for understanding sources of harm throughout the machine learning life cycle,” in Equity and Access in Algorithms, Mechanisms, and Optimization, EAAMO ’21, ACM, Oct. 2021.
1192		25	16	S. Shahriar, S. Allana, S. M. Hazratifard, and R. Dara, “A survey of privacy risks and mitigation strategies in the artificial intelligence life cycle,” IEEE Access, vol. 11, pp. 61829–61854, 2023.
1193		25	17	OpenAI, “Moderation overview,” 2024.
1194		25	18	A. Tamkin, M. Brundage, J. Clark, and D. Ganguli, “Understanding the capabilities, limitations, and societal impact of large language models,” 2021.
1195		25	19	B. Buchanan, A. Lohn, M. Musser, and K. Sedova, “Truth, lies, and automation: How language models could change disinformation,” May 2021.
1196		25	20	OpenAI, “Openai usage policies,” 2023. https://openai.com/policies/usagepolicies/.
1197		25	21	OpenAI, “Building an early warning system for llm-aided biological threat creation,” 2024. https://openai.com/index/building-an-early-warning-system-for-llm-aided-biological-threat-creation/.
1198		25	22	Deloitte, “Deloitte acquires gryphon scientific business to expand security science and public health capabilities,” 2024. https://www2.deloitte.com/us/en/pages/about-deloitte/articles/press-releases/deloitte-acquires-gryphon-scientific-business-to-expand-security-science-and-public-health-capabilities.html.
1199		25	23	L. Weidinger, M. Rauh, N. Marchal, A. Manzini, L. A. Hendricks, J. Mateos-Garcia, S. Bergman, J. Kay, C. Griffin, B. Bariach, I. Gabriel, V. Rieser, and W. Isaac, “Sociotechnical safety evaluation of generative ai systems,” 2023.
1200		25	24	A. Tamkin, A. Askell, L. Lovitt, E. Durmus, N. Joseph, S. Kravec, K. Nguyen, J. Kaplan, and D. Ganguli, “Evaluating and mitigating discrimination in language model decisions,” 2023.
1201		25	25	J. A. Goldstein, G. Sastry, M. Musser, R. DiResta, M. Gentzel, and K. Sedova, “Generative language models and automated influence operations: Emerging threats and potential mitigations,” 2023.
1202		25	26	I. Pentina, T. Hancock, and T. Xie, “Exploring relationship development with social chatbots: A mixed-method study of replika,” Computers in Human Behavior, vol. 140, p. 107600, 2023.
1203		25	27	Y. Bengio, G. Hinton, A. Yao, D. Song, P. Abbeel, T. Darrell, Y. N. Harari, Y.-Q. Zhang, L. Xue, S. Shalev-Shwartz, G. Hadfield, J. Clune, T. Maharaj, F. Hutter, A. G. Baydin, S. McIlraith, Q. Gao, A. Acharya, D. Krueger, A. Dragan, P. Torr, S. Russell, D. Kahneman, J. Brauner, and S. Mindermann, “Managing extreme ai risks amid rapid progress,” Science, vol. 384, no. 6698, pp. 842–845, 2024.
1204		25	28	S. B. Johnson, J. R. Clark, M. C. Luetke, N. M. Butala, A. T. Pearson, J. M. Shapiro, D. M. Aleman, J. M. Lee, M. M. Beil, C. V. Winkle, M. C. Boudreaux, R. C. D’Cunha, H. J. Krouse, and C. Li, “Chatgpt in medical education: a workshop-based large language model-powered intervention for evidence-based clinical decision making in medical students,” Nature Medicine, vol. 29, pp. 1534–1542, 2023.
1205		25	29	K. Kavukcuoglu, “Real-world challenges for agi,” Nov 2021.
1206		25	30	S. Altman, “Planning for agi and beyond,” OpenAI, 2023.
1207		25	31	T. Eloundou, S. Manning, P. Mishkin, and D. Rock, “Gpts are gpts: An early look at the labor market impact potential of large language models,” arXiv preprint arXiv:2303.10130, 2023.
1208		25	32	L. Weidinger, M. Rauh, N. Marchal, A. Manzini, L. A. Hendricks, J. Mateos-Garcia, S. Bergman, J. Kay, C. Griffin, B. Bariach, et al., “Sociotechnical safety evaluation of generative ai systems,” arXiv preprint arXiv:2310.11986, 2023.
1209		25	33	S. Cox, M. Hammerling, J. Lála, J. Laurent, S. Rodriques, M. Rubashkin, and A. White, “Wikicrow: Automating synthesis of human scientific knowledge,” Future House, 2023.
1210		25	34	S. A. Athaluri, S. V. Manthena, V. S. R. K. M. Kesapragada, V. Yarlagadda, T. Dave, and R. T. S. Duddumpudi, “Exploring the boundaries of reality: Investigating the phenomenon of artificial intelligence hallucination in scientific writing through chatgpt references,” Cureus, vol. 15, no. 4, p. e37432, 2023.
1211		25	35	Z. Li, “The dark side of chatgpt: Legal and ethical challenges from stochastic parrots and hallucination,” 2023.
1212		25	36	M. Dubiel, A. Sergeeva, and L. A. Leiva, “Impact of voice fidelity on decision making: A potential dark pattern?,” 2024.
1213		25	37	B. Waber, M. Williams, J. S. Carroll, and A. S. Pentland, “A voice is worth a thousand words: The implications of the micro-coding of social signals in speech for trust research,” in Handbook of Research Methods on Trust (G. M. Fergus Lyon and M. N. Saunders, eds.), ch. 23, p. 320, New York: Edward Elgar Publishing, 2011.
1214		25	38	I. Pentina, B. Guo, and W. P. Fan, “Friend, mentor, lover: Does chatbot engagement lead to psychological dependence?,” Journal of Service Management, 2023.
1215		25	39	H. Nori, N. King, S. M. McKinney, D. Carignan, and E. Horvitz, “Capabilities of gpt-4 on medical challenge problems,” arXiv preprint arXiv:2303.13375, 2023.
1216		25	40	H. Nori, Y. T. Lee, S. Zhang, D. Carignan, R. Edgar, N. Fusi, N. King, J. Larson, Y. Li, W. Liu, et al., “Can generalist foundation models outcompete special-purpose tuning? case study in medicine,” arXiv preprint arXiv:2311.16452, 2023.
1217		25	41	K. Singhal, S. Azizi, T. Tu, S. S. Mahdavi, J. Wei, H. W. Chung, N. Scales, A. Tanwani, H. Cole-Lewis, S. Pfohl, P. Payne, M. Seneviratne, P. Gamble, C. Kelly, N. Scharli, A. Chowdhery, P. Mansfield, B. A. y Arcas, D. Webster, G. S. Corrado, Y. Matias, K. Chou, J. Gottweis, N. Tomasev, Y. Liu, A. Rajkomar, J. Barral, C. Semturs, A. Karthikesalingam, and V. Natarajan, “Large language models encode clinical knowledge,” 2022.
1218		25	42	K. Singhal, T. Tu, J. Gottweis, R. Sayres, E. Wulczyn, L. Hou, K. Clark, S. Pfohl, H. Cole-Lewis, D. Neal, M. Schaekermann, A. Wang, M. Amin, S. Lachgar, P. Mansfield, S. Prakash, B. Green, E. Dominowska, B. A. y Arcas, N. Tomasev, Y. Liu, R. Wong, C. Semturs, S. S. Mahdavi, J. Barral, D. Webster, G. S. Corrado, Y. Matias, S. Azizi, A. Karthikesalingam, and V. Natarajan, “Towards expert-level medical question answering with large language models,” 2023.
1219		25	43	K. Saab, T. Tu, W.-H. Weng, R. Tanno, D. Stutz, E. Wulczyn, F. Zhang, T. Strother, C. Park, E. Vedadi, J. Z. Chaves, S.-Y. Hu, M. Schaekermann, A. Kamath, Y. Cheng, D. G. T. Barrett, C. Cheung, B. Mustafa, A. Palepu, D. McDuff, L. Hou, T. Golany, L. Liu, J. baptiste Alayrac, N. Houlsby, N. Tomasev, J. Freyberg, C. Lau, J. Kemp, J. Lai, S. Azizi, K. Kanada, S. Man, K. Kulkarni, R. Sun, S. Shakeri, L. He, B. Caine, A. Webson, N. Latysheva, M. Johnson, P. Mansfield, J. Lu, E. Rivlin, J. Anderson, B. Green, R. Wong, J. Krause, J. Shlens, E. Dominowska, S. M. A. Eslami, K. Chou, C. Cui, O. Vinyals, K. Kavukcuoglu, J. Manyika, J. Dean, D. Hassabis, Y. Matias, D. Webster, J. Barral, G. Corrado, C. Semturs, S. S. Mahdavi, J. Gottweis, A. Karthikesalingam, and V. Natarajan, “Capabilities of gemini models in medicine,” 2024.
1220		25	44	Epic Systems Corporation, “Epic and microsoft bring gpt-4 to ehrs,” Epic, 2023.
1221		25	45	D. Van Veen, C. Van Uden, L. Blankemeier, J.-B. Delbrouck, A. Aali, C. Bluethgen, A. Pareek, M. Polacin, E. P. Reis, A. Seehofnerová, et al., “Adapted large language models can outperform medical experts in clinical text summarization,” Nature medicine, vol. 30, no. 4, pp. 1134–1142, 2024.
1222		25	46	Epic, “Epic and microsoft bring gpt-4 to ehrs,” 2023.
1223		25	47	P. Garcia, S. P. Ma, S. Shah, M. Smith, Y. Jeong, A. Devon-Sand, M. Tai-Seale, K. Takazawa, D. Clutter, K. Vogt, C. Lugtu, M. Rojo, S. Lin, T. Shanafelt, M. A. Pfeffer, and C. Sharp, “Artificial Intelligence–Generated Draft Replies to Patient Inbox Messages,” JAMA Network Open, vol. 7, pp. e243201–e243201, 03 2024.
1224		25	48	OpenAI, “Paradigm: Improving patient access to clinical trials.” https://openai.com/index/paradigm/, 2024. Accessed: 2024-08-07.
1225		25	49	M. Hutson, “How ai is being used to accelerate clinical trials,” Nature, vol. 627, pp. S2–S5, 2024.
1226		25	50	OpenAI, “Using gpt-4o reasoning to transform cancer care.” https://openai.com/index/color-health/, 2024. Accessed: 2024-08-07.
1227		25	51	J. Varghese and J.-L. Chapiro, “Systematic analysis of chatgpt, google search and llama 2 for clinical decision support tasks,” Nature Communications, vol. 15, no. 1, p. 46411, 2024. Accessed: 2024-08-07.
1228		25	52	E. Schmidt, “Ai will transform science.” https://www.technologyreview.com/2023/07/05/1075865/eric-schmidt-ai-will-transform-science/, 2023. Accessed: 2024-08-07.
1229		25	53	N. Rosenberg, “Science, invention and economic growth,” The Economic Journal, vol. 84, no. 333, pp. 90–108, 1974.
1230		25	54	R. M. Atlas and M. Dando, “The dual-use dilemma for the life sciences: Perspectives, conundrums, and global solutions,”Biosecurity and Bioterrorism: Biodefense Strategy, Practice, and Science, vol. 4, no. 3, pp. 276–286, 2006. PMID:16999588.
1231		25	55	X. Gu and M. Krenn, “Generation and human-expert evaluation of interesting research ideas using knowledge graphs and large language models,” 2024.
1232		25	56	A. Ghafarollahi and M. J. Buehler, “Atomagents: Alloy design and discovery through physics-aware multi-modal multi-agent artificial intelligence,” 2024.
1233		25	57	J. M. Laurent, J. D. Janizek, M. Ruzo, M. M. Hinks, M. J. Hammerling, S. Narayanan, M. Ponnapati, A. D. White, and S. G. Rodriques, “Lab-bench: Measuring capabilities of language models for biology research,” 2024.
1234		25	58	H. Cai, X. Cai, J. Chang, S. Li, L. Yao, C. Wang, Z. Gao, H. Wang, Y. Li, M. Lin, S. Yang, J. Wang, M. Xu, J. Huang, F. Xi, J. Zhuang, Y. Yin, Y. Li, C. Chen, Z. Cheng, Z. Zhao, L. Zhang, and G. Ke, “Sciassess: Benchmarking llm proficiency in scientific literature analysis,” 2024.
1235		25	59	P. Clark, I. Cowhey, O. Etzioni, T. Khot, A. Sabharwal, C. Schoenick, and O. Tafjord, “Think you have solved question answering? try arc, the AI2 reasoning challenge,” CoRR, vol. abs/1803.05457, 2018.
1236		25	60	S. Lin, J. Hilton, and O. Evans, “Truthfulqa: Measuring how models mimic human falsehoods,” CoRR, vol. abs/2109.07958, 2021.
1237		26	0	M. Poli, J. Wang, S. Massaroli, J. Quesnelle, R. Carlow, E. Nguyen, and A. Thomas. Striped-Hyena: Moving Beyond Transformers with Hybrid Signal Processing Models, 12 2023. URL https://github.com/togethercomputer/stripedhyena.	https://qiita.com/kaizen_nagoya/items/423495ed798c1eaf89f1
1238		26	1	Marah I Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat S. Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, S´ ebastien Bubeck, Martin Cai, Caio C´ esar Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Parul Chopra, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Dan Iter, Amit Garg, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Jamie Huynh, Mojan Javaheripi, Xin Jin, Piero Kauffmann, Nikos Karampatziakis, Dongwoo Kim, Mahoud Khademi, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Chen Liang, Weishung Liu, Eric Lin, Zeqi Lin, Piyush Madan, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Xia Song, Masahiro Tanaka, Xin Wang, Rachel Ward, Guanhua Wang, Philipp Witte, Michael Wyatt, Can Xu, Jiahang Xu, Sonali Yadav, Fan Yang, Ziyi Yang, Donghan Yu, Chengruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, and Xiren Zhou. Phi-3 technical report: A highly capable language model locally on your phone. CoRR, abs/2404.14219, 2024.
1239		26	2	Bo Adler, Niket Agarwal, Ashwath Aithal, Dong H. Anh, Pallab Bhattacharya, Annika Brundyn, Jared Casper, Bryan Catanzaro, Sharon Clay, Jonathan Cohen, Sirshak Das, Ayush Dattagupta, Olivier Delalleau, Leon Derczynski, Yi Dong, Daniel Egert, Ellie Evans, Aleksander Ficek, Denys Fridman, Shaona Ghosh, Boris Ginsburg, Igor Gitman, Tomasz Grzegorzek, Robert Hero, Jining Huang, Vibhu Jawa, Joseph Jennings, Aastha Jhunjhunwala, John Kamalu, Sadaf Khan, Oleksii Kuchaiev, Patrick LeGresley, Hui Li, Jiwei Liu, Zihan Liu, Eileen Peters Long, Ameya Mahabaleshwarkar, Somshubra Majumdar, James Maki, Miguel Martinez, Maer Rodrigues de Melo, Ivan Moshkov, Deepak Narayanan, Sean Narenthiran, Jesus Navarro, Phong Nguyen, Osvald Nitski, Vahid Noroozi, Guruprasad Nutheti, Christopher Parisien, Jupinder Parmar, Mostofa Patwary, Krzysztof Pawelec, Wei Ping, Shrimai Prabhumoye, Rajarshi Roy, Trisha Saar, Vasanth Rao Naik Sabavat, Sanjeev Satheesh, Jane Polak Scowcroft, Jason D. Sewall, Pavel Shamis, Gerald Shen, Mohammad Shoeybi, Dave Sizer, Misha Smelyanskiy, Felipe Soares, Makesh Narsimhan Sreedhar, Dan Su, Sandeep Subramanian, Shengyang Sun, Shubham Toshniwal, Hao Wang, Zhilin Wang, Jiaxuan You, Jiaqi Zeng, Jimmy Zhang, Jing Zhang, Vivienne Zhang, Yian Zhang, and Chen Zhu. Nemotron-4 340B technical report. CoRR, abs/2406.11704, 2024.
1240		26	3	Joshua Ainslie, James Lee-Thorp, Michiel de Jong, Yury Zemlyanskiy, Federico Lebr´ on, and Sumit Sanghai. GQA: Training generalized multi-query Transformer models from multi-head checkpoints. In EMNLP, pp. 4895–4901. Association for Computational Linguistics, 2023.
1241		26	4	Ebtesam Almazrouei, Hamza Alobeidli, Abdulaziz Alshamsi, Alessandro Cappelli, Ruxandra Cojocaru, M´ erouane Debbah, Etienne Goffinet, Daniel Hesslow, Julien Launay, Quentin Malartic, Daniele Mazzotta, Badreddine Noune, Baptiste Pannier, and Guilherme Penedo. The Falcon series of open language models. CoRR, abs/2311.16867, 2023.
1242		26	5	Chenxin An, Fei Huang, Jun Zhang, Shansan Gong, Xipeng Qiu, Chang Zhou, and Lingpeng Kong. Training-free long-context scaling of large language models. CoRR, abs/2402.17463, 2024.
1243		26	6	Anthropic. Introducing Claude, 2023a. URL https://www.anthropic.com/index/introducing-claude. Anthropic. Claude 2. Technical report, Anthropic, 2023b. URL https://www-files.anthropic.com/production/images/Model-Card-Claude-2.pdf.
1244		26	7	Anthropic. The Claude 3 model family: Opus, Sonnet, Haiku. Technical report, Anthropic, AI, 2024. URL https://www-cdn.anthropic.com/de8ba9b01c9ab7cbabf5c33b80b7bbc618857627/Model Card Claude 3.pdf.
1245		26	8	Jacob Austin, Augustus Odena, Maxwell I. Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie J. Cai, Michael Terry, Quoc V. Le, and Charles Sutton. Program synthesis with large language models. CoRR, abs/2108.07732, 2021.
1246		26	9	Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng Xu, Jin Xu, An Yang, Hao Yang, Jian Yang, Shusheng Yang, Yang Yao, Bowen Yu, Hongyi Yuan, Zheng Yuan, Jianwei Zhang, Xingxuan Zhang, Yichang Zhang, Zhenru Zhang, Chang Zhou, Jingren Zhou, Xiaohuan Zhou, and Tianhang Zhu. Qwen technical report. CoRR, abs/2309.16609, 2023.
1247		26	10	Yushi Bai, Xin Lv, Jiajie Zhang, Yuze He, Ji Qi, Lei Hou, Jie Tang, Yuxiao Dong, and Juanzi Li. LongAlign: A recipe for long context alignment of large language models. In EMNLP (Findings), pp. 1376–1395. Association for Computational Linguistics, 2024.
1248		26	11	Lucas Bandarkar, Davis Liang, Benjamin Muller, Mikel Artetxe, Satya Narayan Shukla, Donald Husa, Naman Goyal, Abhinandan Krishnan, Luke Zettlemoyer, and Madian Khabsa. The Belebele benchmark: A parallel reading comprehension dataset in 122 language variants. CoRR, abs/2308.16884, 2023.
1249		26	12	Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. Language models are few-shot learners. In NeurIPS, 2020.
1250		26	13	Boxi Cao, Keming Lu, Xinyu Lu, Jiawei Chen, Mengjie Ren, Hao Xiang, Peilin Liu, Yaojie Lu, Ben He, Xianpei Han, Le Sun, Hongyu Lin, and Bowen Yu. Towards scalable automated alignment of LLMs: A survey. CoRR, abs/2406.01252, 2024.
1251		26	14	Federico Cassano, John Gouwar, Daniel Nguyen, Sydney Nguyen, Luna Phipps-Costin, Donald Pinckney, Ming-Ho Yee, Yangtian Zi, Carolyn Jane Anderson, Molly Q. Feldman, Arjun Guha, Michael Greenberg, and Abhinav Jangda. MultiPL-E: A scalable and polyglot approach to benchmarking neural code generation. IEEE Trans. Software Eng., 49(7):3675–3691, 2023.
1252		26	15	Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Pond´ e de Oliveira Pinto, Jared Kaplan, Harrison Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shantanu Jain, William Saunders, Christopher Hesse, Andrew N. Carr, Jan Leike, Joshua Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, and Wojciech Zaremba. Evaluating large language models trained on code. CoRR, abs/2107.03374, 2021.
1253		26	16	Wenhu Chen, Ming Yin, Max Ku, Pan Lu, Yixin Wan, Xueguang Ma, Jianyu Xu, Xinyi Wang, and Tony Xia. TheoremQA: A theorem-driven question answering dataset. In EMNLP, pp. 7889–7901. Association for Computational Linguistics, 2023a.
1254		26	17	Zhihong Chen, Shuo Yan, Juhao Liang, Feng Jiang, Xiangbo Wu, Fei Yu, Guiming Hardy Chen, Junying Chen, Hongbo Zhang, Li Jianquan, Wan Xiang, and Benyou Wang. MultilingualSIFT: Multilingual supervised instruction fine-tuning, 2023b. URL https://github.com/FreedomIntelligence/MultilingualSIFT.
1255		26	18	Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, and Oyvind Tafjord. Think you have solved question answering? Try ARC, the AI2 reasoning challenge. CoRR, abs/1803.05457, 2018.
1256		26	19	Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, Christopher Hesse, and John Schulman. Training verifiers to solve math word problems. CoRR, abs/2110.14168, 2021.
1257		26	20	Damai Dai, Chengqi Deng, Chenggang Zhao, R. X. Xu, Huazuo Gao, Deli Chen, Jiashi Li, Wangding Zeng, Xingkai Yu, Y. Wu, Zhenda Xie, Y. K. Li, Panpan Huang, Fuli Luo, Chong Ruan, Zhifang Sui, and Wenfeng Liang. DeepSeekMoE: Towards ultimate expert specialization in mixture-of-experts language models. CoRR, abs/2401.06066, 2024.
1258		26	21	Yann N. Dauphin, Angela Fan, Michael Auli, and David Grangier. Language modeling with gated convolutional networks. In ICML, volume 70 of Proceedings of Machine Learning Research, pp. 933–941. PMLR, 2017.
1259		26	22	Guanting Dong, Keming Lu, Chengpeng Li, Tingyu Xia, Bowen Yu, Chang Zhou, and Jingren Zhou. Self-play with execution feedback: Improving instruction-following capabilities of large language models. CoRR, abs/2406.13542, 2024.
1260		26	23	Shihan Dou, Jiazheng Zhang, Jianxiang Zang, Yunbo Tao, Haoxiang Jia, Shichun Liu, Yuming Yang, Shenxi Wu, Shaoqing Zhang, Muling Wu, et al. Multi-programming language sandbox for llms. CoRR, abs/2410.23074, 2024.
1261		26	24	Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aur´ elien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Rozi` ere, Bethany Biron, Binh Tang, Bobbie Chern, Charlotte Caucheteux, Chaya Nayak, Chloe Bi, Chris Marra, Chris McConnell, Christian Keller, Christophe Touret, Chunyang Wu, Corinne Wong, Cristian Canton Ferrer, Cyrus Nikolaidis, Damien Allonsius, Daniel Song, Danielle Pintz, Danny Livshits, David Esiobu, Dhruv Choudhary, Dhruv Mahajan, Diego Garcia-Olano, Diego Perino, Dieuwke Hupkes, Egor Lakomkin, Ehab AlBadawy, Elina Lobanova, Emily Dinan, Eric Michael Smith, Filip Radenovic, Frank Zhang,
1262		26	25	Gabriel Synnaeve, Gabrielle Lee, Georgia Lewis Anderson, Graeme Nail, Gr´ egoire Mialon, Guan Pang, Guillem Cucurell, Hailey Nguyen, Hannah Korevaar, Hu Xu, Hugo Touvron, Iliyan Zarov, Imanol Arrieta Ibarra, Isabel M. Kloumann, Ishan Misra, Ivan Evtimov, Jade Copet, Jaewon Lee, Jan Geffert, Jana Vranes, Jason Park, Jay Mahadeokar, Jeet Shah, Jelmer van der Linde, Jennifer Billock, Jenny Hong, Jenya Lee, Jeremy Fu, Jianfeng Chi, Jianyu Huang, Jiawen Liu, Jie Wang, Jiecao Yu, Joanna Bitton, Joe Spisak, Jongsoo Park, Joseph Rocca, Joshua Johnstun, Joshua Saxe, Junteng Jia,
1263		26	26	Kalyan Vasuden Alwala, Kartikeya Upasani, Kate Plawiak, Ke Li, Kenneth Heafield, Kevin Stone, and et al. The Llama 3 herd of models. CoRR, abs/2407.21783, 2024.
1264		26	27	William Fedus, Barret Zoph, and Noam Shazeer. Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity. J. Mach. Learn. Res., 23:120:1–120:39, 2022.
1265		26	28	Alena Fenogenova, Artem Chervyakov, Nikita Martynov, Anastasia Kozlova, Maria Tikhonova, Albina Akhmetgareeva, Anton A. Emelyanov, Denis Shevelev, Pavel Lebedev, Leonid Sinev, Ulyana Isaeva, Katerina Kolomeytseva, Daniil Moskovskiy, Elizaveta Goncharova, Nikita Savushkin, Polina Mikhailova, Denis Dimitrov, Alexander Panchenko, and Sergey Markov. MERA: A comprehensive LLM evaluation in russian. CoRR, abs/2401.04531, 2024.
1266		26	29	Evan Frick, Peter Jin, Tianle Li, Karthik Ganesan, Jian Zhang, Jiantao Jiao, and Banghua Zhu. Athene-70b: Redefining the boundaries of post-training for open models, July 2024a. URL https://nexusflow.ai/blogs/athene.
1267		26	30	Evan Frick, Tianle Li, Connor Chen, Wei-Lin Chiang, Anastasios Nikolas Angelopoulos, Jiantao Jiao, Banghua Zhu, Joseph E. Gonzalez, and Ion Stoica. How to evaluate reward models for RLHF. CoRR, abs/2410.14872, 2024b.
1268		26	31	Aryo Pradipta Gema, Joshua Ong Jun Leang, Giwon Hong, Alessio Devoto, Alberto Carlo Maria Mancino, Rohit Saxena, Xuanli He, Yu Zhao, Xiaotang Du, Mohammad Reza Ghasemi Madani, et al. Are we done with mmlu? CoRR, abs/2406.04127, 2024.
1269		26	32	Gemini Team. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. Technical report, Google, 2024. URL https://storage.googleapis.com/deepmind-media/gemini/gemini v1 5 report.pdf.
1270		26	33	Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhupatiraju, L´ eonard Hussenot, Thomas Mesnard, Bobak Shahriari, Alexandre Ram´ e, et al. Gemma 2: Improving open language models at a practical size. CoRR, abs/2408.00118, 2024.
1271		26	34	Naman Goyal, Cynthia Gao, Vishrav Chaudhary, Peng-Jen Chen, Guillaume Wenzek, Da Ju, Sanjana Krishnan, Marc’Aurelio Ranzato, Francisco Guzm´ an, and Angela Fan. The Flores-101 evaluation benchmark for low-resource and multilingual machine translation. Trans. Assoc. Comput. Linguistics, 10:522–538, 2022.
1272		26	35	Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt. Measuring massive multitask language understanding. In ICLR. OpenReview.net, 2021a.
1273		26	36	Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, and Jacob Steinhardt. Measuring mathematical problem solving with the MATH dataset. In NeurIPS Datasets and Benchmarks, 2021b.
1274		26	37	Jordan Hoffmann, Sebastian Borgeaud, Arthur Mensch, Elena Buchatskaya, Trevor Cai, Eliza Rutherford, Diego de Las Casas, Lisa Anne Hendricks, Johannes Welbl, Aidan Clark, et al. Training computeoptimal large language models. CoRR, abs/2203.15556, 2022.
1275		26	38	Keith Hoskin. The “awful idea of accountability”: Inscribing people into the measurement of objects. Accountability: Power, ethos and the technologies of managing, 1996.
1276		26	39	Cheng-Ping Hsieh, Simeng Sun, Samuel Kriman, Shantanu Acharya, Dima Rekesh, Fei Jia, Yang Zhang, and Boris Ginsburg. RULER: What’s the real context size of your long-context language models? CoRR, abs/2404.06654, 2024.
1277		26	40	Shengding Hu, Yuge Tu, Xu Han, Chaoqun He, Ganqu Cui, Xiang Long, Zhi Zheng, Yewei Fang, Yuxiang Huang, Weilin Zhao, Xinrong Zhang, Zhen Leng Thai, Kai Zhang, Chongyi Wang, Yuan Yao, Chenyang Zhao, Jie Zhou, Jie Cai, Zhongwu Zhai, Ning Ding, Chao Jia, Guoyang Zeng, Dahai Li, Zhiyuan Liu, and Maosong Sun. MiniCPM: Unveiling the potential of small language models with scalable training strategies. CoRR, abs/2404.06395, 2024.
1278		26	41	Binyuan Hui, Jian Yang, Zeyu Cui, Jiaxi Yang, Dayiheng Liu, Lei Zhang, Tianyu Liu, Jiajun Zhang, Bowen Yu, Keming Lu, et al. Qwen2.5-Coder technical report. CoRR, abs/2409.12186, 2024.
1279		26	42	Naman Jain, King Han, Alex Gu, Wen-Ding Li, Fanjia Yan, Tianjun Zhang, Sida Wang, Armando Solar-Lezama, Koushik Sen, and Ion Stoica. LiveCodeBench: Holistic and contamination free evaluation of large language models for code. CoRR, abs/2403.07974, 2024.
1280		26	43	Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, L´ elio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timoth´ee Lacroix, and William El Sayed. Mistral 7B. CoRR, abs/2310.06825, 2023a.
1281		26	44	Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, L´ elio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Sandeep Subramanian, Sophia Yang, Szymon Antoniak, Teven Le Scao, Th´ eophile Gervet, Thibaut Lavril, Thomas Wang, Timoth´ ee Lacroix, and William El Sayed. Mixtral of experts. CoRR, abs/2401.04088, 2024a.
1282		26	45	Huiqiang Jiang, Yucheng Li, Chengruidong Zhang, Qianhui Wu, Xufang Luo, Surin Ahn, Zhenhua Han, Amir H Abdi, Dongsheng Li, Chin-Yew Lin, Yuqing Yang, and Lili Qiu. Minference 1.0: Accelerating pre-filling for long-context llms via dynamic sparse attention. arXiv preprint arXiv:2407.02490, 2024b.
1283		26	46	Zixuan Jiang, Jiaqi Gu, Hanqing Zhu, and David Z. Pan. Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and efficient pre-LN Transformers. CoRR, abs/2305.14858, 2023b.
1284		26	47	Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language models. CoRR, abs/2001.08361, 2020.
1285		26	48	Fajri Koto, Nurul Aisyah, Haonan Li, and Timothy Baldwin. Large language models only pass primary school exams in Indonesia: A comprehensive test on IndoMMLU. In EMNLP, pp. 12359–12374. Association for Computational Linguistics, 2023.
1286		26	49	Nathan Lambert, Valentina Pyatkin, Jacob Daniel Morrison, Lester James Validad Miranda, Bill Yuchen Lin, Khyathi Raghavi Chandu, Nouha Dziri, Sachin Kumar, Tom Zick, Yejin Choi, Noah A. Smith, and Hanna Hajishirzi. RewardBench: Evaluating reward models for language modeling. CoRR, abs/2403.13787, 2024.
1287		26	50	Dmitry Lepikhin, HyoukJoong Lee, Yuanzhong Xu, Dehao Chen, Orhan Firat, Yanping Huang, Maxim Krikun, Noam Shazeer, and Zhifeng Chen. GShard: Scaling giant models with conditional computation and automatic sharding. CoRR, abs/2006.16668, 2020.
1288		26	51	Tianle Li, Wei-Lin Chiang, Evan Frick, Lisa Dunlap, Tianhao Wu, Banghua Zhu, Joseph E. Gonzalez, and Ion Stoica. From crowdsourced data to high-quality benchmarks: Arena-Hard and BenchBuilder pipeline. CoRR, abs/2406.11939, 2024.
1289		26	52	Stephanie Lin, Jacob Hilton, and Owain Evans. TruthfulQA: Measuring how models mimic human falsehoods. In ACL (1), pp. 3214–3252. Association for Computational Linguistics, 2022a.
1290		26	53	Xi Victoria Lin, Todor Mihaylov, Mikel Artetxe, Tianlu Wang, Shuohui Chen, Daniel Simig, Myle Ott, Naman Goyal, Shruti Bhosale, Jingfei Du, Ramakanth Pasunuru, Sam Shleifer, Punit Singh Koura, Vishrav Chaudhary, Brian O’Horo, Jeff Wang, Luke Zettlemoyer, Zornitsa Kozareva, Mona T. Diab, Veselin Stoyanov, and Xian Li. Few-shot learning with multilingual generative language models. In EMNLP, pp. 9019–9052. Association for Computational Linguistics, 2022b.
1291		26	54	Jiawei Liu, Chunqiu Steven Xia, Yuyao Wang, and Lingming Zhang. Is your code generated by ChatGPT really correct? Rigorous evaluation of large language models for code generation. In NeurIPS, 2023.
1292		26	55	Keming Lu, Bowen Yu, Fei Huang, Yang Fan, Runji Lin, and Chang Zhou. Online merging optimizers for boosting rewards and mitigating tax in alignment. CoRR, abs/2405.17931, 2024a.
1293		26	56	Keming Lu, Bowen Yu, Chang Zhou, and Jingren Zhou. Large language models are superpositions of all
1294		26	57	characters: Attaining arbitrary role-play via self-alignment. CoRR, abs/2401.12474, 2024b.
1295		26	58	Niklas Muennighoff, Thomas Wang, Lintang Sutawika, Adam Roberts, Stella Biderman, Teven Le Scao, M. Saiful Bari, Sheng Shen, Zheng Xin Yong, Hailey Schoelkopf, Xiangru Tang, Dragomir Radev, Alham Fikri Aji, Khalid Almubarak, Samuel Albanie, Zaid Alyafeai, Albert Webson, Edward Raff, and Colin Raffel. Crosslingual generalization through multitask finetuning. In ACL (1), pp. 15991–16111. Association for Computational Linguistics, 2023.
1296		26	59	Junho Myung, Nayeon Lee, Yi Zhou, Jiho Jin, Rifki Afina Putri, Dimosthenis Antypas, Hsuvas Borkakoty, Eunsu Kim, Carla P´ erez-Almendros, Abinew Ali Ayele, V´ ıctor Guti´ errez-Basulto, Yazm´ ın Ib´ a nez-Garc´ ıa, Hwaran Lee, Shamsuddeen Hassan Muhammad, Ki-Woong Park, Anar Sabuhi Rzayev, Nina White, Seid Muhie Yimam, Mohammad Taher Pilehvar, Nedjma Ousidhoum, Jos´ e Camacho-Collados, and Alice Oh. Blend: A benchmark for llms on everyday knowledge in diverse cultures and languages. CoRR, abs/2406.09948, 2024.
1297		26	60	OpenAI. GPT4 technical report. CoRR, abs/2303.08774, 2023.
1298		26	61	OpenAI. Hello GPT-4o, 2024a. URL https://openai.com/index/hello-gpt-4o/.
1299		26	62	OpenAI. Learning to reason with LLMs, 2024b. URL https://openai.com/index/learning-to-reason-with-llms/.
1300		26	63	Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul F. Christiano, Jan Leike, and Ryan Lowe. Training language models to follow instructions with human feedback. In NeurIPS, 2022.
1301		26	64	Bowen Peng, Jeffrey Quesnelle, Honglu Fan, and Enrico Shippole. YaRN: Efficient context window extension of large language models. CoRR, abs/2309.00071, 2023.
1302		26	65	Edoardo Maria Ponti, Goran Glavas, Olga Majewska, Qianchu Liu, Ivan Vulic, and Anna Korhonen. XCOPA: A multilingual dataset for causal commonsense reasoning. In EMNLP (1), pp. 2362–2376.
1303		26	66	Association for Computational Linguistics, 2020.
1304		26	67	Shanghaoran Quan, Tianyi Tang, Bowen Yu, An Yang, Dayiheng Liu, Bofei Gao, Jianhong Tu, Yichang Zhang, Jingren Zhou, and Junyang Lin. Language models can self-lengthen to generate long texts. CoRR, abs/2410.23933, 2024.
1305		26	68	Qwen Team. Code with CodeQwen1.5, 2024a. URL https://qwenlm.github.io/blog/codeqwen1.5/.
1306		26	69	Qwen Team. Introducing Qwen1.5, 2024b. URL https://qwenlm.github.io/blog/qwen1.5/. Qwen Team. Introducing Qwen2-Math, 2024c. URL https://qwenlm.github.io/blog/qwen2-math/.
1307		26	70	Qwen Team. QwQ: Reflect deeply on the boundaries of the unknown, 2024d. URL https://qwenlm.git hub.io/blog/qwq-32b-preview/.
1308		26	71	Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever, et al. Improving language understanding by generative pre-training. Technical report, OpenAI, 2018.
1309		26	72	Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D. Manning, Stefano Ermon, and Chelsea Finn. Direct preference optimization: Your language model is secretly a reward model. In NeurIPS, 2023.
1310		26	73	Samyam Rajbhandari, Conglong Li, Zhewei Yao, Minjia Zhang, Reza Yazdani Aminabadi, Ammar Ahmad Awan, Jeff Rasley, and Yuxiong He. DeepSpeed-MoE: Advancing mixture-of-experts inference and training to power next-generation AI scale. In ICML, volume 162 of Proceedings of Machine Learning Research, pp. 18332–18346. PMLR, 2022.
1311		26	74	David Rein, Betty Li Hou, Asa Cooper Stickland, Jackson Petty, Richard Yuanzhe Pang, Julien Dirani, Julian Michael, and Samuel R. Bowman. GPQA: A graduate-level Google-proof Q&A benchmark. CoRR, abs/2311.12022, 2023.
1312		26	75	Keisuke Sakaguchi, Ronan Le Bras, Chandra Bhagavatula, and Yejin Choi. WinoGrande: An adversarial winograd schema challenge at scale. Commun. ACM, 64(9):99–106, 2021.
1313		26	76	Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural machine translation of rare words with subword units. In ACL (1). The Association for Computer Linguistics, 2016.
1314		26	77	Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Mingchuan Zhang, Y. K. Li, Y. Wu, and Daya Guo. Deepseekmath: Pushing the limits of mathematical reasoning in open language models. CoRR, abs/2402.03300, 2024.
1315		26	78	Jianlin Su. The magical effect of the Bias term: RoPE + Bias = better length extrapolation, 2023. URL https://spaces.ac.cn/archives/9577.
1316		26	79	Jianlin Su, Murtadha H. M. Ahmed, Yu Lu, Shengfeng Pan, Wen Bo, and Yunfeng Liu. Roformer: Enhanced Transformer with rotary position embedding. Neurocomputing, 568:127063, 2024.
1317		26	80	Mirac Suzgun, Nathan Scales, Nathanael Sch¨ arli, Sebastian Gehrmann, Yi Tay, Hyung Won Chung, Aakanksha Chowdhery, Quoc V. Le, Ed H. Chi, Denny Zhou, and Jason Wei. Challenging BIG-Bench tasks and whether chain-of-thought can solve them. In ACL (Findings), pp. 13003–13051. Association for Computational Linguistics, 2023.
1318		26	81	Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timoth´ ee Lacroix, Baptiste Rozi` ere, Naman Goyal, Eric Hambro, Faisal Azhar, Aur´ elien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. LLaMA: Open and efficient foundation language models. CoRR, abs/2302.13971, 2023a.
1319		26	82	Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton-Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aur´ elien Rodriguez, Robert Stojnic, Sergey Edunov, and Thomas Scialom. Llama 2: Open foundation and fine-tuned chat models. CoRR, abs/2307.09288, 2023b.
1320		26	83	Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. In NIPS, pp. 5998–6008, 2017.
1321		26	84	Binghai Wang, Rui Zheng, Lu Chen, Yan Liu, Shihan Dou, Caishuang Huang, Wei Shen, Senjie Jin, Enyu Zhou, Chenyu Shi, et al. Secrets of RLHF in large language models part II: Reward modeling. CoRR, abs/2401.06080, 2024a.
1322		26	85	Changhan Wang, Kyunghyun Cho, and Jiatao Gu. Neural machine translation with byte-level subwords. In AAAI, pp. 9154–9160. AAAI Press, 2020.
1323		26	86	Yubo Wang, Xueguang Ma, Ge Zhang, Yuansheng Ni, Abhranil Chandra, Shiguang Guo, Weiming Ren, Aaran Arulraj, Xuan He, Ziyan Jiang, Tianle Li, Max Ku, Kai Wang, Alex Zhuang, Rongqi Fan, Xiang Yue, and Wenhu Chen. MMLU-Pro: A more robust and challenging multi-task language understanding benchmark. CoRR, abs/2406.01574, 2024b.
1324		26	87	Zhilin Wang, Alexander Bukharin, Olivier Delalleau, Daniel Egert, Gerald Shen, Jiaqi Zeng, Oleksii Kuchaiev, and Yi Dong. HelpSteer2-Preference: Complementing ratings with preferences. CoRR, abs/2410.01257, 2024c.
1325		26	88	Colin White, Samuel Dooley, Manley Roberts, Arka Pal, Benjamin Feuer, Siddhartha Jain, Ravid Shwartz-Ziv, Neel Jain, Khalid Saifullah, Siddartha Naidu, Chinmay Hegde, Yann LeCun, Tom Goldstein, Willie Neiswanger, and Micah Goldblum. LiveBench: A challenging, contamination-free LLM benchmark. CoRR, abs/2406.19314, 2024.
1326		26	89	Hao Xiang, Bowen Yu, Hongyu Lin, Keming Lu, Yaojie Lu, Xianpei Han, Le Sun, Jingren Zhou, and Junyang Lin. Aligning large language models via self-steering optimization. CoRR, abs/2410.17131, 2024.
1327		26	90	Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, and Hao Ma. Effective long-context scaling of foundation models. CoRR, abs/2309.16039, 2023.
1328		26	91	An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Ma, Jianxin Yang, Jin Xu, Jingren Zhou, Jinze Bai, Jinzheng He, Junyang Lin, Kai Dang, Keming Lu, Keqin Chen, Kexin Yang, Mei Li, Mingfeng Xue, Na Ni, Pei Zhang, Peng Wang, Ru Peng, Rui Men, Ruize Gao, Runji Lin, Shijie Wang, Shuai Bai, Sinan Tan, Tianhang Zhu, Tianhao Li, Tianyu Liu, Wenbin Ge, Xiaodong Deng, Xiaohuan Zhou, Xingzhang Ren, Xinyu Zhang, Xipin Wei, Xuancheng Ren, Xuejing Liu, Yang Fan, Yang Yao, Yichang Zhang, Yu Wan, Yunfei Chu, Yuqiong Liu, Zeyu Cui, Zhenru Zhang, Zhifang Guo, and Zhihao Fan. Qwen2 technical report. CoRR, abs/2407.10671, 2024a.
1329		26	92	An Yang, Beichen Zhang, Binyuan Hui, Bofei Gao, Bowen Yu, Chengpeng Li, Dayiheng Liu, Jianhong Tu, Jingren Zhou, Junyang Lin, et al. Qwen2.5-Math technical report: Toward mathematical expert model via self-improvement. CoRR, abs/2409.12122, 2024b.
1330		26	93	Jian Yang, Jiaxi Yang, Ke Jin, Yibo Miao, Lei Zhang, Liqun Yang, Zeyu Cui, Yichang Zhang, Binyuan Hui, and Junyang Lin. Evaluating and aligning codellms on human preference. CoRR, abs/2412.05210, 2024c.
1331		26	94	Yinfei Yang, Yuan Zhang, Chris Tar, and Jason Baldridge. PAWS-X: A cross-lingual adversarial dataset for paraphrase identification. In EMNLP/IJCNLP (1), pp. 3685–3690. Association for Computational Linguistics, 2019.
1332		26	95	Alex Young, Bei Chen, Chao Li, Chengen Huang, Ge Zhang, Guanwei Zhang, Heng Li, Jiangcheng Zhu, Jianqun Chen, Jing Chang, Kaidong Yu, Peng Liu, Qiang Liu, Shawn Yue, Senbin Yang, Shiming Yang, Tao Yu, Wen Xie, Wenhao Huang, Xiaohui Hu, Xiaoyi Ren, Xinyao Niu, Pengcheng Nie, Yuchi Xu, Yudong Liu, Yue Wang, Yuxuan Cai, Zhenyu Gu, Zhiyuan Liu, and Zonghong Dai. Yi: Open foundation models by 01.AI. CoRR, abs/2403.04652, 2024.
1333		26	96	Tao Yuan, Xuefei Ning, Dong Zhou, Zhijie Yang, Shiyao Li, Minghui Zhuang, Zheyue Tan, Zhuyu Yao, Dahua Lin, Boxun Li, Guohao Dai, Shengen Yan, and Yu Wang. LV-Eval: A balanced long-context benchmark with 5 length levels up to 256K. CoRR, abs/2402.05136, 2024.
1334		26	97	Zheng Yuan, Hongyi Yuan, Chengpeng Li, Guanting Dong, Chuanqi Tan, and Chang Zhou. Scaling relationship on learning mathematical reasoning with large language models. CoRR, abs/2308.01825, 2023.
1335		26	98	Rowan Zellers, Ari Holtzman, Yonatan Bisk, Ali Farhadi, and Yejin Choi. HellaSwag: Can a machine really finish your sentence? In ACL (1), pp. 4791–4800. Association for Computational Linguistics, 2019.
1336		26	99	Yidan Zhang, Boyi Deng, Yu Wan, Baosong Yang, Haoran Wei, Fei Huang, Bowen Yu, Junyang Lin, and Jingren Zhou. P-MMEval: A parallel multilingual multitask benchmark for consistent evaluation of LLMs. CoRR, abs/2411.09116, 2024.
1337		26	100	Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric P. Xing, Hao Zhang, Joseph E. Gonzalez, and Ion Stoica. Judging LLM-as-a-judge with MT-Bench and Chatbot Arena. In NeurIPS, 2023.
1338		26	101	Enyu Zhou, Guodong Zheng, Bing Wang, Zhiheng Xi, Shihan Dou, Rong Bao, Wei Shen, Limao Xiong, Jessica Fan, Yurong Mou, Rui Zheng, Tao Gui, Qi Zhang, and Xuanjing Huang. RMB: Comprehensively benchmarking reward models in LLM alignment. CoRR, abs/2410.09893, 2024.
1339		26	102	Jeffrey Zhou, Tianjian Lu, Swaroop Mishra, Siddhartha Brahma, Sujoy Basu, Yi Luan, Denny Zhou, and Le Hou. Instruction-following evaluation for large language models. CoRR, abs/2311.07911, 2023.
1340		26	103	Barret Zoph, Irwan Bello, Sameer Kumar, Nan Du, Yanping Huang, Jeff Dean, Noam Shazeer, and William Fedus. ST-MoE: Designing stable and transferable sparse expert models. CoRR, abs/2202.08906, 2022.
1341		27	0	Qwen, :, A. Yang, B. Yang, B. Zhang, B. Hui, B. Zheng, B. Yu, C. Li, D. Liu, F. Huang, H. Wei, H. Lin, J. Yang, J. Tu, J. Zhang, J. Yang, J. Yang, J. Zhou, J. Lin, K. Dang, K. Lu, K. Bao, K. Yang, L. Yu, M. Li, M. Xue, P. Zhang, Q. Zhu, R. Men, R. Lin, T. Li, T. Tang, T. Xia, X. Ren, X. Ren, Y. Fan, Y. Su, Y. Zhang, Y. Wan, Y. Liu, Z. Cui, Z. Zhang, and Z. Qiu. Qwen2.5 technical report. 12 2024. URL https://arxiv.org/pdf/2412.15115.	https://qiita.com/kaizen_nagoya/items/f87fac7f9a83f54328fe
1342		27	1	https://huggingface.co/togethercomputer/StripedHyena-Hessian-7B
1343		27	2	Hyena Hierarchy: Towards Larger Convolutional Language Models Michael Poli, Stefano Massaroli, Eric Nguyen, Daniel Y. Fu, Tri Dao, Stephen Baccus, Yoshua Bengio, Stefano Ermon, Christopher Ré https://arxiv.org/abs/2302.10866
1344		27	3	Sequence modeling and design from molecular to genome scale with Evo Eric Nguyen∗,1,2, Michael Poli∗,3, Matthew G. Durrant∗, Armin W. Thomas1, Brian Kang1, Jeremy Sullivan Madelena Y. Ng1, Ashley Lewis1, Aman Patel1, Aaron Lou1 Stefano Ermon1,4, Stephen A. Baccus1, Tina Hernandez-Boussard1, Christopher Ré1 Patrick D. Hsu†,2,5, and Brian L. Hie†,1,2 https://www.biorxiv.org/content/10.1101/2024.02.27.582234v1.full.pdf
1345		27	4	Paving the way to efficient architectures: StripedHyena-7B, open source models offering a glimpse into a world beyond Transformers https://www.together.ai/blog/stripedhyena-7b
1346		27	5	https://api.together.xyz/signin?redirectUrl=%2Fplayground%2Ftogethercomputer%2FStripedHyena-Hessian-7B
1347		27	6	https://github.com/HazyResearch/flash-fft-conv
1348		28	0	Z. Shao, P. Wang, Q. Zhu, R. Xu, J. Song, X. Bi, H. Zhang, M. Zhang, Y. K. Li, Y. Wu, and D. Guo. Deepseekmath: Pushing the limits of mathematical reasoning in open language models, 2024. URL https://arxiv.org/abs/2402.03300.	https://qiita.com/kaizen_nagoya/items/2add7a80056850b9ce87
1349		28	1	R. Anil, S. Borgeaud, Y. Wu, J. Alayrac, J. Yu, R. Soricut, J. Schalkwyk, A. M. Dai, A. Hauth, K. Millican, D. Silver, S. Petrov, M. Johnson, I. Antonoglou, J. Schrittwieser, A. Glaese, J. Chen, E. Pitler, T. P. Lillicrap, A. Lazaridou, O. Firat, J. Molloy, M. Isard, P. R. Barham, T. Hennigan, B. Lee, F. Viola, M. Reynolds, Y. Xu, R. Doherty, E. Collins, C. Meyer, E. Rutherford, E. Moreira, K. Ayoub, M. Goel, G. Tucker, E. Piqueras, M. Krikun, I. Barr, N. Savinov, I. Danihelka, B. Roelofs, A. White, A. Andreassen, T. von Glehn, L. Yagati, M. Kazemi, L. Gonzalez, M. Khalman, J. Sygnowski, and et al. Gemini: A family of highly capable multimodal models. CoRR, abs/2312.11805, 2023. doi: 10.48550/ARXIV.2312.11805. URL https://doi.org/10.48550/arXiv.2312.11805.
1350		28	2	J. Austin, A. Odena, M. Nye, M. Bosma, H. Michalewski, D. Dohan, E. Jiang, C. Cai, M. Terry, Q. Le, et al. Program synthesis with large language models. arXiv preprint arXiv:2108.07732, 2021.
1351		28	3	Z. Azerbayev, H. Schoelkopf, K. Paster, M. D. Santos, S. McAleer, A. Q. Jiang, J. Deng, S. Biderman, and S. Welleck. Llemma: An open language model for mathematics. arXiv preprint arXiv:2310.10631, 2023.
1352		28	4	J. Bai, S. Bai, Y. Chu, Z. Cui, K. Dang, X. Deng, Y. Fan, W. Ge, Y. Han, F. Huang, et al. Qwen technical report. arXiv preprint arXiv:2309.16609, 2023. C. Burns, P. Izmailov, J. H. Kirchner, B. Baker, L. Gao, L. Aschenbrenner, Y. Chen, A. Ecoffet, M. Joglekar, J. Leike, et al. Weak-to-strong generalization: Eliciting strong capabilities with weak supervision. arXiv preprint arXiv:2312.09390, 2023.
1353		28	5	ChatGLM3 Team. Chatglm3 series: Open bilingual chat llms, 2023. URL https://github.com/THUDM/ChatGLM3. M. Chen, J. Tworek, H. Jun, Q. Yuan, H. P. de Oliveira Pinto, J. Kaplan, H. Edwards, Y. Burda, N. Joseph, G. Brockman, A. Ray, R. Puri, G. Krueger, M. Petrov, H. Khlaaf, G. Sastry, P. Mishkin, B. Chan, S. Gray, N. Ryder, M. Pavlov, A. Power, L. Kaiser, M. Bavarian, C. Winter, P. Tillet, F. P. Such, D. Cummings, M. Plappert, F. Chantzis, E. Barnes, A. Herbert-Voss, W. H. Guss, A. Nichol, A. Paino, N. Tezak, J. Tang, I. Babuschkin, S. Balaji, S. Jain, W. Saunders, C. Hesse, A. N. Carr, J. Leike, J. Achiam, V. Misra, E. Morikawa, A. Radford, M. Knight, M. Brundage,
1354		28	6	M. Murati, K. Mayer, P. Welinder, B. McGrew, D. Amodei, S. McCandlish, I. Sutskever, and W. Zaremba. Evaluating large language models trained on code. CoRR, abs/2107.03374, 2021. URL https://arxiv.org/abs/2107.03374.
1355		28	7	W. Chen, X. Ma, X. Wang, and W. W. Cohen. Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. CoRR, abs/2211.12588, 2022. doi: 10.48550/ARXIV.2211.12588. URL https://doi.org/10.48550/arXiv.2211.12588.
1356		28	8	K. Cobbe, V. Kosaraju, M. Bavarian, M. Chen, H. Jun, L. Kaiser, M. Plappert, J. Tworek, J. Hilton, R. Nakano, et al. Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168, 2021.
1357		28	9	T. Computer. Redpajama: an open dataset for training large language models, Oct. 2023. URL https://github.com/togethercomputer/RedPajama-Data.
1358		28	10	DeepSeek-AI. Deepseek LLM: scaling open-source language models with longtermism. CoRR,abs/2401.02954, 2024. doi: 10.48550/ARXIV.2401.02954. URL https://doi.org/10.48550/arXiv.2401.02954.
1359		28	11	Z. Du, Y. Qian, X. Liu, M. Ding, J. Qiu, Z. Yang, and J. Tang. Glm: General language model pretraining with autoregressive blank infilling. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 320–335, 2022.
1360		28	12	L. Gao, A. Madaan, S. Zhou, U. Alon, P. Liu, Y. Yang, J. Callan, and G. Neubig. PAL: program-aided language models. In A. Krause, E. Brunskill, K. Cho, B. Engelhardt, S. Sabato, and J. Scarlett, editors, International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, volume 202 of Proceedings of Machine Learning Research, pages 10764–10799. PMLR, 2023. URL https://proceedings.mlr.press/v202/gao23f.html.
1361		28	13	Z. Gou, Z. Shao, Y. Gong, Y. Shen, Y. Yang, M. Huang, N. Duan, and W. Chen. Tora: A tool-integrated reasoning agent for mathematical problem solving. CoRR, abs/2309.17452, 2023. doi: 10.48550/ARXIV.2309.17452. URL https://doi.org/10.48550/arXiv.2309.1745 2.
1362		28	14	D. Guo, Q. Zhu, D. Yang, Z. Xie, K. Dong, W. Zhang, G. Chen, X. Bi, Y. Wu, Y. K. Li, F. Luo, Y. Xiong, and W. Liang. Deepseek-coder: When the large language model meets programming – the rise of code intelligence, 2024.
1363		28	15	D. Hendrycks, C. Burns, S. Basart, A. Zou, M. Mazeika, D. Song, and J. Steinhardt. Measuring massive multitask language understanding. arXiv preprint arXiv:2009.03300, 2020.
1364		28	16	D. Hendrycks, C. Burns, S. Kadavath, A. Arora, S. Basart, E. Tang, D. Song, and J. Steinhardt. Measuring mathematical problem solving with the math dataset. arXiv preprint arXiv:2103.03874, 2021.
1365		28	17	High-flyer. Hai-llm: 高效且轻量的大模型训练工具, 2023. URL https://www.high-flyer.cn/en/blog/hai-llm.
1366		28	18	Inflection AI. Inflection-2, 2023. URL https://inflection.ai/inflection-2.
1367		28	19	A. Q. Jiang, S. Welleck, J. P. Zhou, W. Li, J. Liu, M. Jamnik, T. Lacroix, Y. Wu, and G. Lample. Draft, sketch, and prove: Guiding formal theorem provers with informal proofs. arXiv preprint arXiv:2210.12283, 2022.
1368		28	20	A. Q. Jiang, A. Sablayrolles, A. Mensch, C. Bamford, D. S. Chaplot, D. d. l. Casas, F. Bressand, G. Lengyel, G. Lample, L. Saulnier, et al. Mistral 7b. arXiv preprint arXiv:2310.06825, 2023.
1369		28	21	A. Joulin, E. Grave, P. Bojanowski, M. Douze, H. Jégou, and T. Mikolov. Fasttext. zip: Compressing text classification models. arXiv preprint arXiv:1612.03651, 2016.
1370		28	22	W. Kwon, Z. Li, S. Zhuang, Y. Sheng, L. Zheng, C. H. Yu, J. E. Gonzalez, H. Zhang, and I. Stoica. Efficient memory management for large language model serving with pagedattention. In Proceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles, 2023.
1371		28	23	Y. Leviathan, M. Kalman, and Y. Matias. Fast inference from transformers via speculative decoding. 2023. In International Conference on Machine Learning, pages 19274–19286. PMLR,
1372		28	24	A. Lewkowycz, A. Andreassen, D. Dohan, E. Dyer, H. Michalewski, V. Ramasesh, A. Slone, C. Anil, I. Schlag, T. Gutman-Solo, et al. Solving quantitative reasoning problems with language models. Advances in Neural Information Processing Systems, 35:3843–3857, 2022a.
1373		28	25	A. Lewkowycz, A. Andreassen, D. Dohan, E. Dyer, H. Michalewski, V. V. Ramasesh, A. Slone, C. Anil, I. Schlag, T. Gutman-Solo, Y. Wu, B. Neyshabur, G. Gur-Ari, and V. Misra. Solving quantitative reasoning problems with language models. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, 2022b. URL http://papers.nips.cc/paper_files/paper/2022/hash/18abbeef8cfe9203fdf9053c9c4fe191-Abstract-Conference.html.
1374		28	26	H. Lightman, V. Kosaraju, Y. Burda, H. Edwards, B. Baker, T. Lee, J. Leike, J. Schulman, I. Sutskever, and K. Cobbe. Let’s verify step by step. arXiv preprint arXiv:2305.20050, 2023.
1375		28	27	I. Loshchilov and F. Hutter. arXiv:1711.05101, 2017. Decoupled weight decay regularization. arXiv preprint
1376		28	28	H. Luo, Q. Sun, C. Xu, P. Zhao, J. Lou, C. Tao, X. Geng, Q. Lin, S. Chen, and D. Zhang. Wizardmath: Empowering mathematical reasoning for large language models via reinforced evol-instruct. arXiv preprint arXiv:2308.09583, 2023.
1377		28	29	S. Mishra, M. Finlayson, P. Lu, L. Tang, S. Welleck, C. Baral, T. Rajpurohit, O. Tafjord, A. Sabharwal, P. Clark, and A. Kalyan. LILA: A unified benchmark for mathematical reasoning. In Y. Goldberg, Z. Kozareva, and Y. Zhang, editors, Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, pages 5807–5832. Association for Computational Linguistics,2022. doi: 10.18653/V1/2022.EMNLP-MAIN.392. URL https://doi.org/10.18653/v1/2022.emnlp-main.392.
1378		28	30	X. Nguyen, W. Zhang, X. Li, M. M. Aljunied, Q. Tan, L. Cheng, G. Chen, Y. Deng, S. Yang, C. Liu, H. Zhang, and L. Bing. Seallms - large language models for southeast asia. CoRR, abs/2312.00738, 2023. doi: 10.48550/ARXIV.2312.00738. URL https://doi.org/10.48550/arXiv.2312.00738.
1379		28	31	OpenAI. GPT4 technical report. arXiv preprint arXiv:2303.08774, 2023.
1380		28	32	L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray, et al. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
1381		28	33	K. Paster, M. D. Santos, Z. Azerbayev, and J. Ba. Openwebmath: An open dataset of high-quality mathematical web text. CoRR, abs/2310.06786, 2023. doi: 10.48550/ARXIV.2310.06786. URL https://doi.org/10.48550/arXiv.2310.06786.
1382		28	34	L. C. Paulson. Three years of experience with sledgehammer, a practical link between automatic and interactive theorem provers. In R. A. Schmidt, S. Schulz, and B. Konev, editors, Proceedings of the 2nd Workshop on Practical Aspects of Automated Reasoning, PAAR-2010, Edinburgh, Scotland, UK, July 14, 2010, volume 9 of EPiC Series in Computing, pages 1–10. EasyChair, 2010. doi: 10.29007/TNFD. URL https://doi.org/10.29007/tnfd.
1383		28	35	S. Polu and I. Sutskever. Generative language modeling for automated theorem proving. CoRR, abs/2009.03393, 2020. URL https://arxiv.org/abs/2009.03393.
1384		28	36	R. Rafailov, A. Sharma, E. Mitchell, S. Ermon, C. D. Manning, and C. Finn. Direct preference optimization: Your language model is secretly a reward model. 2023.
1385		28	37	J. Schulman. Approximating kl divergence, 2020. URL http://joschu.net/blog/kl-approx.html.
1386		28	38	J. Schulman, P. Moritz, S. Levine, M. Jordan, and P. Abbeel. High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438, 2015.
1387		28	39	J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
1388		28	40	F. Shi, M. Suzgun, M. Freitag, X. Wang, S. Srivats, S. Vosoughi, H. W. Chung, Y. Tay, S. Ruder, D. Zhou, D. Das, and J. Wei. Language models are multilingual chain-of-thought reasoners. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023. URL https://openreview.net/pdf?id=fR3wGCk-IXp.
1389		28	41	F. Song, B. Yu, M. Li, H. Yu, F. Huang, Y. Li, and H. Wang. Preference ranking optimization for human alignment. arXiv preprint arXiv:2306.17492, 2023.
1390		28	42	M. Suzgun, N. Scales, N. Schärli, S. Gehrmann, Y. Tay, H. W. Chung, A. Chowdhery, Q. V. Le, E. H. Chi, D. Zhou, et al. Challenging big-bench tasks and whether chain-of-thought can solve them. arXiv preprint arXiv:2210.09261, 2022.
1391		28	43	T. Tao. Embracing change and resetting expectations, 2023. URL https://unlocked.microsoft.com/ai-anthology/terence-tao/.
1392		28	44	H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosale, D. Bikel, L. Blecher, C. Canton-Ferrer, M. Chen, G. Cucurull, D. Esiobu, J. Fernandes, J. Fu, W. Fu, B. Fuller, C. Gao, V. Goswami, N. Goyal, A. Hartshorn, S. Hosseini, R. Hou, H. Inan, M. Kardas, V. Kerkez, M. Khabsa, I. Kloumann, A. Korenev, P. S. Koura, M. Lachaux, T. Lavril, J. Lee, D. Liskovich, Y. Lu, Y. Mao, X. Martinet, T. Mihaylov, P. Mishra, I. Molybog, Y. Nie, A. Poulton, J. Reizenstein, R. Rungta, K. Saladi, A. Schelten, R. Silva, E. M. Smith, R. Subramanian, X. E. Tan, B. Tang, R. Taylor, A. Williams, J. X. Kuan, P. Xu, Z. Yan, I. Zarov, Y. Zhang, A. Fan, M. Kambadur, S. Narang, A. Rodriguez, R. Stojnic, S. Edunov, and T. Scialom. Llama 2: Open foundation and fine-tuned chat models. CoRR, abs/2307.09288,2023. doi: 10.48550/arXiv.2307.09288. URL https://doi.org/10.48550/arXiv.2307.09288.
1393		28	45	T. H. Trinh, Y. Wu, Q. V. Le, H. He, and T. Luong. Solving olympiad geometry without human demonstrations. Nature, 625(7995):476–482, 2024.
1394		28	46	P. Wang, L. Li, L. Chen, F. Song, B. Lin, Y. Cao, T. Liu, and Z. Sui. Making large language models better reasoners with alignment. arXiv preprint arXiv:2309.02144, 2023a.
1395		28	47	P. Wang, L. Li, Z. Shao, R. Xu, D. Dai, Y. Li, D. Chen, Y. Wu, and Z. Sui. Math-shepherd: Verify and reinforce llms step-by-step without human annotations. CoRR, abs/2312.08935, 2023b.
1396		28	48	Z. Wang, R. Xia, and P. Liu. Generative AI for math: Part I - mathpile: A billion-token-scale pretraining corpus for math. CoRR, abs/2312.17120, 2023c. doi: 10.48550/ARXIV.2312.17120. URL https://doi.org/10.48550/arXiv.2312.17120.
1397		28	49	J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. H. Chi, Q. V. Le, and D. Zhou. Chain-of-thought prompting elicits reasoning in large language models. In NeurIPS, 2022. URL http://papers.nips.cc/paper_files/paper/2022/hash/9d5609613524ecf4f15af0f7b31abca4-Abstract-Conference.html.
1398		28	50	T. Wei, J. Luan, W. Liu, S. Dong, and B. Wang. Cmath: Can your language model pass chinese elementary school math test?, 2023.
1399		28	51	M. Wenzel, L. C. Paulson, and T. Nipkow. The isabelle framework. In O. A. Mohamed, C. A. Muñoz, and S. Tahar, editors, Theorem Proving in Higher Order Logics, 21st International Conference, TPHOLs 2008, Montreal, Canada, August 18-21, 2008. Proceedings, volume 5170 of Lecture Notes in Computer Science, pages 33–38. Springer, 2008. doi: 10.1007/978-3-540-71067-7_7. URL https://doi.org/10.1007/978-3-540-71067-7_7.
1400		28	52	H. Xia, T. Ge, P. Wang, S.-Q. Chen, F. Wei, and Z. Sui. Speculative decoding: Exploiting speculative execution for accelerating seq2seq generation. In H. Bouamor, J. Pino, and K. Bali, editors, Findings of the Association for Computational Linguistics: EMNLP 2023, pages 3909–3925, Singapore, Dec. 2023. Association for Computational Linguistics. doi: 10.18653/v1/2023.findings-emnlp.257. URL https://aclanthology.org/2023.findings-emnlp.257.
1401		28	53	H. Xia, Z. Yang, Q. Dong, P. Wang, Y. Li, T. Ge, T. Liu, W. Li, and Z. Sui. Unlocking efficiency in large language model inference: A comprehensive survey of speculative decoding. arXiv preprint arXiv:2401.07851, 2024.
1402		28	54	S. Yao, D. Yu, J. Zhao, I. Shafran, T. L. Griffiths, Y. Cao, and K. Narasimhan. Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601, 2023.
1403		28	55	L. Yu, W. Jiang, H. Shi, J. Yu, Z. Liu, Y. Zhang, J. T. Kwok, Z. Li, A. Weller, and W. Liu. Metamath: Bootstrap your own mathematical questions for large language models. CoRR, abs/2309.12284, 2023. doi: 10.48550/ARXIV.2309.12284. URL https://doi.org/10.48550/arXiv.2309.12284.
1404		28	56	Z. Yuan, H. Yuan, C. Li, G. Dong, C. Tan, and C. Zhou. Scaling relationship on learning mathematical reasoning with large language models. arXiv preprint arXiv:2308.01825, 2023a.
1405		28	57	Z. Yuan, H. Yuan, C. Tan, W. Wang, S. Huang, and F. Huang. Rrhf: Rank responses to align language models with human feedback without tears. arXiv preprint arXiv:2304.05302, 2023b.
1406		28	58	X. Yue, X. Qu, G. Zhang, Y. Fu, W. Huang, H. Sun, Y. Su, and W. Chen. Mammoth: Building math generalist models through hybrid instruction tuning. CoRR, abs/2309.05653, 2023. doi: 10.48550/ARXIV.2309.05653. URL https://doi.org/10.48550/arXiv.2309.05653.
1407		28	59	K. Zheng, J. M. Han, and S. Polu. Minif2f: a cross-system benchmark for formal olympiad-level mathematics. arXiv preprint arXiv:2109.00110, 2021.
1408		28	60	W. Zhong, R. Cui, Y. Guo, Y. Liang, S. Lu, Y. Wang, A. Saied, W. Chen, and N. Duan. AGIEval: A human-centric benchmark for evaluating foundation models. CoRR, abs/2304.06364, 2023. doi: 10.48550/arXiv.2304.06364. URL https://doi.org/10.48550/arXiv.2304.06364.
1409		29	0	S. T. Sherry, M. H. Ward, M. Kholodov, J. Baker, L. Phan, E. M. Smigielski, and K. Sirotkin. dbsnp: the ncbi database of genetic variation. Nucleic Acids Research, 29:308–311, 1 2001. ISSN 0305-1048. doi: 10.1093/NAR/29.1.308. URL https://dx.doi.org/10.1093/nar/29.1.308.	https://qiita.com/kaizen_nagoya/items/756da32e4c0868d84da0
1410		29	1	1 Kruglyak,L. (1999) Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nature Genet., 22, 139–144.
1411		29	2	2 Carulli,J.P., Artinger,M., Swain,P.M., Root,C.D., Chee,L., Tulig,C., Guerin,J., Osborne,M., Stein,G., Lian,J. and Lomedico,P.T. (1998) High throughput analysis of differential gene expression. J. Cell. Biochem., 30–31 (Suppl.), 286–296.
1412		29	3	3 Cavalli-Sforza,L.L. (1998) The DNA revolution in population genetics. Trends Genet., 14, 60–65.
1413		29	4	4 Collins,F.S. (1999) Shattuck lecture–medical and societal consequences of the Human Genome Project. N. Engl. J. Med., 341, 28–37.
1414		29	5	5 Buetow,K.H., Edmonson,M.N. and Cassidy,A.B. (1999) Reliable identification of large numbers of candidate SNPs from public EST data. Nature Genet., 21, 323–325.
1415		29	6	6 Masood,E. (1999) As consortium plans free SNP map of human genome. Nature, 398, 545–546.
1416		29	7	7 Brookes,A.J., Lehväslaiho,H., Siegfried,M., Boehm,J.G., Yuan,Y.P., Sarkar,C.M., Bork,P. and Ortigao,F. (2000) HGBASE A Database of SNPs and other variations in and around human genes. Nucleic Acids Res., 28, 356–360.
1417		30	0	Z. Sondka, N. B. Dhir, D. Carvalho-Silva, S. Jupe, Madhumita, K. McLaren, M. Starkey, S. Ward, J. Wilding, M. Ahmed, J. Argasinska, D. Beare, M. S. Chawla, S. Duke, I. Fasanella, A. G. Neogi, S. Haller, B. Hetenyi, L. Hodges, A. Holmes, R. Lyne, T. Maurel, S. Nair, H. Pedro, A. Sangrador-Vegas, H. Schuilenburg, Z. Sheard, S. Y. Yong, and J. Teague. Cosmic: a curated database of somatic variants and clinical data for cancer. Nucleic Acids Research, 52:D1210–D1217, 1 2024. ISSN 0305-1048. doi: 10.1093/NAR/GKAD986. URL https://dx.doi.org/10.1093/nar/gkad986.	https://qiita.com/kaizen_nagoya/items/2b0960d4e1ff26a9b01f
1418		30	1	Chang K., Creighton C.J., Davis C., Donehower L., Drummond J., Wheeler D., Ally A., Balasundaram M., Birol I., Butterfield Y.S.N.et al. . The cancer genome atlas pan-cancer analysis project. Nat. Genet. 2013; 45:1113–1120.
1419		30	2	ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium Pan-cancer analysis of whole genomes. Nature. 2020; 578:82–93.
1420		30	3	Pugh T.J., Bell J.L., Bruce J.P., Doherty G.J., Galvin M., Green M.F., Hunter-Zinck H., Kumari P., Lenoue-Newton M.L., Li M.M.et al. . AACR Project GENIE: 100,000 cases and beyond. Cancer Discov. 2022; 12:2044–2057.
1421		30	4	Forbes S.A., Beare D., Boutselakis H., Bamford S., Bindal N., Tate J., Cole C.G., Ward S., Dawson E., Ponting L.et al. . COSMIC: somatic cancer genetics at high-resolution. Nucleic Acids Res. 2017; 45:D777–D783.
1422		30	5	Tate J.G., Bamford S., Jubb H.C., Sondka Z., Beare D.M., Bindal N., Boutselakis H., Cole C.G., Creatore C., Dawson E.et al. . COSMIC: the catalogue of somatic mutations In cancer. Nucleic Acids Res. 2019; 47:D941–D947.
1423		30	6	Bamford S., Dawson E., Forbes S., Clements J., Pettett R., Dogan A., Flanagan A., Teague J., Futreal P.A., Stratton M.R.et al. . The COSMIC (Catalogue of Somatic Mutations in Cancer) database and website. Br. J. Cancer. 2004; 91:355–358.
1424		30	7	Sondka Z., Bamford S., Cole C.G., Ward S.A., Dunham I., Forbes S.A. The COSMIC Cancer Gene Census: describing genetic dysfunction across all human cancers. Nat. Rev. Cancer. 2018; 18:696–705.
1425		30	8	Wilkinson M.D., Dumontier M., Aalbersberg I.J., Appleton G., Axton M., Baak A., Blomberg N., Boiten J.-W., da Silva Santos L.B., Bourne P.E.et al. . The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data. 2016; 3:160018.
1426		30	9	Hudson T.J., Anderson W., Aretz A., Barker A.D., Bell C., Bernabé R.R., Bhan M.K., Calvo F., Eerola I., Gerhard D.S.et al. . International network of cancer genome projects. Nature. 2010; 464:993–998.
1427		30	10	Seal R.L., Braschi B., Gray K., Jones T.E.M., Tweedie S., Haim-Vilmovsky L., Bruford E.A. Genenames.Org: the HGNC resources in 2023. Nucleic Acids Res. 2023; 51:D1003–D1009.
1428		30	11	Eilbeck K., Lewis S.E., Mungall C.J., Yandell M., Stein L., Durbin R., Ashburner M. The Sequence ontology: a tool for the unification of genome annotations. Genome Biol. 2005; 6:R44.
1429		30	12	den Dunnen J.T., Dalgleish R., Maglott D.R., Hart R.K., Greenblatt M.S., McGowan-Jordan J., Roux A.F., Smith T., Antonarakis S.E., Taschner P.E. HGVS recommendations for the description of sequence variants: 2016 update. Hum. Mutat. 2016; 37:564–569.
1430		30	13	Martin F.J., Amode M.R., Aneja A., Austine-Orimoloye O., Azov A.G., Barnes I., Becker A., Bennett R., Berry A., Bhai J.et al. . Ensembl 2023. Nucleic Acids Res. 2023; 51:D933–D941.
1431		30	14	Sioutos N., Coronado S.d., Haber M.W., Hartel F.W., Shaiu W.-L., Wright L.W. NCI Thesaurus: a semantic model integrating cancer-related clinical and molecular information. J. Biomed. Inform. 2007; 40:30–43.
1432		30	15	Frankish A., Diekhans M., Jungreis I., Lagarde J., Loveland J.E., Mudge J.M., Sisu C., Wright J.C., Armstrong J., Barnes I.et al. . Gencode 2021. Nucleic Acids Res. 2021; 49:D916–D923.
1433		30	16	McLaren W., Gil L., Hunt S.E., Riat H.S., Ritchie G.R.S., Thormann A., Flicek P., Cunningham F. The Ensembl variant effect predictor. Genome Biol. 2016; 17:122.
1434		30	17	Alexandrov L.B., Kim J., Haradhvala N.J., Huang M.N., Tian Ng A.W., Wu Y., Boot A., Covington K.R., Gordenin D.A., Bergstrom E.N.et al. . The repertoire of mutational signatures in human cancer. Nature. 2020; 578:94–101.
1435		30	18	Koh G., Degasperi A., Zou X., Momen S., Nik-Zainal S. Mutational signatures: emerging concepts, caveats and clinical applications. Nat. Rev. Cancer. 2021; 21:619–637.
1436		30	19	Islam S.M.A., Díaz-Gay M., Wu Y., Barnes M., Vangara R., Bergstrom E.N., He Y., Vella M., Wang J., Teague J.W.et al. . Uncovering novel mutational signatures by de novo extraction with SigProfilerExtractor. Cell Genomics. 2022; 2:https://doi.org/10.1016/j.xgen.2022.100179.
1437		30	20	Landrum M.J., Lee J.M., Benson M., Brown G.R., Chao C., Chitipiralla S., Gu B., Hart J., Hoffman D., Jang W.et al. . ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 2018; 46:D1062–D1067.
1438		30	21	Karczewski K.J., Francioli L.C., Tiao G., Cummings B.B., Alföldi J., Wang Q., Collins R.L., Laricchia K.M., Ganna A., Birnbaum D.P.et al. . The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020; 581:443–434.
1439		30	22	Martincorena I., Raine K.M., Gerstung M., Dawson K.J., Haase K., Van Loo P., Davies H., Stratton M.R., Campbell P.J. Universal patterns of selection in cancer and somatic tissues. Cell. 2017; 171:1029–1041.
1440		30	23	Cooper G.M., Stone E.A., Asimenos G., Green E.D., Batzoglou S., Sidow A. Distribution and intensity of constraint in mammalian genomic sequence. Genome Res. 2005; 15:901–913.
1441		30	24	Vaser R., Adusumalli S., Leng S.N., Sikic M., Ng P.C. SIFT missense predictions for genomes. Nat. Protoc. 2016; 11:1–9.
1442		30	25	Hanahan D. Hallmarks of cancer: new dimensions. Cancer Discov. 2022; 12:31–46.
1443		30	26	Hanahan D., Weinberg R.A. The hallmarks of cancer. Cell. 2000; 100:57–70.
1444		30	27	Hanahan D., Weinberg R.A. Hallmarks of cancer: the next generation. Cell. 2011; 144:646–674.
1445		30	28	Eisenhauer E.A., Therasse P., Bogaerts J., Schwartz L.H., Sargent D., Ford R., Dancey J., Arbuck S., Gwyther S., Mooney M.et al. . New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). Eur. J. Cancer. 2009; 45:228–247.
1446		30	29	Jubb H.C., Saini H.K., Verdonk M.L., Forbes S.A. COSMIC-3D provides structural perspectives on cancer genetics for drug discovery. Nat. Genet. 2018; 50:1200–1202.
1447		30	30	Ju D., Hui D., Hammond D.A., Wonkam A., Tishkoff S.A. Importance of including Non-European populations in large Human genetic studies to enhance precision medicine. Annu. Rev. Biomed. Data Sci. 2022; 5:321–339.
1448		30	31	Jia F., Teer J.K., Knepper T.C., Lee J.K., Zhou H.H., He Y.J., McLeod H.L. Discordance of somatic mutations between Asian and Caucasian patient populations with gastric cancer. Mol. Diagn. Ther. 2017; 21:179–185.
1449		31	0	J. Su, Y. Lu, S. Pan, A. Murtadha, B. Wen, and Y. Liu. Roformer: Enhanced transformer with rotary position embedding, 2023. URL https://arxiv.org/abs/2104.09864.	https://qiita.com/kaizen_nagoya/items/a12a45518f28a5133af2
1450		31	1	Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, and Yann N Dauphin. Convolutional sequence to sequence learning. In International Conference on Machine Learning, pages 1243–1252. PMLR, 2017.
1451		31	2	Md. Amirul Islam, Sen Jia, and Neil D. B. Bruce. How much position information do convolutional neural networks encode? ArXiv, abs/2001.08248, 2020.
1452		31	3	Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, L ukasz Kaiser, and Illia Polosukhin. Attention is all you need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. URL https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.
1453		31	4	J. Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT, 2019.
1454		31	5	A. Radford, Jeffrey Wu, R. Child, David Luan, Dario Amodei, and Ilya Sutskever. Language models are unsupervised multitask learners. 2019.
1455		31	6	Chulhee Yun, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank Reddi, and Sanjiv Kumar. Are transformers universal approximators of sequence-to-sequence functions? In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=ByxRM0Ntvr.
1456		31	7	Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. Albert: A lite bert for self-supervised learning of language representations. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=H1eA7AEtvS.
1457		31	8	Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning. ELECTRA: Pre-training text encoders as discriminators rather than generators. In ICLR, 2020. URL https://openreview.net/pdf?id=r1xMH1BtvB.
1458		31	9	A. Radford and Karthik Narasimhan. Improving language understanding by generative pre-training. 2018.
1459		31	10	Ankur P. Parikh, Oscar Täckström, Dipanjan Das, and Jakob Uszkoreit. A decomposable attention model for natural language inference. In EMNLP, 2016.
1460		31	11	Peter Shaw, Jakob Uszkoreit, and Ashish Vaswani. Self-attention with relative position representations. In NAACL-HLT, 2018.
1461		31	12	Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Noam Shazeer, I. Simon, C. Hawthorne, Andrew M. Dai, M. Hoffman, M. Dinculescu, and D. Eck. Music transformer. arXiv: Learning, 2018.
1462		31	13	Zihang Dai, Z. Yang, Yiming Yang, J. Carbonell, Quoc V. Le, and R. Salakhutdinov. Transformer-xl: Attentive language models beyond a fixed-length context. In ACL, 2019.
1463		31	14	Z. Yang, Zihang Dai, Yiming Yang, J. Carbonell, R. Salakhutdinov, and Quoc V. Le. Xlnet: Generalized autoregressive pretraining for language understanding. In NeurIPS, 2019.
1464		31	15	Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, W. Li, and Peter J. Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21: 140:1–140:67, 2020.
1465		31	16	Guolin Ke, Di He, and T. Liu. Rethinking positional encoding in language pre-training. ArXiv, abs/2006.15595, 2020.
1466		31	17	Pengcheng He, Xiaodong Liu, Jianfeng Gao, and Weizhu Chen. Deberta: Decoding-enhanced bert with disentangled attention. ArXiv, abs/2006.03654, 2020.
1467		31	18	Zhiheng Huang, Davis Liang, Peng Xu, and Bing Xiang. Improve transformer models with better relative position embeddings. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 3327–3335, Online, November 2020. Association for Computational Linguistics. doi:10.18653/v1/2020.findings-emnlp.298. URL https://www.aclweb.org/anthology/2020.findings-emnlp.298.
1468		31	19	Xuanqing Liu, Hsiang-Fu Yu, Inderjit S. Dhillon, and Cho-Jui Hsieh. Learning to encode position for transformer with continuous dynamical model. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, volume 119 of Proceedings of Machine Learning Research, pages 6327–6335. PMLR, 2020. URL http://proceedings.mlr.press/v119/liu20n.html.
1469		31	20	Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural ordinary differential equations. In Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett, editors, Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, pages 6572–6583, 2018a. URL https://proceedings.neurips.cc/paper/2018/hash/69386f6bb1dfed68692a24c8686939b9-Abstract.html.
1470		31	21	Benyou Wang, Donghao Zhao, Christina Lioma, Qiuchi Li, Peng Zhang, and Jakob Grue Simonsen. Encoding word order in complex embeddings. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=Hke-WTVtwr.
1471		31	22	Angelos Katharopoulos, Apoorv Vyas, Nikolaos Pappas, and François Fleuret. Transformers are rnns: Fast autoregressive transformers with linear attention. In International Conference on Machine Learning, pages 5156–5165. PMLR, 2020.
1472		31	23	Zhuoran Shen, Mingyuan Zhang, Haiyu Zhao, Shuai Yi, and Hongsheng Li. Efficient attention: Attention with linear complexities. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 3531–3539, 2021.
1473		31	24	Amapreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel Bowman. Glue: A multi-task benchmark and analysis platform for natural language understanding. 04 2018.
1474		31	25	Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, A. Gane, Tamás Sarlós, Peter Hawkins, J. Davis, Afroz Mohiuddin, Lukasz Kaiser, David Belanger, Lucy J. Colwell, and Adrian Weller. Rethinking attention with performers. ArXiv, abs/2009.14794, 2020.
1475		31	26	Ondrej Bojar, Christian Buck, Christian Federmann, Barry Haddow, Philipp Koehn, Johannes Leveling, Christof Monz, Pavel Pecina, Matt Post, Herve Saint-Amand, Radu Soricut, Lucia Specia, and Alevs Tamchyna. Findings of the 2014 workshop on statistical machine translation. pages 12–58, 06 2014. doi:10.3115/v1/W14-3302.
1476		31	27	Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural machine translation of rare words with subword units. 08 2015.
1477		31	28	Myle Ott, Sergey Edunov, Alexei Baevski, Angela Fan, Sam Gross, Nathan Ng, David Grangier, and Michael Auli. fairseq: A fast, extensible toolkit for sequence modeling. pages 48–53, 01 2019. doi:10.18653/v1/N19-4009.
1478		31	29	Kishore Papineni, Salim Roukos, Todd Ward, and Wei Jing Zhu. Bleu: a method for automatic evaluation of machine translation. 10 2002. doi:10.3115/1073083.1073135.
1479		31	30	Yukun Zhu, Ryan Kiros, Richard Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In arXiv preprint arXiv:1506.06724, 2015.
1480		31	31	Wikimedia Foundation. Wikimedia downloads, https://dumps.wikimedia.org, 2021.
1481		31	32	Ilya Loshchilov and Frank Hutter. Decoupled Weight Decay Regularization. arXiv e-prints, art. arXiv:1711.05101,November 2017.
1482		31	33	William B. Dolan and Chris Brockett. Automatically constructing a corpus of sentential paraphrases. In Proceedings of the Third International Workshop on Paraphrasing (IWP2005), 2005. URL https://www.aclweb.org/anthology/I05-5002.
1483		31	34	Richard Socher, A. Perelygin, J.Y. Wu, J. Chuang, C.D. Manning, A.Y. Ng, and C. Potts. Recursive deep models for semantic compositionality over a sentiment treebank. EMNLP, 1631:1631–1642, 01 2013.
1484		31	35	Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. Squad: 100,000+ questions for machine comprehension of text. pages 2383–2392, 01 2016. doi:10.18653/v1/D16-1264.
1485		31	36	Hussein Al-Natsheh. Udl at semeval-2017 task 1: Semantic textual similarity estimation of english sentence pairs using regression model over pairwise features. 08 2017.
1486		31	37	Z. Chen, H. Zhang, and L. Zhang, X.and Zhao. Quora question pairs., 2018b. URL https://www.kaggle.com/c/quora-question-pairs.
1487		31	38	Adina Williams, Nikita Nangia, and Samuel Bowman. A broad-coverage challenge corpus for sentence understanding through inference. pages 1112–1122, 01 2018. doi:10.18653/v1/N18-1101.
1488		31	39	Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online, October 2020. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/2020.emnlp-demos.6.
1489		31	40	Matt Mahoney. Large text compression benchmark, http://www.mattmahoney.net/dc/text.html, 2006.
1490		31	41	Jianlin Su. Wobert: Word-based chinese bert model - zhuiyiai. Technical report, 2020. URL https://github.com/ZhuiyiTechnology/WoBERT.
1491		31	42	Victor Junqiu Wei, Xiaozhe Ren, Xiaoguang Li, Wenyong Huang, Yi Liao, Yasheng Wang, Jiashu Lin, Xin Jiang, Xiao Chen, and Qun Liu. Nezha: Neural contextualized representation for chinese language understanding. 08 2019.
1492		31	43	Chaojun Xiao, Haoxi Zhong, Zhipeng Guo, Cunchao Tu, Zhiyuan Liu, Maosong Sun, Tianyang Zhang, Xianpei Han, Zhen hu, Heng Wang, and Jianfeng Xu. Cail2019-scm: A dataset of similar case matching in legal domain. 11 2019.
1493		32	0	E. Wang, S. Schmidgall, P. F. Jaeger, F. Zhang, R. Pilgrim, Y. Matias, J. Barral, D. Fleet, and S. Azizi. Txgemma: Efficient and agentic llms for therapeutics. https://arxiv.org/pdf/2504.06196	https://qiita.com/kaizen_nagoya/items/e4eff5d51f926e943b9e
1494		32	1	Chen, J., Hu, Y., Wang, Y., Lu, Y., Cao, X., Lin, M., Xu, H., Wu, J., Xiao, C., Sun, J., et al. TrialBench: Multi-modal artificial intelligence-ready clinical trial datasets. arXiv preprint arXiv:2407.00631 (2024).
1495		32	2	Kuo, K.-T., Mao, T.-L., Jones, S., Veras, E., Ayhan, A., Wang, T.-L., Glas, R., Slamon, D., Velculescu, V. E., Kuman, R. J., et al. Frequent activating mutations of PIK3CA in ovarian clear cell carcinoma. The American journal of pathology 174, 1597–1601 (2009).
1496		32	3	Leontiadou, H., Galdadas, I., Athanasiou, C. & Cournia, Z. Insights into the mechanism of the PIK3CA E545K activating mutation using MD simulations. Scientific reports 8, 15544 (2018).
1497		32	4	Chen, H., Si, Y., Wen, J., Hu, C., Xia, E., Wang, Y. & Wang, O. P110αinhibitor alpelisib exhibits a synergistic effect with pyrotinib and reverses pyrotinib resistant in HER2+ breast cancer. Neoplasia 43, 100913 (2023).
1498		32	5	Fritsch, C., Huang, A., Chatenay-Rivauday, C., Schnell, C., Reddy, A., Liu, M., Kauffmann, A., Guthy, D., Erdmann, D., De Pover, A., et al. Characterization of the novel and specific PI3Kα inhibitor NVP-BYL719 and development of the patient stratification strategy for clinical trials. Molecular cancer therapeutics 13, 1117–1129 (2014).
1499		32	6	Narayan, P., Prowell, T. M., Gao, J. J., Fernandes, L. L., Li, E., Jiang, X., Qiu, J., Fan, J., Song, P., Yu, J., et al. FDA approval summary: alpelisib plus fulvestrant for patients with HR-positive, HER2-negative, PIK3CA-mutated, advanced or metastatic breast cancer. Clinical Cancer Research 27, 1842–1849 (2021).
1500		32	7	Passarelli, A., Carbone, V., Pignata, S., Mazzeo, R., Lorusso, D., Scambia, G., Canova, S., Di Palma, T., Tasca, G., Mantiero, M., et al. Alpelisib for PIK3CA-mutated advanced gynecological cancers: first clues of clinical activity. Gynecologic Oncology 183, 61–67 (2024).
1501		32	8	Thibault, B., Thole, A., D’Angelo, R., Basset, C. & Guillermet-Guibert, J. PI3Kα-specific inhibitor BYL-719 synergizes with cisplatin in vitro in PIK3CA-mutated ovarian cancer cells. Scientific Reports 15, 6265 (2025).
1502		32	9	Hu, X., Xia, M., Wang, J., Yu, H., Chai, J., Zhang, Z., Sun, Y., Su, J. & Sun, L. Dual PI3K/mTOR inhibitor PKI-402 suppresses the growth of ovarian cancer cells by degradation of Mcl-1 through autophagy. Biomedicine & Pharmacotherapy 129, 110397 (2020).
1503		32	10	Turon, G., Hlozek, J., Woodland, J. G., Kumar, A., Chibale, K. & Duran-Frigola, M. First fully-automated AI/ML virtual screening cascade implemented at a drug discovery centre in Africa. Nature Communications 14, 5736 (2023).
1504		32	11	Fontenot, R., Kathad, U., McDermott, J., Sturtevant, D., Sharma, P. & Carr, P. Predicting a Compounds Blood-Brain-Barrier Permeability with Lantern Pharma’s AI and ML Platform, RADR 2023.
1505		32	12	Bera, S., Dent, J., Gill, G., Stolman, A. & Wu, B. SimGCN for TDC Benchmarks (2022).
1506		32	13	Plonka, W., Stork, C., Šícho, M. & Kirchmair, J. CYPlebrity: Machine learning models for the prediction of inhibitors of cytochrome P450 enzymes. Bioorganic & medicinal chemistry 46, 116388 (2021).
1507		32	14	Hu, W., Liu, B., Gomes, J., Zitnik, M., Liang, P., Pande, V. & Leskovec, J. Strategies for pre-training graph neural networks. arXiv preprint arXiv:1905.12265 (2019).
1508		32	15	Huang, K., Fu, T., Glass, L. M., Zitnik, M., Xiao, C. & Sun, J. DeepPurpose: a deep learning library for drug–target interaction prediction. Bioinformatics 36, 5545–5547 (2020).
1509		32	16	Lagunin, A., Filimonov, D., Zakharov, A., Xie, W., Huang, Y., Zhu, F., Shen, T., Yao, J. & Poroikov, V. Computer-aided prediction of rodent carcinogenicity by PASS and CISOC-PSCT. QSAR & Combinatorial Science 28, 806–810 (2009).
1510		32	17	Li, P., Li, Y., Hsieh, C.-Y., Zhang, S., Liu, X., Liu, H., Song, S. & Yao, X. TrimNet: learning molecular representation from triplet messages for biomedicine. Briefings in Bioinformatics 22, bbaa266 (2021).
1511		32	18	Huang, D., Chowdhuri, S. R., Li, A., Li, A., Agrawal, A., Gano, K. & Zhu, A. A Unified System for Molecular Property Predictions: Oloren ChemEngine and its Applications (2022).
1512		32	19	Li, J., Cai, D. & He, X. Learning graph-level representation for drug discovery. arXiv preprint arXiv:1709.03741 (2017).
1513		32	20	Raimondi, D., Simm, J., Arany, A. & Moreau, Y. A novel method for data fusion over entity-relation graphs and its application to protein–protein interaction prediction. Bioinformatics 37, 2275–2281 (2021).
1514		32	21	Gfeller, D., Schmidt, J., Croce, G., Guillaume, P., Bobisse, S., Genolet, R., Queiroz, L., Cesbron, J., Racle, J. & Harari, A. Improved predictions of antigen presentation and TCR recognition with MixMHCpred2. 2 and PRIME2. 0 reveal potent SARS-CoV-2 CD8+ T-cell epitopes. Cell Systems 14, 72–83 (2023).
1515		32	22	Motmaen, A., Dauparas, J., Baek, M., Abedi, M. H., Baker, D. & Bradley, P. Peptide-binding specificity prediction using fine-tuned protein structure prediction networks. Proceedings of the National Academy of Sciences 120, e2216697120 (2023).
1516		32	23	Siramshetty, V., Williams, J., Nguyen, Ð., Neyra, J., Southall, N., Mathé, E., Xu, X. & Shah, P. Validating ADME QSAR models using marketed drugs. SLAS DISCOVERY: Advancing the Science of Drug Discovery 26, 1326–1336 (2021).
1517		32	24	Haneczok, J. & Delijewski, M. Machine learning enabled identification of potential SARS-CoV-2 3CLpro inhibitors based on fixed molecular fingerprints and Graph-CNN neural representations. Journal of Biomedical Informatics 119, 103821 (2021).
1518		32	25	Liu, Y., Wu, Y., Shen, X. & Xie, L. COVID-19 multi-targeted drug repurposing using few-shot learning. Frontiers in Bioinformatics 1, 693177 (2021).
1519		32	26	Chen, X., Dougherty, T., Hong, C., Schibler, R., Zhao, Y. C., Sadeghi, R., Matasci, N., Wu, Y.-C. & Kerman, I. Predicting antibody developability from sequence using machine learning. biorxiv, 2020–06 (2020).
1520		32	27	Alves, V. M., Muratov, E., Fourches, D., Strickland, J., Kleinstreuer, N., Andrade, C. H. & Tropsha, A. Predicting chemically-induced skin reactions. Part I: QSAR models of skin sensitization and their application to identify potentially hazardous compounds. Toxicology and applied pharmacology 284, 262–272 (2015).
1521		32	28	Shermukhamedov, S., Mamurjonova, D. & Probst, M. Structure to Property: Chemical Element Embeddings and a Deep Learning Approach for Accurate Prediction of Chemical Properties. arXiv preprint arXiv:2309.09355 (2023).
1522		32	29	29. Vu, O., Mendenhall, J., Altarawy, D. & Meiler, J. BCL:: Mol2D—a robust atom environment descriptor for QSAR modeling and lead optimization. Journal of computer-aided molecular design 33, 477–486 (2019).
1523		32	30	30. Karim, A., Lee, M., Balle, T. & Sattar, A. CardioTox net: a robust predictor for hERG channel blockade based on deep learning meta-feature ensembles. Journal of Cheminformatics 13, 1–13 (2021).
1524		32	31	31. Korotcov, A., Tkachenko, V., Russo, D. P. & Ekins, S. Comparison of deep learning with multiple machine learning methods and metrics using diverse drug discovery data sets. Molecular pharmaceutics 14, 4462–4475 (2017).
1525		32	32	32. Wong, L., You, Z.-H., Guo, Z.-H., Yi, H.-C., Chen, Z.-H. & Cao, M.-Y. MIPDH: a novel computational model for predicting microRNA–mRNA interactions by DeepWalk on a heterogeneous network. ACS omega 5, 17022–17032 (2020).
1526		32	33	33. Fu, T., Huang, K., Xiao, C., Glass, L. M. & Sun, J. Hint: Hierarchical interaction network for clinical-trial-outcome predictions. Patterns 3 (2022).
1527		32	34	34. Weber, A., Born, J. & Rodriguez Martínez, M. TITAN: T-cell receptor specificity prediction with bimodal attention networks. Bioinformatics 37, i237–i244 (2021).
1528		32	35	35. Lam, H. T., Sbodio, M. L., Galindo, M. M., Zayats, M., Fernandez-Diaz, R., Valls, V., Picco, G., Ramis, C. B. & Lopez, V. Otter-Knowledge: benchmarks of multimodal knowledge graph representation learning from different sources for drug discovery. arXiv preprint arXiv:2306.12802 (2023).
1529		32	36	36. Kinnings, S. L., Liu, N., Tonge, P. J., Jackson, R. M., Xie, L. & Bourne, P. E. A machine learning-based method to improve docking scoring functions and its application to drug repurposing. Journal of chemical information and modeling 51, 408–419 (2011).
1530		32	37	37. Kalemati, M., Zamani Emani, M. & Koohi, S. BiComp-DTA: Drug-target binding affinity prediction through complementary biological-related and compression-based featurization approach. PLOS Computational Biology 19, e1011036 (2023).
1531		32	38	38. 39. Wei, B. & Gong, X. DeepPLA: a novel deep learning-based model for protein-ligand binding affinity prediction (2021).
1532		32	39	Probst, D., Schwaller, P. & Reymond, J.-L. Reaction classification and yield prediction using the differential reaction fingerprint DRFP. Digital discovery 1, 91–97 (2022).
1533		32	40	40. Rivera, Z. A., Tayo, L., Chen, B.-Y. & Tsai, P.-W. In silico Evaluation of the Feasibility of Magnolia officinalis Electron-shuttling Compounds as Parkinson’s Disease Remedy. Letters in Drug Design & Discovery 21, 3039–3048 (2024).
1534		32	41	41. Pei, Q., Wu, L., Zhu, J., Xia, Y., Xie, S., Qin, T., Liu, H., Liu, T.-Y. & Yan, R. Breaking the barriers of data scarcity in drug–target affinity prediction. Briefings in Bioinformatics 24, bbad386 (2023).
1535		32	42	42. Xia, F., Shukla, M., Brettin, T., Garcia-Cardona, C., Cohn, J., Allen, J. E., Maslov, S., Holbeck, S. L., Doroshow, J. H., Evrard, Y. A., et al. Predicting tumor cell line response to drug pairs with deep learning. BMC bioinformatics 19, 71–79 (2018).
1536		32	43	43. Lind, A. P. & Anderson, P. C. Predicting drug activity against cancer cells by random forest models based on minimal genomic information and chemical properties. PloS one 14, e0219774 (2019).
1537		32	44	44. Euclia. https://github.com/euclia/public-models. 2023.
1538		32	45	45. Leenay, R. T., Aghazadeh, A., Hiatt, J., Tse, D., Roth, T. L., Apathy, R., Shifrut, E., Hultquist, J. F., Krogan, N., Wu, Z., et al. Large dataset enables prediction of repair after CRISPR–Cas9 editing in primary T cells. Nature biotechnology 37,1034–1037 (2019).
1539		32	46	46. Yang, K., Swanson, K., Jin, W., Coley, C., Eiden, P., Gao, H., Guzman-Perez, A., Hopper, T., Kelley, B., Mathea, M., et al. Analyzing learned molecular representations for property prediction. Journal of chemical information and modeling 59, 3370–3388 (2019).
1540		32	47	47. Preuer, K., Lewis, R. P., Hochreiter, S., Bender, A., Bulusu, K. C. & Klambauer, G. DeepSynergy: predicting anti-cancer drug synergy with Deep Learning. Bioinformatics 34, 1538–1546 (2018).
1541		32	48	48. Zheng, S., Rao, J., Zhang, Z., Xu, J. & Yang, Y. Predicting retrosynthetic reactions using self-corrected transformer neural networks. Journal of chemical information and modeling 60, 47–55 (2019).
1542		32	49	49. Boral, N., Ghosh, P., Goswami, A. & Bhattacharyya, M. Accountable prediction of drug ADMET Properties with molecular descriptors. bioRxiv, 2022–06 (2022).
1543		32	50	50. Hendrycks, D., Burns, C., Basart, S., Zou, A., Mazeika, M., Song, D. & Steinhardt, J. Measuring massive multitask language understanding. arXiv preprint arXiv:2009.03300 (2020).
1544		33	0	A. Yang, B. Yang, B. Hui, B. Zheng, B. Yu, C. Zhou, C. Li, C. Li, D. Liu, F. Huang, G. Dong, H. Wei, H. Lin, J. Tang, J. Wang, J. Yang, J. Tu, J. Zhang, J. Ma, J. Xu, J. Zhou, J. Bai, J. He, J. Lin, K. Dang, K. Lu, K. Chen, K. Yang, M. Li, M. Xue, N. Ni, P. Zhang, P. Wang, R. Peng, R. Men, R. Gao, R. Lin, S. Wang, S. Bai, S. Tan, T. Zhu, T. Li, T. Liu, W. Ge, X. Deng, X. Zhou, X. Ren, X. Zhang, X. Wei, X. Ren, Y. Fan, Y. Yao, Y. Zhang, Y. Wan, Y. Chu, Y. Liu, Z. Cui, Z. Zhang, and Z. Fan. Qwen2 technical report. arXiv preprint arXiv:2407.10671, 2024.https://arxiv.org/pdf/2407.10671	https://qiita.com/kaizen_nagoya/items/29a77b25282c8822011e
1545		33	1	Marah Abdin, Jyoti Aneja, Sebastien Bubeck, Caio C´ esar Teodoro Mendes, Weizhu Chen, Allie Del　Giorno, Ronen Eldan, Sivakanth Gopi, Suriya Gunasekar, Mojan Javaheripi, Piero Kauffmann,　Yin Tat Lee, Yuanzhi Li, Anh Nguyen, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah,　Michael Santacroce, Harkirat Singh Behl, Adam Taumann Kalai, Xin Wang, Rachel Ward, Philipp　Witte, Cyril Zhang, and Yi Zhang. Phi-2: The surprising power of small language models,　2024. URL https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/.
1546		33	2	AI@Meta. Llama 3 model card, 2024. URL https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md.
1547		33	3	Joshua Ainslie, James Lee-Thorp, Michiel de Jong, Yury Zemlyanskiy, Federico Lebr´ on, and Sumit　Sanghai. GQA: Training generalized multi-query Transformer models from multi-head checkpoints.　In EMNLP, pp. 4895–4901. Association for Computational Linguistics, 2023.
1548		33	4	Chenxin An, Fei Huang, Jun Zhang, Shansan Gong, Xipeng Qiu, Chang Zhou, and Lingpeng Kong.　Training-free long-context scaling of large language models. CoRR, abs/2402.17463, 2024.
1549		33	5	Anthropic. The Claude 3 model family: Opus, Sonnet, Haiku. Technical report, Anthropic, AI, 2024. URL https://www-cdn.anthropic.com/de8ba9b01c9ab7cbabf5c33b80b7bbc618857627/Model_Card_Claude_3.pdf.
1550		33	6	Jacob Austin, Augustus Odena, Maxwell I. Nye, Maarten Bosma, Henryk Michalewski, David Dohan,　Ellen Jiang, Carrie J. Cai, Michael Terry, Quoc V. Le, and Charles Sutton. Program synthesis with　large language models. CoRR, abs/2108.07732, 2021.
1551		33	7	Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge,　Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu,　Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan,　Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng Xu, Jin　Xu, An Yang, Hao Yang, Jian Yang, Shusheng Yang, Yang Yao, Bowen Yu, Hongyi Yuan, Zheng　Yuan, Jianwei Zhang, Xingxuan Zhang, Yichang Zhang, Zhenru Zhang, Chang Zhou, Jingren　Zhou, Xiaohuan Zhou, and Tianhang Zhu. Qwen technical report. CoRR, abs/2309.16609, 2023a.
1552		33	8	Jinze Bai, Shuai Bai, Shusheng Yang, Shijie Wang, Sinan Tan, Peng Wang, Junyang Lin, Chang　Zhou, and Jingren Zhou. Qwen-VL: A frontier large vision-language model with versatile abilities.　CoRR, abs/2308.12966, 2023b.
1553		33	9	Yuntao Bai, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, Andy Jones,　Anna Chen, Anna Goldie, Azalia Mirhoseini, Cameron McKinnon, Carol Chen, Catherine Olsson,　Christopher Olah, Danny Hernandez, Dawn Drain, Deep Ganguli, Dustin Li, Eli Tran-Johnson,　Ethan Perez, Jamie Kerr, Jared Mueller, Jeffrey Ladish, Joshua Landau, Kamal Ndousse, Kamile　Lukosiute, Liane Lovitt, Michael Sellitto, Nelson Elhage, Nicholas Schiefer, Noem´ ı Mercado, Nova　DasSarma, Robert Lasenby, Robin Larson, Sam Ringer, Scott Johnston, Shauna Kravec, Sheer El　Showk, Stanislav Fort, Tamera Lanham, Timothy Telleen-Lawton, Tom Conerly, Tom Henighan,Tristan Hume, Samuel R. Bowman, Zac Hatfield-Dodds, Ben Mann, Dario Amodei, Nicholas　Joseph, Sam McCandlish, Tom Brown, and Jared Kaplan. Constitutional AI: Harmlessness from　AI feedback. CoRR, abs/2212.08073, 2022.
1554		33	10	Lucas Bandarkar, Davis Liang, Benjamin Muller, Mikel Artetxe, Satya Narayan Shukla, Donald Husa,　Naman Goyal, Abhinandan Krishnan, Luke Zettlemoyer, and Madian Khabsa. The Belebele benchmark: A parallel reading comprehension dataset in 122 language variants. CoRR, abs/2308.16884,　2023.
1555		33	11	Boxi Cao, Keming Lu, Xinyu Lu, Jiawei Chen, Mengjie Ren, Hao Xiang, Peilin Liu, Yaojie Lu, Ben　He, Xianpei Han, Le Sun, Hongyu Lin, and Bowen Yu. Towards scalable automated alignment of　LLMs: A survey. CoRR, abs/2406.01252, 2024.
1556		33	12	Federico Cassano, John Gouwar, Daniel Nguyen, Sydney Nguyen, Luna Phipps-Costin, Donald　Pinckney, Ming-Ho Yee, Yangtian Zi, Carolyn Jane Anderson, Molly Q. Feldman, Arjun Guha, Michael Greenberg, and Abhinav Jangda. MultiPL-E: A scalable and polyglot approach to benchmarking neural code generation. IEEE Trans. Software Eng., 49(7):3675–3691, 2023.
1557		33	13	Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Pond´ e de Oliveira Pinto, Jared Kaplan, Harrison Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shantanu Jain, William Saunders, Christopher Hesse, Andrew N. Carr, Jan Leike, Joshua Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, and Wojciech Zaremba. Evaluating large language models trained on code. CoRR, abs/2107.03374, 2021.
1558		33	14	Wenhu Chen, Ming Yin, Max Ku, Pan Lu, Yixin Wan, Xueguang Ma, Jianyu Xu, Xinyi Wang, and Tony Xia. TheoremQA: A theorem-driven question answering dataset. In EMNLP, pp. 7889–7901. Association for Computational Linguistics, 2023a.
1559		33	15	Zhihong Chen, Shuo Yan, Juhao Liang, Feng Jiang, Xiangbo Wu, Fei Yu, Guiming Hardy Chen, Junying Chen, Hongbo Zhang, Li Jianquan, Wan Xiang, and Benyou Wang. Multilingual- SIFT: Multilingual supervised instruction fine-tuning, 2023b. URL https://github.com/FreedomIntelligence/MultilingualSIFT.
1560		33	16	Wei-Lin Chiang, Lianmin Zheng, Ying Sheng, Anastasios Nikolas Angelopoulos, Tianle Li, Dacheng Li, Hao Zhang, Banghua Zhu, Michael I. Jordan, Joseph E. Gonzalez, and Ion Stoica. Chatbot arena: An open platform for evaluating LLMs by human preference. CoRR, abs/2403.04132, 2024.
1561		33	17	Yunfei Chu, Jin Xu, Xiaohuan Zhou, Qian Yang, Shiliang Zhang, Zhijie Yan, Chang Zhou, and Jingren Zhou. Qwen-Audio: Advancing universal audio understanding via unified large-scale audio-language models. CoRR, abs/2311.07919, 2023.
1562		33	18	Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, and Oyvind Tafjord. Think you have solved question answering? Try ARC, the AI2 reasoning challenge. CoRR, abs/1803.05457, 2018.
1563		33	19	Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, Christopher Hesse, and John Schulman. Training verifiers to solve math word problems. CoRR, abs/2110.14168, 2021.
1564		33	20	Damai Dai, Chengqi Deng, Chenggang Zhao, R. X. Xu, Huazuo Gao, Deli Chen, Jiashi Li, Wangding Zeng, Xingkai Yu, Y. Wu, Zhenda Xie, Y. K. Li, Panpan Huang, Fuli Luo, Chong Ruan, Zhifang Sui, and Wenfeng Liang. DeepSeekMoE: Towards ultimate expert specialization in mixture-of-experts language models. CoRR, abs/2401.06066, 2024.
1565		33	21	Yann N. Dauphin, Angela Fan, Michael Auli, and David Grangier. Language modeling with gated convolutional networks. In ICML, volume 70 of Proceedings of Machine Learning Research, pp. 933–941. PMLR, 2017.
1566		33	22	Guanting Dong, Hongyi Yuan, Keming Lu, Chengpeng Li, Mingfeng Xue, Dayiheng Liu, Wei Wang, Zheng Yuan, Chang Zhou, and Jingren Zhou. How abilities in large language models are affected by supervised fine-tuning data composition. CoRR, abs/2310.05492, 2023.
1567		33	23	Guanting Dong, Keming Lu, Chengpeng Li, Tingyu Xia, Bowen Yu, Chang Zhou, and Jingren Zhou. Self-play with execution feedback: Improving instruction-following capabilities of large language models. CoRR, abs/2406.13542, 2024.
1568		33	24	Alena Fenogenova, Artem Chervyakov, Nikita Martynov, Anastasia Kozlova, Maria Tikhonova, Albina Akhmetgareeva, Anton A. Emelyanov, Denis Shevelev, Pavel Lebedev, Leonid Sinev, Ulyana Isaeva, Katerina Kolomeytseva, Daniil Moskovskiy, Elizaveta Goncharova, Nikita Savushkin, Polina Mikhailova, Denis Dimitrov, Alexander Panchenko, and Sergey Markov. MERA: A comprehensive LLM evaluation in russian. CoRR, abs/2401.04531, 2024.
1569		33	25	Shahriar Golchin and Mihai Surdeanu. Time travel in llms: Tracing data contamination in large language models. In ICLR. OpenReview.net, 2024.
1570		33	26	Naman Goyal, Cynthia Gao, Vishrav Chaudhary, Peng-Jen Chen, Guillaume Wenzek, Da Ju, Sanjana Krishnan, Marc’Aurelio Ranzato, Francisco Guzm´ an, and Angela Fan. The Flores-101 evaluation benchmark for low-resource and multilingual machine translation. Trans. Assoc. Comput. Linguistics, 10:522–538, 2022.
1571		33	27	Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt. Measuring massive multitask language understanding. In ICLR. OpenReview.net, 2021a.
1572		33	28	Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, and Jacob Steinhardt. Measuring mathematical problem solving with the MATH dataset. In NeurIPS Datasets and Benchmarks, 2021b.
1573		33	29	Yuzhen Huang, Yuzhuo Bai, Zhihao Zhu, Junlei Zhang, Jinghan Zhang, Tangjun Su, Junteng Liu, Chuancheng Lv, Yikai Zhang, Jiayi Lei, Yao Fu, Maosong Sun, and Junxian He. C-Eval: A multi-level multi-discipline chinese evaluation suite for foundation models. In NeurIPS, 2023.
1574		33	30	Naman Jain, King Han, Alex Gu, Wen-Ding Li, Fanjia Yan, Tianjun Zhang, Sida Wang, Armando Solar-Lezama, Koushik Sen, and Ion Stoica. LiveCodeBench: Holistic and contamination free evaluation of large language models for code. CoRR, abs/2403.07974, 2024.
1575		33	31	Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, L´ elio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timoth´ ee Lacroix, and William El Sayed. Mistral 7B. CoRR, abs/2310.06825, 2023a.
1576		33	32	Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, L´ elio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Sandeep Subramanian, Sophia Yang, Szymon Antoniak, Teven Le Scao, Th´ eophile Gervet, Thibaut Lavril, Thomas Wang, Timoth´ ee Lacroix, and William El Sayed. Mixtral of experts. CoRR, abs/2401.04088, 2024.
1577		33	33	Zixuan Jiang, Jiaqi Gu, Hanqing Zhu, and David Z. Pan. Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and efficient pre-LN Transformers. CoRR, abs/2305.14858, 2023b.
1578		33	34	Gregory Kamradt. Needle in a haystack - pressure testing LLMs, 2023. URL https://github.com/gkamradt/LLMTest_NeedleInAHaystack.
1579		33	35	Aran Komatsuzaki, Joan Puigcerver, James Lee-Thorp, Carlos Riquelme Ruiz, Basil Mustafa, Joshua Ainslie, Yi Tay, Mostafa Dehghani, and Neil Houlsby. Sparse upcycling: Training mixture-of-experts from dense checkpoints. In ICLR. OpenReview.net, 2023.
1580		33	36	Fajri Koto, Nurul Aisyah, Haonan Li, and Timothy Baldwin. Large language models only pass primary school exams in Indonesia: A comprehensive test on IndoMMLU. In EMNLP, pp. 12359–12374. Association for Computational Linguistics, 2023.
1581		33	37	Haonan Li, Yixuan Zhang, Fajri Koto, Yifei Yang, Hai Zhao, Yeyun Gong, Nan Duan, and Timothy Baldwin. CMMLU: Measuring massive multitask language understanding in Chinese. CoRR, abs/2306.09212, 2023.
1582		33	38	Tianle Li, Wei-Lin Chiang, Evan Frick, Lisa Dunlap, Tianhao Wu, Banghua Zhu, Joseph E. Gonzalez, and Ion Stoica. From crowdsourced data to high-quality benchmarks: Arena-Hard and BenchBuilder pipeline. CoRR, abs/2406.11939, 2024.
1583		33	39	Opher Lieber, Barak Lenz, Hofit Bata, Gal Cohen, Jhonathan Osin, Itay Dalmedigos, Erez Safahi, Shaked Meirom, Yonatan Belinkov, Shai Shalev-Shwartz, Omri Abend, Raz Alon, Tomer Asida, Amir Bergman, Roman Glozman, Michael Gokhman, Avashalom Manevich, Nir Ratner, Noam Rozen, Erez Shwartz, Mor Zusman, and Yoav Shoham. Jamba: A hybrid Transformer-Mamba language model. CoRR, abs/2403.19887, 2024.
1584		33	40	Stephanie Lin, Jacob Hilton, and Owain Evans. TruthfulQA: Measuring how models mimic human falsehoods. In ACL (1), pp. 3214–3252. Association for Computational Linguistics, 2022a.
1585		33	41	Xi Victoria Lin, Todor Mihaylov, Mikel Artetxe, Tianlu Wang, Shuohui Chen, Daniel Simig, Myle Ott, Naman Goyal, Shruti Bhosale, Jingfei Du, Ramakanth Pasunuru, Sam Shleifer, Punit Singh Koura, Vishrav Chaudhary, Brian O’Horo, Jeff Wang, Luke Zettlemoyer, Zornitsa Kozareva, Mona T. Diab, Veselin Stoyanov, and Xian Li. Few-shot learning with multilingual generative language models. In EMNLP, pp. 9019–9052. Association for Computational Linguistics, 2022b.
1586		33	42	Jiawei Liu, Chunqiu Steven Xia, Yuyao Wang, and Lingming Zhang. Is your code generated by ChatGPT really correct? Rigorous evaluation of large language models for code generation. In NeurIPS, 2023a.
1587		33	43	Xiao Liu, Xuanyu Lei, Shengyuan Wang, Yue Huang, Zhuoer Feng, Bosi Wen, Jiale Cheng, Pei Ke, Yifan Xu, Weng Lam Tam, Xiaohan Zhang, Lichao Sun, Hongning Wang, Jing Zhang, Minlie Huang, Yuxiao Dong, and Jie Tang. AlignBench: Benchmarking Chinese alignment of large language models. CoRR, abs/2311.18743, 2023b.
1588		33	44	Keming Lu, Bowen Yu, Fei Huang, Yang Fan, Runji Lin, and Chang Zhou. Online merging optimizers for boosting rewards and mitigating tax in alignment. CoRR, abs/2405.17931, 2024a.
1589		33	45	Keming Lu, Bowen Yu, Chang Zhou, and Jingren Zhou. Large language models are superpositions of all characters: Attaining arbitrary role-play via self-alignment. CoRR, abs/2401.12474, 2024b.
1590		33	46	Keming Lu, Hongyi Yuan, Zheng Yuan, Runji Lin, Junyang Lin, Chuanqi Tan, Chang Zhou, and Jingren Zhou. #InsTag: Instruction tagging for analyzing supervised fine-tuning of large language models. In ICLR. OpenReview.net, 2024c.
1591		33	47	Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivi` ere, Mihir Sanjay Kale, Juliette Love, Pouya Tafti, L´ eonard Hussenot, Pier Giuseppe Sessa, Aakanksha Chowdhery, Adam Roberts, Aditya Barua, Alex Botev, Alex Castro-Ros, Ambrose Slone, Am´ elie H´ eliou, Andrea Tacchetti, Anna Bulanova, Antonia Paterson, Beth Tsai, Bobak Shahriari, Charline Le Lan, Christopher A. Choquette-Choo, Cl´ ement Crepy, Daniel Cer, Daphne Ippolito, David Reid, Elena Buchatskaya, Eric Ni, Eric Noland, Geng Yan, George Tucker, George-Christian Muraru, Grigory Rozhdestvenskiy, Henryk Michalewski, Ian Tenney, Ivan Grishchenko, Jacob Austin, James Keeling, Jane Labanowski, Jean-Baptiste Lespiau, Jeff Stanway, Jenny Brennan, Jeremy Chen, Johan Ferret, Justin Chiu, Justin Mao-Jones, Katherine Lee, Kathy Yu, Katie Millican, Lars Lowe Sjoesund, Lisa Lee, Lucas Dixon, Machel Reid, Maciej Mikuła, Mateo Wirth, Michael Sharman, Nikolai Chinaev, Nithum Thain, Olivier Bachem, Oscar Chang, Oscar Wahltinez, Paige Bailey, Paul Michel, Petko Yotov, Rahma Chaabouni, Ramona Comanescu, Reena Jana, Rohan Anil, Ross McIlroy, Ruibo Liu, Ryan Mullins, Samuel L Smith, Sebastian Borgeaud, Sertan Girgin, Sholto Douglas, Shree Pandya, Siamak Shakeri, Soham De, Ted Klimenko, Tom Hennigan, Vlad Feinberg, Wojciech Stokowiec, Yu hui Chen, Zafarali Ahmed, Zhitao Gong, Tris Warkentin, Ludovic Peran, Minh Giang, Cl´ ement Farabet, Oriol Vinyals, Jeff Dean, Koray Kavukcuoglu, Demis Hassabis, Zoubin Ghahramani, Douglas Eck, Joelle Barral, Fernando Pereira, Eli Collins, Armand Joulin, Noah Fiedel, Evan Senter, Alek Andreev, and Kathleen Kenealy. Gemma: Open models based on Gemini research and technology. CoRR, abs/2403.08295, 2024.
1592		33	48	Niklas Muennighoff, Thomas Wang, Lintang Sutawika, Adam Roberts, Stella Biderman, Teven Le Scao, M. Saiful Bari, Sheng Shen, Zheng Xin Yong, Hailey Schoelkopf, Xiangru Tang, Dragomir Radev, Alham Fikri Aji, Khalid Almubarak, Samuel Albanie, Zaid Alyafeai, Albert Webson, Edward Raff, and Colin Raffel. Crosslingual generalization through multitask finetuning. In ACL (1), pp. 15991–16111. Association for Computational Linguistics, 2023.
1593		33	49	Jinjie Ni, Fuzhao Xue, Xiang Yue, Yuntian Deng, Mahir Shah, Kabir Jain, Graham Neubig, and Yang You. MixEval: Deriving wisdom of the crowd from LLM benchmark mixtures. CoRR, abs/2406.06565, 2024.
1594		33	50	OpenAI. Introducing ChatGPT, 2022. URL https://openai.com/index/chatgpt/.
1595		33	51	OpenAI. GPT4 technical report. arXiv preprint arXiv:2303.08774, 2023.
1596		33	52	OpenAI. Hello GPT-4o, 2024. URL https://openai.com/index/hello-gpt-4o/.
1597		33	53	OpenCompass Contributors. OpenCompass: A universal evaluation platform for foundation models, 2023. URL https://github.com/open-compass/opencompass.
1598		33	54	Bowen Peng, Jeffrey Quesnelle, Honglu Fan, and Enrico Shippole. YaRN: Efficient context window extension of large language models. CoRR, abs/2309.00071, 2023.
1599		33	55	Edoardo Maria Ponti, Goran Glavas, Olga Majewska, Qianchu Liu, Ivan Vulic, and Anna Korhonen. XCOPA: A multilingual dataset for causal commonsense reasoning. In EMNLP (1), pp. 2362–2376. Association for Computational Linguistics, 2020.
1600		33	56	Qwen Team. Introducing Qwen1.5, 2024a. URL https://qwenlm.github.io/blog/qwen1.5/.
1601		33	57	Qwen Team. Qwen1.5-110B: The first 100B+ model of the Qwen1.5 series, 2024b. URL https://qwenlm.github.io/blog/qwen1.5-110b/.
1602		33	58	Qwen Team. Qwen1.5-MoE: Matching 7B model performance with 1/3 activated parameters, 2024c. URL https://qwenlm.github.io/blog/qwen-moe/.
1603		33	59	Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D. Manning, Stefano Ermon, and Chelsea Finn. Direct preference optimization: Your language model is secretly a reward model. In NeurIPS, 2023.
1604		33	60	Samyam Rajbhandari, Conglong Li, Zhewei Yao, Minjia Zhang, Reza Yazdani Aminabadi, Ammar Ahmad Awan, Jeff Rasley, and Yuxiong He. DeepSpeed-MoE: Advancing mixture-of-experts inference and training to power next-generation AI scale. In ICML, volume 162 of Proceedings of Machine Learning Research, pp. 18332–18346. PMLR, 2022.
1605		33	61	Mathieu Ravaut, Bosheng Ding, Fangkai Jiao, Hailin Chen, Xingxuan Li, Ruochen Zhao, Chengwei Qin, Caiming Xiong, and Shafiq Joty. How much are LLMs contaminated? A comprehensive survey and the llmsanitize library. CoRR, abs/2404.00699, 2024.
1606		33	62	David Rein, Betty Li Hou, Asa Cooper Stickland, Jackson Petty, Richard Yuanzhe Pang, Julien Dirani, Julian Michael, and Samuel R. Bowman. GPQA: A graduate-level Google-proof Q&A benchmark. CoRR, abs/2311.12022, 2023.
1607		33	63	Oscar Sainz, Jon Ander Campos, Iker Garc´ ıa-Ferrero, Julen Etxaniz, Oier Lopez de Lacalle, and Eneko Agirre. NLP evaluation in trouble: On the need to measure LLM data contamination for each benchmark. In EMNLP (Findings), pp. 10776–10787. Association for Computational Linguistics, 2023.
1608		33	64	Keisuke Sakaguchi, Ronan Le Bras, Chandra Bhagavatula, and Yejin Choi. WinoGrande: An adversarial winograd schema challenge at scale. Commun. ACM, 64(9):99–106, 2021.
1609		33	65	Jianlin Su. The magical effect of the Bias term: RoPE + Bias = better length extrapolation, 2023. URL https://spaces.ac.cn/archives/9577.
1610		33	66	Jianlin Su, Murtadha H. M. Ahmed, Yu Lu, Shengfeng Pan, Wen Bo, and Yunfeng Liu. Roformer: Enhanced Transformer with rotary position embedding. Neurocomputing, 568:127063, 2024.
1611		33	67	Mirac Suzgun, Nathan Scales, Nathanael Sch¨ arli, Sebastian Gehrmann, Yi Tay, Hyung Won Chung, Aakanksha Chowdhery, Quoc V. Le, Ed H. Chi, Denny Zhou, and Jason Wei. Challenging BIG- Bench tasks and whether chain-of-thought can solve them. In ACL (Findings), pp. 13003–13051. Association for Computational Linguistics, 2023.
1612		33	68	Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timoth´ee Lacroix, Baptiste Rozi` ere, Naman Goyal, Eric Hambro, Faisal Azhar, Aur´ elien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. LLaMA: Open and efficient foundation language models. CoRR, abs/2302.13971, 2023.
1613		33	69	Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. In NIPS, pp. 5998–6008, 2017.
1614		33	70	Yubo Wang, Xueguang Ma, Ge Zhang, Yuansheng Ni, Abhranil Chandra, Shiguang Guo, Weiming Ren, Aaran Arulraj, Xuan He, Ziyan Jiang, Tianle Li, Max Ku, Kai Wang, Alex Zhuang, Rongqi Fan, Xiang Yue, and Wenhu Chen. MMLU-Pro: A more robust and challenging multi-task language understanding benchmark. CoRR, abs/2406.01574, 2024.
1615		33	71	Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, and Hao Ma. Effective long-context scaling of foundation models. CoRR, abs/2309.16039, 2023.
1616		33	72	Yinfei Yang, Yuan Zhang, Chris Tar, and Jason Baldridge. PAWS-X: A cross-lingual adversarial dataset for paraphrase identification. In EMNLP/IJCNLP (1), pp. 3685–3690. Association for Computational Linguistics, 2019.
1617		33	73	Alex Young, Bei Chen, Chao Li, Chengen Huang, Ge Zhang, Guanwei Zhang, Heng Li, Jiangcheng Zhu, Jianqun Chen, Jing Chang, Kaidong Yu, Peng Liu, Qiang Liu, Shawn Yue, Senbin Yang, Shiming Yang, Tao Yu, Wen Xie, Wenhao Huang, Xiaohui Hu, Xiaoyi Ren, Xinyao Niu, Pengcheng Nie, Yuchi Xu, Yudong Liu, Yue Wang, Yuxuan Cai, Zhenyu Gu, Zhiyuan Liu, and Zonghong Dai. Yi: Open foundation models by 01.AI. CoRR, abs/2403.04652, 2024.
1618		33	74	Tao Yuan, Xuefei Ning, Dong Zhou, Zhijie Yang, Shiyao Li, Minghui Zhuang, Zheyue Tan, Zhuyu Yao, Dahua Lin, Boxun Li, Guohao Dai, Shengen Yan, and Yu Wang. LV-Eval: A balanced long-context benchmark with 5 length levels up to 256K. CoRR, abs/2402.05136, 2024.
1619		33	75	Zheng Yuan, Hongyi Yuan, Chengpeng Li, Guanting Dong, Chuanqi Tan, and Chang Zhou. Scaling relationship on learning mathematical reasoning with large language models. CoRR, abs/2308.01825, 2023.
1620		33	76	Rowan Zellers, Ari Holtzman, Yonatan Bisk, Ali Farhadi, and Yejin Choi. Hellaswag: Can a machine really finish your sentence? In ACL (1), pp. 4791–4800. Association for Computational Linguistics, 2019.
1621		33	77	Aohan Zeng, Bin Xu, Bowen Wang, Chenhui Zhang, Da Yin, Diego Rojas, Guanyu Feng, Hanlin Zhao, Hanyu Lai, Hao Yu, Hongning Wang, Jiadai Sun, Jiajie Zhang, Jiale Cheng, Jiayi Gui, Jie Tang, Jing Zhang, Juanzi Li, Lei Zhao, Lindong Wu, Lucen Zhong, Mingdao Liu, Minlie Huang, Peng Zhang, Qinkai Zheng, Rui Lu, Shuaiqi Duan, Shudan Zhang, Shulin Cao, Shuxun Yang, Weng Lam Tam, Wenyi Zhao, Xiao Liu, Xiao Xia, Xiaohan Zhang, Xiaotao Gu, Xin Lv, Xinghan Liu, Xinyi Liu, Xinyue Yang, Xixuan Song, Xunkai Zhang, Yifan An, Yifan Xu, Yilin Niu, Yuantao Yang, Yueyan Li, Yushi Bai, Yuxiao Dong, Zehan Qi, Zhaoyu Wang, Zhen Yang, Zhengxiao Du, Zhenyu Hou, and Zihan Wang. ChatGLM: A family of large language models from GLM-130B to GLM-4 all tools. CoRR, abs/2406.12793, 2024.
1622		33	78	Yingxiu Zhao, Bowen Yu, Binyuan Hui, Haiyang Yu, Minghao Li, Fei Huang, Nevin L. Zhang, and Yongbin Li. Tree-Instruct: A preliminary study of the intrinsic relationship between complexity and alignment. In LREC/COLING, pp. 16776–16789. ELRA and ICCL, 2024.
1623		33	79	Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric P. Xing, Hao Zhang, Joseph E. Gonzalez, and Ion Stoica. Judging LLM-as-a-judge with MT-Bench and Chatbot Arena. In NeurIPS, 2023.
1624		33	80	Jeffrey Zhou, Tianjian Lu, Swaroop Mishra, Siddhartha Brahma, Sujoy Basu, Yi Luan, Denny Zhou, and Le Hou. Instruction-following evaluation for large language models. CoRR, abs/2311.07911, 2023.
1625		34	0	A. Yang, B. Yang, B. Zhang, B. Hui, B. Zheng, B. Yu, C. Li, D. Liu, F. Huang, H. Wei, H. Lin, J. Yang, J. Tu, J. Zhang, J. Yang, J. Yang, J. Zhou, J. Lin, K. Dang, K. Lu, K. Bao, K. Yang, L. Yu, M. Li, M. Xue, P. Zhang, Q. Zhu, R. Men, R. Lin, T. Li, T. Xia, X. Ren, X. Ren, Y. Fan, Y. Su, Y. Zhang, Y. Wan, Y. Liu, Z. Cui, Z. Zhang, and Z. Qiu. Qwen2.5 technical report. arXiv preprint arXiv:2412.15115, 2024.https://arxiv.org/pdf/2412.15115
1626		34	1	Marah I Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat S. Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, S´ ebastien Bubeck, Martin Cai, Caio C´ esar Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Parul Chopra, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Dan Iter, Amit Garg, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Jamie Huynh, Mojan Javaheripi, Xin Jin, Piero Kauffmann, Nikos Karampatziakis, Dongwoo Kim, Mahoud Khademi, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Chen Liang, Weishung Liu, Eric Lin, Zeqi Lin, Piyush Madan, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Xia Song, Masahiro Tanaka, Xin Wang, Rachel Ward, Guanhua Wang, Philipp Witte, Michael Wyatt, Can Xu, Jiahang Xu, Sonali Yadav, Fan Yang, Ziyi Yang, Donghan Yu, Chengruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, and Xiren Zhou. Phi-3 technical report: A highly capable language model locally on your phone. CoRR, abs/2404.14219, 2024.
1627		34	2	Bo Adler, Niket Agarwal, Ashwath Aithal, Dong H. Anh, Pallab Bhattacharya, Annika Brundyn, Jared Casper, Bryan Catanzaro, Sharon Clay, Jonathan Cohen, Sirshak Das, Ayush Dattagupta, Olivier Delalleau, Leon Derczynski, Yi Dong, Daniel Egert, Ellie Evans, Aleksander Ficek, Denys Fridman, Shaona Ghosh, Boris Ginsburg, Igor Gitman, Tomasz Grzegorzek, Robert Hero, Jining Huang, Vibhu Jawa, Joseph Jennings, Aastha Jhunjhunwala, John Kamalu, Sadaf Khan, Oleksii Kuchaiev, Patrick LeGresley, Hui Li, Jiwei Liu, Zihan Liu, Eileen Peters Long, Ameya Mahabaleshwarkar, Somshubra Majumdar, James Maki, Miguel Martinez, Maer Rodrigues de Melo, Ivan Moshkov, Deepak Narayanan, Sean Narenthiran, Jesus Navarro, Phong Nguyen, Osvald Nitski, Vahid Noroozi, Guruprasad Nutheti, Christopher Parisien, Jupinder Parmar, Mostofa Patwary, Krzysztof Pawelec, Wei Ping, Shrimai Prabhumoye, Rajarshi Roy, Trisha Saar, Vasanth Rao Naik Sabavat, Sanjeev Satheesh, Jane Polak Scowcroft, Jason D. Sewall, Pavel Shamis, Gerald Shen, Mohammad Shoeybi, Dave Sizer, Misha Smelyanskiy, Felipe Soares, Makesh Narsimhan Sreedhar, Dan Su, Sandeep Subramanian, Shengyang Sun, Shubham Toshniwal, Hao Wang, Zhilin Wang, Jiaxuan You, Jiaqi Zeng, Jimmy Zhang, Jing Zhang, Vivienne Zhang, Yian Zhang, and Chen Zhu. Nemotron-4 340B technical report. CoRR, abs/2406.11704, 2024.
1628		34	3	Joshua Ainslie, James Lee-Thorp, Michiel de Jong, Yury Zemlyanskiy, Federico Lebr´ on, and Sumit Sanghai. GQA: Training generalized multi-query Transformer models from multi-head checkpoints. In EMNLP, pp. 4895–4901. Association for Computational Linguistics, 2023.
1629		34	4	Ebtesam Almazrouei, Hamza Alobeidli, Abdulaziz Alshamsi, Alessandro Cappelli, Ruxandra Cojocaru,´M´ erouane Debbah, Etienne Goffinet, Daniel Hesslow, Julien Launay, Quentin Malartic, Daniele Mazzotta, Badreddine Noune, Baptiste Pannier, and Guilherme Penedo. The Falcon series of open language models. CoRR, abs/2311.16867, 2023.
1630		34	5	Chenxin An, Fei Huang, Jun Zhang, Shansan Gong, Xipeng Qiu, Chang Zhou, and Lingpeng Kong. Training-free long-context scaling of large language models. CoRR, abs/2402.17463, 2024.
1631		34	6	Anthropic. Introducing Claude, 2023a. URL https://www.anthropic.com/index/introducing-claude. Anthropic. Claude 2. Technical report, Anthropic, 2023b. URL https://www-files.anthropic.com/production/images/Model-Card-Claude-2.pdf.
1632		34	7	Anthropic. The Claude 3 model family: Opus, Sonnet, Haiku. Technical report, Anthropic, AI, 2024. URL https://www-cdn.anthropic.com/de8ba9b01c9ab7cbabf5c33b80b7bbc618857627/Model Card Claude 3.pdf.
1633		34	8	Jacob Austin, Augustus Odena, Maxwell I. Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie J. Cai, Michael Terry, Quoc V. Le, and Charles Sutton. Program synthesis with large language models. CoRR, abs/2108.07732, 2021.
1634		34	9	Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng Xu, Jin Xu, An Yang, Hao Yang, Jian Yang, Shusheng Yang, Yang Yao, Bowen Yu, Hongyi Yuan, Zheng Yuan, Jianwei Zhang, Xingxuan Zhang, Yichang Zhang, Zhenru Zhang, Chang Zhou, Jingren Zhou, Xiaohuan Zhou, and Tianhang Zhu. Qwen technical report. CoRR, abs/2309.16609, 2023.
1635		34	10	Yushi Bai, Xin Lv, Jiajie Zhang, Yuze He, Ji Qi, Lei Hou, Jie Tang, Yuxiao Dong, and Juanzi Li. LongAlign: A recipe for long context alignment of large language models. In EMNLP (Findings), pp. 1376–1395. Association for Computational Linguistics, 2024.
1636		34	11	Lucas Bandarkar, Davis Liang, Benjamin Muller, Mikel Artetxe, Satya Narayan Shukla, Donald Husa, Naman Goyal, Abhinandan Krishnan, Luke Zettlemoyer, and Madian Khabsa. The Belebele benchmark: A parallel reading comprehension dataset in 122 language variants. CoRR, abs/2308.16884, 2023.
1637		34	12	Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. Language models are few-shot learners. In NeurIPS, 2020.
1638		34	13	Boxi Cao, Keming Lu, Xinyu Lu, Jiawei Chen, Mengjie Ren, Hao Xiang, Peilin Liu, Yaojie Lu, Ben He, Xianpei Han, Le Sun, Hongyu Lin, and Bowen Yu. Towards scalable automated alignment of LLMs: A survey. CoRR, abs/2406.01252, 2024.
1639		34	14	Federico Cassano, John Gouwar, Daniel Nguyen, Sydney Nguyen, Luna Phipps-Costin, Donald Pinckney, Ming-Ho Yee, Yangtian Zi, Carolyn Jane Anderson, Molly Q. Feldman, Arjun Guha, Michael Greenberg, and Abhinav Jangda. MultiPL-E: A scalable and polyglot approach to benchmarking neural code generation. IEEE Trans. Software Eng., 49(7):3675–3691, 2023.
1640		34	15	Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Pond´ e de Oliveira Pinto, Jared Kaplan, Harrison Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shantanu Jain, William Saunders, Christopher Hesse, Andrew N. Carr, Jan Leike, Joshua Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, and Wojciech Zaremba. Evaluating large language models trained on code. CoRR, abs/2107.03374, 2021.
1641		34	16	Wenhu Chen, Ming Yin, Max Ku, Pan Lu, Yixin Wan, Xueguang Ma, Jianyu Xu, Xinyi Wang, and Tony Xia. TheoremQA: A theorem-driven question answering dataset. In EMNLP, pp. 7889–7901. Association for Computational Linguistics, 2023a.
1642		34	17	Zhihong Chen, Shuo Yan, Juhao Liang, Feng Jiang, Xiangbo Wu, Fei Yu, Guiming Hardy Chen, Junying Chen, Hongbo Zhang, Li Jianquan, Wan Xiang, and Benyou Wang. MultilingualSIFT: Multilingual supervised instruction fine-tuning, 2023b. URL https://github.com/FreedomIntelligence/MultilingualSIFT.
1643		34	18	Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, and Oyvind Tafjord. Think you have solved question answering? Try ARC, the AI2 reasoning challenge. CoRR, abs/1803.05457, 2018.
1644		34	19	Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, Christopher Hesse, and John Schulman. Training verifiers to solve math word problems. CoRR, abs/2110.14168, 2021.
1645		34	20	Damai Dai, Chengqi Deng, Chenggang Zhao, R. X. Xu, Huazuo Gao, Deli Chen, Jiashi Li, Wangding Zeng, Xingkai Yu, Y. Wu, Zhenda Xie, Y. K. Li, Panpan Huang, Fuli Luo, Chong Ruan, Zhifang Sui, and Wenfeng Liang. DeepSeekMoE: Towards ultimate expert specialization in mixture-of-experts language models. CoRR, abs/2401.06066, 2024.
1646		34	21	Yann N. Dauphin, Angela Fan, Michael Auli, and David Grangier. Language modeling with gated convolutional networks. In ICML, volume 70 of Proceedings of Machine Learning Research, pp. 933–941. PMLR, 2017.
1647		34	22	Guanting Dong, Keming Lu, Chengpeng Li, Tingyu Xia, Bowen Yu, Chang Zhou, and Jingren Zhou. Self-play with execution feedback: Improving instruction-following capabilities of large language models. CoRR, abs/2406.13542, 2024.
1648		34	23	Shihan Dou, Jiazheng Zhang, Jianxiang Zang, Yunbo Tao, Haoxiang Jia, Shichun Liu, Yuming Yang, Shenxi Wu, Shaoqing Zhang, Muling Wu, et al. Multi-programming language sandbox for llms. CoRR, abs/2410.23074, 2024.
1649		34	24	Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aur´ elien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Rozi` ere, Bethany Biron, Binh Tang, Bobbie Chern, Charlotte Caucheteux, Chaya Nayak, Chloe Bi, Chris Marra, Chris McConnell, Christian Keller, Christophe Touret, Chunyang Wu, Corinne Wong, Cristian Canton Ferrer, Cyrus Nikolaidis, Damien Allonsius, Daniel Song, Danielle Pintz, Danny Livshits, David Esiobu, Dhruv
1650		34	25	Choudhary, Dhruv Mahajan, Diego Garcia-Olano, Diego Perino, Dieuwke Hupkes, Egor Lakomkin, Ehab AlBadawy, Elina Lobanova, Emily Dinan, Eric Michael Smith, Filip Radenovic, Frank Zhang, Gabriel Synnaeve, Gabrielle Lee, Georgia Lewis Anderson, Graeme Nail, Gr´ egoire Mialon, Guan Pang, Guillem Cucurell, Hailey Nguyen, Hannah Korevaar, Hu Xu, Hugo Touvron, Iliyan Zarov, Imanol Arrieta Ibarra, Isabel M. Kloumann, Ishan Misra, Ivan Evtimov, Jade Copet, Jaewon Lee, Jan Geffert, Jana Vranes, Jason Park, Jay Mahadeokar, Jeet Shah, Jelmer van der Linde, Jennifer Billock, Jenny Hong, Jenya Lee, Jeremy Fu, Jianfeng Chi, Jianyu Huang, Jiawen Liu, Jie Wang, Jiecao Yu, Joanna Bitton, Joe Spisak, Jongsoo Park, Joseph Rocca, Joshua Johnstun, Joshua Saxe, Junteng Jia, Kalyan Vasuden Alwala, Kartikeya Upasani, Kate Plawiak, Ke Li, Kenneth Heafield, Kevin Stone, and et al. The Llama 3 herd of models. CoRR, abs/2407.21783, 2024.
1651		34	26	William Fedus, Barret Zoph, and Noam Shazeer. Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity. J. Mach. Learn. Res., 23:120:1–120:39, 2022.
1652		34	27	Alena Fenogenova, Artem Chervyakov, Nikita Martynov, Anastasia Kozlova, Maria Tikhonova, Albina Akhmetgareeva, Anton A. Emelyanov, Denis Shevelev, Pavel Lebedev, Leonid Sinev, Ulyana Isaeva, Katerina Kolomeytseva, Daniil Moskovskiy, Elizaveta Goncharova, Nikita Savushkin, Polina Mikhailova, Denis Dimitrov, Alexander Panchenko, and Sergey Markov. MERA: A comprehensive LLM evaluation in russian. CoRR, abs/2401.04531, 2024.
1653		34	28	Evan Frick, Peter Jin, Tianle Li, Karthik Ganesan, Jian Zhang, Jiantao Jiao, and Banghua Zhu. Athene-70b: Redefining the boundaries of post-training for open models, July 2024a. URL https://nexusflow.ai/blogs/athene.
1654		34	29	Evan Frick, Tianle Li, Connor Chen, Wei-Lin Chiang, Anastasios Nikolas Angelopoulos, Jiantao Jiao, Banghua Zhu, Joseph E. Gonzalez, and Ion Stoica. How to evaluate reward models for RLHF. CoRR, abs/2410.14872, 2024b.
1655		34	30	Aryo Pradipta Gema, Joshua Ong Jun Leang, Giwon Hong, Alessio Devoto, Alberto Carlo Maria Mancino, Rohit Saxena, Xuanli He, Yu Zhao, Xiaotang Du, Mohammad Reza Ghasemi Madani, et al. Are we done with mmlu? CoRR, abs/2406.04127, 2024.
1656		34	31	Gemini Team. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. Technical report, Google, 2024. URL https://storage.googleapis.com/deepmind-media/gemini/gemini v1 5 report.pdf.
1657		34	32	Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhupatiraju, L´ eonard Hussenot, Thomas Mesnard, Bobak Shahriari, Alexandre Ram´ e, et al. Gemma 2: Improving open language models at a practical size. CoRR, abs/2408.00118, 2024.
1658		34	33	Naman Goyal, Cynthia Gao, Vishrav Chaudhary, Peng-Jen Chen, Guillaume Wenzek, Da Ju, Sanjana Krishnan, Marc’Aurelio Ranzato, Francisco Guzm´ an, and Angela Fan. The Flores-101 evaluation benchmark for low-resource and multilingual machine translation. Trans. Assoc. Comput. Linguistics, 10:522–538, 2022.
1659		34	34	Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt. Measuring massive multitask language understanding. In ICLR. OpenReview.net, 2021a.
1660		34	35	Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, and Jacob Steinhardt. Measuring mathematical problem solving with the MATH dataset. In NeurIPS Datasets and Benchmarks, 2021b.
1661		34	36	Jordan Hoffmann, Sebastian Borgeaud, Arthur Mensch, Elena Buchatskaya, Trevor Cai, Eliza Rutherford, Diego de Las Casas, Lisa Anne Hendricks, Johannes Welbl, Aidan Clark, et al. Training computeoptimal large language models. CoRR, abs/2203.15556, 2022.
1662		34	37	Keith Hoskin. The “awful idea of accountability”: Inscribing people into the measurement of objects. Accountability: Power, ethos and the technologies of managing, 1996.
1663		34	38	Cheng-Ping Hsieh, Simeng Sun, Samuel Kriman, Shantanu Acharya, Dima Rekesh, Fei Jia, Yang Zhang, and Boris Ginsburg. RULER: What’s the real context size of your long-context language models? CoRR, abs/2404.06654, 2024.
1664		34	39	Shengding Hu, Yuge Tu, Xu Han, Chaoqun He, Ganqu Cui, Xiang Long, Zhi Zheng, Yewei Fang, Yuxiang Huang, Weilin Zhao, Xinrong Zhang, Zhen Leng Thai, Kai Zhang, Chongyi Wang, Yuan Yao, Chenyang Zhao, Jie Zhou, Jie Cai, Zhongwu Zhai, Ning Ding, Chao Jia, Guoyang Zeng, Dahai Li, Zhiyuan Liu, and Maosong Sun. MiniCPM: Unveiling the potential of small language models with scalable training strategies. CoRR, abs/2404.06395, 2024.
1665		34	40	Binyuan Hui, Jian Yang, Zeyu Cui, Jiaxi Yang, Dayiheng Liu, Lei Zhang, Tianyu Liu, Jiajun Zhang, Bowen Yu, Keming Lu, et al. Qwen2.5-Coder technical report. CoRR, abs/2409.12186, 2024.
1666		34	41	Naman Jain, King Han, Alex Gu, Wen-Ding Li, Fanjia Yan, Tianjun Zhang, Sida Wang, Armando SolarLezama, Koushik Sen, and Ion Stoica. LiveCodeBench: Holistic and contamination free evaluation of large language models for code. CoRR, abs/2403.07974, 2024.
1667		34	42	Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, L´ elio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timoth´ee Lacroix, and William El Sayed. Mistral 7B. CoRR, abs/2310.06825, 2023a.
1668		34	43	Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, L´ elio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Sandeep Subramanian, Sophia Yang, Szymon Antoniak, Teven Le Scao, Th´ eophile Gervet, Thibaut Lavril, Thomas Wang, Timoth´ ee Lacroix, and William El Sayed. Mixtral of experts. CoRR, abs/2401.04088, 2024a.
1669		34	44	Huiqiang Jiang, Yucheng Li, Chengruidong Zhang, Qianhui Wu, Xufang Luo, Surin Ahn, Zhenhua Han, Amir H Abdi, Dongsheng Li, Chin-Yew Lin, Yuqing Yang, and Lili Qiu. Minference 1.0: Accelerating pre-filling for long-context llms via dynamic sparse attention. arXiv preprint arXiv:2407.02490, 2024b.
1670		34	45	Zixuan Jiang, Jiaqi Gu, Hanqing Zhu, and David Z. Pan. Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and efficient pre-LN Transformers. CoRR, abs/2305.14858, 2023b.
1671		34	46	Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language models. CoRR, abs/2001.08361, 2020.
1672		34	47	Fajri Koto, Nurul Aisyah, Haonan Li, and Timothy Baldwin. Large language models only pass primary school exams in Indonesia: A comprehensive test on IndoMMLU. In EMNLP, pp. 12359–12374. Association for Computational Linguistics, 2023.
1673		34	48	Nathan Lambert, Valentina Pyatkin, Jacob Daniel Morrison, Lester James Validad Miranda, Bill Yuchen Lin, Khyathi Raghavi Chandu, Nouha Dziri, Sachin Kumar, Tom Zick, Yejin Choi, Noah A. Smith, and Hanna Hajishirzi. RewardBench: Evaluating reward models for language modeling. CoRR, abs/2403.13787, 2024.
1674		34	49	Dmitry Lepikhin, HyoukJoong Lee, Yuanzhong Xu, Dehao Chen, Orhan Firat, Yanping Huang, Maxim Krikun, Noam Shazeer, and Zhifeng Chen. GShard: Scaling giant models with conditional computation and automatic sharding. CoRR, abs/2006.16668, 2020.
1675		34	50	Tianle Li, Wei-Lin Chiang, Evan Frick, Lisa Dunlap, Tianhao Wu, Banghua Zhu, Joseph E. Gonzalez, and Ion Stoica. From crowdsourced data to high-quality benchmarks: Arena-Hard and BenchBuilder pipeline. CoRR, abs/2406.11939, 2024.
1676		34	51	Stephanie Lin, Jacob Hilton, and Owain Evans. TruthfulQA: Measuring how models mimic human falsehoods. In ACL (1), pp. 3214–3252. Association for Computational Linguistics, 2022a.
1677		34	52	Xi Victoria Lin, Todor Mihaylov, Mikel Artetxe, Tianlu Wang, Shuohui Chen, Daniel Simig, Myle Ott, Naman Goyal, Shruti Bhosale, Jingfei Du, Ramakanth Pasunuru, Sam Shleifer, Punit Singh Koura, Vishrav Chaudhary, Brian O’Horo, Jeff Wang, Luke Zettlemoyer, Zornitsa Kozareva, Mona T. Diab, Veselin Stoyanov, and Xian Li. Few-shot learning with multilingual generative language models. In EMNLP, pp. 9019–9052. Association for Computational Linguistics, 2022b.
1678		34	53	Jiawei Liu, Chunqiu Steven Xia, Yuyao Wang, and Lingming Zhang. Is your code generated by ChatGPT really correct? Rigorous evaluation of large language models for code generation. In NeurIPS, 2023.
1679		34	54	Keming Lu, Bowen Yu, Fei Huang, Yang Fan, Runji Lin, and Chang Zhou. Online merging optimizers for boosting rewards and mitigating tax in alignment. CoRR, abs/2405.17931, 2024a.
1680		34	55	Keming Lu, Bowen Yu, Chang Zhou, and Jingren Zhou. Large language models are superpositions of all characters: Attaining arbitrary role-play via self-alignment. CoRR, abs/2401.12474, 2024b.
1681		34	56	Niklas Muennighoff, Thomas Wang, Lintang Sutawika, Adam Roberts, Stella Biderman, Teven Le Scao, M. Saiful Bari, Sheng Shen, Zheng Xin Yong, Hailey Schoelkopf, Xiangru Tang, Dragomir Radev, Alham Fikri Aji, Khalid Almubarak, Samuel Albanie, Zaid Alyafeai, Albert Webson, Edward Raff, and Colin Raffel. Crosslingual generalization through multitask finetuning. In ACL (1), pp. 15991–16111. Association for Computational Linguistics, 2023.
1682		34	57	Junho Myung, Nayeon Lee, Yi Zhou, Jiho Jin, Rifki Afina Putri, Dimosthenis Antypas, Hsuvas Borkakoty, Eunsu Kim, Carla P´ erez-Almendros, Abinew Ali Ayele, V´ ıctor Guti´ errez-Basulto, Yazm´ ın Ib´˜anez-Garc´ ıa, Hwaran Lee, Shamsuddeen Hassan Muhammad, Ki-Woong Park, Anar Sabuhi Rzayev, Nina White, Seid Muhie Yimam, Mohammad Taher Pilehvar, Nedjma Ousidhoum, Jos´ e Camacho-Collados, and Alice Oh. Blend: A benchmark for llms on everyday knowledge in diverse cultures and languages. CoRR, abs/2406.09948, 2024.
1683		34	58	OpenAI. GPT4 technical report. CoRR, abs/2303.08774, 2023.
1684		34	59	OpenAI. Hello GPT-4o, 2024a. URL https://openai.com/index/hello-gpt-4o/.
1685		34	60	OpenAI. Learning to reason with LLMs, 2024b. URL https://openai.com/index/learning-to-reason-with-llms/.
1686		34	61	Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul F. Christiano, Jan Leike, and Ryan Lowe. Training language models to follow instructions with human feedback. In NeurIPS, 2022.
1687		34	62	Bowen Peng, Jeffrey Quesnelle, Honglu Fan, and Enrico Shippole. YaRN: Efficient context window extension of large language models. CoRR, abs/2309.00071, 2023.
1688		34	63	Edoardo Maria Ponti, Goran Glavas, Olga Majewska, Qianchu Liu, Ivan Vulic, and Anna Korhonen. XCOPA: A multilingual dataset for causal commonsense reasoning. In EMNLP (1), pp. 2362–2376. Association for Computational Linguistics, 2020.
1689		34	64	Shanghaoran Quan, Tianyi Tang, Bowen Yu, An Yang, Dayiheng Liu, Bofei Gao, Jianhong Tu, Yichang Zhang, Jingren Zhou, and Junyang Lin. Language models can self-lengthen to generate long texts. CoRR, abs/2410.23933, 2024.
1690		34	65	Qwen Team. Code with CodeQwen1.5, 2024a. URL https://qwenlm.github.io/blog/codeqwen1.5/.
1691		34	66	Qwen Team. Introducing Qwen1.5, 2024b. URL https://qwenlm.github.io/blog/qwen1.5/.
1692		34	67	Qwen Team. Introducing Qwen2-Math, 2024c. URL https://qwenlm.github.io/blog/qwen2-math/.
1693		34	68	Qwen Team. QwQ: Reflect deeply on the boundaries of the unknown, 2024d. URL https://qwenlm.github.io/blog/qwq-32b-preview/.
1694		34	69	Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever, et al. Improving language understanding by generative pre-training. Technical report, OpenAI, 2018.
1695		34	70	Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D. Manning, Stefano Ermon, and Chelsea Finn. Direct preference optimization: Your language model is secretly a reward model. In NeurIPS, 2023.
1696		34	71	Samyam Rajbhandari, Conglong Li, Zhewei Yao, Minjia Zhang, Reza Yazdani Aminabadi, Ammar Ahmad Awan, Jeff Rasley, and Yuxiong He. DeepSpeed-MoE: Advancing mixture-of-experts inference and training to power next-generation AI scale. In ICML, volume 162 of Proceedings of Machine Learning Research, pp. 18332–18346. PMLR, 2022.
1697		34	72	David Rein, Betty Li Hou, Asa Cooper Stickland, Jackson Petty, Richard Yuanzhe Pang, Julien Dirani, Julian Michael, and Samuel R. Bowman. GPQA: A graduate-level Google-proof Q&A benchmark. CoRR, abs/2311.12022, 2023.
1698		34	73	Keisuke Sakaguchi, Ronan Le Bras, Chandra Bhagavatula, and Yejin Choi. WinoGrande: An adversarial winograd schema challenge at scale. Commun. ACM, 64(9):99–106, 2021.
1699		34	74	Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural machine translation of rare words with subword units. In ACL (1). The Association for Computer Linguistics, 2016.
1700		34	75	Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Mingchuan Zhang, Y. K. Li, Y. Wu, and Daya Guo. Deepseekmath: Pushing the limits of mathematical reasoning in open language models. CoRR, abs/2402.03300, 2024.
1701		34	76	Jianlin Su. The magical effect of the Bias term: RoPE + Bias = better length extrapolation, 2023. URL https://spaces.ac.cn/archives/9577.
1702		34	77	Jianlin Su, Murtadha H. M. Ahmed, Yu Lu, Shengfeng Pan, Wen Bo, and Yunfeng Liu. Roformer: Enhanced Transformer with rotary position embedding. Neurocomputing, 568:127063, 2024.
1703		34	78	Mirac Suzgun, Nathan Scales, Nathanael Sch¨ arli, Sebastian Gehrmann, Yi Tay, Hyung Won Chung, Aakanksha Chowdhery, Quoc V. Le, Ed H. Chi, Denny Zhou, and Jason Wei. Challenging BIG-Bench tasks and whether chain-of-thought can solve them. In ACL (Findings), pp. 13003–13051. Association for Computational Linguistics, 2023.
1704		34	79	Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timoth´ ee Lacroix, Baptiste Rozi` ere, Naman Goyal, Eric Hambro, Faisal Azhar, Aur´ elien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. LLaMA: Open and efficient foundation language models. CoRR, abs/2302.13971, 2023a.
1705		34	80	Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton-Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aur´ elien Rodriguez, Robert Stojnic, Sergey Edunov, and Thomas Scialom. Llama 2: Open foundation and fine-tuned chat models. CoRR, abs/2307.09288, 2023b.
1706		34	81	Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. In NIPS, pp. 5998–6008, 2017.
1707		34	82	Binghai Wang, Rui Zheng, Lu Chen, Yan Liu, Shihan Dou, Caishuang Huang, Wei Shen, Senjie Jin, Enyu Zhou, Chenyu Shi, et al. Secrets of RLHF in large language models part II: Reward modeling. CoRR, abs/2401.06080, 2024a.
1708		34	83	Changhan Wang, Kyunghyun Cho, and Jiatao Gu. Neural machine translation with byte-level subwords. In AAAI, pp. 9154–9160. AAAI Press, 2020.
1709		34	84	Yubo Wang, Xueguang Ma, Ge Zhang, Yuansheng Ni, Abhranil Chandra, Shiguang Guo, Weiming Ren, Aaran Arulraj, Xuan He, Ziyan Jiang, Tianle Li, Max Ku, Kai Wang, Alex Zhuang, Rongqi Fan, Xiang Yue, and Wenhu Chen. MMLU-Pro: A more robust and challenging multi-task language understanding benchmark. CoRR, abs/2406.01574, 2024b.
1710		34	85	Zhilin Wang, Alexander Bukharin, Olivier Delalleau, Daniel Egert, Gerald Shen, Jiaqi Zeng, Oleksii Kuchaiev, and Yi Dong. HelpSteer2-Preference: Complementing ratings with preferences. CoRR, abs/2410.01257, 2024c.
1711		34	86	Colin White, Samuel Dooley, Manley Roberts, Arka Pal, Benjamin Feuer, Siddhartha Jain, Ravid ShwartzZiv, Neel Jain, Khalid Saifullah, Siddartha Naidu, Chinmay Hegde, Yann LeCun, Tom Goldstein, Willie Neiswanger, and Micah Goldblum. LiveBench: A challenging, contamination-free LLM benchmark. CoRR, abs/2406.19314, 2024.
1712		34	87	Hao Xiang, Bowen Yu, Hongyu Lin, Keming Lu, Yaojie Lu, Xianpei Han, Le Sun, Jingren Zhou, and Junyang Lin. Aligning large language models via self-steering optimization. CoRR, abs/2410.17131, 2024.
1713		34	88	Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, and Hao Ma. Effective long-context scaling of foundation models. CoRR, abs/2309.16039, 2023.
1714		34	89	An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Ma, Jianxin Yang, Jin Xu, Jingren Zhou, Jinze Bai, Jinzheng He, Junyang Lin, Kai Dang, Keming Lu, Keqin Chen, Kexin Yang, Mei Li, Mingfeng Xue, Na Ni, Pei Zhang, Peng Wang, Ru Peng, Rui Men, Ruize Gao, Runji Lin, Shijie Wang, Shuai Bai, Sinan Tan, Tianhang Zhu, Tianhao Li, Tianyu Liu, Wenbin Ge, Xiaodong Deng, Xiaohuan Zhou, Xingzhang Ren, Xinyu Zhang, Xipin Wei, Xuancheng Ren, Xuejing Liu, Yang Fan, Yang Yao, Yichang Zhang, Yu Wan, Yunfei Chu, Yuqiong Liu, Zeyu Cui, Zhenru Zhang, Zhifang Guo, and Zhihao Fan. Qwen2 technical report. CoRR, abs/2407.10671, 2024a.
1715		34	90	An Yang, Beichen Zhang, Binyuan Hui, Bofei Gao, Bowen Yu, Chengpeng Li, Dayiheng Liu, Jianhong Tu, Jingren Zhou, Junyang Lin, et al. Qwen2.5-Math technical report: Toward mathematical expert model via self-improvement. CoRR, abs/2409.12122, 2024b.
1716		34	91	Jian Yang, Jiaxi Yang, Ke Jin, Yibo Miao, Lei Zhang, Liqun Yang, Zeyu Cui, Yichang Zhang, Binyuan Hui, and Junyang Lin. Evaluating and aligning codellms on human preference. CoRR, abs/2412.05210, 2024c.
1717		34	92	Yinfei Yang, Yuan Zhang, Chris Tar, and Jason Baldridge. PAWS-X: A cross-lingual adversarial dataset for paraphrase identification. In EMNLP/IJCNLP (1), pp. 3685–3690. Association for Computational Linguistics, 2019.
1718		34	93	Alex Young, Bei Chen, Chao Li, Chengen Huang, Ge Zhang, Guanwei Zhang, Heng Li, Jiangcheng Zhu, Jianqun Chen, Jing Chang, Kaidong Yu, Peng Liu, Qiang Liu, Shawn Yue, Senbin Yang, Shiming Yang, Tao Yu, Wen Xie, Wenhao Huang, Xiaohui Hu, Xiaoyi Ren, Xinyao Niu, Pengcheng Nie, Yuchi Xu, Yudong Liu, Yue Wang, Yuxuan Cai, Zhenyu Gu, Zhiyuan Liu, and Zonghong Dai. Yi: Open foundation models by 01.AI. CoRR, abs/2403.04652, 2024.
1719		34	94	Tao Yuan, Xuefei Ning, Dong Zhou, Zhijie Yang, Shiyao Li, Minghui Zhuang, Zheyue Tan, Zhuyu Yao, Dahua Lin, Boxun Li, Guohao Dai, Shengen Yan, and Yu Wang. LV-Eval: A balanced long-context benchmark with 5 length levels up to 256K. CoRR, abs/2402.05136, 2024.
1720		34	95	Zheng Yuan, Hongyi Yuan, Chengpeng Li, Guanting Dong, Chuanqi Tan, and Chang Zhou. Scaling relationship on learning mathematical reasoning with large language models. CoRR, abs/2308.01825, 2023.
1721		34	96	Rowan Zellers, Ari Holtzman, Yonatan Bisk, Ali Farhadi, and Yejin Choi. HellaSwag: Can a machine really finish your sentence? In ACL (1), pp. 4791–4800. Association for Computational Linguistics, 2019.
1722		34	97	Yidan Zhang, Boyi Deng, Yu Wan, Baosong Yang, Haoran Wei, Fei Huang, Bowen Yu, Junyang Lin, and Jingren Zhou. P-MMEval: A parallel multilingual multitask benchmark for consistent evaluation of LLMs. CoRR, abs/2411.09116, 2024.
1723		34	98	Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric P. Xing, Hao Zhang, Joseph E. Gonzalez, and Ion Stoica. Judging LLM-as-a-judge with MT-Bench and Chatbot Arena. In NeurIPS, 2023.
1724		34	99	Enyu Zhou, Guodong Zheng, Bing Wang, Zhiheng Xi, Shihan Dou, Rong Bao, Wei Shen, Limao Xiong, Jessica Fan, Yurong Mou, Rui Zheng, Tao Gui, Qi Zhang, and Xuanjing Huang. RMB: Comprehensively benchmarking reward models in LLM alignment. CoRR, abs/2410.09893, 2024.
1725		34	100	Jeffrey Zhou, Tianjian Lu, Swaroop Mishra, Siddhartha Brahma, Sujoy Basu, Yi Luan, Denny Zhou, and Le Hou. Instruction-following evaluation for large language models. CoRR, abs/2311.07911, 2023.
1726		34	101	Barret Zoph, Irwan Bello, Sameer Kumar, Nan Du, Yanping Huang, Jeff Dean, Noam Shazeer, and William Fedus. ST-MoE: Designing stable and transferable sparse expert models. CoRR, abs/2202.08906, 2022.
1727		35	0	Q. Zhang, K. Ding, T. Lyv, X. Wang, Q. Yin, Y. Zhang, J. Yu, Y. Wang, X. Li, Z. Xiang, K. Feng, X. Zhuang, Z. Wang, M. Qin, M. Zhang, J. Zhang, J. Cui, T. Huang, P. Yan, R. Xu, H. Chen, X. Li, X. Fan, H. Xing, and H. Chen. Scientific large language models: A survey on biological and chemical domains. A Survey on Biological and Chemical Domains, 1:90, 1 2024. doi:10.1145/nnnnnnn.nnnnnnn. URL https://arxiv.org/pdf/2401.14656v2.	https://qiita.com/kaizen_nagoya/items/6505717d7c4769a4ff31
1728		35	1	] Hisham Abdel-Aty and Ian R Gould. 2022. Large-scale distributed training of transformers for chemical fingerprinting. Journal of Chemical Information and Modeling 62, 20 (2022), 4852–4862.
1729		35	2	] Hadi Abdine, Michail Chatzianastasis, Costas Bouyioukos, and Michalis Vazirgiannis. 2023. Prot2Text: Multimodal Protein’s Function Generation with GNNs and Transformers. arXiv:2307.14367 [q-bio.QM]
1730		35	3	] Sanjar Adilov. 2021. Generative pre-training from molecules. (2021).
1731		35	4	] Walid Ahmad, Elana Simon, Seyone Chithrananda, Gabriel Grand, and Bharath Ramsundar. 2022. Chemberta-2: Towards chemical foundation models. arXiv preprint arXiv:2209.01712 (2022).
1732		35	5	] Microsoft Research AI4Science and Microsoft Azure Quantum. 2023. The Impact of Large Language Models on Scientific Discovery: A Preliminary Study using GPT-4. arXiv preprint:2311.07361 (2023).
1733		35	6	] Sultan Alrowili and Vijay Shanker. 2021. BioM-Transformers: Building Large Biomedical Language Models with BERT, ALBERT and ELECTRA. In Proceedings of the 20th Workshop on Biomedical Language Processing. 221–227.
1734		35	7	] Weizhi An, Yuzhi Guo, Yatao Bian, Hehuan Ma, Jinyu Yang, Chunyuan Li, and Junzhou Huang. 2022. MoDNA: motif-oriented pre-training for DNA language model. In Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. 1–5.
1735		35	8	] Michael Ashburner, Catherine A Ball, Judith A Blake, David Botstein, Heather Butler, J Michael Cherry, Allan P Davis, Kara Dolinski, Selina S Dwight, Janan T Eppig, et al. 2000. Gene ontology: tool for the unification of biology. Nature genetics 25, 1 (2000), 25–29.
1736		35	9	] Timothy F. Truong Jr au2 and Tristan Bepler. 2023. PoET: A generative model of protein families as sequences-of-sequences. arXiv:2306.06156 [qbio.QM]
1737		35	10	Žiga Avsec, Vikram Agarwal, Daniel Visentin, Joseph R Ledsam, Agnieszka Grabska-Barwinska, Kyle R Taylor, Yannis Assael, John Jumper, Pushmeet Kohli, and David R Kelley. 2021. Effective gene expression prediction from sequence by integrating long-range interactions. Nature methods 18, 10 (2021), 1196–1203.
1738		35	11	Simon Axelrod and Rafael Gomez-Bombarelli. 2022. GEOM, energy-annotated molecular conformations for property prediction and molecular generation. Scientific Data 9, 1 (2022), 185.
1739		35	12	Sarp Aykent and Tian Xia. 2022. Gbpnet: Universal geometric representation learning on protein structures. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 4–14.
1740		35	13	Viraj Bagal, Rishal Aggarwal, PK Vinod, and U Deva Priyakumar. 2021. MolGPT: molecular generation using a transformer-decoder model. Journal of Chemical Information and Modeling 62, 9 (2021), 2064–2076.
1741		35	14	Baichuan. 2023. Baichuan 2: Open Large-scale Language Models. arXiv preprint arXiv:2309.10305 (2023).
1742		35	15	Amos Bairoch and Rolf Apweiler. 2000. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic acids research 28, 1 (2000), 45–48.
1743		35	16	Suryanarayanan Balaji, Rishikesh Magar, Yayati Jadhav, and Amir Barati Farimani. 2023. GPT-MolBERTa: GPT Molecular Features Language Model for molecular property prediction. arXiv:2310.03030 [physics.chem-ph]
1744		35	17	Dominique Beaini, Shenyang Huang, Joao Alex Cunha, Gabriela Moisescu-Pareja, Oleksandr Dymov, Samuel Maddrell-Mander, Callum McLean, Frederik Wenkel, Luis Müller, Jama Hussein Mohamud, et al. 2023. Towards Foundational Models for Molecular Learning on Large-Scale Multi-Task Datasets. arXiv preprint arXiv:2310.04292 (2023).
1745		35	18	Iz Beltagy, Kyle Lo, and Arman Cohan. 2019. SciBERT: A Pretrained Language Model for Scientific Text. arXiv:1903.10676 [cs.CL]
1746		35	19	Asma Ben Abacha and Dina Demner-Fushman. 2019. A question-entailment approach to question answering. BMC bioinformatics 20 (2019), 1–23.
1747		35	20	Gonzalo Benegas, Carlos Albors, Alan J Aw, Chengzhong Ye, and Yun S Song. 2023. GPN-MSA: an alignment-based DNA language model for genome-wide variant effect prediction. bioRxiv (2023), 2023–10.
1748		35	21	Gonzalo Benegas, Sanjit Singh Batra, and Yun S Song. 2022. DNA language models are powerful zero-shot predictors of genome-wide variant effects. bioRxiv (2022), 2022–08.
1749		35	22	Gonzalo Benegas, Sanjit Singh Batra, and Yun S Song. 2023. DNA language models are powerful predictors of genome-wide variant effects. Proceedings of the National Academy of Sciences 120, 44 (2023), e2311219120.
1750		35	23	Mostapha Benhenda. 2017. ChemGAN challenge for drug discovery: can AI reproduce natural chemical diversity? (2017). Tristan Bepler and Bonnie Berger. 2021. Learning the protein language: Evolution, structure, and function. Cell systems 12, 6 (2021), 654–669.
1751		35	24	Bishwaranjan Bhattacharjee, Aashka Trivedi, Masayasu Muraoka, Muthukumaran Ramasubramanian, Takuma Udagawa, Iksha Gurung, Rong
1752		35	25	Zhang, Bharath Dandala, Rahul Ramachandran, Manil Maskey, et al. 2024. INDUS: Effective and Efficient Language Models for Scientific Applications. arXiv preprint arXiv:2405.10725 (2024).
1753		35	26	Lorenz C Blum and Jean-Louis Reymond. 2009. 970 million druglike small molecules for virtual screening in the chemical universe database GDB-13. Journal of the American Chemical Society 131, 25 (2009), 8732–8733.
1754		35	27	Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. Advances in neural information processing systems 26 (2013).
1755		35	28	Emmanuel Boutet, Damien Lieberherr, Michael Tognolli, Michel Schneider, and Amos Bairoch. 2007. UniProtKB/Swiss-Prot: the manually annotated section of the UniProt KnowledgeBase. In Plant bioinformatics: methods and protocols. Springer, 89–112. Scientific Large Language Models: A Survey on Biological & Chemical Domains 75
1756		35	29	Emmanuel Boutet, Damien Lieberherr, Michael Tognolli, Michel Schneider, Parit Bansal, Alan J Bridge, Sylvain Poux, Lydie Bougueleret, and Ioannis Xenarios. 2016. UniProtKB/Swiss-Prot, the manually annotated section of the UniProt KnowledgeBase: how to use the entry view. Plant bioinformatics: methods and protocols (2016), 23–54.
1757		35	30	Andres M Bran, Sam Cox, Oliver Schilter, Carlo Baldassari, Andrew D White, and Philippe Schwaller. 2023. ChemCrow: Augmenting large-language models with chemistry tools. arXiv:2304.05376 [physics.chem-ph]
1758		35	31	Andres M Bran and Philippe Schwaller. 2023. Transformers and Large Language Models for Chemistry and Drug Discovery. arXiv:2310.06083 [cs.LG]
1759		35	32	Nadav Brandes, Dan Ofer, Yam Peleg, Nadav Rappoport, and Michal Linial. 2022. ProteinBERT: a universal deep-learning model of protein sequence and function. Bioinformatics 38, 8 (2022), 2102–2110.
1760		35	33	Nathan Brown, Marco Fiscato, Marwin HS Segler, and Alain C Vaucher. 2019. GuacaMol: benchmarking models for de novo molecular design. Journal of chemical information and modeling 59, 3 (2019), 1096–1108.
1761		35	34	Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
1762		35	35	Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, et al. 2023. Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint:2303.12712 (2023).
1763		35	36	Hengxing Cai, Xiaochen Cai, Junhan Chang, Sihang Li, Lin Yao, Changxin Wang, Zhifeng Gao, Yongge Li, Mujie Lin, Shuwen Yang, et al. 2024. Sciassess: Benchmarking llm proficiency in scientific literature analysis. arXiv preprint arXiv:2403.01976 (2024).
1764		35	37	Hengxing Cai, Xiaochen Cai, Shuwen Yang, Jiankun Wang, Lin Yao, Zhifeng Gao, Junhan Chang, Sihang Li, Mingjun Xu, Changxin Wang, et al. 2024. Uni-SMART: Universal Science Multimodal Analysis and Research Transformer. arXiv preprint arXiv:2403.10301 (2024).
1765		35	38	Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, et al. 2024. Internlm2 technical report. arXiv preprint arXiv:2403.17297 (2024).
1766		35	39	Kathi Canese and Sarah Weis. 2013. PubMed: the bibliographic database. The NCBI handbook 2, 1 (2013).
1767		35	40	Yue Cao, Payel Das, Vijil Chenthamarakshan, Pin-Yu Chen, Igor Melnyk, and Yang Shen. 2021. Fold2seq: A joint sequence (1d)-fold (3d) embedding-based generative model for protein design. In International Conference on Machine Learning. PMLR, 1261–1271.
1768		35	41	Antje Chang, Lisa Jeske, Sandra Ulbrich, Julia Hofmann, Julia Koblitz, Ida Schomburg, Meina Neumann-Schaal, Dieter Jahn, and Dietmar Schomburg. 2021. BRENDA, the ELIXIR core data resource in 2021: new developments and updates. Nucleic acids research 49, D1 (2021), D498–D508.
1769		35	42	Jinho Chang and Jong Chul Ye. 2024. Bidirectional generation of structure and properties through a single molecular foundation model. Nature Communications 15, 1 (2024), 2323.
1770		35	43	Bo Chen, Xingyi Cheng, Yangli-ao Geng, Shen Li, Xin Zeng, Boyan Wang, Jing Gong, Chiming Liu, Aohan Zeng, Yuxiao Dong, et al. 2023. xTrimoPGLM: Unified 100b-scale pre-trained transformer for deciphering the language of protein. bioRxiv (2023), 2023–07.
1771		35	44	Dong Chen, Kaifu Gao, Duc Duy Nguyen, Xin Chen, Yi Jiang, Guo-Wei Wei, and Feng Pan. 2021. Algebraic graph-assisted bidirectional transformers for molecular property prediction. Nature communications 12, 1 (2021), 3521.
1772		35	45	Hanjie Chen, Zhouxiang Fang, Yash Singla, and Mark Dredze. 2024. Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions. arXiv:2402.18060 [cs.CL]
1773		35	46	Ken Chen, Yue Zhou, Maolin Ding, Yu Wang, Zhixiang Ren, and Yuedong Yang. 2023. Self-supervised learning on millions of pre-mRNA sequences improves sequence-based RNA splicing prediction. bioRxiv (2023), 2023–01.
1774		35	47	Linqing Chen, Weilei Wang, Zilong Bai, Peng Xu, Yan Fang, Jie Fang, Wentao Wu, Lizhi Zhou, Ruiji Zhang, Yubin Xia, et al. 2024. PharmGPT: Domain-Specific Large Language Models for Bio-Pharmaceutical and Chemistry. arXiv preprint arXiv:2406.18045 (2024).
1775		35	48	Qiyuan Chen and Cheng Deng. 2023. Bioinfo-Bench: A Simple Benchmark Framework for LLM Bioinformatics Skills Evaluation. bioRxiv (2023), 2023–10.
1776		35	49	Yangyang Chen, Zixu Wang, Lei Wang, Jianmin Wang, Pengyong Li, Dongsheng Cao, Xiangxiang Zeng, Xiucai Ye, and Tetsuya Sakurai. 2023. Deep generative model for drug design from protein target sequence. Journal of Cheminformatics 15, 1 (2023), 38.
1777		35	50	Yirong Chen, Zhenyu Wang, Xiaofen Xing, huimin zheng, Zhipei Xu, Kai Fang, Junhong Wang, Sihang Li, Jieling Wu, Qi Liu, and Xiangmin Xu. 2023. BianQue: Balancing the Questioning and Suggestion Ability of Health LLMs with Multi-turn Health Conversations Polished by ChatGPT. arXiv:2310.15896 [cs.CL]
1778		35	51	Yiqun Chen and James Zou. 2023. GenePT: A Simple But Effective Foundation Model for Genes and Cells Built From ChatGPT. bioRxiv (2023).
1779		35	52	Zeming Chen, Alejandro Hernández Cano, Angelika Romanou, Antoine Bonnet, Kyle Matoba, Francesco Salvi, Matteo Pagliardini, Simin Fan, Andreas Köpf, Amirkeivan Mohtashami, Alexandre Sallinen, Alireza Sakhaeirad, Vinitra Swamy, Igor Krawczuk, Deniz Bayazit, Axel Marmet, Syrielle Montariol, Mary-Anne Hartley, Martin Jaggi, and Antoine Bosselut. 2023. MEDITRON-70B: Scaling Medical Pretraining for Large Language Models. arXiv:2311.16079 [cs.CL]
1780		35	53	Jun Cheng, Muhammed Hasan Çelik, Thi Yen Duong Nguyen, Žiga Avsec, and Julien Gagneur. 2019. CAGI 5 splicing challenge: improved exon skipping and intron retention predictions with MMSplice. Human mutation 40, 9 (2019), 1243–1251.
1781		35	54	Jun Cheng, Guido Novati, Joshua Pan, Clare Bycroft, Akvil˙ e Žemgulyt˙ e, Taylor Applebaum, Alexander Pritzel, Lai Hong Wong, Michal Zielinski, Tobias Sargeant, et al. 2023. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science 381, 6664 (2023), eadg7492.
1782		35	55	Wei-Lin Chiang, Zhuohan Li, Zi Lin, Ying Sheng, Zhanghao Wu, Hao Zhang, Lianmin Zheng, Siyuan Zhuang, Yonghao Zhuang, Joseph E. Gonzalez, Ion Stoica, and Eric P. Xing. 2023. Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality. https://vicuna.lmsys.org
1783		35	56	Gayane Chilingaryan, Hovhannes Tamoyan, Ani Tevosyan, Nelly Babayan, Lusine Khondkaryan, Karen Hambardzumyan, Zaven Navoyan, Hrant Khachatrian, and Armen Aghajanyan. 2022. Bartsmiles: Generative masked language models for molecular representations. arXiv preprint arXiv:2211.16349 (2022).
1784		35	57	Seyone Chithrananda, Gabriel Grand, and Bharath Ramsundar. 2020. ChemBERTa: large-scale self-supervised pretraining for molecular property prediction. arXiv preprint arXiv:2010.09885 (2020).
1785		35	58	Kwang-Hwi Cho, Kyoung Tai No, et al. 2023. iupacGPT: IUPAC-based large-scale molecular pre-trained model for property prediction and molecule generation. (2023).
1786		35	59	Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, et al. 2023. Palm: Scaling language modeling with pathways. Journal of Machine Learning Research 24, 240 (2023), 1–113.
1787		35	60	Dimitrios Christofidellis, Giorgio Giannone, Jannis Born, Ole Winther, Teodoro Laino, and Matteo Manica. 2023. Unifying Molecular and Textual Representations via Multi-task Language Modelling. arXiv:2301.12586 [cs.LG]
1788		35	61	Simon Chu and Kathy Wei. 2023. Generative Antibody Design for Complementary Chain Pairing Sequences through Encoder-Decoder Language Model. In NeurIPS 2023 Generative AI and Biology (GenBio) Workshop. Kevin Clark, Minh-Thang Luong, Quoc V Le, and Christopher D Manning. 2020. Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555 (2020).
1789		35	62	Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, and Oyvind Tafjord. 2018. Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge. arXiv:1803.05457 [cs.AI]
1790		35	63	Emily Clough and Tanya Barrett. 2016. The gene expression omnibus database. Statistical Genomics: Methods and Protocols (2016), 93–110.
1791		35	64	Micaela E Consens, Cameron Dufault, Michael Wainberg, Duncan Forster, Mehran Karimzadeh, Hani Goodarzi, Fabian J Theis, Alan Moses, and Bo Wang. 2023. To Transformers and Beyond: Large Language Models for the Genome. arXiv preprint arXiv:2311.07621 (2023).
1792		35	65	1000 Genomes Project Consortium et al. 2015. A global reference for human genetic variation. Nature 526, 7571 (2015), 68.
1793		35	66	The UniProt Consortium. 2021. UniProt: the universal protein knowledgebase in 2021. Nucleic acids research 49, D1 (2021), D480–D489.
1794		35	67	The UniProt Consortium. 2023. UniProt: the universal protein knowledgebase in 2023. Nucleic Acids Research 51, D1 (2023), D523–D531.
1795		35	68	UniProt Consortium. 2019. UniProt: a worldwide hub of protein knowledge. Nucleic acids research 47, D1 (2019), D506–D515.
1796		35	69	ENCODE Project Consortium Overall coordination (data analysis coordination) Dunham Ian 2 Kundaje Anshul 3 81 82 82, Writing group Bernstein
1797		35	70	Bradley E. 7 34 Birney Ewan Dunham Ian Green Eric D. 35 Gunter Chris 15 Snyder Michael 13, et al. 2012. An integrated encyclopedia of DNA
1798		35	71	elements in the human genome. Nature 489, 7414 (2012), 57–74.
1799		35	72	Jiyu Cui, Fang Wu, Wen Zhang, Lifeng Yang, Jianbo Hu, Yin Fang, Peng Ye, Qiang Zhang, Xian Suo, Yiming Mo, et al. 2023. Direct prediction of gas
1800		35	73	adsorption via spatial atom interaction learning. Nature Communications 14, 1 (2023), 7043.
1801		35	74	Zhanbei Cui, Yu Liao, Tongda Xu, and Yan Wang. 2022. Geneformer: Learned gene compression using transformer-based context modeling. arXiv preprint arXiv:2212.08379 (2022).
1802		35	75	Hugo Dalla-Torre, Liam Gonzalez, Javier Mendoza-Revilla, Nicolas Lopez Carranza, Adam Henryk Grzywaczewski, Francesco Oteri, Christian Dallago, Evan Trop, Bernardo P de Almeida, Hassan Sirelkhatim, et al. 2023. The nucleotide transformer: Building and evaluating robust foundation models for human genomics. bioRxiv (2023), 2023–01.
1803		35	76	Christian Dallago, Jody Mou, Kadina E Johnston, Bruce J Wittmann, Nicholas Bhattacharya, Samuel Goldman, Ali Madani, and Kevin K Yang. 2021.
1804		35	77	FLIP: Benchmark tasks in fitness landscape inference for proteins. bioRxiv (2021), 2021–11.
1805		35	78	Dina Demner-Fushman, Marc D Kohli, Marc B Rosenman, Sonya E Shooshan, Laritza Rodriguez, Sameer Antani, George R Thoma, and Clement J McDonald. 2016. Preparing a collection of radiology examinations for distribution and retrieval. Journal of the American Medical Informatics Association 23, 2 (2016), 304–310.
1806		35	79	Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL, Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 4171–4186.
1807		35	80	Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
1808		35	81	C Domínguez Conde, C Xu, LB Jarvis, DB Rainbow, SB Wells, T Gomes, SK Howlett, O Suchanek, K Polanski, HW King, et al. 2022. Cross-tissue immune cell analysis reveals tissue-specific features in humans. Science 376, 6594 (2022), eabl5197.
1809		35	82	René Dreos, Giovanna Ambrosini, Rouayda Cavin Périer, and Philipp Bucher. 2013. EPD and EPDnew, high-quality promoter resources in the　next-generation sequencing era. Nucleic acids research 41, D1 (2013), D157–D164.
1810		35	83	Yuanqi Du, Tianfan Fu, Jimeng Sun, and Shengchao Liu. 2022. Molgensurvey: A systematic survey in machine learning models for molecule design.　arXiv preprint arXiv:2203.14500 (2022).
1811		35	84	Zhengxiao Du, Yujie Qian, Xiao Liu, Ming Ding, Jiezhong Qiu, Zhilin Yang, and Jie Tang. 2021. Glm: General language model pretraining with　autoregressive blank infilling. arXiv preprint arXiv:2103.10360 (2021).
1812		35	85	ChenRui Duan, Zelin Zang, Yongjie Xu, Hang He, Zihan Liu, Zijia Song, Ju-Sheng Zheng, and Stan Z. Li. 2024. FGBERT: Function-Driven Pre-trained　Gene Language Model for Metagenomics. arXiv:2402.16901 [q-bio.GN]
1813		35	86	Carl Edwards, Tuan Lai, Kevin Ros, Garrett Honke, Kyunghyun Cho, and Heng Ji. 2022. Translation between Molecules and Natural Language.　arXiv:2204.11817 [cs.CL]
1814		35	87	Scientific Large Language Models: A Survey on Biological & Chemical Domains 77
1815		35	88	Carl Edwards, ChengXiang Zhai, and Heng Ji. 2021. Text2mol: Cross-modal molecule retrieval with natural language queries. In Proceedings of the　2021 Conference on Empirical Methods in Natural Language Processing. 595–607.
1816		35	89	Yasha Ektefaie, Andrew Shen, Daria Bykova, Maximillian Marin, Marinka Zitnik, and Maha R Farhat. 2024. Evaluating generalizability of artificial　intelligence models for molecular datasets. bioRxiv (2024).
1817		35	90	Ahmed Elnaggar, Michael Heinzinger, Christian Dallago, Ghalia Rehawi, Yu Wang, Llion Jones, Tom Gibbs, Tamas Feher, Christoph Angerer, Martin　Steinegger, Debsindhu Bhowmik, and Burkhard Rost. 2022. ProtTrans: Toward Understanding the Language of Life Through Self-Supervised　Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 10 (2022), 7112–7127.
1818		35	91	Janan T Eppig, Cynthia L Smith, Judith A Blake, Martin Ringwald, James A Kadin, Joel E Richardson, and Carol J Bult. 2017. Mouse Genome　Informatics (MGI): resources for mining mouse genetic, genomic, and biological data in support of primary and translational research. Systems　Genetics: Methods and Protocols (2017), 47–73.
1819		35	92	Nicholas Evans and Stephen C Levinson. 2009. The myth of language universals: Language diversity and its importance for cognitive science.　Behavioral and brain sciences 32, 5 (2009), 429–448.
1820		35	93	Benedek Fabian, Thomas Edlich, Héléna Gaspar, Marwin Segler, Joshua Meyers, Marco Fiscato, and Mohamed Ahmed. 2020. Molecular representation　learning with language models and domain-relevant auxiliary tasks. arXiv preprint arXiv:2011.13230 (2020).
1821		35	94	Xiaomin Fang, Fan Wang, Lihang Liu, Jingzhou He, Dayong Lin, Yingfei Xiang, Kunrui Zhu, Xiaonan Zhang, Hua Wu, Hui Li, et al. 2023. A method　for multiple-sequence-alignment-free protein structure prediction using a protein language model. Nature Machine Intelligence (2023), 1–10.
1822		35	95	Yin Fang, Xiaozhuan Liang, Ningyu Zhang, Kangwei Liu, Rui Huang, Zhuo Chen, Xiaohui Fan, and Huajun Chen. 2023. Mol-Instructions: A　Large-Scale Biomolecular Instruction Dataset for Large Language Models. arXiv:2306.08018 [q-bio.QM]
1823		35	96	Yin Fang, Ningyu Zhang, Zhuo Chen, Xiaohui Fan, and Huajun Chen. 2023. Domain-agnostic molecular generation with self-feedback. CoRR,　abs/2301.11259 (2023).
1824		35	97	Yin Fang, Qiang Zhang, Haihong Yang, Xiang Zhuang, Shumin Deng, Wen Zhang, Ming Qin, Zhuo Chen, Xiaohui Fan, and Huajun Chen. 2022.　Molecular contrastive learning with chemical element knowledge graph. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36.　3968–3976.
1825		35	98	Yin Fang, Qiang Zhang, Ningyu Zhang, Zhuo Chen, Xiang Zhuang, Xin Shao, Xiaohui Fan, and Huajun Chen. 2023. Knowledge graph-enhanced　molecular contrastive learning with functional prompt. Nature Machine Intelligence (2023), 1–12.
1826		35	99	Henri A Favre and Warren H Powell. 2013. Nomenclature of organic chemistry: IUPAC recommendations and preferred names 2013. Royal Society of　Chemistry.
1827		35	100	EA Feingold, PJ Good, MS Guyer, S Kamholz, L Liefer, K Wetterstrand, FS Collins, TR Gingeras, D Kampa, EA Sekinger, et al. 2004. The ENCODE　(ENCyclopedia of DNA elements) project. Science 306, 5696 (2004), 636–640.
1828		35	101	Kehua Feng, Keyan Ding, Weijie Wang, Xiang Zhuang, Zeyuan Wang, Ming Qin, Yu Zhao, Jianhua Yao, Qiang Zhang, and Huajun Chen. 2024.　SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models. arXiv preprint arXiv:2406.09098 (2024).
1829		35	102	Noelia Ferruz, Steffen Schmidt, and Birte Höcker. 2022. A deep unsupervised language model for protein design. bioRxiv (2022).　Robert D Finn, Jaina Mistry, Benjamin Schuster-Böckler, Sam Griffiths-Jones, Volker Hollich, Timo Lassmann, Simon Moxon, Mhairi Marshall, Ajay　Khanna, Richard Durbin, et al. 2006. Pfam: clans, web tools and services. Nucleic acids research 34, suppl_1 (2006), D247–D251.
1830		35	103	0 Veniamin Fishman, Yuri Kuratov, Maxim Petrov, Aleksei Shmelev, Denis Shepelin, Nikolay Chekanov, Olga Kardymon, and Mikhail Burtsev. 2023.　GENA-LM: A Family of Open-Source Foundational Models for Long DNA Sequences. bioRxiv (2023), 2023–06.
1831		35	104	0 Mary Forehand. 2010. Bloom’s taxonomy. Emerging perspectives on learning, teaching, and technology 41, 4 (2010), 47–56.
1832		35	105	0 Oscar Franzén, Li-Ming Gan, and Johan LM Björkegren. 2019. PanglaoDB: a web server for exploration of mouse and human single-cell RNA　sequencing data. Database 2019 (2019), baz046.
1833		35	106	0 Nathan C Frey, Daniel Berenberg, Karina Zadorozhny, Joseph Kleinhenz, Julien Lafrance-Vanasse, Isidro Hotzel, Yan Wu, Stephen Ra, Richard　Bonneau, Kyunghyun Cho, et al. 2023. Protein discovery with discrete walk-jump sampling. arXiv preprint arXiv:2306.12360 (2023).
1834		35	107	0 Dennis Gankin, Alexander Karollus, Martin Grosshauser, Kristian Klemon, Johannes Hingerl, and Julien Gagneur. 2023. Species-aware DNA　language modeling. bioRxiv (2023), 2023–01.
1835		35	108	0 Bowen Gao, Bo Qiang, Haichuan Tan, Minsi Ren, Yinjun Jia, Minsi Lu, Jingjing Liu, Weiying Ma, and Yanyan Lan. 2023. DrugCLIP: Contrastive　Protein-Molecule Representation Learning for Virtual Screening. arXiv:2310.06367 [cs.LG]
1836		35	109	0 Zijing Gao, Qiao Liu, Wanwen Zeng, Wing H Wong, and Rui Jiang. 2023. EpiGePT: a Pretrained Transformer model for epigenomics. bioRxiv　(2023), 2023–07.
1837		35	110	0 Iker García-Ferrero, Rodrigo Agerri, Aitziber Atutxa Salazar, Elena Cabrio, Iker de la Iglesia, Alberto Lavelli, Bernardo Magnini, Benjamin Molinet,　Johana Ramirez-Romero, German Rigau, Jose Maria Villa-Gonzalez, Serena Villata, and Andrea Zaninello. 2024. Medical mT5: An Open-Source　Multilingual Text-to-Text LLM for The Medical Domain. arXiv:2404.07613
1838		35	111	0 Anna Gaulton, Louisa J Bellis, A Patricia Bento, Jon Chambers, Mark Davies, Anne Hersey, Yvonne Light, Shaun McGlinchey, David Michalovich,　Bissan Al-Lazikani, et al. 2012. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic acids research 40, D1 (2012), D1100–D1107.
1839		35	112	0 H BIELKA GDR, N Sharon, and EW Australia. 1984. Nomenclature and symbolism for amino acids and peptides. Pure and Applied Chemistry 56　(1984), 595–624.
1840		35	113	1 Michael K Gilson, Tiqing Liu, Michael Baitaluk, George Nicola, Linda Hwang, and Jenny Chong. 2016. BindingDB in 2015: a public database for　medicinal chemistry, computational chemistry and systems pharmacology. Nucleic acids research 44, D1 (2016), D1045–D1053.　78 Zhang and Ding, et al.
1841		35	114	1 Yu Gu, Robert Tinn, Hao Cheng, Michael Lucas, Naoto Usuyama, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, and Hoifung Poon. 2021.　Domain-specific language model pretraining for biomedical natural language processing. ACM Transactions on Computing for Healthcare (HEALTH)　3, 1 (2021), 1–23.
1842		35	115	1 Zhouhong Gu, Xiaoxuan Zhu, Haoning Ye, Lin Zhang, Jianchen Wang, Sihang Jiang, Zhuozhi Xiong, Zihan Li, Qianyu He, Rui Xu, Wenhao Huang,　Zili Wang, Shusen Wang, Weiguo Zheng, Hongwei Feng, and Yanghua Xiao. 2023. Xiezhi: An Ever-Updating Benchmark for Holistic Domain　Knowledge Evaluation. arXiv:2306.05783 [cs.CL]
1843		35	116	1 Jiang Guo, A Santiago Ibanez-Lopez, Hanyu Gao, Victor Quach, Connor W Coley, Klavs F Jensen, and Regina Barzilay. 2021. Automated chemical　reaction extraction from scientific literature. Journal of chemical information and modeling 62, 9 (2021), 2035–2045.
1844		35	117	1 Yan Guo, Yulin Dai, Hui Yu, Shilin Zhao, David C Samuels, and Yu Shyr. 2017. Improvements and impacts of GRCh38 human reference on high throughput sequencing data analysis. Genomics 109, 2 (2017), 83–90.
1845		35	118	1 Zhichun Guo, Kehan Guo, Bozhao Nan, Yijun Tian, Roshni G Iyer, Yihong Ma, Olaf Wiest, Xiangliang Zhang, Wei Wang, Chuxu Zhang, et al. 2023.　Graph-based molecular representation learning. IJCAI (2023).
1846		35	119	1 Tanishq Gupta, Mohd Zaki, N. M. Anoop Krishnan, and Mausam. 2022. MatSciBERT: A Materials Domain Language Model for Text Mining and　Information Extraction. NPJ Computational Materials 8, 1 (May 2022), 102.
1847		35	120	1 Suchin Gururangan, Ana Marasović, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, and Noah A. Smith. 2020. Don’t Stop Pretraining:　Adapt Language Models to Domains and Tasks. In Proceedings of ACL.
1848		35	121	1 Janna Hastings, Gareth Owen, Adriano Dekker, Marcus Ennis, Namrata Kale, Venkatesh Muthukrishnan, Steve Turner, Neil Swainston, Pedro　Mendes, and Christoph Steinbeck. 2016. ChEBI in 2016: Improved services and an expanding collection of metabolites. Nucleic acids research 44, D1　(2016), D1214–D1219.
1849		35	122	1 Thomas Hayes, Roshan Rao, Halil Akin, Nicholas J. Sofroniew, Deniz Oktay, Zeming Lin, Robert Verkuil, Vincent Q. Tran, Jonathan Deaton, Marius　Wiggert, Rohil Badkundri, Irhum Shafkat, Jun Gong, Alexander Derry, Raul S. Molina, Neil Thomas, Yousuf Khan, Chetan Mishra, Carolyn Kim,　Liam J. Bartie, Matthew Nemeth, Patrick D. Hsu, Tom Sercu, Salvatore Candido, and Alexander Rives. 2024. Simulating 500 million years of　evolution with a language model. bioRxiv: 2024.07.01.600583 (2024).
1850		35	123	2 Liang He, Shizhuo Zhang, Lijun Wu, Huanhuan Xia, Fusong Ju, He Zhang, Siyuan Liu, Yingce Xia, Jianwei Zhu, Pan Deng, Bin Shao, Tao Qin, and　Tie-Yan Liu. 2021. Pre-training Co-evolutionary Protein Representation via A Pairwise Masked Language Model. CoRR abs/2110.15527 (2021).
1851		35	124	2 Shanshan He, Ruchir Bhatt, Brian Birditt, Carl Brown, Emily Brown, Kan Chantranuvatana, Patrick Danaher, Dwayne Dunaway, Brian Filanoski,　Ryan G Garrison, et al. 2021. High-plex multiomic analysis in FFPE tissue at single-cellular and subcellular resolution by spatial molecular imaging.　bioRxiv (2021), 2021–11.
1852		35	125	2 Xuehai He, Shu Chen, Zeqian Ju, Xiangyu Dong, Hongchao Fang, Sicheng Wang, Yue Yang, Jiaqi Zeng, Ruisi Zhang, Ruoyu Zhang, Meng Zhou,　Penghui Zhu, and Pengtao Xie. 2020. MedDialog: Two Large-scale Medical Dialogue Datasets. arXiv:2004.03329 [cs.LG]
1853		35	126	2 Michael Heinzinger, Konstantin Weissenow, Joaquin Gomez Sanchez, Adrian Henkel, Martin Steinegger, and Burkhard Rost. 2023. ProstT5:　Bilingual Language Model for Protein Sequence and Structure. bioRxiv (2023).
1854		35	127	2 Stephen Heller, Alan McNaught, Stephen Stein, Dmitrii Tchekhovskoi, and Igor Pletnev. 2013. InChI- the worldwide chemical structure identifier　standard. Journal of Cheminformatics 5, 1 (2013), 1–9.
1855		35	128	2 Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt. 2021. Measuring Massive Multitask　Language Understanding. arXiv:2009.03300 [cs.CY]
1856		35	129	2 Daniel Hesslow, Niccoló Zanichelli, Pascal Notin, Iacopo Poli, and Debora Marks. 2022. RITA: a Study on Scaling Up Generative Protein Sequence　Models. arXiv (May 2022), arXiv:2205.05789.
1857		35	130	2 David Hiscock and Chris Upton. 2000. Viral Genome DataBase: storing and analyzing genes and proteins from complete viral genomes. Bioinformatics　16, 5 (2000), 484–485.
1858		35	131	2 Shion Honda, Shoi Shi, and Hiroki R Ueda. 2019. Smiles transformer: Pre-trained molecular fingerprint for low data drug discovery. arXiv preprint　arXiv:1911.04738 (2019).
1859		35	132	2 Zhi Hong, Aswathy Ajith, Gregory Pauloski, Eamon Duede, Kyle Chard, and Ian Foster. 2023. The Diminishing Returns of Masked Language　Models to Science. arXiv:2205.11342 [cs.CL]
1860		35	133	3 Marc Horlacher, Giulia Cantini, Julian Hesse, Patrick Schinke, Nicolas Goedert, Shubhankar Londhe, Lambert Moyon, and Annalisa Marsico. 2023.　A systematic benchmark of machine learning methods for protein–RNA interaction prediction. Briefings in Bioinformatics 24, 5 (2023), bbad307.
1861		35	134	3 Wenpin Hou and Zhicheng Ji. 2024. Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis. Nature Methods (2024), 1–4.
1862		35	135	3 Bozhen Hu, Jun Xia, Jiangbin Zheng, Cheng Tan, Yufei Huang, Yongjie Xu, and Stan Z Li. 2022. Protein language models and structure prediction: Connection and progression. arXiv preprint arXiv:2211.16742 (2022).
1863		35	136	3 Weihua Hu, Matthias Fey, Hongyu Ren, Maho Nakata, Yuxiao Dong, and Jure Leskovec. 2023. Ogb-lsc: A large-scale challenge for machine learning on graphs. arXiv preprint arXiv:2103.09430 (2023).
1864		35	137	3 Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems 33 (2020), 22118–22133.
1865		35	138	3 Kexin Huang, Jaan Altosaar, and Rajesh Ranganath. 2020. ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission. arXiv:1904.05342 [cs.CL]
1866		35	139	3 Yuzhen Huang, Yuzhuo Bai, Zhihao Zhu, Junlei Zhang, Jinghan Zhang, Tangjun Su, Junteng Liu, Chuancheng Lv, Yikai Zhang, Jiayi Lei, Yao Fu, Maosong Sun, and Junxian He. 2023. C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models. In Advances in Neural Information Processing Systems.
1867		35	140	3 Yunha Hwang, Andre L Cornman, Elizabeth H Kellogg, Sergey Ovchinnikov, and Peter R Girguis. 2024. Genomic language model predicts protein co-regulation and function. Nature communications 15, 1 (2024), 2880.
1868		35	141	3 John J Irwin, Teague Sterling, Michael M Mysinger, Erin S Bolstad, and Ryan G Coleman. 2012. ZINC: a free tool to discover chemistry for biology. Journal of chemical information and modeling 52, 7 (2012), 1757–1768.
1869		35	142	3 John J Irwin, Khanh G Tang, Jennifer Young, Chinzorig Dandarchuluun, Benjamin R Wong, Munkhzul Khurelbaatar, Yurii S Moroz, John Mayfield, and Roger A Sayle. 2020. ZINC20—a free ultralarge-scale chemical database for ligand discovery. Journal of chemical information and modeling 60, 12 (2020), 6065–6073.
1870		35	143	4 Ross Irwin, Spyridon Dimitriadis, Jiazhen He, and Esben Jannik Bjerrum. 2022. Chemformer: a pre-trained transformer for computational chemistry. Machine Learning: Science and Technology 3, 1 (2022), 015022.
1871		35	144	4 Nikita Janakarajan, Tim Erdmann, Sarath Swaminathan, Teodoro Laino, and Jannis Born. 2023. Language models in molecular discovery. arXiv preprint arXiv:2309.16235 (2023).
1872		35	145	4 Bijay Jassal, Lisa Matthews, Guilherme Viteri, Chuqiao Gong, Pascual Lorente, Antonio Fabregat, Konstantinos Sidiropoulos, Justin Cook, Marc Gillespie, Robin Haw, et al. 2020. The reactome pathway knowledgebase. Nucleic acids research 48, D1 (2020), D498–D503.
1873		35	146	4 Yanrong Ji, Zhihan Zhou, Han Liu, and Ramana V Davuluri. 2021. DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome. Bioinformatics 37, 15 (2021), 2112–2120.
1874		35	147	4 Minghui Jiang, Ying Xu, and Binhai Zhu. 2008. Protein structure–structure alignment with discrete Fréchet distance. Journal of bioinformatics and computational biology 6, 01 (2008), 51–64.
1875		35	148	4 Di Jin, Eileen Pan, Nassim Oufattole, Wei-Hung Weng, Hanyi Fang, and Peter Szolovits. 2020. What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams. arXiv:2009.13081
1876		35	149	4 Junru Jin, Yingying Yu, Ruheng Wang, Xin Zeng, Chao Pang, Yi Jiang, Zhongshen Li, Yutong Dai, Ran Su, Quan Zou, et al. 2022. iDNA-ABF: multi-scale deep biological language learning model for the interpretable prediction of DNA methylations. Genome biology 23, 1 (2022), 1–23.
1877		35	150	4 Qiao Jin, Bhuwan Dhingra, William W. Cohen, and Xinghua Lu. 2019. Probing Biomedical Embeddings from Language Models. arXiv:1904.02181 [cs.CL]
1878		35	151	4 Qiao Jin, Bhuwan Dhingra, Zhengping Liu, William W. Cohen, and Xinghua Lu. 2019. PubMedQA: A Dataset for Biomedical Research Question Answering. arXiv:1909.06146 [cs.CL]
1879		35	152	4 Matt Gardner Johannes Welbl, Nelson F. Liu. 2017. Crowdsourcing Multiple Choice Science Questions. arXiv:1707.06209v1.
1880		35	153	5 Alistair Johnson, Tom Pollard, Steven Horng, Leo Anthony Celi, and Roger Mark. 2023. MIMIC-IV-Note: Deidentified free-text clinical notes (version 2.2). PhysioNet. https://doi.org/10.13026/1n74-ne17
1881		35	154	5 Alistair E. W. Johnson, Lucas Bulgarelli, Lu Shen, Alvin Gayles, Ayad Shammout, Steven Horng, Tom J. Pollard, Benjamin Moody, Brian Gow, Li wei H. Lehman, Leo Anthony Celi, and Roger G. Mark. 2023. MIMIC-IV, a freely accessible electronic health record dataset. Scientific Data 10 (2023). https://api.semanticscholar.org/CorpusID:255439889
1882		35	155	5 Alistair E. W. Johnson, Tom J. Pollard, Seth J. Berkowitz, Nathaniel R. Greenbaum, Matthew P. Lungren, Chih ying Deng, Roger G. Mark, and Steven Horng. 2019. MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Scientific Data 6 (2019). https://api.semanticscholar.org/CorpusID:209342303
1883		35	156	5 Alistair E. W. Johnson, Tom J. Pollard, Lu Shen, Li wei H. Lehman, Mengling Feng, Mohammad Mahdi Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G. Mark. 2016. MIMIC-III, a freely accessible critical care database. Scientific Data 3 (2016). https://api.semanticscholar.org/CorpusID:33285731
1884		35	157	5 John Jumper, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ronneberger, Kathryn Tunyasuvunakool, Russ Bates, Augustin Žídek, Anna Potapenko, et al. 2021. Highly accurate protein structure prediction with AlphaFold. Nature 596, 7873 (2021), 583–589.
1885		35	158	5 Donna Karolchik, Robert Baertsch, Mark Diekhans, Terrence S Furey, Angie Hinrichs, YT Lu, Krishna M Roskin, Matt Schwartz, Charles W Sugnet, Daryl J Thomas, et al. 2003. The UCSC genome browser database. Nucleic acids research 31, 1 (2003), 51–54.
1886		35	159	5 Pavel Karpov, Guillaume Godin, and Igor V Tetko. 2019. A transformer model for retrosynthesis. In International Conference on Artificial Neural Networks. Springer, 817–830.
1887		35	160	5 Panagiotis Katsonis and Olivier Lichtarge. 2019. CAGI5: Objective performance assessments of predictions based on the Evolutionary Action equation. Human mutation 40, 9 (2019), 1436–1454.
1888		35	161	5 Eunji Kim, Dongseon Lee, Youngchun Kwon, Min Sik Park, and Youn-Suk Choi. 2021. Valid, plausible, and diverse retrosynthesis using tied two-way transformers with latent variables. Journal of Chemical Information and Modeling 61, 1 (2021), 123–133.
1889		35	162	5 Hyunjae Kim, Hyeon Hwang, Jiwoo Lee, Sihyeon Park, Dain Kim, Taewhoo Lee, Chanwoong Yoon, Jiwoong Sohn, Donghee Choi, and Jaewoo Kang. 2024. Small Language Models Learn Enhanced Reasoning Skills from Medical Textbooks. arXiv:2404.00376 [cs.CL]
1890		35	163	6 Hyunseung Kim, Jonggeol Na, and Won Bo Lee. 2021. Generative chemical transformer: neural machine learning of molecular geometric structures from chemical language via attention. Journal of chemical information and modeling 61, 12 (2021), 5804–5814.
1891		35	164	6 Sunghwan Kim, Jie Chen, Tiejun Cheng, Asta Gindulyte, Jia He, Siqian He, Qingliang Li, Benjamin A Shoemaker, Paul A Thiessen, Bo Yu, et al. 2019. PubChem 2019 update: improved access to chemical data. Nucleic acids research 47, D1 (2019), D1102–D1109. 80 Zhang and Ding, et al.
1892		35	165	6 Sunghwan Kim, Jie Chen, Tiejun Cheng, Asta Gindulyte, Jia He, Siqian He, Qingliang Li, Benjamin A Shoemaker, Paul A Thiessen, Bo Yu, et al. 2021. PubChem in 2021: new data content and improved web interfaces. Nucleic acids research 49, D1 (2021), D1388–D1395.
1893		35	166	6 Sunghwan Kim, Jie Chen, Tiejun Cheng, Asta Gindulyte, Jia He, Siqian He, Qingliang Li, Benjamin A Shoemaker, Paul A Thiessen, Bo Yu, et al. 2023. PubChem 2023 update. Nucleic acids research 51, D1 (2023), D1373–D1380.
1894		35	167	6 Sunghwan Kim, Paul A Thiessen, Evan E Bolton, Jie Chen, Gang Fu, Asta Gindulyte, Lianyi Han, Jane He, Siqian He, Benjamin A Shoemaker, et al. 2016. PubChem substance and compound databases. Nucleic acids research 44, D1 (2016), D1202–D1213.
1895		35	168	6 David R Krathwohl. 2002. A revision of Bloom’s taxonomy: An overview. Theory into practice 41, 4 (2002), 212–218.
1896		35	169	6 Mario Krenn, Florian Häse, AkshatKumar Nigam, Pascal Friederich, and Alan Aspuru-Guzik. 2020. Self-referencing embedded strings (SELFIES): A 100% robust molecular string representation. Machine Learning: Science and Technology 1, 4 (2020), 045024.
1897		35	170	6 Jens Kringelum, Sonny Kim Kjaerulff, Søren Brunak, Ole Lund, Tudor I Oprea, and Olivier Taboureau. 2016. ChemProt-3.0: a global chemical biology diseases mapping. Database 2016 (2016), bav123.
1898		35	171	6 Andriy Kryshtafovych, Torsten Schwede, Maya Topf, Krzysztof Fidelis, and John Moult. 2021. Critical assessment of methods of protein structure prediction (CASP)—Round XIV. Proteins: Structure, Function, and Bioinformatics 89, 12 (2021), 1607–1617.
1899		35	172	6 Yanis Labrak, Adrien Bazoge, Emmanuel Morin, Pierre-Antoine Gourraud, Mickael Rouvier, and Richard Dufour. 2024. Biomistral: A collection of open-source pretrained large language models for medical domains. arXiv preprint arXiv:2402.10373 (2024).
1900		35	173	7 Alexander Lachmann, Denis Torre, Alexandra B Keenan, Kathleen M Jagodnik, Hoyjin J Lee, Lily Wang, Moshe C Silverstein, and Avi Ma’ayan. 2018. Massive mining of publicly available RNA-seq data from human and mouse. Nature communications 9, 1 (2018), 1366.
1901		35	174	7 Philippe Lamesch, Tanya Z Berardini, Donghui Li, David Swarbreck, Christopher Wilks, Rajkumar Sasidharan, Robert Muller, Kate Dreher, Debbie L Alexander, Margarita Garcia-Hernandez, et al. 2012. The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. Nucleic acids research 40, D1 (2012), D1202–D1210.
1902		35	175	7 Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2019. Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019).
1903		35	176	7 7 Ursula K Le Guin. 2004. The Wave in the Mind: Talks and Essays on the Writer, the Reader, and the Imagination. Shambhala Publications. Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. 2020. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36, 4 (2020), 1234–1240.
1904		35	177	7 Minji Lee, Luiz Felipe Vecchietti, Hyunkyu Jung, Hyunjoo Ro, Meeyoung Cha, and Ho Min Kim. 2022. Protein sequence design in a latent space via model-based reinforcement learning. (2022).
1905		35	178	7 Youhan Lee, Hasun Yu, Jaemyung Lee, and Jaehoon Kim. 2023. Pre-training Sequence, Structure, and Surface Features for Comprehensive Protein Representation Learning. In The Twelfth International Conference on Learning Representations.
1906		35	179	7 Daniel Levine, Sacha Lévy, Syed Asad Rizvi, Nazreen Pallikkavaliyaveetil, Xingyu Chen, David Zhang, Sina Ghadermarzi, Ruiming Wu, Zihe Zheng, Ivan Vrkic, et al. 2023. Cell2sentence: Teaching large language models the language of biology. bioRxiv (2023), 2023–09.
1907		35	180	7 Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020. 7871–7880.
1908		35	181	7 Han Li, Dan Zhao, and Jianyang Zeng. 2022. KPGT: knowledge-guided pre-training of graph transformer for molecular property prediction. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 857–867.
1909		35	182	8 Hong-Liang Li, Yi-He Pang, and Bin Liu. 2021. BioSeq-BLM: a platform for analyzing DNA, RNA and protein sequences based on biological language models. Nucleic acids research 49, 22 (2021), e129–e129.
1910		35	183	8 Juncai Li and Xiaofei Jiang. 2021. Mol-BERT: an effective molecular representation with BERT for molecular property prediction. Wireless Communications and Mobile Computing 2021 (2021), 1–7.
1911		35	184	8 Junnan Li, Dongxu Li, Silvio Savarese, and Steven Hoi. 2023. BLIP-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. arXiv preprint:2301.12597 (2023).
1912		35	185	8 Jiatong Li, Yunqing Liu, Wenqi Fan, Xiao-Yong Wei, Hui Liu, Jiliang Tang, and Qing Li. 2023. Empowering Molecule Discovery for Molecule-Caption Translation with Large Language Models: A ChatGPT Perspective. arXiv:2306.06615 [cs.CL]
1913		35	186	8 Jiao Li, Yueping Sun, Robin J Johnson, Daniela Sciaky, Chih-Hsuan Wei, Robert Leaman, Allan Peter Davis, Carolyn J Mattingly, Thomas C Wiegers, and Zhiyong Lu. 2016. BioCreative V CDR task corpus: a resource for chemical disease relation extraction. Database 2016 (2016).
1914		35	187	8 Jianquan Li, Xidong Wang, Xiangbo Wu, Zhiyi Zhang, Xiaolong Xu, Jie Fu, Prayag Tiwari, Xiang Wan, and Benyou Wang. 2023. Huatuo-26M, a Large-scale Chinese Medical QA Dataset. arXiv:2305.01526 [cs.CL]
1915		35	188	8 Jiahao Li, Zhourun Wu, Wenhao Lin, Jiawei Luo, Jun Zhang, Qingcai Chen, and Junjie Chen. 2023. iEnhancer-ELM: improve enhancer identification by extracting position-related multiscale contextual information based on enhancer language models. Bioinformatics Advances 3, 1 (2023), vbad043.
1916		35	189	8 Yuesen Li, Chengyi Gao, Xin Song, Xiangyu Wang, Yungang Xu, and Suxia Han. 2023. Druggpt: A gpt-based strategy for designing potential ligands targeting specific proteins. bioRxiv (2023), 2023–06.
1917		35	190	8 Yunxiang Li, Zihan Li, Kai Zhang, Ruilong Dan, Steve Jiang, and You Zhang. 2023. ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge. arXiv:2303.14070 [cs.CL]
1918		35	191	8 Zhongshen Li, Junru Jin, Wentao Long, and Leyi Wei. 2023. PLPMpro: Enhancing promoter sequence prediction with prompt-learning based pre-trained language model. Computers in Biology and Medicine 164 (2023), 107260.
1919		35	192	9 Youwei Liang, Ruiyi Zhang, Li Zhang, and Pengtao Xie. 2023. DrugChat: towards enabling ChatGPT-like capabilities on drug molecule graphs. arXiv preprint arXiv:2309.03907 (2023).
1920		35	193	9 Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74–81.
1921		35	194	9 Zeming Lin, Halil Akin, Roshan Rao, Brian Hie, Zhongkai Zhu, Wenting Lu, Allan dos Santos Costa, Maryam Fazel-Zarandi, Tom Sercu, Sal Candido, et al. 2022. Language models of protein sequences at the scale of evolution enable accurate structure prediction. BioRxiv 2022 (2022), 500902.
1922		35	195	9 Huaqing Liu, Shuxian Zhou, Peiyi Chen, Jiahui Liu, Ku-Geng Huo, and Lanqing Han. 2024. Exploring Genomic Large Language Models: Bridgin the Gap between Natural Language and Gene Sequences. bioRxiv (2024), 2024–02.
1923		35	196	9 June M. Liu, Donghao Li, He Cao, Tianhe Ren, Zeyi Liao, and Jiamin Wu. 2023. ChatCounselor: A Large Language Models for Mental Health Support. arXiv:2309.15461 [cs.CL]
1924		35	197	9 Pengfei Liu, Yiming Ren, and Zhixiang Ren. 2023. GIT-Mol: A Multi-modal Large Language Model for Molecular Science with Graph, Image, and Text. arXiv:2308.06911 [cs.LG]
1925		35	198	9 Shengchao Liu, Weili Nie, Chengpeng Wang, Jiarui Lu, Zhuoran Qiao, Ling Liu, Jian Tang, Chaowei Xiao, and Anima Anandkumar. 2023. Multi-modal Molecule Structure-text Model for Text-based Retrieval and Editing. arXiv:2212.10789 [cs.LG]
1926		35	199	9 Shengchao Liu, Hanchen Wang, Weiyang Liu, Joan Lasenby, Hongyu Guo, and Jian Tang. 2021. Pre-training molecular graph representation with 3d geometry. arXiv preprint arXiv:2110.07728 (2021).
1927		35	200	9 Shengchao Liu, Jiongxiao Wang, Yijin Yang, Chengpeng Wang, Ling Liu, Hongyu Guo, and Chaowei Xiao. 2023. ChatGPT-powered Conversational Drug Editing Using Retrieval and Domain Feedback. arXiv:2305.18090 [q-bio.BM]
1928		35	201	9 Shengchao Liu, Yutao Zhu, Jiarui Lu, Zhao Xu, Weili Nie, Anthony Gitter, Chaowei Xiao, Jian Tang, Hongyu Guo, and Anima Anandkumar. 2023. A Text-guided Protein Design Framework. arXiv:2302.04611 [cs.LG]
1929		35	202	0 Xianggen Liu, Yan Guo, Haoran Li, Jin Liu, Shudong Huang, Bowen Ke, and Jiancheng Lv. 2024. DrugLLM: Open Large Language Model for Few-shot Molecule Generation. arXiv preprint arXiv:2405.06690 (2024).
1930		35	203	0 Yuyan Liu, Sirui Ding, Sheng Zhou, Wenqi Fan, and Qiaoyu Tan. 2024. MolecularGPT: Open Large Language Model (LLM) for Few-Shot Molecular Property Prediction. arXiv preprint arXiv:2406.12950 (2024).
1931		35	204	0 Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
1932		35	205	0 Yunwu Liu, Ruisheng Zhang, Tongfeng Li, Jing Jiang, Jun Ma, and Ping Wang. 2023. MolRoPE-BERT: An enhanced molecular representation with Rotary Position Embedding for molecular property prediction. Journal of Molecular Graphics and Modelling 118 (2023), 108344.
1933		35	206	0 Zicheng Liu, Jiahui Li, Siyuan Li, Zelin Zang, Cheng Tan, Yufei Huang, Yajing Bai, and Stan Z Li. 2024. GenBench: A Benchmarking Suite for Systematic Evaluation of Genomic Foundation Models. arXiv preprint arXiv:2406.01627 (2024).
1934		35	207	0 Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision. 10012–10022.
1935		35	208	0 Zhiyuan Liu, An Zhang, Hao Fei, Enzhi Zhang, Xiang Wang, Kenji Kawaguchi, and Tat-Seng Chua. 2024. ProtT3: Protein-to-Text Generation for Text-based Protein Understanding. arXiv preprint arXiv:2405.12564 (2024).
1936		35	209	0 Zequn Liu, Wei Zhang, Yingce Xia, Lijun Wu, Shufang Xie, Tao Qin, Ming Zhang, and Tie-Yan Liu. 2023. MolXPT: Wrapping Molecules with Text for Generative Pre-training. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. 1606–1616.
1937		35	210	0 Kyle Lo, Lucy Lu Wang, Mark Neumann, Rodney Kinney, and Dan S. Weld. 2020. S2ORC: The Semantic Scholar Open Research Corpus. arXiv:1911.02782 [cs.CL]
1938		35	211	0 Loredana Lo Conte, Bart Ailey, Tim JP Hubbard, Steven E Brenner, Alexey G Murzin, and Cyrus Chothia. 2000. SCOP: a structural classification of proteins database. Nucleic acids research 28, 1 (2000), 257–259.
1939		35	212	1 Daniel Mark Lowe. 2012. Extraction of chemical structures and reactions from the literature. Ph. D. Dissertation. University of Cambridge.
1940		35	213	1 Pan Lu, Swaroop Mishra, Tony Xia, Liang Qiu, Kai-Wei Chang, Song-Chun Zhu, Oyvind Tafjord, Peter Clark, and Ashwin Kalyan. 2022. Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering. arXiv:2209.09513 [cs.CL]
1941		35	214	1 Xingyu Lu, He Cao, Zijing Liu, Shengyuan Bai, Leqing Chen, Yuan Yao, Hai-Tao Zheng, and Yu Li. 2024. MoleculeQA: A Dataset to Evaluate Factual Accuracy in Molecular Comprehension. arXiv preprint arXiv:2403.08192 (2024).
1942		35	215	1 Hanyu Luo, Cheng Chen, Wenyu Shan, Pingjian Ding, and Lingyun Luo. 2022. iEnhancer-BERT: A novel transfer learning architecture based on DNA-Language model for identifying enhancers and their strength. In International Conference on Intelligent Computing. Springer, 153–165.
1943		35	216	1 Hanyu Luo, Wenyu Shan, Cheng Chen, Pingjian Ding, and Lingyun Luo. 2023. Improving language model of human genome for DNA–protein binding prediction based on task-specific pre-training. Interdisciplinary Sciences: Computational Life Sciences 15, 1 (2023), 32–43.
1944		35	217	1 Renqian Luo, Liai Sun, Yingce Xia, Tao Qin, Sheng Zhang, Hoifung Poon, and Tie-Yan Liu. 2022. BioGPT: generative pre-trained transformer for biomedical text generation and mining. Briefings in Bioinformatics 23, 6 (Sept. 2022).
1945		35	218	1 Yizhen Luo, Kai Yang, Massimo Hong, Xing Yi Liu, and Zaiqing Nie. 2023. MolFM: A Multimodal Molecular Foundation Model. arXiv:2307.09484 [q-bio.BM]
1946		35	219	1 Yizhen Luo, Jiahuan Zhang, Siqi Fan, Kai Yang, Yushuai Wu, Mu Qiao, and Zaiqing Nie. 2023. Biomedgpt: Open multimodal generative pre-trained transformer for biomedicine. arXiv preprint arXiv:2308.09442 (2023). 82 Zhang and Ding, et al.
1947		35	220	1 Rachel K. Luu and Markus J. Buehler. 2023. BioinspiredLLM: Conversational Large Language Model for the Mechanics of Biological and Bio-inspired Materials. arXiv:2309.08788 [cond-mat.mtrl-sci]
1948		35	221	1 Liuzhenghao Lv, Zongying Lin, Hao Li, Yuyang Liu, Jiaxi Cui, Calvin Yu-Chian Chen, Li Yuan, and Yonghong Tian. 2024. ProLLaMA: A Protein Large Language Model for Multi-Task Protein Language Processing. arXiv preprint arXiv:2402.16445 (2024).
1949		35	222	2 Chang Ma, Haiteng Zhao, Lin Zheng, Jiayi Xin, Qintong Li, Lijun Wu, Zhihong Deng, Yang Lu, Qi Liu, and Lingpeng Kong. 2023. Retrieved Sequence Augmentation for Protein Representation Learning. bioRxiv (2023), 2023–02.
1950		35	223	2 Ali Madani, Bryan McCann, Nikhil Naik, Nitish Shirish Keskar, Namrata Anand, Raphael R. Eguchi, Po-Ssu Huang, and Richard Socher. 2020.　ProGen: Language Modeling for Protein Generation. arXiv e-prints (March 2020), 2004.03497.
1951		35	224	2 Aditya Malusare, Harish Kothandaraman, Dipesh Tamboli, Nadia A. Lanman, and Vaneet Aggarwal. 2023. Understanding the Natural Language of　DNA using Encoder-Decoder Foundation Models with Byte-level Precision. arXiv:2311.02333 [cs.LG]
1952		35	225	2 Vipul Mann and Venkat Venkatasubramanian. 2021. Predicting chemical reaction outcomes: A grammar ontology-based transformer framework.　AIChE Journal 67, 3 (2021), e17190.
1953		35	226	2 Sanaa Mansoor, Minkyung Baek, Umesh Madan, and Eric Horvitz. 2021. Toward more general embeddings for protein design: Harnessing joint　representations of sequence and structure. bioRxiv (2021), 2021–09.
1954		35	227	2 Kelong Mao, Xi Xiao, Tingyang Xu, Yu Rong, Junzhou Huang, and Peilin Zhao. 2021. Molecular graph enhanced transformer for retrosynthesis　prediction. Neurocomputing 457 (2021), 193–202.
1955		35	228	2 Valerio Mariani, Marco Biasini, Alessandro Barbato, and Torsten Schwede. 2013. lDDT: a local superposition-free score for comparing protein　structures and models using distance difference tests. Bioinformatics 29, 21 (2013), 2722–2728.
1956		35	229	2 Fergal J Martin, M Ridwan Amode, Alisha Aneja, Olanrewaju Austine-Orimoloye, Andrey G Azov, If Barnes, Arne Becker, Ruth Bennett, Andrew　Berry, Jyothish Bhai, et al. 2023. Ensembl 2023. Nucleic acids research 51, D1 (2023), D933–D941.
1957		35	230	2 Łukasz Maziarka, Tomasz Danel, Sławomir Mucha, Krzysztof Rataj, Jacek Tabor, and Stanisław Jastrzębski. 2020. Molecule attention transformer.　arXiv preprint arXiv:2002.08264 (2020).
1958		35	231	2 Łukasz Maziarka, Dawid Majchrowski, Tomasz Danel, Piotr Gaiński, Jacek Tabor, Igor Podolak, Paweł Morkisz, and Stanisław Jastrzębski. 2024.　Relative molecule self-attention transformer. Journal of Cheminformatics 16, 1 (2024), 3.
1959		35	232	3 Eyal Mazuz, Guy Shtar, Bracha Shapira, and Lior Rokach. 2023. Molecule generation using transformers and policy gradient reinforcement learning.　Scientific Reports 13, 1 (2023), 8799.
1960		35	233	3 Andrew G. McDonald, Sinéad Boyce, and Keith F. Tipton. 2008. ExplorEnz: the primary source of the IUBMB enzyme list. Nucleic Acids Research　37, suppl_1 (09 2008), 593–597.
1961		35	234	3 Colin Megill, Bruce Martin, Charlotte Weaver, Sidney Bell, Lia Prins, Seve Badajoz, Brian McCandless, Angela Oliveira Pisco, Marcus Kinsella,　Fiona Griffin, et al. 2021. Cellxgene: a performant, scalable exploration platform for high dimensional sparse matrices. bioRxiv (2021), 2021–04.
1962		35	235	3 Joshua Meier, Roshan Rao, Robert Verkuil, Jason Liu, Tom Sercu, and Alexander Rives. 2021. Language models enable zero-shot prediction of the　effects of mutations on protein function. bioRxiv (2021).
1963		35	236	3 Arindam Mitra, Luciano Del Corro, Shweti Mahajan, Andres Codas, Clarisse Simoes, Sahaj Agrawal, Xuxi Chen, Anastasia Razdaibiedina, Erik　Jones, Kriti Aggarwal, Hamid Palangi, Guoqing Zheng, Corby Rosset, Hamed Khanpour, and Ahmed Awadallah. 2023. Orca 2: Teaching Small　Language Models How to Reason. arXiv:2311.11045 [cs.AI]
1964		35	237	3 S Moller, Ulf Leser, Wolfgang Fleischmann, and Rolf Apweiler. 1999. EDITtoTrEMBL: a distributed approach to high-quality automated protein　sequence annotation. Bioinformatics (Oxford, England) 15, 3 (1999), 219–227.
1965		35	238	3 Geraldene Munsamy, Sebastian Lindner, Philipp Lorenz, and Noelia Ferruz. 2022. ZymCTRL: a conditional language model for the controllable　generation of artificial enzymes. In Machine Learning for Structural Biology Workshop. NeurIPS 2022.
1966		35	239	3 Michael M Mysinger, Michael Carchia, John J Irwin, and Brian K Shoichet. 2012. Directory of useful decoys, enhanced (DUD-E): better ligands and　decoys for better benchmarking. Journal of medicinal chemistry 55, 14 (2012), 6582–6594.
1967		35	240	3 Eric Nguyen, Michael Poli, Matthew G Durrant, Armin W Thomas, Brian Kang, Jeremy Sullivan, Madelena Y Ng, Ashley Lewis, Aman Patel, Aaron　Lou, et al. 2024. Sequence modeling and design from molecular to genome scale with Evo. bioRxiv (2024), 2024–02.
1968		35	241	3 Eric Nguyen, Michael Poli, Marjan Faizi, Armin Thomas, Callum Birch-Sykes, Michael Wornow, Aman Patel, Clayton Rabideau, Stefano Massaroli,　Yoshua Bengio, et al. 2023. Hyenadna: Long-range genomic sequence modeling at single nucleotide resolution. arXiv preprint arXiv:2306.15794　(2023).
1969		35	242	4 Erik Nijkamp, Jeffrey A Ruffolo, Eli N Weinstein, Nikhil Naik, and Ali Madani. 2023. ProGen2: exploring the boundaries of protein language models.　Cell Systems 14, 11 (2023), 968–978.
1970		35	243	4 Pascal Notin, Mafalda Dias, Jonathan Frazer, Javier Marchena Hurtado, Aidan N Gomez, Debora Marks, and Yarin Gal. 2022. Tranception: Protein　Fitness Prediction with Autoregressive Transformers and Inference-time Retrieval. In Proceedings of the 39th International Conference on Machine　Learning, Vol. 162. 16990–17017.
1971		35	244	4 Pascal Notin, Aaron W. Kollasch, Daniel Ritter, Lood van Niekerk, Steffanie Paul, Hansen Spinner, Nathan Rollins, Ada Shaw, Ruben Weitzman,　Jonathan Frazer, Mafalda Dias, Dinko Franceschi, Rose Orenbuch, Yarin Gal, and Debora S. Marks. 2023. ProteinGym: Large-Scale Benchmarks for　Protein Design and Fitness Prediction. bioRxiv (2023).
1972		35	245	4 Pascal Notin, Ruben Weitzman, Debora S. Marks, and Yarin Gal. 2023. ProteinNPT: Improving Protein Property Prediction and Design with　Non-Parametric Transformers. bioRxiv (2023).　Scientific Large Language Models: A Survey on Biological & Chemical Domains 83
1973		35	246	4 Robert D Olson, Rida Assaf, Thomas Brettin, Neal Conrad, Clark Cucinell, James J Davis, Donald M Dempsey, Allan Dickerman, Emily M Dietrich,　Ronald W Kenyon, et al. 2023. Introducing the bacterial and viral bioinformatics resource center (BV-BRC): a resource combining PATRIC, IRD and　ViPR. Nucleic acids research 51, D1 (2023), D678–D689.
1974		35	247	4 OpenAI. 2022. Introducing ChatGPT. OpenAI Blog (November 2022).
1975		35	248	4 OpenAI. 2023. GPT-4 Technical Report. OpenAI (2023).
1976		35	249	4 Christine A Orengo, Alex D Michie, Susan Jones, David T Jones, Mark B Swindells, and Janet M Thornton. 1997. CATH–a hierarchic classification　of protein domain structures. Structure 5, 8 (1997), 1093–1109.
1977		35	250	4 Rose Oughtred, Chris Stark, Bobby-Joe Breitkreutz, Jennifer Rust, Lorrie Boucher, Christie Chang, Nadine Kolas, Lara O’Donnell, Genie Leung,　Rochelle McAdam, et al. 2019. The BioGRID interaction database: 2019 update. Nucleic acids research 47, D1 (2019), D529–D541.
1978		35	251	4 Carlos Outeiral and Charlotte M Deane. 2024. Codon language embeddings provide strong signals for use in protein engineering. Nature Machine　Intelligence 6, 2 (2024), 170–179.
1979		35	252	5 Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex　Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, and Ryan　Lowe. 2022. Training language models to follow instructions with human feedback. cite arxiv:2203.02155.
1980		35	253	5 Ankit Pal, Logesh Kumar Umapathi, and Malaikannan Sankarasubbu. 2022. MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for　Medical domain Question Answering. arXiv:2203.14371 [cs.CL]
1981		35	254	5 Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings　of the 40th annual meeting of the Association for Computational Linguistics. 311–318.
1982		35	255	5 Typhaine Paysan-Lafosse, Matthias Blum, Sara Chuguransky, Tiago Grego, Beatriz Lázaro Pinto, Gustavo A Salazar, Maxwell L Bileschi, Peer Bork,　Alan Bridge, Lucy Colwell, et al. 2023. InterPro in 2022. Nucleic acids research 51, D1 (2023), D418–D427.
1983		35	256	5 Qizhi Pei, Lijun Wu, Kaiyuan Gao, Xiaozhuan Liang, Yin Fang, Jinhua Zhu, Shufang Xie, Tao Qin, and Rui Yan. 2024. Biot5+: Towards generalized　biological understanding with iupac integration and multi-task tuning. arXiv preprint arXiv:2402.17810 (2024).
1984		35	257	5 Qizhi Pei, Lijun Wu, Kaiyuan Gao, Jinhua Zhu, Yue Wang, Zun Wang, Tao Qin, and Rui Yan. 2024. Leveraging Biomolecule and Natural Language　through Multi-Modal Learning: A Survey. arXiv preprint arXiv:2403.01528 (2024).
1985		35	258	5 Qizhi Pei, Lijun Wu, Kaiyuan Gao, Jinhua Zhu, and Rui Yan. 2024. 3D-MolT5: Towards Unified 3D Molecule-Text Modeling with 3D Molecular　Tokenization. arXiv preprint arXiv:2406.05797 (2024).
1986		35	259	5 Qizhi Pei, Wei Zhang, Jinhua Zhu, Kehan Wu, Kaiyuan Gao, Lijun Wu, Yingce Xia, and Rui Yan. 2023. BioT5: Enriching Cross-modal Integration in　Biology with Chemical Knowledge and Natural Language Associations. arXiv:2310.07276 [cs.CL]
1987		35	260	5 Cheng Peng, Xi Yang, Aokun Chen, Kaleb E. Smith, Nima PourNejatian, Anthony B. Costa, Cheryl Martin, Mona G. Flores, Ying Zhang, Tanja　Magoc, Gloria Lipori, Duane A. Mitchell, Naykky S. Ospina, Mustafa M. Ahmed, William R. Hogan, Elizabeth A. Shenkman, Yi Guo, Jiang　Bian, and Yonghui Wu. 2023. A study of generative large language model for medical research and healthcare. npj Digital Medicine 6, 1 (2023).　https://doi.org/10.1038/s41746-023-00958-w
1988		35	261	5 Yifan Peng, Qingyu Chen, and Zhiyong Lu. 2020. An Empirical Study of Multi-Task Learning on BERT for Biomedical Text Mining.　arXiv:2005.02799 [cs.CL]
1989		35	262	6 Rafael Josip Penić, Tin Vlašić, Roland G Huber, Yue Wan, and Mile Šikić. 2024. Rinalmo: General-purpose rna language models can generalize well　on structure prediction tasks. arXiv preprint arXiv:2403.00043 (2024).
1990		35	263	6 Sara Pieri, Sahal Shaji Mullappilly, Fahad Shahbaz Khan, Rao Muhammad Anwer, Salman Khan, Timothy Baldwin, and Hisham Cholakkal. 2024.　BiMediX: Bilingual Medical Mixture of Experts LLM. arXiv:2402.13253
1991		35	264	6 Tom J Pollard, Alistair EW Johnson, Jesse D Raffa, Leo A Celi, Roger G Mark, and Omar Badawi. 2018. The eICU Collaborative Research Database,　a freely available multi-center database for critical care research. Scientific data 5, 1 (2018), 1–13.
1992		35	265	6 Daniil Polykovskiy, Alexander Zhebrak, Benjamin Sanchez-Lengeling, Sergey Golovanov, Oktai Tatanov, Stanislav Belyaev, Rauf Kurbanov, Aleksey　Artamonov, Vladimir Aladinskiy, Mark Veselov, et al. 2020. Molecular sets (MOSES): a benchmarking platform for molecular generation models.　Frontiers in pharmacology 11 (2020), 565644.
1993		35	266	6 6 Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever, et al. 2018. Improving language understanding by generative pre-training. (2018).　Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. 2019. Language models are unsupervised multitask　learners. OpenAI blog (2019), 9.
1994		35	267	6 Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. 2019. Language models are unsupervised multitask　learners. OpenAI blog 1, 8 (2019), 9.
1995		35	268	6 Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2020. Exploring　the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research 21, 1 (2020), 5485–5551.
1996		35	269	6 Soumya Ram and Tristan Bepler. 2022. Few Shot Protein Generation. arXiv preprint arXiv:2204.01168 (2022).
1997		35	270	6 Mayk Caldas Ramos, Christopher J. Collison, and Andrew D. White. 2024. A Review of Large Language Models and Autonomous Agents in　Chemistry. arXiv:2407.01603 [cs.LG]
1998		35	271	7 Frank P Ramsey. 1923. Tractatus Logico-Philosophicus.
1999		35	272	7 Roshan Rao, Nicholas Bhattacharya, Neil Thomas, Yan Duan, Xi Chen, John Canny, Pieter Abbeel, and Yun S Song. 2019. Evaluating Protein　Transfer Learning with TAPE. In Advances in Neural Information Processing Systems.　84 Zhang and Ding, et al.
2000		35	273	7 Roshan M Rao, Jason Liu, Robert Verkuil, Joshua Meier, John Canny, Pieter Abbeel, Tom Sercu, and Alexander Rives. 2021. MSA transformer. In　International Conference on Machine Learning. PMLR, 8844–8856.
2001		35	274	7 Aviv Regev, Sarah A Teichmann, Eric S Lander, Ido Amit, Christophe Benoist, Ewan Birney, Bernd Bodenmiller, Peter Campbell, Piero Carninci,　Menna Clatworthy, et al. 2017. The human cell atlas. elife 6 (2017), e27041.
2002		35	275	7 Yuchen Ren, Zhiyuan Chen, Lifeng Qiao, Hongtai Jing, Yuchen Cai, Sheng Xu, Peng Ye, Xinzhu Ma, Siqi Sun, Hongliang Yan, et al. 2024. BEACON:　Benchmark for Comprehensive RNA Tasks and Language Models. arXiv preprint arXiv:2406.10391 (2024).
2003		35	276	7 Alexander Rives, Joshua Meier, Tom Sercu, Siddharth Goyal, Zeming Lin, Jason Liu, Demi Guo, Myle Ott, C. Lawrence Zitnick, Jerry Ma, and Rob　Fergus. 2019. Biological Structure and Function Emerge from Scaling Unsupervised Learning to 250 Million Protein Sequences. PNAS (2019).
2004		35	277	7 Yu Rong, Yatao Bian, Tingyang Xu, Weiyang Xie, Ying Wei, Wenbing Huang, and Junzhou Huang. 2020. Self-supervised graph transformer on　large-scale molecular data. Advances in Neural Information Processing Systems 33 (2020), 12559–12571.
2005		35	278	7 Peter W Rose, Bojan Beran, Chunxiao Bi, Wolfgang F Bluhm, Dimitris Dimitropoulos, David S Goodsell, Andreas Prlić, Martha Quesada, Gregory B　Quinn, John D Westbrook, et al. 2010. The RCSB Protein Data Bank: redesigned web site and web services. Nucleic acids research 39, suppl_1 (2010),　D392–D401.
2006		35	279	7 Jerret Ross, Brian Belgodere, Vijil Chenthamarakshan, Inkit Padhi, Youssef Mroueh, and Payel Das. 2022. Large-scale chemical language　representations capture molecular structure and properties. Nature Machine Intelligence 4, 12 (2022), 1256–1264.
2007		35	280	7 Lars Ruddigkeit, Ruud Van Deursen, Lorenz C Blum, and Jean-Louis Reymond. 2012. Enumeration of 166 billion organic small molecules in the　chemical universe database GDB-17. Journal of chemical information and modeling 52, 11 (2012), 2864–2875.
2008		35	281	8 Nicole Rusk. 2018. Sequence-based prediction of variants’ effects. Nature Methods 15, 8 (2018), 571–571.
2009		35	282	8 Moritz Schaefer, Peter Peneder, Daniel Malzl, Anna Hakobyan, Varun S Sharma, Thomas Krausgruber, Jörg Menche, Eleni Tomazou, and Christoph　Bock. [n. d.]. Joint Embedding of Transcriptomes and Text Enables Interactive Single-Cell RNA-seq Data Exploration via Natural Language. In　ICLR 2024 Workshop on Machine Learning for Genomics Explorations.
2010		35	283	8 Martin H Schaefer, Jean-Fred Fontaine, Arunachalam Vinayagam, Pablo Porras, Erich E Wanker, and Miguel A Andrade-Navarro. 2012. HIPPIE:　Integrating protein interaction networks with experiment based quality scores. PloS one 7, 2 (2012), e31826.
2011		35	284	8 Yair Schiff, Chia-Hsiang Kao, Aaron Gokaslan, Tri Dao, Albert Gu, and Volodymyr Kuleshov. 2024. Caduceus: Bi-directional equivariant long-range　dna sequence modeling. arXiv preprint arXiv:2403.03234 (2024).
2012		35	285	8 Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, Ivan Titov, and Max Welling. 2018. Modeling relational data with graph　convolutional networks. In The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3–7, 2018, Proceedings 15.Springer, 593–607.
2013		35	286	8 Nadine Schneider, Nikolaus Stiefl, and Gregory A Landrum. 2016. What’s what: The (nearly) definitive guide to reaction role assignment. Journal　of chemical information and modeling 56, 12 (2016), 2336–2346.
2014		35	287	8 Philippe Schwaller, Teodoro Laino, Théophile Gaudin, Peter Bolgar, Christopher A Hunter, Costas Bekas, and Alpha A Lee. 2019. Molecular　transformer: a model for uncertainty-calibrated chemical reaction prediction. ACS central science 5, 9 (2019), 1572–1583.
2015		35	288	8 Philippe Schwaller, Daniel Probst, Alain C Vaucher, Vishnu H Nair, David Kreutter, Teodoro Laino, and Jean-Louis Reymond. 2021. Mapping the　space of chemical reactions using attention-based neural networks. Nature machine intelligence 3, 2 (2021), 144–152.
2016		35	289	8 Yaiza Serrano, Sergi Roda, Victor Guallar, and Alexis Molina. 2023. Efficient and accurate sequence generation with small-scale protein language　models. bioRxiv (2023), 2023–08.
2017		35	290	8 Damiano Sgarbossa, Umberto Lupo, and Anne-Florence Bitbol. 2023. Generative power of a protein language model trained on multiple sequence　alignments. Elife 12 (2023), e79854.
2018		35	291	9 Murray Shanahan. 2022. Talking About Large Language Models. CoRR abs/2212.03551 (2022).
2019		35	292	9 Soumya Sharma, Bishal Santra, Abhik Jana, TYSS Santosh, Niloy Ganguly, and Pawan Goyal. 2019. Incorporating domain knowledge into medical　NLI using knowledge graphs. arXiv preprint arXiv:1909.00160 (2019).
2020		35	293	9 Yiqing Shen, Zan Chen, Michail Mamalakis, Luhan He, Haiyang Xia, Tianbin Li, Yanzhou Su, Junjun He, and Yu Guang Wang. 2024. A Fine-tuning　Dataset and Benchmark for Large Language Models for Protein Understanding. arXiv preprint arXiv:2406.05540 (2024).
2021		35	294	9 Pranav Shetty, Arunkumar Chitteth Rajan, Chris Kuenneth, Sonakshi Gupta, Lakshmi Prerana Panchumarti, Lauren Holm, Chao Zhang, and Rampi　Ramprasad. 2023. A general-purpose material property data extraction pipeline from large polymer corpora using natural language processing. npj　Computational Materials 9, 1 (2023), 52.
2022		35	295	9 Hoo-Chang Shin, Yang Zhang, Evelina Bakhturina, Raul Puri, Mostofa Patwary, Mohammad Shoeybi, and Raghav Mani. 2020. BioMegatron: Larger　Biomedical Domain Language Model. arXiv:2010.06060
2023		35	296	9 Ofir Ben Shoham and Nadav Rappoport. 2023. CPLLM: Clinical Prediction with Large Language Models. arXiv:2309.11295 [cs.CL]
2024		35	297	9 Richard W Shuai, Jeffrey A Ruffolo, and Jeffrey J Gray. 2021. Generative language modeling for antibody design. bioRxiv (2021), 2021–12.
2025		35	298	9 Christian JA Sigrist, Edouard De Castro, Lorenzo Cerutti, Béatrice A Cuche, Nicolas Hulo, Alan Bridge, Lydie Bougueleret, and Ioannis Xenarios. 2012. New and continuing developments at PROSITE. Nucleic acids research 41, D1 (2012), D344–D347.
2026		35	299	9 Karan Singhal, Shekoofeh Azizi, Tao Tu, S Sara Mahdavi, Jason Wei, Hyung Won Chung, Nathan Scales, Ajay Tanwani, Heather Cole-Lewis, Stephen Pfohl, et al. 2023. Large language models encode clinical knowledge. Nature 620, 7972 (2023), 172–180.
2027		35	300	9 Karan Singhal, Tao Tu, Juraj Gottweis, Rory Sayres, Ellery Wulczyn, Le Hou, Kevin Clark, Stephen Pfohl, Heather Cole-Lewis, Darlene Neal, Mike Schaekermann, Amy Wang, Mohamed Amin, Sami Lachgar, Philip Mansfield, Sushant Prakash, Bradley Green, Ewa Dominowska, Blaise Aguera y Scientific Large Language Models: A Survey on Biological & Chemical Domains 85 Arcas, Nenad Tomasev, Yun Liu, Renee Wong, Christopher Semturs, S. Sara Mahdavi, Joelle Barral, Dale Webster, Greg S. Corrado, Yossi Matias, Shekoofeh Azizi, Alan Karthikesalingam, and Vivek Natarajan. 2023. Towards Expert-Level Medical Question Answering with Large Language Models. arXiv:2305.09617 [cs.CL]
2028		35	301	0 Garrett A Soukup. 2001. Nucleic acids: General properties. e LS (2001).
2029		35	302	0 Martin Steinegger, Milot Mirdita, and Johannes Söding. 2019. Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold. Nature methods 16, 7 (2019), 603–606.
2030		35	303	0 Martin Steinegger and Johannes Söding. 2018. Clustering huge protein sequence sets in linear time. Nature communications 9, 1 (2018), 2542.
2031		35	304	0 Teague Sterling and John J Irwin. 2015. ZINC 15–ligand discovery for everyone. Journal of chemical information and modeling 55, 11 (2015), 2324–2337.
2032		35	305	0 Matt Sternke and Joel Karpiak. 2023. ProteinRL: Reinforcement learning with generative protein language models for property-directed sequence design. In NeurIPS 2023 Generative AI and Biology (GenBio) Workshop.
2033		35	306	0 Robert L Strausberg, Elise A Feingold, Richard D Klausner, and Francis S Collins. 1999. The mammalian gene collection. Science 286, 5439 (1999), 455–457.
2034		35	307	0 Bing Su, Dazhao Du, Zhao Yang, Yujie Zhou, Jiangmeng Li, Anyi Rao, Hao Sun, Zhiwu Lu, and Ji-Rong Wen. 2022. A molecular multimodal foundation model associating molecule graphs with natural language. arXiv preprint arXiv:2209.05481 (2022).
2035		35	308	0 Jin Su, Chenchen Han, Yuyang Zhou, Junjie Shan, Xibin Zhou, and Fajie Yuan. 2023. Saprot: Protein language modeling with structure-aware vocabulary. bioRxiv (2023), 2023–10.
2036		35	309	0 Jiangming Sun, Nina Jeliazkova, Vladimir Chupakhin, Jose-Felipe Golib-Dzib, Ola Engkvist, Lars Carlsson, Jörg Wegner, Hugo Ceulemans, Ivan Georgiev, Vedrin Jeliazkov, et al. 2017. ExCAPE-DB: an integrated large scale dataset facilitating Big Data analysis in chemogenomics. Journal of cheminformatics 9 (2017), 1–9.
2037		35	310	0 Liangtai Sun, Yang Han, Zihan Zhao, Da Ma, Zhennan Shen, Baocai Chen, Lu Chen, and Kai Yu. 2023. SciEval: A Multi-Level Large Language Model Evaluation Benchmark for Scientific Research. arXiv:2308.13149 [cs.CL]
2038		35	311	1 Baris E Suzek, Hongzhan Huang, Peter McGarvey, Raja Mazumder, and Cathy H Wu. 2007. UniRef: comprehensive and non-redundant UniProt reference clusters. Bioinformatics 23, 10 (2007), 1282–1288.
2039		35	312	1 Baris E Suzek, Yuqi Wang, Hongzhan Huang, Peter B McGarvey, Cathy H Wu, and UniProt Consortium. 2015. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31, 6 (2015), 926–932.
2040		35	313	1 Damian Szklarczyk, Rebecca Kirsch, Mikaela Koutrouli, Katerina Nastou, Farrokh Mehryary, Radja Hachilif, Annika L Gable, Tao Fang, Nadezhda T Doncheva, Sampo Pyysalo, et al. 2023. The STRING database in 2023: protein–protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic acids research 51, D1 (2023), D638–D646.
2041		35	314	1 Ross Taylor, Marcin Kardas, Guillem Cucurull, Thomas Scialom, Anthony Hartshorn, Elvis Saravia, Andrew Poulton, Viktor Kerkez, and Robert Stojnic. 2022. Galactica: A Large Language Model for Science. CoRR abs/2211.09085 (2022).
2042		35	315	1 Ross Taylor, Marcin Kardas, Guillem Cucurull, Thomas Scialom, Anthony Hartshorn, Elvis Saravia, Andrew Poulton, Viktor Kerkez, and Robert Stojnic. 2022. Galactica: A Large Language Model for Science. arXiv:2211.09085 [cs.CL]
2043		35	316	1 1 Gemini Team. 2023. Gemini: A Family of Highly Capable Multimodal Models. arXiv:2312.11805 [cs.CL] Igor V Tetko, Pavel Karpov, Ruud Van Deursen, and Guillaume Godin. 2020. State-of-the-art augmented NLP transformer models for direct and single-step retrosynthesis. Nature communications 11, 1 (2020), 5575.
2044		35	317	1 Amol Thakkar, Alain C Vaucher, Andrea Byekwaso, Philippe Schwaller, Alessandra Toniato, and Teodoro Laino. 2023. Unbiasing retrosynthesis language models with disconnection prompts. ACS Central Science 9, 7 (2023), 1488–1498.
2045		35	318	1 Augustin Toma, Patrick R. Lawler, Jimmy Ba, Rahul G. Krishnan, Barry B. Rubin, and Bo Wang. 2023. Clinical Camel: An Open Expert-Level Medical Language Model with Dialogue-Based Knowledge Encoding. arXiv:2305.12031 [cs.CL]
2046		35	319	1 Alessandra Toniato, Alain C Vaucher, Philippe Schwaller, and Teodoro Laino. 2023. Enhancing diversity in language based models for single-step retrosynthesis. Digital Discovery 2, 2 (2023), 489–501.
2047		35	320	2 Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurélien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. 2023. LLaMA: Open and Efficient Foundation Language Models. CoRR (2023).
2048		35	321	2 Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. 2023. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023).
2049		35	322	2 Chau Tran, Siddharth Khadkikar, and Aleksey Porollo. 2023. Survey of Protein Sequence Embedding Models. International Journal of Molecular Sciences 24, 4 (2023), 3775.
2050		35	323	2 Tuan Tran and Chinwe Ekenna. 2023. Molecular Descriptors Property Prediction Using Transformer-Based Approach. International Journal of Molecular Sciences 24, 15 (2023), 11948.
2051		35	324	2 Zhengkai Tu and Connor W Coley. 2022. Permutation invariant graph-to-sequence model for template-free retrosynthesis and reaction prediction. Journal of chemical information and modeling 62, 15 (2022), 3503–3513.
2052		35	325	2 Umit V Ucak, Islambek Ashyrmamatov, Junsu Ko, and Juyong Lee. 2022. Retrosynthetic reaction pathway prediction through neural machine translation of atomic environments. Nature communications 13, 1 (2022), 1186. 86 Zhang and Ding, et al.
2053		35	326	2 Gökçe Uludoğan, Elif Ozkirimli, Kutlu O. Ulgen, Nilgün Karalı, and Arzucan Özgür. 2022. Exploiting Pretrained Biochemical Language Models for Targeted Drug Design. arXiv:2209.00981 [cs.LG]
2054		35	327	2 Serbulent Unsal, Heval Atas, Muammer Albayrak, Kemal Turhan, Aybar C Acar, and Tunca Doğan. 2022. Learning functional properties of proteins with language models. Nature Machine Intelligence 4, 3 (2022), 227–245.
2055		35	328	2 Michel van Kempen, Stephanie S Kim, Charlotte Tumescheit, Milot Mirdita, Cameron LM Gilchrist, Johannes Söding, and Martin Steinegger. 2022. Foldseek: fast and accurate protein structure search. Biorxiv (2022), 2022–02.
2056		35	329	2 Mihaly Varadi, Stephen Anyango, Mandar Deshpande, Sreenath Nair, Cindy Natassia, Galabina Yordanova, David Yuan, Oana Stroe, Gemma Wood, Agata Laydon, Augustin Žídek, Tim Green, Kathryn Tunyasuvunakool, Stig Petersen, John Jumper, Ellen Clancy, Richard Green, Ankur Vora, Mira Lutfi, Michael Figurnov, Andrew Cowie, Nicole Hobbs, Pushmeet Kohli, Gerard Kleywegt, Ewan Birney, Demis Hassabis, and Sameer Velankar. 2021. AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Research 50, D1 (11 2021), D439–D444.
2057		35	330	3 Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA. 5998–6008.
2058		35	331	3 A Venigalla, Jonathan Frankle, and M Carbin. 2022. Biomedlm: a domain-specific large language model for biomedical text. MosaicML. Accessed: Dec 23, 3 (2022), 2.
2059		35	332	3 Guangyu Wang, Guoxing Yang, Zongxin Du, Longjun Fan, and Xiaohu Li. 2023. ClinicalGPT: Large Language Models Finetuned with Diverse Medical Data and Comprehensive Evaluation. arXiv:2306.09968 [cs.CL]
2060		35	333	3 Haochun Wang, Chi Liu, Nuwa Xi, Zewen Qiang, Sendong Zhao, Bing Qin, and Ting Liu. 2023. HuaTuo: Tuning LLaMA Model with Chinese Medical Knowledge. arXiv:2304.06975 [cs.CL]
2061		35	334	3 Lei Wang, Hui Zhang, Wei Xu, Zhidong Xue, and Yan Wang. 2023. Deciphering the protein landscape with ProtFlash: a lightweight language model. Cell Reports Physical Science 4, 10 (2023), 101600.
2062		35	335	3 Ning Wang, Jiang Bian, Yuchen Li, Xuhong Li, Shahid Mumtaz, Linghe Kong, and Haoyi Xiong. 2024. Multi-purpose RNA language modelling with motif-aware pretraining and type-guided fine-tuning. Nature Machine Intelligence (2024), 1–10.
2063		35	336	3 Renxiao Wang, Xueliang Fang, Yipin Lu, Chao-Yie Yang, and Shaomeng Wang. 2005. The PDBbind database: methodologies and updates. Journal of medicinal chemistry 48, 12 (2005), 4111–4119.
2064		35	337	3 Sheng Wang, Yuzhi Guo, Yuhong Wang, Hongmao Sun, and Junzhou Huang. 2019. Smiles-bert: large scale unsupervised pre-training for molecular property prediction. In Proceedings of the 10th ACM international conference on bioinformatics, computational biology and health informatics. 429–436.
2065		35	338	3 Wenlu Wang, Ye Wang, Honggang Zhao, and Simone Sciabola. 2022. A pre-trained conditional transformer for Target-specific De Novo Molecular Generation. (2022).
2066		35	339	3 Xin Wang, Xin Gao, Guohua Wang, and Dan Li. 2023. miProBERT: identification of microRNA promoters based on the pre-trained model BERT. Briefings in bioinformatics 24, 3 (2023), bbad093.
2067		35	340	4 Xi Wang, Ruichu Gu, Zhiyuan Chen, Yongge Li, Xiaohong Ji, Guolin Ke, and Han Wen. 2023. UNI-RNA: universal pre-trained models revolutionize RNA research. bioRxiv (2023), 2023–07.
2068		35	341	4 Xiaoxuan Wang, Ziniu Hu, Pan Lu, Yanqiao Zhu, Jieyu Zhang, Satyen Subramaniam, Arjun R. Loomba, Shichang Zhang, Yizhou Sun, and Wei Wang. 2023. SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models. arXiv:2307.10635 [cs.CL]
2069		35	342	4 Yanli Wang, Jewen Xiao, Tugba O Suzek, Jian Zhang, Jiyao Wang, and Stephen H Bryant. 2009. PubChem: a public information system for analyzing bioactivities of small molecules. Nucleic acids research 37, suppl_2 (2009), W623–W633.
2070		35	343	4 Yanli Wang, Jewen Xiao, Tugba O Suzek, Jian Zhang, Jiyao Wang, Zhigang Zhou, Lianyi Han, Karen Karapetyan, Svetlana Dracheva, Benjamin A Shoemaker, et al. 2012. PubChem’s BioAssay database. Nucleic acids research 40, D1 (2012), D400–D412.
2071		35	344	4 Yuhao Wang, Qiang Zhang, Ming Qin, Xiang Zhuang, Xiaotong Li, Zhichen Gong, Zeyuan Wang, Yu Zhao, Jianhua Yao, Keyan Ding, et al. [n. d.]. Knowledge-aware Reinforced Language Models for Protein Directed Evolution. In Forty-first International Conference on Machine Learning.
2072		35	345	4 Ye Wang, Honggang Zhao, Simone Sciabola, and Wenlu Wang. 2023. cMolGPT: A Conditional Generative Pre-Trained Transformer for Target- Specific De Novo Molecular Generation. Molecules 28, 11 (2023), 4430.
2073		35	346	4 Zichen Wang, Steven A Combs, Ryan Brand, Miguel Romero Calvo, Panpan Xu, George Price, Nataliya Golovach, Emmanuel O Salawu, Colby J Wise, Sri Priya Ponnapalli, et al. 2022. Lm-gvp: an extensible sequence and structure informed deep learning framework for protein property prediction. Scientific reports 12, 1 (2022), 6832.
2074		35	347	4 Zifeng Wang, Zichen Wang, Balasubramaniam Srinivasan, Vassilis N Ioannidis, Huzefa Rangwala, and Rishita Anubhai. 2023. BioBridge: Bridging Biomedical Foundation Models via Knowledge Graph. arXiv preprint arXiv:2310.03320 (2023).
2075		35	348	4 Zeyuan Wang, Qiang Zhang, Keyan Ding, Ming Qin, Xiang Zhuang, Xiaotong Li, and Huajun Chen. 2023. InstructProtein: Aligning Human and Protein Language via Knowledge Instruction. arXiv preprint:2310.03269 (2023).
2076		35	349	4 Zeyuan Wang, Qiang Zhang, Shuang-Wei HU, Haoran Yu, Xurui Jin, Zhichen Gong, and Huajun Chen. 2023. Multi-level Protein Structure Pre-training via Prompt Learning. In The Eleventh International Conference on Learning Representations.
2077		35	350	5 David Weininger. 1988. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. Journal of chemical information and computer sciences 28, 1 (1988), 31–36.
2078		35	351	5 Johannes Welbl, Nelson F Liu, and Matt Gardner. 2017. Crowdsourcing multiple choice science questions. arXiv preprint arXiv:1707.06209 (2017). Scientific Large Language Models: A Survey on Biological & Chemical Domains 87
2079		35	352	5 David L Wheeler, Tanya Barrett, Dennis A Benson, Stephen H Bryant, Kathi Canese, Vyacheslav Chetvernin, Deanna M Church, Michael DiCuccio, Ron Edgar, Scott Federhen, et al. 2007. Database resources of the national center for biotechnology information. Nucleic acids research 36, suppl_1 (2007), D13–D21.
2080		35	353	5 Jacob White. 2020. PubMed 2.0. Medical reference services quarterly 39, 4 (2020), 382–387.
2081		35	354	5 Mark D Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, Jan-Willem Boiten, Luiz Bonino da Silva Santos, Philip E Bourne, et al. 2016. The FAIR Guiding Principles for scientific data management and stewardship. Scientific data 3, 1 (2016), 1–9.
2082		35	355	5 David S Wishart, Yannick D Feunang, An C Guo, Elvis J Lo, Ana Marcu, Jason R Grant, Tanvir Sajed, Daniel Johnson, Carin Li, Zinat Sayeeda, et al. 2018. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic acids research 46, D1 (2018), D1074–D1082.
2083		35	356	5 Chaoyi Wu, Weixiong Lin, Xiaoman Zhang, Ya Zhang, Yanfeng Wang, and Weidi Xie. 2023. PMC-LLaMA: Towards Building Open-source Language Models for Medicine. arXiv:2304.14454 [cs.CL]
2084		35	357	5 Cathy H Wu, Anastasia Nikolskaya, Hongzhan Huang, Lai-Su L Yeh, Darren A Natale, Cholanayakanahalli R Vinayaka, Zhang-Zhi Hu, Raja Mazumder, Sandeep Kumar, Panagiotis Kourtesis, et al. 2004. PIRSF: family classification system at the Protein Information Resource. Nucleic acids research 32, suppl_1 (2004), D112–D114.
2085		35	358	5 Fang Wu, Dragomir Radev, and Stan Z Li. 2023. Molformer: Motif-based transformer on 3d heterogeneous molecular graphs. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 5312–5320.
2086		35	359	5 Zhenqin Wu, Bharath Ramsundar, Evan N Feinberg, Joseph Gomes, Caleb Geniesse, Aneesh S Pappu, Karl Leswing, and Vijay Pande. 2018. MoleculeNet: a benchmark for molecular machine learning. Chemical science 9, 2 (2018), 513–530.
2087		35	360	6 wwPDB consortium. 2018. Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Research 47, D1 (10 2018), D520–D528.
2088		35	361	6 Ioannis Xenarios, Lukasz Salwinski, Xiaoqun Joyce Duan, Patrick Higney, Sul-Min Kim, and David Eisenberg. 2002. DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic acids research 30, 1 (2002), 303–305.
2089		35	362	6 Jun Xia, Yanqiao Zhu, Yuanqi Du, Y Liu, and SZ Li. 2023. A Systematic Survey of Chemical Pre-trained Models. IJCAI.
2090		35	363	6 Qianqian Xie, Qingyu Chen, Aokun Chen, Cheng Peng, Yan Hu, Fongci Lin, Xueqing Peng, Jimin Huang, Jeffrey Zhang, Vipina Keloth, Xinyu Zhou, Huan He, Lucila Ohno-Machado, Yonghui Wu, Hua Xu, and Jiang Bian. 2024. Me LLaMA: Foundation Large Language Models for Medical Applications. arXiv:2402.12749
2091		35	364	6 Renchunzi Xie, Hongxin Wei, Lei Feng, and Bo An. 2022. Gearnet: Stepwise dual learning for weakly supervised domain adaptation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 8717–8725.
2092		35	365	6 Tong Xie, Yuwei Wan, Wei Huang, Zhenyu Yin, Yixuan Liu, Shaozhou Wang, Qingyuan Linghu, Chunyu Kit, Clara Grazian, Wenjie Zhang, Imran Razzak, and Bram Hoex. 2023. DARWIN Series: Domain Specific Large Language Models for Natural Science. arXiv:2308.13565 [cs.CL]
2093		35	366	6 Guoli Xiong, Zhenxing Wu, Jiacai Yi, Li Fu, Zhijiang Yang, Changyu Hsieh, Mingzhu Yin, Xiangxiang Zeng, Chengkun Wu, Aiping Lu, et al. 2021. ADMETlab 2.0: an integrated online platform for accurate and comprehensive predictions of ADMET properties. Nucleic Acids Research 49, W1 (2021), W5–W14.
2094		35	367	6 Honglin Xiong, Sheng Wang, Yitao Zhu, Zihao Zhao, Yuxiao Liu, Linlin Huang, Qian Wang, and Dinggang Shen. 2023. DoctorGLM: Fine-tuning your Chinese Doctor is not a Herculean Task. arXiv:2304.01097 [cs.CL]
2095		35	368	6 Canwen Xu, Daya Guo, Nan Duan, and Julian McAuley. 2023. Baize: An open-source chat model with parameter-efficient tuning on self-chat data. arXiv preprint arXiv:2304.01196 (2023).
2096		35	369	6 Guikun Xu, Yongquan Jiang, PengChuan Lei, Yan Yang, and Jim Chen. [n. d.]. GTMGC: Using Graph Transformer to Predict Molecule’s Ground-State Conformation. In The Twelfth International Conference on Learning Representations.
2097		35	370	7 Hanwen Xu and Sheng Wang. 2022. ProTranslator: zero-shot protein function prediction using textual description. In International Conference on Research in Computational Molecular Biology. Springer, 279–294.
2098		35	371	7 Hanwen Xu, Addie Woicik, Hoifung Poon, Russ B Altman, and Sheng Wang. 2023. Multilingual translation for zero-shot biomedical classification using BioTranslator. Nature Communications 14, 1 (2023), 738.
2099		35	372	7 Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2018. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826 (2018).
2100		35	373	7 Minghao Xu, Xinyu Yuan, Santiago Miret, and Jian Tang. 2023. ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts. arXiv:2301.12040 [q-bio.BM]
2101		35	374	7 Minghao Xu, Zuobai Zhang, Jiarui Lu, Zhaocheng Zhu, Yangtian Zhang, Chang Ma, Runcheng Liu, and Jian Tang. 2022. PEER: A Comprehensive and Multi-Task Benchmark for Protein Sequence Understanding. arXiv:2206.02096 [cs.LG]
2102		35	375	7 Zhao Xu, Youzhi Luo, Xuan Zhang, Xinyi Xu, Yaochen Xie, Meng Liu, Kaleb Dickerson, Cheng Deng, Maho Nakata, and Shuiwang Ji. 2021. Molecule3d: A benchmark for predicting 3d geometries from molecular graphs. arXiv preprint arXiv:2110.01717 (2021).
2103		35	376	7 Dongyu Xue, Han Zhang, Dongling Xiao, Yukang Gong, Guohui Chuai, Yu Sun, Hao Tian, Hua Wu, Yukun Li, and Qi Liu. 2020. X-MOL: large-scale pre-training for molecular understanding and diverse molecular analysis. bioRxiv (2020), 2020–12.
2104		35	377	7 Hideki Yamaguchi and Yutaka Saito. 2022. EvoOpt: an MSA-guided, fully unsupervised sequence optimization pipeline for protein design. In Machine Learning for Structural Biology Workshop, NeurIPS. 88 Zhang and Ding, et al.
2105		35	378	7 Fan Yang, Wenchuan Wang, Fang Wang, Yuan Fang, Duyu Tang, Junzhou Huang, Hui Lu, and Jianhua Yao. 2022. scBERT as a large-scale pretrained deep language model for cell type annotation of single-cell RNA-seq data. Nature Machine Intelligence 4, 10 (2022), 852–866.
2106		35	379	7 Jingfeng Yang, Hongye Jin, Ruixiang Tang, Xiaotian Han, Qizhang Feng, Haoming Jiang, Bing Yin, and Xia Hu. 2023. Harnessing the power of llms in practice: A survey on chatgpt and beyond. arXiv preprint arXiv:2304.13712 (2023).
2107		35	380	8 Jianyi Yang, Ambrish Roy, and Yang Zhang. 2012. BioLiP: a semi-manually curated database for biologically relevant ligand–protein interactions. Nucleic acids research 41, D1 (2012), D1096–D1103.
2108		35	381	8 Meng Yang, Haiping Huang, Lichao Huang, Nan Zhang, Jihong Wu, Huanming Yang, and Feng Mu. 2021. LOGO, a contextualized pre-trained language model of human genome flexibly adapts to various downstream tasks by fine-tuning. (2021).
2109		35	382	8 Songhua Yang, Hanjie Zhao, Senbin Zhu, Guangyu Zhou, Hongfei Xu, Yuxiang Jia, and Hongying Zan. 2023. Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language Model through Expert Feedback and Real-world Multi-turn Dialogue. arXiv:2308.03549 [cs.CL]
2110		35	383	8 Xi Yang, Aokun Chen, Nima PourNejatian, Hoo Chang Shin, Kaleb E Smith, Christopher Parisien, Colin Compas, Cheryl Martin, Mona G Flores, Ying Zhang, Tanja Magoc, Christopher A Harle, Gloria Lipori, Duane A Mitchell, William R Hogan, Elizabeth A Shenkman, Jiang Bian, and Yonghui Wu. 2022. GatorTron: A Large Clinical Language Model to Unlock Patient Information from Unstructured Electronic Health Records. arXiv:2203.03540 [cs.CL]
2111		35	384	8 Michihiro Yasunaga, Jure Leskovec, and Percy Liang. 2022. LinkBERT: Pretraining Language Models with Document Links. arXiv:2203.15827 [cs.CL]
2112		35	385	8 Qichen Ye, Junling Liu, Dading Chong, Peilin Zhou, Yining Hua, and Andrew Liu. 2023. Qilin-Med: Multi-stage Knowledge Injection Advanced Medical Large Language Model. arXiv:2310.09089 [cs.CL]
2113		35	386	8 Chengxuan Ying, Mingqi Yang, Shuxin Zheng, Guolin Ke, Shengjie Luo, Tianle Cai, Chenglin Wu, Yuxin Wang, Yanming Shen, and Di He. 2021. First place solution of KDD Cup 2021 & OGB large-scale challenge graph prediction track. arXiv preprint arXiv:2106.08279 (2021).
2114		35	387	8 Botao Yu, Frazier N Baker, Ziqi Chen, Xia Ning, and Huan Sun. 2024. Llasmol: Advancing large language models for chemistry with a large-scale, comprehensive, high-quality instruction tuning dataset. arXiv preprint arXiv:2402.09391 (2024).
2115		35	388	8 Chaohao Yuan, Songyou Li, Geyan Ye, Yikun Zhang, Long-Kai Huang, Wenbing Huang, Wei Liu, Jianhua Yao, and Yu Rong. 2024. Functional Protein Design with Local Domain Alignment. arXiv preprint arXiv:2404.16866 (2024).
2116		35	389	8 Atakan Yüksel, Erva Ulusoy, Atabey Ünlü, and Tunca Doğan. 2023. Selformer: Molecular representation learning via selfies language models. Machine Learning: Science and Technology (2023).
2117		35	390	9 Barbara Zdrazil, Eloy Felix, Fiona Hunter, Emma J Manners, James Blackshaw, Sybilla Corbett, Marleen de Veij, Harris Ioannidis, David Mendez Lopez, Juan F Mosquera, et al. 2023. The ChEMBL Database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods. Nucleic Acids Research (2023), gkad1004.
2118		35	391	9 Aohan Zeng, Xiao Liu, Zhengxiao Du, Zihan Wang, Hanyu Lai, Ming Ding, Zhuoyi Yang, Yifan Xu, Wendi Zheng, Xiao Xia, Weng Lam Tam, Zixuan Ma, Yufei Xue, Jidong Zhai, Wenguang Chen, Zhiyuan Liu, Peng Zhang, Yuxiao Dong, and Jie Tang. 2023. GLM-130B: An Open Bilingual Pre-trained Model. In The Eleventh International Conference on Learning Representations (ICLR).
2119		35	392	9 Haoyang Zeng, Matthew D Edwards, Ge Liu, and David K Gifford. 2016. Convolutional neural network architectures for predicting DNA–protein binding. Bioinformatics 32, 12 (2016), i121–i127.
2120		35	393	9 Zheni Zeng, Yuan Yao, Zhiyuan Liu, and Maosong Sun. 2022. A deep-learning system bridging molecule structure and biomedical text with comprehension comparable to human professionals. Nature communications 13, 1 (2022), 862.
2121		35	394	9 Zheni Zeng, Bangchen Yin, Shipeng Wang, Jiarui Liu, Cheng Yang, Haishen Yao, Xingzhi Sun, Maosong Sun, Guotong Xie, and Zhiyuan Liu. 2023. Interactive Molecular Discovery with Natural Language. arXiv:2306.11976 [cs.CL]
2122		35	395	9 Dan Zhang, Ziniu Hu, Sining Zhoubian, Zhengxiao Du, Kaiyu Yang, Zihan Wang, Yisong Yue, Yuxiao Dong, and Jie Tang. 2024. Sciglm: Training scientific language models with self-reflective instruction annotation and tuning. arXiv preprint arXiv:2401.07950 (2024).
2123		35	396	9 Di Zhang, Wei Liu, Qian Tan, Jingdan Chen, Hang Yan, Yuliang Yan, Jiatong Li, Weiran Huang, Xiangyu Yue, Dongzhan Zhou, et al. 2024. Chemllm: A chemical large language model. arXiv preprint arXiv:2402.06852 (2024).
2124		35	397	9 Daoan Zhang, Weitong Zhang, Bing He, Jianguo Zhang, Chenchen Qin, and Jianhua Yao. 2023. DNAGPT: A Generalized Pretrained Tool for Multiple DNA Sequence Analysis Tasks. bioRxiv (2023), 2023–07.
2125		35	398	9 Hongbo Zhang, Junying Chen, Feng Jiang, Fei Yu, Zhihong Chen, Jianquan Li, Guiming Chen, Xiangbo Wu, Zhiyi Zhang, Qingying Xiao, Xiang Wan, Benyou Wang, and Haizhou Li. 2023. HuatuoGPT, towards Taming Language Model to Be a Doctor. arXiv:2305.15075 [cs.CL]
2126		35	399	9 Le Zhang, Jiayang Chen, Tao Shen, Yu Li, and Siqi Sun. 2023. Enhancing the Protein Tertiary Structure Prediction by Multiple Sequence Alignment Generation. arXiv e-prints (June 2023), 2306.01824.
2127		35	400	0 Ningyu Zhang, Zhen Bi, Xiaozhuan Liang, Siyuan Cheng, Haosen Hong, Shumin Deng, Jiazhang Lian, Qiang Zhang, and Huajun Chen. 2022. OntoProtein: Protein Pretraining With Gene Ontology Embedding. arXiv (Jan. 2022), 2201.11147.
2128		35	401	0 Ningyu Zhang, Zhen Bi, Xiaozhuan Liang, Siyuan Cheng, Haosen Hong, Shumin Deng, Jiazhang Lian, Qiang Zhang, and Huajun Chen. 2022. Ontoprotein: Protein pretraining with gene ontology embedding. arXiv preprint arXiv:2201.11147 (2022).
2129		35	402	0 Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, et al. 2022. Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068 (2022).
2130		35	403	0 Sheng Zhang, Xin Zhang, Hui Wang, Lixiang Guo, and Shanshan Liu. 2018. Multi-scale attentive interaction networks for chinese medical question answer selection. IEEE Access 6 (2018), 74061–74071. Scientific Large Language Models: A Survey on Biological & Chemical Domains 89
2131		35	404	0 Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q Weinberger, and Yoav Artzi. 2019. Bertscore: Evaluating text generation with bert. arXiv preprint:1904.09675 (2019).
2132		35	405	0 Xiaotian Zhang, Chunyang Li, Yi Zong, Zhengyu Ying, Liang He, and Xipeng Qiu. 2023. Evaluating the Performance of Large Language Models on GAOKAO Benchmark.
2133		35	406	0 Xiao-Chen Zhang, Cheng-Kun Wu, Zhi-Jiang Yang, Zhen-Xing Wu, Jia-Cai Yi, Chang-Yu Hsieh, Ting-Jun Hou, and Dong-Sheng Cao. 2021. MG-BERT: leveraging unsupervised atomic representation learning for molecular property prediction. Briefings in bioinformatics 22, 6 (2021), bbab152.
2134		35	407	0 Xiao-Chen Zhang, Cheng-Kun Wu, Jia-Cai Yi, Xiang-Xiang Zeng, Can-Qun Yang, Ai-Ping Lu, Ting-Jun Hou, and Dong-Sheng Cao. 2022. Pushing the Boundaries of Molecular Property Prediction for Drug Discovery with Multitask Learning BERT Enhanced by SMILES Enumeration. Research 2022 (2022), 0004.
2135		35	408	0 Ying Zhang, Fang Ge, Fuyi Li, Xibei Yang, Jiangning Song, and Dong-Jun Yu. 2023. Prediction of multiple types of RNA modifications via biological language model. IEEE/ACM Transactions on Computational Biology and Bioinformatics (2023).
2136		35	409	0 Yikun Zhang, Mei Lang, Jiuhong Jiang, Zhiqiang Gao, Fan Xu, Thomas Litfin, Ke Chen, Jaswinder Singh, Xiansong Huang, Guoli Song, et al. 2023. Multiple sequence-alignment-based RNA language model and its application to structural inference. bioRxiv (2023), 2023–03.
2137		35	410	1 Yang Zhang and Jeffrey Skolnick. 2007. Scoring function for automated assessment of protein structure template quality. PROTEINS-NEW YORK- 68, 4 (2007), 1020.
2138		35	411	1 Yikun Zhang, Geyan Ye, Chaohao Yuan, Bo Han, Long-Kai Huang, Jianhua Yao, Wei Liu, and Yu Rong. 2024. Atomas: Hierarchical Alignment on Molecule-Text for Unified Molecule Understanding and Generation. arXiv preprint arXiv:2404.16880 (2024).
2139		35	412	1 Zuobai Zhang, Chuanrui Wang, Minghao Xu, Vijil Chenthamarakshan, Aurélie Lozano, Payel Das, and Jian Tang. 2023. A Systematic Study of Joint Representation Learning on Protein Sequences and Structures. arXiv:2303.06275 [q-bio.QM]
2140		35	413	1 Zuobai Zhang, Minghao Xu, Vijil Chenthamarakshan, Aurélie Lozano, Payel Das, and Jian Tang. 2023. Enhancing protein language models with structure-based encoder and pre-training. arXiv preprint arXiv:2303.06275 (2023).
2141		35	414	1 Haiteng Zhao, Shengchao Liu, Chang Ma, Hannan Xu, Jie Fu, Zhi-Hong Deng, Lingpeng Kong, and Qi Liu. 2023. GIMLET: A Unified Graph-Text Model for Instruction-Based Molecule Zero-Shot Learning. bioRxiv (2023), 2023–05.
2142		35	415	1 Suyuan Zhao, Jiahuan Zhang, Yizhen Luo, Yushuai Wu, and Zaiqing Nie. 2024. LangCell: Language-Cell Pre-training for Cell Identity Understanding. arXiv preprint arXiv:2405.06708 (2024).
2143		35	416	1 Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, et al. 2023. A survey of large language models. arXiv preprint arXiv:2303.18223 (2023).
2144		35	417	1 Zihan Zhao, Da Ma, Lu Chen, Liangtai Sun, Zihao Li, Hongshen Xu, Zichen Zhu, Su Zhu, Shuai Fan, Guodong Shen, et al. 2024. Chemdfm: Dialogue foundation model for chemistry. arXiv preprint arXiv:2401.14818 (2024).
2145		35	418	1 Kangjie Zheng, Siyu Long, Tianyu Lu, Junwei Yang, Xinyu Dai, Ming Zhang, Zaiqing Nie, Wei-Ying Ma, and Hao Zhou. 2024. Multi-Scale Protein Language Model for Unified Molecular Modeling. bioRxiv (2024), 2024–03.
2146		35	419	1 Shuangjia Zheng, Jiahua Rao, Zhongyue Zhang, Jun Xu, and Yuedong Yang. 2019. Predicting retrosynthetic reactions using self-corrected transformer neural networks. Journal of chemical information and modeling 60, 1 (2019), 47–55.
2147		35	420	2 Zaixiang Zheng, Yifan Deng, Dongyu Xue, Yi Zhou, Fei Ye, and Quanquan Gu. 2023. Structure-informed Language Models Are Protein Designers. bioRxiv (2023).
2148		35	421	2 Zhiling Zheng, Oufan Zhang, Christian Borgs, Jennifer T. Chayes, and Omar M. Yaghi. 2023. ChatGPT Chemistry Assistant for Text Mining and the Prediction of MOF Synthesis. Journal of the American Chemical Society 145, 32 (aug 2023), 18048–18062.
2149		35	422	2 Wanjun Zhong, Ruixiang Cui, Yiduo Guo, Yaobo Liang, Shuai Lu, Yanlin Wang, Amin Saied, Weizhu Chen, and Nan Duan. 2023. AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models. arXiv:2304.06364 [cs.CL]
2150		35	423	2 Gengmo Zhou, Zhifeng Gao, Qiankun Ding, Hang Zheng, Hongteng Xu, Zhewei Wei, Linfeng Zhang, and Guolin Ke. 2023. Uni-Mol: a universal 3D molecular representation learning framework. (2023).
2151		35	424	2 Hong-Yu Zhou, Yunxiang Fu, Zhicheng Zhang, Cheng Bian, and Yizhou Yu. 2023. Protein Representation Learning via Knowledge Enhanced Primary Structure Modeling. arXiv e-prints (Jan. 2023), arXiv:2301.13154. arXiv:2301.13154 [cs.LG]
2152		35	425	2 Jian Zhou and Olga G Troyanskaya. 2015. Predicting effects of noncoding variants with deep learning–based sequence model. Nature methods 12, 10 (2015), 931–934.
2153		35	426	2 Zhihan Zhou, Yanrong Ji, Weijian Li, Pratik Dutta, Ramana Davuluri, and Han Liu. 2023. Dnabert-2: Efficient foundation model and benchmark for multi-species genome. arXiv preprint arXiv:2306.15006 (2023).
2154		35	427	2 Yanqiao Zhu, Jeehyun Hwang, Keir Adams, Zhen Liu, Bozhao Nan, Brock Stenfors, Yuanqi Du, Jatin Chauhan, Olaf Wiest, Olexandr Isayev, et al. 2023. Learning Over Molecular Conformer Ensembles: Datasets and Benchmarks. arXiv preprint arXiv:2310.00115 (2023).
2155		35	428	2 Xiang Zhuang, Qiang Zhang, Keyan Ding, Yatao Bian, Xiao Wang, Jingsong Lv, Hongyang Chen, and Huajun Chen. 2023. Learning Invariant Molecular Representation in Latent Discrete Space.
2156		35	429	2 Xiang Zhuang, Qiang Zhang, Bin Wu, Keyan Ding, Yin Fang, and Huajun Chen. 2023. Graph Sampling-based Meta-Learning for Molecular Property Prediction.
2157		35	430	3 Rustam Zhumagambetov, Ferdinand Molnár, Vsevolod A Peshkov, and Siamac Fazli. 2021. Transmol: repurposing a language model for molecular generation. RSC advances 11, 42 (2021), 25921–25932.
2158		35	431	3 Le Zhuo, Zewen Chi, Minghao Xu, Heyan Huang, Heqi Zheng, Conghui He, Xian-Ling Mao, and Wentao Zhang. 2024. Protllm: An interleaved protein-language llm with protein-as-word pre-training. arXiv preprint arXiv:2403.07920 (2024).
2159		35	432	3 Maxim Zvyagin, Alexander Brace, Kyle Hippe, Yuntian Deng, Bin Zhang, Cindy Orozco Bohorquez, Austin Clyde, Bharat Kale, Danilo Perez-Rivera, Heng Ma, et al. 2022. GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics. bioRxiv (2022).
2160		36	0	Z. Zhou, Y. Ji, W. Li, P. Dutta, R. V. Davuluri, and H. Liu. Dnabert-2: Efficient foundation model and benchmark for multi-species genome. 12th International Conference on Learning Representations, ICLR 2024, 6 2023. URL https://arxiv.org/pdf/2306.15006.	https://qiita.com/kaizen_nagoya/items/d711266990ec2bed35f2
2161		36	1	Žiga Avsec, Vikram Agarwal, Daniel Visentin, Joseph R Ledsam, Agnieszka Grabska-Barwinska, Kyle R Taylor, Yannis Assael, John Jumper, Pushmeet Kohli, and David R Kelley. Effective gene expression prediction from sequence by integrating long-range interactions. Nature methods, 18 (10):1196–1203, 2021.
2162		36	2	Dennis A Benson, Mark Cavanaugh, Karen Clark, Ilene Karsch-Mizrachi, David J Lipman, James Ostell, and Eric W Sayers. Genbank. Nucleic acids research, 41(D1):D36–D42, 2012.
2163		36	3	Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Published as a conference paper at ICLR 2024
2164		36	4	Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, Aditi Raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, and Percy Liang. On the opportunities and risks of foundation models, 2022.
2165		36	5	Marta Byrska-Bishop, Uday S. Evani, Xuefang Zhao, Anna O. Basile, Haley J. Abel, Allison A. Regier, André Corvelo, Wayne E. Clarke, Rajeeva Musunuri, Kshithija Nagulapalli, Susan Fairley, Alexi Runnels, Lara Winterkorn, Ernesto Lowy-Gallego, The Human Genome Structural Variation Consortium, Paul Flicek, Soren Germer, Harrison Brand, Ira M. Hall, Michael E. Talkowski, Giuseppe Narzisi, and Michael C. Zody. High coverage whole genome sequencing of the expanded 1000 genomes project cohort including 602 trios. bioRxiv, 2021. doi: 10.1101/2021.02.06.430068. URL https://www.biorxiv.org/content/early/2021/02/07/2021.02.06.430068.
2166		36	6	Ken Chen, Huiying Zhao, and Yuedong Yang. Capturing large genomic contexts for accurately predicting enhancer-promoter interactions. Brief. Bioinform., 23(2), March 2022.
2167		36	7	ENCODE Project Consortium et al. An integrated encyclopedia of dna elements in the human genome. Nature, 489(7414):57, 2012.
2168		36	8	Hugo Dalla-Torre, Liam Gonzalez, Javier Mendoza-Revilla, Nicolas Lopez Carranza, Adam Henryk Grzywaczewski, Francesco Oteri, Christian Dallago, Evan Trop, Hassan Sirelkhatim, Guillaume Richard, Marcin Skwark, Karim Beguir, Marie Lopez, and Thomas Pierrot. The nucleotide transformer: Building and evaluating robust foundation models for human genomics. bioRxiv,2023. doi: 10.1101/2023.01.11.523679. URL https://www.biorxiv.org/content/early/2023/03/09/2023.01.11.523679.
2169		36	9	Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, and Christopher Ré. Flashattention: Fast and memory-efficient exact attention with io-awareness, 2022.
2170		36	10	Yann N. Dauphin, Angela Fan, Michael Auli, and David Grangier. Language modeling with gated convolutional networks. In Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, pp. 933–941. JMLR.org, 2017.
2171		36	11	René Dreos, Giovanna Ambrosini, Rouayda Cavin Périer, and Philipp Bucher. Epd and epdnew, high-quality promoter resources in the next-generation sequencing era. Nucleic acids research, 41 (D1):D157–D164, 2013.
2172		36	12	Katarína Grešová, Vlastimil Martinek, Davidˇ Cechák, Petr Šimeˇ cek, and Panagiotis Alexiou. Genomic benchmarks: a collection of datasets for genomic sequence classification. BMC Genomic Data, 24(1):25, 2023.
2173		36	13	Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. Lora: Low-rank adaptation of large language models, 2021.
2174		36	14	Yanrong Ji, Zhihan Zhou, Han Liu, and Ramana V Davuluri. Dnabert: pre-trained bidirectional encoder representations from transformers model for dna-language in genome. Bioinformatics, 37 (15):2112–2120, 2021. Published as a conference paper at ICLR 2024
2175		36	15	Junru Jin, Yingying Yu, Ruheng Wang, Xin Zeng, Chao Pang, Yi Jiang, Zhongshen Li, Yutong Dai, Ran Su, Quan Zou, Kenta Nakai, and Leyi Wei. iDNA-ABF: multi-scale deep biological language learning model for the interpretable prediction of DNA methylations. Genome Biol., 23(1):219, October 2022.
2176		36	16	Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, pp. 4171–4186, 2019.
2177		36	17	Shruti Khare, Céline Gurry, Lucas Freitas, Mark B Schultz, Gunter Bach, Amadou Diallo, Nancy Akite, Joses Ho, Raphael TC Lee, Winston Yeo, et al. Gisaid’s role in pandemic response. China CDC weekly, 3(49):1049, 2021.
2178		36	18	Taku Kudo and John Richardson. Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing, 2018.
2179		36	19	Nguyen Quoc Khanh Le, Quang-Thai Ho, Van-Nui Nguyen, and Jung-Su Chang. Bert-promoter: An improved sequence-based predictor of dna promoter using bert pre-trained model and shap feature selection. Computational Biology and Chemistry, 99:107732, 2022.
2180		36	20	Dohoon Lee, Jeewon Yang, and Sun Kim. Learning the histone codes with large genomic windows and three-dimensional chromatin interactions using transformer. Nature Communications, 13(1): 6678, 2022.
2181		36	21	Zhongxiao Li, Elva Gao, Juexiao Zhou, Wenkai Han, Xiaopeng Xu, and Xin Gao. Applications of deep learning in understanding gene regulation. Cell Reports Methods, 3(1):100384, 2023. ISSN 2667-2375. doi: https://doi.org/10.1016/j.crmeth.2022.100384. URL https://www.sciencedirect.com/science/article/pii/S2667237522002892.
2182		36	22	Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization, 2019.
2183		36	23	Eric Nguyen, Michael Poli, Marjan Faizi, Armin Thomas, Michael Wornow, Callum Birch-Sykes, Stefano Massaroli, Aman Patel, Clayton Rabideau, Yoshua Bengio, et al. Hyenadna: Long-range genomic sequence modeling at single nucleotide resolution. Advances in neural information processing systems, 36, 2024.
2184		36	24	Yu Ni, Linqi Fan, Miao Wang, Ning Zhang, Yongchun Zuo, and Mingzhi Liao. EPI-Mind: Identifying enhancer-promoter interactions based on transformer mechanism. Interdiscip. Sci., 14(3):786–794, September 2022.
2185		36	25	OpenAI. Gpt-4 technical report, 2023.
2186		36	26	Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, and Ryan Lowe. Training language models to follow instructions with human feedback, 2022.
2187		36	27	Ofir Press, Noah A Smith, and Mike Lewis. Train short, test long: Attention with linear biases enables input length extrapolation. arXiv preprint arXiv:2108.12409, 2021.
2188		36	28	Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551, 2020.
2189		36	29	Joel Rozowsky, Jiahao Gao, Beatrice Borsari, Yucheng T Yang, Timur Galeev, Gamze Gürsoy, Charles B Epstein, Kun Xiong, Jinrui Xu, Tianxiao Li, Jason Liu, Keyang Yu, Ana Berthel, Zhanlin Chen, Fabio Navarro, Maxwell S Sun, James Wright, Justin Chang, Christopher J F Cameron, Noam Shoresh, Elizabeth Gaskell, Jorg Drenkow, Jessika Adrian, Sergey Aganezov, François Aguet, Gabriela Balderrama-Gutierrez, Samridhi Banskota, Guillermo Barreto Corona, Sora Chee, Surya B Chhetri, Gabriel Conte Cortez Martins, Cassidy Danyko, Carrie A Davis, Daniel Farid, Nina P Farrell, Idan Gabdank, Yoel Gofin, David U Gorkin, Mengting Gu, Vivian Hecht, Benjamin C Hitz, Robbyn Issner, Yunzhe Jiang, Melanie Kirsche, Xiangmeng Kong, Bonita R Lam, Shantao Li, Bian Li, Xiqi Li, Khine Zin Lin, Ruibang Luo, Mark Mackiewicz, Ran Meng, Published as a conference paper at ICLR 2024
2190		36	30	Jill E Moore, Jonathan Mudge, Nicholas Nelson, Chad Nusbaum, Ioann Popov, Henry E Pratt, Yunjiang Qiu, Srividya Ramakrishnan, Joe Raymond, Leonidas Salichos, Alexandra Scavelli, Jacob M Schreiber, Fritz J Sedlazeck, Lei Hoon See, Rachel M Sherman, Xu Shi, Minyi Shi, Cricket Alicia Sloan, J Seth Strattan, Zhen Tan, Forrest Y Tanaka, Anna Vlasova, Jun Wang, Jonathan Werner, Brian Williams, Min Xu, Chengfei Yan, Lu Yu, Christopher Zaleski, Jing Zhang, Kristin Ardlie, J Michael Cherry, Eric M Mendenhall, William S Noble, Zhiping Weng, Morgan E Levine, Alexander Dobin, Barbara Wold, Ali Mortazavi, Bing Ren, Jesse Gillis, Richard M Myers, Michael P Snyder, Jyoti Choudhary, Aleksandar Milosavljevic, Michael C Schatz, Bradley E Bernstein, Roderic Guigó, Thomas R Gingeras, and Mark Gerstein. The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models. Cell, 186(7):1493–1511.e40, March 2023.
2191		36	31	Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1715–1725, Berlin, Germany, August 2016. Association for Computational Linguistics. doi: 10.18653/v1/P16-1162. URL https://aclanthology.org/P16-1162. Noam Shazeer. Glu variants improve transformer, 2020.
2192		36	32	Gabrielle D Smith, Wan Hern Ching, Paola Cornejo-Páramo, and Emily S Wong. Decoding enhancer complexity with machine learning and high-throughput discovery. Genome Biol., 24(1):116, May 2023.
2193		36	33	John A Stamatoyannopoulos, Michael Snyder, Ross Hardison, Bing Ren, Thomas Gingeras, David M Gilbert, Mark Groudine, Michael Bender, Rajinder Kaul, Theresa Canfield, et al. An encyclopedia of mouse dna elements (mouse encode). Genome biology, 13(8):1–5, 2012.
2194		36	34	Jianlin Su, Yu Lu, Shengfeng Pan, Ahmed Murtadha, Bo Wen, and Yunfeng Liu. Roformer: Enhanced transformer with rotary position embedding. arXiv preprint arXiv:2104.09864, 2021.
2195		36	35	Chi Sun, Xipeng Qiu, Yige Xu, and Xuanjing Huang. How to fine-tune bert for text classification?, 2020.
2196		36	36	The Mosaic ML Team. composer. https://github.com/mosaicml/composer/, 2021.
2197		36	37	Christina V Theodoris, Ling Xiao, Anant Chopra, Mark D Chaffin, Zeina R Al Sayed, Matthew C Hill, Helene Mantineo, Elizabeth M Brydon, Zexian Zeng, X Shirley Liu, and Patrick T Ellinor. Transfer learning enables predictions in network biology. Nature, 618(7965):616–624, June 2023.
2198		36	38	Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
2199		36	39	Ruohan Wang, Zishuai Wang, Jianping Wang, and Shuaicheng Li. Splicefinder: ab initio prediction of splice sites using convolutional neural network. BMC bioinformatics, 20:1–13, 2019.
2200		36	40	Zixuan Wang, Yongqing Zhang, Yuhang Liu, Shuwen Xiong, Maocheng Wang, Jiliu Zhou, and Meiqin Gong. Towards a better understanding of tf-dna binding prediction from genomic features. Computers in Biology and Medicine, pp. 105993, 2022.
2201		36	41	Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45, Online, October 2020. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/2020.emnlp-demos.6.
2202		36	42	Pengyu Zhang, Hongming Zhang, and Hao Wu. ipro-wael: a comprehensive and robust framework for identifying promoters in multiple species. Nucleic Acids Research, 50(18):10278–10289, 2022.

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up