0
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

DNA LLM for survey 2000 papers.

Last updated at Posted at 2025-08-07

The revised version of this list are below.
NA LLM and genome for survey 2200 papers.
https://qiita.com/kaizen_nagoya/items/a0ed467467ee74f66ce4

No. S R R.R title and URL new or qiita
1 0 Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model https://arxiv.org/abs/2505.23579 https://qiita.com/kaizen_nagoya/items/0718b214043a614deee0
2 1 0 [1] J. Amberger, C. A. Bocchini, A. F. Scott, and A. Hamosh. Mckusick’s online mendelian inheritance in man (omim®). Nucleic Acids Research, 37:D793, 2008. ISSN 03051048. doi: 10.1093/NAR/GKN665. URL https://pmc.ncbi.nlm.nih.gov/articles/PMC2686440/. https://qiita.com/kaizen_nagoya/items/c599d867201d1ffb1f4d
3 1 1.McKusick VA. On the X Chromosome of Man. Quart. Rev. Biol. 1962;37:69–175. doi: 10.1086/403631.
4 2 2.McKusick VA. Mendelian Inheritance in Man, A Catolog of Autosomal Dominant, Autosomal Recessive, and X-linked Phenotypes. 1st edn. Baltimore, MD: Johns Hopkins University Press; 1966.
5 3 3.McKusick VA. Mendelian Inheritance in Man, A Catolog of Human Genes and Genetic Disorders. 12th edn. Baltimore, MD: Johns Hopkins University Press; 1998.
6 4 4.Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA. Online Mendelian Inheritance in Man (OMIM®), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 2005;33:514–517. doi: 10.1093/nar/gki033.
7 2 0 [2] Anthropic. Claude 3.7 sonnet, February 2025. URL https://www.anthropic.com/news/claude-3-7-sonnet. Accessed: 2025-05-13. https://qiita.com/kaizen_nagoya/items/4364d9c475114353cf2a
8 3 0 [3] G. Benegas, C. Ye, C. Albors, J. C. Li, and Y. S. Song. Genomic language models: Opportunities and challenges. ArXiv, page arXiv:2407.11435v2, 9 2024. ISSN 2331-8422. URL https://pmc.ncbi.nlm.nih.gov/articles/PMC11275703/http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC11275703. https://pubmed.ncbi.nlm.nih.gov/39753409/ https://qiita.com/kaizen_nagoya/items/f797330e64e0c7d05f39
9 1 1. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. Attention is all you need. In: Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R., eds. Advances in Neural Information Processing Systems vol. 30. Curran Associates, Inc. (2017):.
10 2 2. Gulati, A., Qin, J., Chiu, C.-C., Parmar, N., Zhang, Y., Yu, J., Han, W., Wang, S., Zhang, Z., Wu, Y. et al. (2020). Conformer: Convolution-augmented transformer for speech recognition. arXiv preprint arXiv:2005.08100. https://arxiv.org/abs/2005.08100.
11 3 3. Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F. L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S. et al. (2023). GPT-4 technical report. arXiv preprint arXiv:2303.08774. https://arxiv.org/abs/2303.08774.
12 4 4. Bateman, A., Martin, M.-J., Orchard, S., Magrane, M., Ahmad, S., Alpi, E., Bowler-Barnett, E. H., Britto, R., Cukura, A., Denny, P. et al. (2023). UniProt: the universal protein knowledgebase in 2023. Nucleic Acids Research 51, D523–D531.
13 5 5. Lin, Z., Akin, H., Rao, R., Hie, B., Zhu, Z., Lu, W., Smetanin, N., Verkuil, R., Kabeli, O., Shmueli, Y. et al. (2023). Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130.
14 6 6. Meier, J., Rao, R., Verkuil, R., Liu, J., Sercu, T., and Rives, A. Language models enable zero-shot prediction of the effects of mutations on protein function. In:Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W., eds. Advances in Neural Information Processing Systems vol. 34. Curran Associates, Inc. 2021):( 29287–29303). https://proceedings.neurips.cc/paper_files/paper/2021/file/f51338d736f95dd42427296047067694-Paper.pdf.
15 7 7. Truong Jr, T., and Bepler, T. PoET: A generative model of protein families as sequences-of-sequences. In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., and Levine, S., eds. Advances in Neural Information Processing Systems vol. 36. Curran Associates, Inc. (2023):( 77379–77415). https://proceedings.neurips.cc/paper_files/paper/2023/file/f4366126eba252699b280e8f93c0ab2f-Paper-Conference.pdf.
16 8 8. Bepler, T., and Berger, B. (2021). Learning the protein language: Evolution, structure, and function. Cell Systems 12, 654–669.
17 9 9. Ruffolo, J. A., and Madani, A. (2024). Designing proteins with language models. Nature Biotechnology 42, 200–202.
18 10 10. Riesselman, A. J., Ingraham, J. B., and Marks, D. S. (2018). Deep generative models of genetic variation capture the effects of mutations. Nature Methods 15, 816–822.
19 11 11. Frazer, J., Notin, P., Dias, M., Gomez, A., Min, J. K., Brock, K., Gal, Y., and Marks, D. S. (2021). Disease variant prediction with deep generative models of evolutionary data. Nature 599, 91–95.
20 12 12. Brandes, N., Goldman, G., Wang, C. H., Ye, C. J., and Ntranos, V. (2023). Genome-wide prediction of disease variant effects with a deep protein language model. Nature Genetics. https://doi.org/10.1038/s41588-023-01465-0. doi:10.1038/s41588-023-01465-0.
21 13 13. Benegas, G., Batra, S. S., and Song, Y. S. (2023). DNA language models are powerful predictors of genome-wide variant effects. Proceedings of the National Academy of Sciences 120,e2311219120.
22 14 14. Mendoza-Revilla, J., Trop, E., Gonzalez, L., Roller, M., Dalla-Torre, H., de Almeida, B. P., Richard, G., Caton, J., Lopez Carranza, N., Skwark, M., Laterre, A., Beguir, K., Pierrot, T., and Lopez, M. (2024). A foundational large language model for edible plant genomes. Communications Biology 7, 835. https://doi.org/10.1038/s42003-024-06465-2. doi:10.1038/s42003-024-06465-2.
23 15 15. Zhai, J., Gokaslan, A., Schiff, Y., Berthel, A., Liu, Z.-Y., Miller, Z. R., Scheben, A., Stitzer, M. C., Romay, C., Buckler, E. S., and Kuleshov, V. (2024). Cross-species plant genomes modeling at single nucleotide resolution using a pre-trained DNA language model. bioRxiv preprint. https://www.biorxiv.org/content/early/2024/06/05/2024.06.04.596709. doi:10.1101/2024.06.04.596709.
24 16 16. Dalla-Torre, H., Gonzalez, L., Mendoza Revilla, J., Lopez Carranza, N., Henryk Grywaczewski, A., Oteri, F., Dallago, C., Trop, E., Sirelkhatim, H., Richard, G. et al. (2023). The Nucleotide Transformer: Building and Evaluating Robust Foundation Models for Human Genomics. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2023.01.11.523679v3.
25 17 17. Benegas, G., Albors, C., Aw, A. J., Ye, C., and Song, Y. S. (2023). GPN-MSA: an alignment- based DNA language model for genome-wide variant effect prediction. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2023.10.10.561776v2.
26 18 18. Hsu, C., Nisonoff, H., Fannjiang, C., and Listgarten, J. (2022). Learning protein fitness models from evolutionary and assay-labeled data. Nature Biotechnology 40, 1114–1122.
27 19 19. Tomaz da Silva, P., Karollus, A., Hingerl, J., Galindez, G., Wagner, N., Hernandez-Alias, X., Incarnato, D., and Gagneur, J. (2024). Nucleotide dependency analysis of DNA language models reveals genomic functional elements. bioRxiv preprint ( 2024–07). https://www.biorxiv.org/content/10.1101/2024.07.27.605418v1.
28 20 20. Siepel, A., Bejerano, G., Pedersen, J. S., Hinrichs, A. S., Hou, M., Rosenbloom, K., Clawson, H., Spieth, J., Hillier, L. W., Richards, S. et al. (2005). Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Research 15, 1034–1050.
29 21 21. Pollard, K. S., Hubisz, M. J., Rosenbloom, K. R., and Siepel, A. (2010). Detection of nonneutral substitution rates on mammalian phylogenies. Genome Research 20, 110–121.
30 22 22. Avsec, Z., Agarwal, V., Visentin, D., Ledsam, J. R., Grabska-Barwinska, A., Taylor, K. R., Assael, Y., Jumper, J., Kohli, P., and Kelley, D. R. (2021). Effective gene expression prediction from sequence by integrating long-range interactions. Nature Methods 18, 1196–1203.
31 23 23. Jaganathan, K., Panagiotopoulou, S. K., McRae, J. F., Darbandi, S. F., Knowles, D., Li, Y. I., Kosmicki, J. A., Arbelaez, J., Cui, W., Schwartz, G. B. et al. (2019). Predicting splicing from primary sequence with deep learning. Cell 176, 535–548.
32 24 24. Schiff, Y., Kao, C.-H., Gokaslan, A., Dao, T., Gu, A., and Kuleshov, V. (2024). Caduceus: Bi-directional equivariant long-range DNA sequence modeling. arXiv preprint arXiv:2403.03234.https://arxiv.org/abs/2403.03234.
33 25 25. Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., and Amodei, D. Language models are few-shot learners. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H., eds. Advances in Neural Information Processing Systems vol. 33. Curran Associates, Inc. (2020):( 1877–1901). https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.
34 26 26. Madani, A., Krause, B., Greene, E. R., Subramanian, S., Mohr, B. P., Holton, J. M., Olmos, J. L., Xiong, C., Sun, Z. Z., Socher, R. et al. (2023). Large language models generate functional protein sequences across diverse families. Nature Biotechnology 41, 1099–1106.
35 27 27. Ingraham, J., Garg, V., Barzilay, R., and Jaakkola, T. Generative models for graph-based protein design. In: Wallach, H., Larochelle, H., Beygelzimer, A., d'Alch´e-Buc, F., Fox, E., and Garnett, R., eds. Advances in Neural Information Processing Systems vol. 32. Curran Associates, Inc. (2019):https://proceedings.neurips.cc/paper_files/paper/2019/file/f3a4ff4839c56a5f460c88cce3666a2b-Paper.pdf.
36 28 28. Hsu, C., Verkuil, R., Liu, J., Lin, Z., Hie, B., Sercu, T., Lerer, A., and Rives, A. Learning inverse folding from millions of predicted structures. In: International Conference on Machine Learning. PMLR (2022):( 8946–8970).
37 29 29. Shin, J.-E., Riesselman, A. J., Kollasch, A. W., McMahon, C., Simon, E., Sander, C., Manglik, A., Kruse, A. C., and Marks, D. S. (2021). Protein design and variant prediction using autoregressive generative models. Nature Communications 12, 2403.
38 30 30. Lal, A., Garfield, D., Biancalani, T., and Eraslan, G. regLM: Designing realistic regulatory DNA with autoregressive language models. In: International Conference on Research in Computational Molecular Biology. Springer (2024):( 332–335).
39 31 31. Nguyen, E., Poli, M., Durrant, M. G., Thomas, A. W., Kang, B., Sullivan, J., Ng, M. Y., Lewis, A., Patel, A., Lou, A. et al. (2024). Sequence modeling and design from molecular to genome scale with Evo. bioRxiv preprint ( 2024–02). https://www.biorxiv.org/content/10.1101/2024.02.27.582234v2.
40 32 32. Wang, Y., Wang, H., Wei, L., Li, S., Liu, L., and Wang, X. (2020). Synthetic promoter design in Escherichia coli based on a deep generative network. Nucleic Acids Research 48,6403–6412.
41 33 33. Jores, T., Tonnies, J., Wrightsman, T., Buckler, E. S., Cuperus, J. T., Fields, S., and Queitsch, C. (2021). Synthetic promoter designs enabled by a comprehensive analysis of plant core promoters. Nature Plants 7, 842–855.
42 34 34. de Almeida, B. P., Reiter, F., Pagani, M., and Stark, A. (2022). DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers. Nature Genetics 54, 613–624.
43 35 35. Nguyen, E., Poli, M., Faizi, M., Thomas, A., Wornow, M., Birch-Sykes, C., Massaroli, S., Patel, A., Rabideau, C., Bengio, Y., Ermon, S., R´e, C., and Baccus, S. HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution. In: Oh, A., Naumann,
44 36 T., Globerson, A., Saenko, K., Hardt, M., and Levine, S., eds. Advances in Neural Information Processing Systems vol. 36. Curran Associates, Inc. (2023):( 43177–43201).
45 37 36. Shao, B. (2023). A long-context language model for deciphering and generating bacteriophage genomes. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2023.12.18.572218v3.
46 38 37. Ratcliff, J. D. (2024). Transformer model generated bacteriophage genomes are compositionally distinct from natural sequences. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2024.03.19.585716v1.
47 39 38. Alipanahi, B., Delong, A., Weirauch, M. T., and Frey, B. J. (2015). Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning. Nature Biotechnology 33, 831–838.
48 40 39. Zhou, J., and Troyanskaya, O. G. (2015). Predicting effects of noncoding variants with deep learning–based sequence model. Nature Methods 12, 931–934.
49 41 40. Kelley, D. R., Snoek, J., and Rinn, J. L. (2016). Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Research 26, 990–999.
50 42 41. Kelley, D. R., Reshef, Y. A., Bileschi, M., Belanger, D., McLean, C. Y., and Snoek, J. (2018). Sequential regulatory activity prediction across chromosomes with convolutional neural networks. Genome Research 28, 739–750.
51 43 42. Zeng, T., and Li, Y. I. (2022). Predicting RNA splicing from DNA sequence using Pangolin. Genome Biology 23, 103. https://genomebiology.biomedcentral.com/articles/10.1186/s13059-022-02664-4. doi:10.1186/s13059-022-02664-4.
52 44 43. Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Burstein, J., Doran, C., and Solorio, T., eds. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics (2019):(4171–4186). https://aclanthology.org/N19-1423. doi:10.18653/v1/N19-1423.
53 45 44. Bommasani, R., Hudson, D. A. et al. (2021). On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258. https://arxiv.org/abs/2108.07258.
54 46 45. West-Roberts, J., Kravitz, J., Jha, N., Cornman, A., and Hwang, Y. (2024). Diverse genomic embedding benchmark for functional evaluation across the tree of life. bioRxiv ( 2024–07). https://www.biorxiv.org/content/10.1101/2024.07.10.602933v1.
55 47 46. de Almeida, B. P., Dalla-Torre, H., Richard, G., Blum, C., Hexemer, L., G´elard, M., Mendoza-Revilla, J., Pandey, P., Laurent, S., Lopez, M. et al. (2024). SegmentNT: annotating the genome at single-nucleotide resolution with DNA foundation models. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2024.03.14.584712v2.
56 48 47. Zhou, Z., Wu, W., Ho, H., Wang, J., Shi, L., Davuluri, R. V., Wang, Z., and Liu, H. (2024). DNABERT-S: Learning species-aware dna embedding with genome foundation models. arXiv preprint. https://arxiv.org/abs/2402.08777.
57 49 48. Zhou, Z., Ji, Y., Li, W., Dutta, P., Davuluri, R., and Liu, H. (2023). DNABERT-2: Efficient foundation model and benchmark for multi-species genome. arXiv preprint arXiv:2306.15006. https://arxiv.org/abs/2306.15006.
58 50 49. Garau-Luis, J. J., Bordes, P., Gonzalez, L., Roller, M., de Almeida, B. P., Hexemer, L., Blum, C., Laurent, S., Grzegorzewski, J., Lang, M. et al. (2024). Multi-modal transfer learning between biological foundation models. arXiv preprint arXiv:2406.14150.
59 51 50. Marin, F. I., Teufel, F., Horlacher, M., Madsen, D., Pultz, D., Winther, O., and Boomsma, W. BEND: Benchmarking DNA Language Models on Biologically Meaningful Tasks. In: International Conference on Learning Representations (2024):.
60 52 51. Tang, Z., and Koo, P. K. (2024). Evaluating the representational power of pre-trained DNA language models for regulatory genomics. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2024.02.29.582810v1.
61 53 52. Li, F.-Z., Amini, A. P., Yue, Y., Yang, K. K., and Lu, A. X. (2024). Feature reuse and scaling: Understanding transfer learning with protein language models. bioRxiv preprint ( 202402).
62 54 53. Zaheer, M., Guruganesh, G., Dubey, K. A., Ainslie, J., Alberti, C., Ontanon, S., Pham, P., Ravula, A., Wang, Q., Yang, L., and Ahmed, A. Big Bird: Transformers for Longer Sequences. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H., eds. Advances in Neural Information Processing Systems vol. 33. Curran Associates, Inc. (2020):(17283–17297).
63 55 54. Ji, Y., Zhou, Z., Liu, H., and Davuluri, R. V. (2021). DNABERT: pre-trained bidirectional encoder representations from Transformers model for DNA-language in genome. Bioinformatics 37, 2112–2120.
64 56 55. Mo, S., Fu, X., Hong, C., Chen, Y., Zheng, Y., Tang, X., Lan, Y., Shen, Z., and Xing, E. Multi-modal Self-supervised Pre-training for Large-scale Genome Data. In: NeurIPS 2021 AI for Science Workshop (2021):.
65 57 56. Trotter, M. V., Nguyen, C. Q., Young, S., Woodruff, R. T., and Branson, K. M. (2021). Epigenomic language models powered by Cerebras. arXiv preprint arXiv:2112.07571. https://arxiv.org/abs/2112.07571.
66 58 57. Zhang, Y., An, L., Yue, F., and Hardison, R. C. (2016). Jointly characterizing epigenetic dynamics across multiple human cell types. Nucleic Acids Research 44, 6721–6731.
67 59 58. Hoarfrost, A., Aptekmann, A., Farfa˜nuk, G., and Bromberg, Y. (2022). Deep learning of a bacterial and archaeal universal language of life enables transfer learning and illuminates microbial dark matter. Nature Communications 13, 2606.
68 60 59. Yang, M., Huang, L., Huang, H., Tang, H., Zhang, N., Yang, H., Wu, J., and Mu, F. (2022). Integrating convolution and self-attention improves language model of human genome for interpreting non-coding regions at base-resolution. Nucleic Acids Research 50, e81–e81.
69 61 60. Gwak, H.-J., and Rho, M. (2022). ViBE: a hierarchical BERT model to identify eukaryotic viruses using metagenome sequencing data. Briefings in Bioinformatics 23. doi:10.1093/bib/bbac204. Bbac204.
70 62 61. Levy, B., Xu, Z., Zhao, L., Kremling, K., Altman, R., Wong, P., and Tanner, C. FloraBERT: cross-species transfer learning withattention-based neural networks for gene expression prediction (2022). https://doi.org/10.21203/rs.3.rs-1927200/v1. doi:10.21203/rs.3.rs-1927200/v1.
71 63 62. Bai, Z., Zhang, Y.-z., Miyano, S., Yamaguchi, R., Fujimoto, K., Uematsu, S., and Imoto, S. (2022). Identification of bacteriophage genome sequences with representation learning. Bioinformatics. Btac509.
72 64 63. Zvyagin, M., Brace, A., Hippe, K., Deng, Y., Zhang, B., Bohorquez, C. O., Clyde, A.,Kale, B., Perez-Rivera, D., Ma, H. et al. (2023). GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics. The International Journal of High Performance Computing Applications 37, 683–705.
73 65 64. Chen, K., Zhou, Y., Ding, M., Wang, Y., Ren, Z., and Yang, Y. (2024). Self-supervised learning on millions of primary RNA sequences from 72 vertebrates improves sequence-based RNA splicing prediction. Briefings in Bioinformatics 25, bbae163.
74 66 65. Karollus, A., Hingerl, J., Gankin, D., Grosshauser, M., Klemon, K., and Gagneur, J. (2024). Species-aware DNA language models capture regulatory elements and their evolution. Genome Biology 25, 83.
75 67 66. Fishman, V., Kuratov, Y., Petrov, M., Shmelev, A., Shepelin, D., Chekanov, N., Kardymon, O., and Burtsev, M. (2023). GENA-LM: A Family of Open-Source Foundational Models for Long DNA Sequences. bioRxiv preprint. https://www.biorxiv.org/content/early/2023/06/13/2023.06.12.544594. doi:10.1101/2023.06.12.544594.
76 68 67. Sanabria, M., Hirsch, J., and Poetsch, A. R. (2023). The human genome’s vocabulary as proposed by the DNA language model GROVER. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2023.07.19.549677v2.
77 69 68. Zhang, D., Zhang, W., He, B., Zhang, J., Qin, C., and Yao, J. (2023). DNAGPT: A generalized pretrained tool for multiple DNA sequence analysis tasks. bioRxiv preprint. https://arxiv.org/abs/2307.05628.
78 70 69. Chu, Y., Yu, D., Li, Y., Huang, K., Shen, Y., Cong, L., Zhang, J., and Wang, M. (2024). A 5’ UTR language model for decoding untranslated regions of mRNA and function predictions. Nature Machine Intelligence 6, 449–460.
79 71 70. Lorenz, R., Bernhart, S. H., H¨oner zu Siederdissen, C., Tafer, H., Flamm, C., Stadler, P. F., and Hofacker, I. L. (2011). ViennaRNA Package 2.0. Algorithms for Molecular Biology 6, 1–14.
80 72 71. Robson, E. S., and Ioannidis, N. M. (2023). GUANinE v1. 0: Benchmark Datasets for Genomic AI Sequence-to-Function Models. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2023.10.12.562113v3.
81 73 72. Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., and Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research 21, 1–67. http://jmlr.org/papers/v21/20-074.html.
82 74 73. Kudo, T. Subword regularization: Improving neural network translation models with multiple subword candidates. In: Gurevych, I., and Miyao, Y., eds. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Melbourne, Australia: Association for Computational Linguistics (2018):( 66–75). https://aclanthology.org/P18-1007. doi:10.18653/v1/P18-1007.
83 75 74. Richard, G., de Almeida, B. P., Dalla-Torre, H., Blum, C., Hexemer, L., Pandey, P., Laurent, S., Lopez, M. P., Laterre, A., Lang, M. et al. (2024). ChatNT: A Multimodal Conversational Agent for DNA, RNA and Protein Tasks. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2024.04.30.591835v1.
84 76 75. Chiang, W.-L., Li, Z., Lin, Z., Sheng, Y., Wu, Z., Zhang, H., Zheng, L., Zhuang, S., Zhuang, Y., Gonzalez, J. E., Stoica, I., and Xing, E. P. Vicuna: An Open-Source Chat- bot Impressing GPT-4 with 90%* ChatGPT Quality (2023). https://lmsys.org/blog/2023-03-30-vicuna/.
85 77 76. He, Y., Fang, P., Shan, Y., Pan, Y., Wei, Y., Chen, Y., Chen, Y., Liu, Y., Zeng, Z., Zhou, Z. et al. (2024). LucaOne: Generalized Biological Foundation Model with Unified Nucleic Acid and Protein Language. bioRxiv preprint ( 2024–05). https://www.biorxiv.org/content/10.1101/2024.05.10.592927v1.
86 78 77. Zhu, X., Qin, C., Wang, F., Yang, F., He, B., Zhao, Y., and Yao, J. (2024). CD-GPT: A Biological Foundation Model Bridging the Gap between Molecular Sequences Through Central Dogma. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2024.06.24.600337v1.
87 79 78. Cornman, A., West-Roberts, J., Camargo, A. P., Roux, S., Beracochea, M., Mirdita, M., Ovchinnikov, S., and Hwang, Y. (2024). The OMG dataset: An Open MetaGenomic corpus for mixed-modality genomic language modeling. bioRxiv preprint ( 2024–08). https://www.biorxiv.org/content/10.1101/2024.08.14.607850v1.
88 80 79. Markowitz, V. M., Chen, I.-M. A., Palaniappan, K., Chu, K., Szeto, E., Grechkin, Y., Ratner, A., Jacob, B., Huang, J., Williams, P. et al. (2012). IMG: the integrated microbial genomes database and comparative analysis system. Nucleic Acids Research 40, D115–D122.
89 81 80. Richardson, L., Allen, B., Baldi, G., Beracochea, M., Bileschi, M. L., Burdett, T., Burgin, J., Caballero-P´erez, J., Cochrane, G., Colwell, L. J. et al. (2023). MGnify: the microbiome sequence data analysis resource in 2023. Nucleic Acids Research 51, D753–D759.
90 82 81. Gao, L., Biderman, S., Black, S., Golding, L., Hoppe, T., Foster, C., Phang, J., He, H., Thite, A., Nabeshima, N., Presser, S., and Leahy, C. (2020). The Pile: An 800GB Dataset of Diverse Text for Language Modeling. arXiv preprint arXiv:2101.00027. https://arxiv.org/abs/2101.00027.
91 83 82. Longpre, S., Biderman, S., Albalak, A., Schoelkopf, H., McDuff, D., Kapoor, S., Klyman, K., Lo, K., Ilharco, G., San, N. et al. (2024). The responsible foundation model development cheatsheet: A review of tools & resources. arXiv preprint arXiv:2406.16746. https://arxiv.org/abs/2406.16746.
92 84 83. Sullivan, P. F., Meadows, J. R., Gazal, S., Phan, B. N., Li, X., Genereux, D. P., Dong, M. X., Bianchi, M., Andrews, G., Sakthikumar, S. et al. (2023). Leveraging base-pair mammalian constraint to understand genetic variation and human disease. Science 380, eabn2937.
93 85 84. Lee, K., Ippolito, D., Nystrom, A., Zhang, C., Eck, D., Callison-Burch, C., and Carlini, N. Deduplicating training data makes language models better. In: Muresan, S., Nakov, P., and Villavicencio, A., eds. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Dublin, Ireland: Association for Compu- tational Linguistics (2022):( 8424–8445). https://aclanthology.org/2022.acl-long.577. doi:10.18653/v1/2022.acl-long.577.
94 86 85. Schoenfelder, S., and Fraser, P. (2019). Long-range enhancer–promoter contacts in gene expression control. Nature Reviews Genetics 20, 437–455.
95 87 86. Karnuta, J. M., and Scacheri, P. C. (2018). Enhancers: bridging the gap between gene control and human disease. Human Molecular Genetics 27, R219–R227.
96 88 87. King, J. L., and Jukes, T. H. (1969). Non-darwinian evolution. Science 164, 788–798. doi:10.1126/science.164.3881.788.
97 89 88. Tay, Y., Dehghani, M., Gupta, J. P., Aribandi, V., Bahri, D., Qin, Z., and Metzler, D. Are pretrained convolutions better than pretrained transformers? In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (2021):( 4349–4359). https://aclanthology.org/2021.acl-long.335/.
98 90 89. Yang, K. K., Fusi, N., and Lu, A. X. (2024). Convolutions are competitive with transformers for protein sequence pretraining. Cell Systems 15, 286–294.
99 91 90. Linder, J., Srivastava, D., Yuan, H., Agarwal, V., and Kelley, D. R. (2023). Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2023.08.30.555582v1.
100 92 91. Su, J., Ahmed, M., Lu, Y., Pan, S., Bo, W., and Liu, Y. (2024). Roformer: Enhanced transformer with rotary position embedding. Neurocomputing 568, 127063.
101 93 92. Dai, Z., Yang, Z., Yang, Y., Carbonell, J. G., Le, Q. V., and Salakhutdinov, R. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context. In: Korhonen, A., Traum, D. R., and M`arquez, L., eds. Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers. Association for Computational Linguistics (2019):( 2978–2988). https://doi.org/10.18653/v1/p19-1285. doi:10.18653/V1/P19-1285.
102 94 93. Yu, L., Simig, D., Flaherty, C., Aghajanyan, A., Zettlemoyer, L., and Lewis, M. MEGABYTE: Predicting million-byte sequences with multiscale transformers. In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., and Levine, S., eds. Advances in Neural Information Processing Systems vol. 36. Curran Associates, Inc. (2023):( 78808–78823). https://proceedings.neurips.cc/paper_files/paper/2023/file/f8f78f8043f35890181a824e53a57134-Paper-Conference.pdf.
103 95 94. Gu, A., Goel, K., and Re, C. Efficiently modeling long sequences with structured state spaces. In: International Conference on Learning Representations (2022):.
104 96 95. Poli, M., Massaroli, S., Nguyen, E., Fu, D. Y., Dao, T., Baccus, S., Bengio, Y., Ermon, S., and R´e, C. Hyena Hierarchy: Towards larger convolutional language models. In: International Conference on Machine Learning. PMLR (2023):( 28043–28078).
105 97 96. Gu, A., and Dao, T. (2023). Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752. https://arxiv.org/abs/2312.00752.
106 98 97. Rives, A., Meier, J., Sercu, T., Goyal, S., Lin, Z., Liu, J., Guo, D., Ott, M., Zitnick, C. L., Ma, J., and Ferguson, A. L. (2021). Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proceedings of the National Academy of Sciences 118, e2016239118.
107 99 98. Cheng, X., Chen, B., Li, P., Gong, J., Tang, J., and Song, L. (2024). Training compute-optimal protein language models. bioRxiv preprint. https://www.biorxiv.org/content/ 10.1101/2024.06.06.597716v1.
108 100 99. Samuel, D. (2024). BERTs are Generative In-Context Learners. arXiv:2406.04823. https://arxiv.org/abs/2406.04823. arXiv preprint
109 101 100. Hayes, T., Rao, R., Akin, H., Sofroniew, N. J., Oktay, D., Lin, Z., Verkuil, R., Tran, V. Q., Deaton, J., Wiggert, M. et al. (2024). Simulating 500 million years of evolution with a language model. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2024.07.01.600583v1.
110 102 101. Sennrich, R., Haddow, B., and Birch, A. Neural machine translation of rare words with subword units. In: Erk, K., and Smith, N. A., eds. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Berlin, Germany: Association for Computational Linguistics (2016):( 1715–1725). https://aclanthology.org/P16-1162. doi:10.18653/v1/P16-1162.
111 103 102. Blanchette, M., Kent, W. J., Riemer, C., Elnitski, L., Smit, A. F., Roskin, K. M., Baertsch, R., Rosenbloom, K., Clawson, H., Green, E. D. et al. (2004). Aligning multiple genomic sequences with the threaded blockset aligner. Genome Research 14, 708–715.
112 104 103. Armstrong, J., Hickey, G., Diekhans, M., Fiddes, I. T., Novak, A. M., Deran, A., Fang, Q., Xie, D., Feng, S., Stiller, J. et al. (2020). Progressive Cactus is a multiple-genome aligner for the thousand-genome era. Nature 587, 246–251.
113 105 104. Song, B., Buckler, E. S., and Stitzer, M. C. (2024). New whole-genome alignment tools are needed for tapping into plant diversity. Trends in Plant Science 29, 355–369.
114 106 105. Phan, M. H., Zehnder, T. M., Puntieri, F., Lo, B.-W., Lenhard, B., Mueller, F., Vingron, M., and Ibrahim, D. M. (2024). Conservation of regulatory elements with highly diverged sequences across large evolutionary distances. bioRxiv preprint ( 2024–05). https://www.biorxiv.org/content/10.1101/2024.05.13.590087v1.
115 107 106. Linardatos, P., Papastefanopoulos, V., and Kotsiantis, S. (2020). Explainable AI: A review of machine learning interpretability methods. Entropy 23, 18.
116 108 107. Zhang, Y., Tiˇno, P., Leonardis, A., and Tang, K. (2021). A survey on neural network inter-pretability. IEEE Transactions on Emerging Topics in Computational Intelligence 5, 726–742.
117 109 108. Talukder, A., Barham, C., Li, X., and Hu, H. (2021). Interpretation of deep learning in genomics and epigenomics. Briefings in Bioinformatics 22, bbaa177.
118 110 109. Shrikumar, A., Tian, K., Avsec, Z., Shcherbina, A., Banerjee, A., Sharmin, M., Nair, S., and Kundaje, A. (2018). Technical note on transcription factor motif discovery from importance scores (TF-MoDISco) version 0.5. 6.5. arXiv preprint arXiv:1811.00416. https://arxiv.org/abs/1811.00416.
119 111 110. Fowler, D. M., Adams, D. J., Gloyn, A. L., Hahn, W. C., Marks, D. S., Muffley, L. A., Neal, J. T., Roth, F. P., Rubin, A. F., Starita, L. M., and Hurles, M. E. (2023). An Atlas of Variant Effects to understand the genome at nucleotide resolution. Genome Biology 24, 147.
120 112 111. Kircher, M., Xiong, C., Martin, B., Schubach, M., Inoue, F., Bell, R. J. A., Costello, J. F., Shendure, J., and Ahituv, N. (2019). Saturation mutagenesis of twenty disease-associated regulatory elements at single base-pair resolution. Nature Communications 10. doi:10.1038/s41467-019-11526-w.
121 113 112. Findlay, G. M., Daza, R. M., Martin, B., Zhang, M. D., Leith, A. P., Gasperini, M., Janizek, J. D., Huang, X., Starita, L. M., and Shendure, J. (2018). Accurate classification of BRCA1 variants with saturation genome editing. Nature 562, 217–222.
122 114 113. Notin, P., Kollasch, A. W., Ritter, D., Niekerk, L. V., Paul, S., Spinner, H., Rollins, N. J., Shaw, A., Orenbuch, R., Weitzman, R., Frazer, J., Dias, M., Franceschi, D., Gal, Y., and Marks, D. S. ProteinGym: Large-Scale Benchmarks for Protein Fitness Prediction and Design. In: Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2023):https://openreview.net/forum?id=URoZHqAohf.
123 115 114. Landrum, M. J., Lee, J. M., Benson, M., Brown, G. R., Chao, C., Chitipiralla, S., Gu, B., Hart, J., Hoffman, D., Jang, W. et al. (2016). ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Research 44, D862–D868.
124 116 115. Stenson, P. D., Mort, M., Ball, E. V., Evans, K., Hayden, M., Heywood, S., Hussain, M., Phillips, A. D., and Cooper, D. N. (2017). The Human Gene Mutation Database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies. Human Genetics 136, 665–677. doi:10.1007/s00439-017-1779-6.
125 117 116. Amberger, J. S., Bocchini, C. A., Schiettecatte, F., Scott, A. F., and Hamosh, A. (2015). OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders. Nucleic Acids Research 43, D789–D798.
126 118 117. Pritchard, J. K., and Cox, N. (2002). The allelic architecture of human disease genes: common disease–common variant...or not? Human Molecular Genetics 11, 2417–2423. doi:10.1093/hmg/11.20.2417.
127 119 118. Karczewski, K. J., Francioli, L. C., Tiao, G., Cummings, B. B., Alf¨oldi, J., Wang, Q., Collins, R. L., Laricchia, K. M., Ganna, A., Birnbaum, D. P., Gauthier, L. D., Brand, H., Solomonson, M., Watts, N. A., Rhodes, D., Singer-Berk, M., England, E. M., Seaby, E. G., Kosmicki, J. A., Walters, R. K., Tashman, K., Farjoun, Y., Banks, E., Poterba, T., Consortium, G. A. D., and MacArthur, D. G. (2020). The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443. doi:10.1038/s41586-020-2308-7.
128 120 119. Vapnik, V. N. The Nature of Statistical Learning Theory. New York: Springer (1999).
129 121 120. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., and Fei-Fei, L. (2015). ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 211–252. doi:10.1007/s11263-015-0816-y.
130 122 121. Moult, J., Fidelis, K., Kryshtafovych, A., Schwede, T., and Tramontano, A. (2018). Critical assessment of methods of protein structure prediction (CASP)—round XII. Proteins: Structure, Function, and Bioinformatics 86, 7–15.
131 123 122. Johnson, A. D., Handsaker, R. E., Pulit, S. L., Nizzari, M., O’Donnell, C. J., and de Bakker, P. I. (2017). CAGI: The Critical Assessment of Genome Interpretation. Genome Biology 18, 1–5.
132 124 123. Grimm, D. G., Azencott, C.-A., Aicheler, F., Gieraths, U., MacArthur, D. G., Samocha, K. E., Cooper, D. N., Stenson, P. D., Daly, M. J., Smoller, J. W. et al. (2015). The evaluation of tools used to predict the impact of missense variants is hindered by two types of circularity. Human Mutation 36, 513–523.
133 125 124. Hartl, D. L., Clark, A. G., and Clark, A. G. Principles of population genetics vol. 116. Sinauer associates Sunderland, MA (1997).
134 126 125. Livesey, B. J., Badonyi, M., Dias, M., Frazer, J., Kumar, S., Lindorff-Larsen, K., McCandlish, D. M., Orenbuch, R., Shearer, C. A., Muffley, L. et al. (2024). Guidelines for releasing a variant effect predictor. arXiv preprint. https://arxiv.org/abs/2404.10807.
135 127 126. Gupta, A., Lal, A., Gunsalus, L. M., Biancalani, T., and Eraslan, G. (2023). Polygraph: A software framework for the systematic assessment of synthetic regulatory DNA elements. bioRxiv preprint. https://www.biorxiv.org/content/10.1101/2023.11.27.568764v2.
136 128 127. Consortium, E. P. et al. (2012). An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74.
137 129 128. Kundaje, A., Meuleman, W., Ernst, J. et al. (2015). Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330.
138 130 129. Greˇsov´a, K., Martinek, V., Cech´ak, D., Simeˇcek, P., and Alexiou, P. (2023). Genomic bench-marks: a collection of datasets for genomic sequence classification. BMC Genomic Data 24, Article number: 25.
139 131 130. Helfrich, G. (2024). The harms of terminology: why we should reject so-called “frontier AI”. AI and Ethics ( 1–7).
140 4 0 [4] G. Benegas, C. Albors, A. J. Aw, C. Ye, and Y. S. Song. A dna language model based on multispecies alignment predicts the effects of genome-wide variants. Nature Biotechnology, pages 1–6, 1 2025. ISSN 15461696. doi: 10.1038/S41587-024-02511-W;SUBJMETA=114, 1305,208,631;KWRD=GENETICS,MACHINE+LEARNING. URL https://www.nature.com/articles/s41587-024-02511-w. https://www.researchgate.net/publication/387673319_A_DNA_language_model_based_on_multispecies_alignment_predicts_the_effects_of_genome-wide_variants https://qiita.com/kaizen_nagoya/items/6e8858c2395dcc98804a
141 1 1. Goldfeder, R. L., Wall, D. P., Khoury, M. J., Ioannidis, J. P. & Ashley, E. A. Human genome sequencing at the population scale: a primer on high-throughput DNA sequencing and analysis. Am. J. Epidemiol. 186, 1000–1009 (2017).
142 2 2. Marwaha, S., Knowles, J. W. & Ashley, E. A. A guide for the diagnosis of rare and undiagnosed disease: beyond the exome. Genome Med. 14, 23 (2022).
143 3 3. Lee, S., Abecasis, G. R., Boehnke, M. & Lin, X. Rare-variant association analysis: study designs and statistical tests. Am. J. Hum. Genet. 95, 5–23 (2014).
144 4 4. Trajanoska, K. et al. From target discovery to clinical drug development with human genetics. Nature 620, 737–745 (2023).
145 5 5. Meier, J. et al. Language models enable zero-shot prediction of the effects of mutations on protein function. In Proc. Advances in Neural Information Processing Systems 34 (eds Ranzato, M. et al.) 29287–29303 (Curran Associates, Inc., 2021).
146 6 6. Brandes, N., Goldman, G., Wang, C. H., Ye, C. J. & Ntranos, V. Genome-wide prediction of disease variant effects with a deep protein language model. Nat. Genet. 55, 1512–1522 (2023).
147 7 7. Jagota, M. et al. Cross-protein transfer learning substantially improves disease variant prediction. Genome Biol. 24, 182 (2023).
148 8 8. Benegas, G., Batra, S. S. & Song, Y. S. DNA language models are powerful predictors of genome-wide variant effects. Proc. Natl Acad. Sci. USA 120, e2311219120 (2023).
149 9 9. Dalla-Torre, H. et al. Nucleotide Transformer: building and evaluating robust foundation models for human genomics. Nat. Methods https://doi.org/10.1038/s41592-024-02523-z (2024).
150 10 10. Vaswani, A. et al. Attention is all you need. In Proc. Advances in Neural Information Processing Systems 30 (eds Guyon, S. et al.) 6000–6010 (Curran Associates, Inc., 2017).
151 11 11. Armstrong, J., Fiddes, I. T., Diekhans, M. & Paten, B. Wholegenome alignment and comparative annotation. Annu. Rev. Anim. Biosci. 7, 41–64 (2019).
152 12 12. Nguyen, E. et al. HyenaDNA: long-range genomic sequence modeling at single nucleotide resolution. In Proc. 37th International Conference on Neural Information Processing Systems (eds Oh, A. et al.) 43177–43201 (Curran Associates, Inc.,2023).
153 13 13. Rentzsch, P., Schubach, M., Shendure, J. & Kircher, M. CADD-Splice-improving genome-wide variant effect prediction using deep learning-derived splice scores. Genome Med. 13, 31 (2021).
154 14 14. Pollard, K. S., Hubisz, M. J., Rosenbloom, K. R. & Siepel, A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 20, 110–121 (2010).
155 15 15. Sullivan, P. F. et al. Leveraging base-pair mammalian constraint to understand genetic variation and human disease. Science 380, eabn2937 (2023).
156 16 16. Rives, A. et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc. Natl Acad. Sci. USA 118, e2016239118 (2021).
157 17 17. Avsec, Ž. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021).
158 18 18. Jaganathan, K. et al. Predicting splicing from primary sequence with deep learning. Cell 176, 535–548.e24 (2019).
159 19 19. Rao, R. M. et al. MSA Transformer. In Proceedings of the 38th International Conference on Machine Learning (eds Meila, M. & Zhang, T.) (PMLR, 2021).
160 20 20. Paten, B. et al. Genome-wide nucleotide-level mammalian ancestor reconstruction. Genome Res. 18, 1829–1843 (2008).
161 21 21. Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).
162 22 22. Landrum, M. J. et al. ClinVar: improvements to accessing data. Nucleic Acids Res. 48, D835–D844 (2020).
163 23 23. Chen, S. et al. A genomic mutational constraint map using variation in 76,156 human genomes. Nature 625, 92–100 (2024).
164 24 24. Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 47, D886–D894 (2019).
165 25 25. Tate, J. G. et al. COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res. 47, D941–D947 (2019).
166 26 26. Notin, P. et al. ProteinGym: large-scale benchmarks for protein fitness prediction and design. In Proceedings of the Advances in Neural Information Processing Systems 37 (eds Oh, A. et al.) (NeurIPS, 2023).
167 27 27. Smedley, D. et al. A whole-genome analysis framework for effective identification of pathogenic regulatory variants in mendelian disease. Am. J. Hum. Genet. 99, 595–606 (2016).
168 28 28. Albuisson, J. et al. Identification of two novel mutations in Shh long-range regulator associated with familial pre-axial polydactyly. Clin. Genet. 79, 371–377 (2011).
169 29 29. Kvon, E. Z. et al. Comprehensive in vivo interrogation reveals phenotypic impact of human enhancer variants. Cell 180, 1262–1271.e15 (2020).
170 30 30. Arbini, A. A., Pollak, E. S., Bayleran, J. K., High, K. A. & Bauer, K. A. Severe factor VII deficiency due to a mutation disrupting a hepatocyte nuclear factor 4 binding site in the factor VII promoter. Blood 89, 176–182 (1997).
171 31 31. Gao, H. et al. The landscape of tolerated genetic variation in humans and primates. Science 380, eabn8153 (2023).
172 32 32. The Dependency Map Consortium. DepMap 23Q4 public. figshare https://doi.org/10.25452/figshare.plus.24667905.v2 (2023).
173 33 33. Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443(2020).
174 34 34. Agarwal, I., Fuller, Z. L., Myers, S. R. & Przeworski, M. Relating pathogenic loss-of-function mutations in humans to their evolutionary fitness costs. eLife 12, e83172 (2023).
175 35 35. Zeng, T., Spence, J. P., Mostafavi, H. & Pritchard, J. K.Bayesian estimation of gene constraint from an evolutionary model with gene features. Nat. Genet. 56, 1632–1643 (2024).
176 36 36. Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–424 (2015).
177 37 37. Schneider, T. D. & Stephens, R. M. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 18, 6097–6100 (1990).
178 38 38. Kent, W. J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).
179 39 39. Nair, S. et al. The dynseq browser track shows context-specific features at nucleotide resolution. Nat. Genet. 54, 1581–1583 (2022).
180 40 40. Fishman, V. et al. GENA-LM: a family of open-source foundational models for long DNA sequences. Preprint at bioRxiv https://doi.org/10.1101/2023.06.12.544594 (2023).
181 41 41. Borgeaud, S. et al. Improving language models by retrieving from trillions of tokens. In Proc. 39th International Conference on Machine Learning (eds Chaudhuri, K. et al.) 2206–2240 (PMLR, 2022).
182 42 42. Weiner, D. J. et al. Polygenic architecture of rare coding variation across 394,783 exomes. Nature 614, 492–499 (2023).
183 43 43. Weissbrod, O. et al. Functionally informed fine-mapping and polygenic localization of complex trait heritability. Nat. Genet. 52, 1355–1363 (2020).
184 44 44. Márquez-Luna, C. et al. Incorporating functional priors improves polygenic prediction accuracy in UK Biobank and 23andMe data sets. Nat. Commun. 12, 6052 (2021).
185 45 45. Aw, A. J., McRae, J., Rahmani, E. & Song, Y. S. Highly parameterized polygenic scores tend to overfit to population stratification via random effects. Preprint at bioRxiv https://doi.org/10.1101/2024.01.27.577589 (2024).
186 5 0 [5] G. Brixi, M. G. Durrant, J. Ku, M. Poli, G. Brockman, D. Chang, G. A. Gonzalez, S. H. King, D. B. Li, A. T. Merchant, M. Naghipourfar, E. Nguyen, C. Ricci-Tam, D. W. Romero, G. Sun, A. Taghibakshi, A. Vorontsov, B. Yang, M. Deng, L. Gorton, N. Nguyen, N. K. Wang, E. Adams, S. A. Baccus, S. Dillmann, S. Ermon, D. Guo, R. Ilango, K. Janik, A. X. Lu, R. Mehta, M. R.Mofrad, M. Y. Ng, J. Pannu, C. Ré, J. C. Schmok, J. S. John, J. Sullivan, K. Zhu, G. Zynda, D. Balsam, P. Collison, A. B. Costa, T. Hernandez-Boussard, E. Ho, M.-Y. Liu, T. McGrath, K. Powell, D. P. Burke, H. Goodarzi, P. D. Hsu, and B. L. Hie. Genome modeling and design across all domains of life with evo 2. bioRxiv, 2025. doi: 10.1101/2025.02.18.638918. URL https://www.biorxiv.org/content/early/2025/02/21/2025.02.18.638918. https://qiita.com/kaizen_nagoya/items/eecda74f758008633ee2
187 1 J. Abramson, J. Adler, J. Dunger, R. Evans, T. Green, A. Pritzel, O. Ronneberger, L. Willmore, A. J. Ballard, J. Bambrick, et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature,2024.
188 2 C. D. Allis and T. Jenuwein. The molecular hallmarks of epigenetic control. Nature Reviews Genetics, 17(8):487–500, 2016.
189 3 E. Almazrouei, H. Alobeidli, A. Alshamsi, A. Cappelli, R. Cojocaru, M. Debbah, É. Goffinet, D. Hesslow, J. Launay, Q. Malartic, et al. The falcon series of open language models. arXiv preprint arXiv:2311.16867, 2023. A. Andonian, Q. Anthony, S. Biderman, S. Black, P. Gali, L. Gao, E. Hallahan, J. Levy-Kramer, C. Leahy,
190 4 L.Nestler,K.Parker,M.Pieler,J.Phang,S.Purohit,H.Schoelkopf,D.Stander,T.Songz,C.Tigges,B.Thérien, P. Wang, and S. Weinbach. GPT-NeoX: Large Scale Autoregressive Language Modeling in PyTorch, 9 2023. URL https://www.github.com/eleutherai/gpt-neox.
191 5 Ž. Avsec, V. Agarwal, D. Visentin, J. R. Ledsam, A. Grabska-Barwinska, K. R. Taylor, Y. Assael, J. Jumper, P. Kohli, and D. R. Kelley. Effective gene expression prediction from sequence by integrating long-range interactions. Nature Methods, 18(10):1196–1203, 2021.
192 6 G. Benegas, S. S. Batra, and Y. S. Song. DNA language models are powerful predictors of genome-wide variant effects. Proceedings of the National Academy of Sciences, 120(44):e2311219120, 2023. doi: 10.1073/pnas.2311219120. URL https://www.pnas.org/doi/abs/10.1073/pnas.2311219120.
193 7 G. Benegas, C. Albors, A. J. Aw, C. Ye, and Y. S. Song. A dna language model based on multispecies alignment predicts the effects of genome-wide variants. Nature Biotechnology, pages 1–6, 2025.
194 8 S. Biderman, H. Schoelkopf, Q. G. Anthony, H. Bradley, K. O’Brien, E. Hallahan, M. A. Khan, S. Purohit, U. S. Prashanth, E. Raff, et al. Pythia: A suite for analyzing large language models across training and scaling. In International Conference on Machine Learning, pages 2397–2430. PMLR, 2023.
195 9 D. Bloomfield, J. Pannu, A. W. Zhu, M. Y. Ng, A. Lewis, E. Bendavid, S. M. Asch, T. Hernandez-Boussard, A. Cicero, and T.Inglesby. AIandbiosecurity: Theneedforgovernance. Science, 385(6711):831–833, 2024. doi:10.1126/science.adq1977. URL https://www.science.org/doi/abs/10.1126/science.adq1977.
196 10 N. Brandes, G. Goldman, C. H. Wang, C. J. Ye, and V. Ntranos. Genome-wide prediction of disease variant effects with a deep protein language model. Nature Genetics, 55(9):1512–1522, Sept. 2023. ISSN 1546-1718. doi: 10.1038/s41588-023-01465-0. URL https://doi.org/10.1038/s41588-023-01465-0.
197 11 S. Brenner, A. Stretton, and S. Kaplan. Genetic code: the ‘nonsense’ triplets for chain termination and their suppression. Nature, 206(4988):994–998, 1965.
198 12 T. Bricken, A. Templeton, J. Batson, B. Chen, A. Jermyn, T. Conerly, N. Turner, C. Anil, C. Denison, A. Askell, R. Lasenby, Y. Wu, S. Kravec, N. Schiefer, T. Maxwell, N. Joseph, Z. Hatfield-Dodds, A. Tamkin, K. Nguyen, B. McLean, J. E. Burke, T. Hume, S. Carter, T. Henighan, and C. Olah. Towards monosemanticity: Decomposing language models with dictionary learning. Transformer Circuits Thread, 2023. https://transformer-circuits.pub/2023/monosemantic-features/index.html.
199 13 J. E. Brownell, J. Zhou, T. Ranalli, R. Kobayashi, D. G. Edmondson, S. Y. Roth, and C. D. Allis. Tetrahymena histone acetyltransferase A: a homolog to yeast Gcn5p linking histone acetylation to gene activation. Cell, 84(6):843–851, 1996.
200 14 B. Bussmann, P. Leask, and N. Nanda. BatchTopK sparse autoencoders. arXiv, 2412.06410, 2024. A. P. Camargo, S. Roux, F. Schulz, M. Babinski, Y. Xu, B. Hu, P. S. G. Chain, S. Nayfach, and N. C. Kyrpides. Identification of mobile genetic elements with genomad. Nat. Biotechnol., 42(8):1303–1312, Aug. 2024.
201 P. Camargo, S. Roux, F. Schulz, M. Babinski, Y. Xu, B. Hu, P. S. G. Chain, S. Nayfach, and N. C. Kyrpides. Identification of mobile genetic elements with genomad. Nat. Biotechnol., 42(8):1303–1312, Aug.
202 15 B. Chen, X. Cheng, P. Li, Y.-a. Geng, J. Gong, S. Li, Z. Bei, X. Tan, B. Wang, X. Zeng, et al. xTri-moPGLM:unified100b-scalepre-trainedtransformerfordecipheringthelanguageofprotein. arXiv preprint arXiv:2401.06199, 2024.
203 16 I.-M. A. Chen, K. Chu, K. Palaniappan, A. Ratner, J. Huang, M. Huntemann, P. Hajek, S. Ritter, N. Varghese, R. Seshadri, S. Roux, T. Woyke, E. A. Eloe-Fadrosh, N. N. Ivanova, and N. C. Kyrpides. The IMG/M data management and analysis system v.6.0: new tools and advanced capabilities. Nucleic Acids Res., 49(D1):D751–D763, Jan. 2021.
204 17 J.Chen,Z.Hu,S.Sun,Q.Tan,Y.Wang,Q.Yu,L.Zong,L.Hong,J.Xiao,T.Shen,I.King,andY.Li. Interpretable RNA foundation model from unannotated data for highly accurate RNA structure and function predictions 2022.
205 18 S. Chen, S. Wong, L. Chen, and Y. Tian. Extending context window of large language models via positional interpolation. URL https://arxiv. org/abs/2306.15595, 2023.
206 19 J. Cheng, G. Novati, J. Pan, C. Bycroft, A. Žemgulyt˙ e, T. Applebaum, A. Pritzel, L. H. Wong, M. Zielinski, T. Sargeant, et al. Accurate proteome-wide missense variant effect prediction with alphamissense. Science, 381(6664):eadg7492, 2023.
207 20 L. T. Chow, R. E. Gelinas, T. R. Broker, and R. J. Roberts. An amazing sequence arrangement at the 5 ends of adenovirus 2 messenger RNA. Cell, 12(1):1–8, 1977.
208 21 G. Csárdi, T. Nepusz, K. Müller, S. Horvát, V. Traag, F. Zanini, and D. Noom. igraph for R: R interface of the igraph library for graph theory and network analysis, Dec. 2024.
209 22 H. Cunningham, A. Ewart, L. Riggs, R. Huben, and L. Sharkey. Sparse autoencoders find highly interpretable features in language models. arXiv, 2309.08600, 2023.
210 23 H.Dalla-Torre, L.Gonzalez, J.Mendoza-Revilla, N.LopezCarranza, A.H.Grzywaczewski, F.Oteri, C.Dallago, E. Trop, B. P. de Almeida, H. Sirelkhatim, et al. Nucleotide Transformer: Building and evaluating robust foundation models for human genomics. Nature Methods, pages 1–11, 2024.
211 24 C. Darwin. On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life. John Murray, London, 1859.
212 25 T. Dobzhansky. Genetics and the Origin of Species. Columbia University Press, 1951.
213 26 A. Dubey, A. Jauhri, A. Pandey, A. Kadian, A. Al-Dahle, A. Letman, A. Mathur, A. Schelten, A. Yang, A. Fan, et al. The Llama 3 herd of models. arXiv preprint arXiv:2407.21783, 2024.
214 27 M. G. Durrant, N. T. Perry, J. J. Pai, A. R. Jangid, J. S. Athukoralage, M. Hiraizumi, J. P. McSpedon, A. Pawluk, H. Nishimasu, S. Konermann, and P. D. Hsu. Bridge RNAs direct programmable recombination of target and donor DNA. Nature, 630(8018):984–993, June 2024.
215 28 A. A. Egorov and G. C. Atkinson. Lovis4u: Locus visualisation tool for comparative genomics. bioRxiv, 2024. doi: 10.1101/2024.09.11.612399. URL https://www.biorxiv.org/content/early/2024/09/14/2024.09.11.612399.
216 29 G. M. Findlay, R. M. Daza, B. Martin, M. D. Zhang, A. P. Leith, M. Gasperini, J. D. Janizek, X. Huang, L. M. Starita, and J. Shendure. Accurate classification of BRCA1 variants with saturation genome editing. Nature, 562:217–222, 10 2018. ISSN 0028-0836. doi: 10.1038/s41586-018-0461-z.
217 30 L. Gao, T. D. la Tour, H. Tillman, G. Goh, R. Troll, A. Radford, I. Sutskever, J. Leike, and J. Wu. Scaling and evaluating sparse autoencoders. arXiv, 2024a. URL https://arxiv.org/abs/2406.04093.
218 31 T.Gao, A.Wettig, H.Yen, andD.Chen. Howtotrainlong-contextlanguagemodels(effectively). arXiv preprint arXiv:2410.02660, 2024b.
219 32 D.G.Gibson, G.A.Benders, C.Andrews-Pfannkoch, E.A.Denisova, H.Baden-Tillson, J.Zaveri, T.B.Stockwell, A. Brownley, D. W. Thomas, M. A. Algire, C. Merryman, L. Young, V. N. Noskov, J. I. Glass, J. C. Venter, C. A. Hutchison, and H. O. Smith. Complete chemical synthesis, assembly, and cloning of a Mycoplasma genitalium genome. Science, 319(5867):1215–1220, 2008.
220 33 S. Gupta, J. A. Stamatoyannopoulos, T. L. Bailey, and W. S. Noble. Quantifying similarity between motifs. Genome Biology, 8(2):R24, Feb. 2007. ISSN 1474-760X. doi: 10.1186/gb-2007-8-2-r24. URL https://doi.org/10.1186/gb-2007-8-2-r24.
221 34 P. W. Harrison, M. R. Amode, O. Austine-Orimoloye, A. G. Azov, M. Barba, I. Barnes, A. Becker, R. Bennett, A. Berry, J. Bhai, S. K. Bhurji, S. Boddu, P. R. Branco Lins, L. Brooks, S. B. Ramaraju, L. I. Campbell, M. C. Martinez, M. Charkhchi, K. Chougule, A. Cockburn, C. Davidson, N. H. De Silva, K. Dodiya, S. Donaldson, B. El Houdaigui, T. E. Naboulsi, R. Fatima, C. G. Giron, T. Genez, D. Grigoriadis, G. S. Ghattaoraya, J. G. Martinez, T. A. Gurbich, M. Hardy, Z. Hollis, T. Hourlier, T. Hunt, M. Kay, V. Kaykala, T. Le, D. Lemos, D. Lodha, D. Marques-Coelho, G. Maslen, G. A. Merino, L. P. Mirabueno, A. Mushtaq, S. N. Hossain, D. N. Ogeh, M. P. Sakthivel, A. Parker, M. Perry, I. Piližota, D. Poppleton, I. Prosovetskaia, S. Raj, J. G. Pérez-Silva, A. I. A. Salam, S. Saraf, N. Saraiva-Agostinho, D. Sheppard, S. Sinha, B. Sipos, V. Sitnik, W. Stark, E. Steed, M.-M.Suner,L.Surapaneni,K.Sutinen,F.F.Tricomi,D.Urbina-Gómez, A.Veidenberg, T.A.Walsh, D.Ware, E. Wass, N. L. Willhoft, J. Allen, J. Alvarez-Jarreta, M. Chakiachvili, B. Flint, S. Giorgetti, L. Haggerty, G. R. Ilsley, J.Keatley, J.E.Loveland, B.Moore, J.M.Mudge, G.Naamati, J.Tate, S.J.Trevanion, A.Winterbottom, A. Frankish, S. E. Hunt, F. Cunningham, S. Dyer, R. D. Finn, F. J. Martin, and A. D. Yates. Ensembl 2024. Nucleic Acids Res., 52(D1):D891–D899, Jan. 2024.
222 35 S. Hartmann, D. Lu, J. Phillips, and T. J. Vision. Phytome: a platform for plant comparative genomics. Nucleic Acids Res., 34(Database issue):D724–30, Jan. 2006.
223 36 T. Hayes, R. Rao, H. Akin, N. J. Sofroniew, D. Oktay, Z. Lin, R. Verkuil, V. Q. Tran, J. Deaton, M. Wiggert, et al. Simulating 500 million years of evolution with a language model. Science, eads0018, 2025.
224 37 S. B. Hedges. The origin and evolution of model organisms. Nat. Rev. Genet., 3(11):838–849, Nov. 2002.
225 38 J. C. Hingerl, A. Karollus, and J. Gagneur. Flashzoi: An enhanced Borzoi model for accelerated genomic analysis. bioRxiv, pages 2024–12, 2024.
226 39 H. Huang, C. Hu, J. Na, S. N. Hart, R. D. Gnanaolivu, M. Abozaid, T. Rao, Y. A. Tecleab, T. Pesaran, et al. Functional evaluation and clinical classification of BRCA2 variants. Nature, pages 1–10, 2025.
227 40 L. A. Hug, B. J. Baker, K. Anantharaman, C. T. Brown, A. J. Probst, C. J. Castelle, C. N. Butterfield, A. W. Hernsdorf, Y. Amano, K. Ise, Y. Suzuki, N. Dudek, D. A. Relman, K. M. Finstad, R. Amundson, B. C. Thomas, and J. F. Banfield. A new view of the tree of life. Nat. Microbiol., 1(5):16048, Apr. 2016.
228 41 D. Hyatt, G.-L. Chen, P. F. Locascio, M. L. Land, F. W. Larimer, and L. J. Hauser. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics, 11:119, Mar. 2010.
229 42 F. Jacob and J. Monod. Genetic regulatory mechanisms in the synthesis of proteins. Journal of Molecular Biology, 3(3):318–356, 1961.
230 43 Y. Ji, Z. Zhou, H. Liu, and R. V. Davuluri. DNABERT: pre-trained bidirectional encoder representations from transformers model for DNA-language in genome. Bioinformatics, 37(15):2112–2120, 022021. ISSN1367-4803. doi: 10.1093/bioinformatics/btab083. URL https://doi.org/10.1093/bioinformatics/btab083.
231 44 W. Kabsch and C. Sander. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers: Original Research on Biomolecules, 22(12):2577–2637, 1983.
232 45 J. Kaplan, S. McCandlish, T. Henighan, T. B. Brown, B. Chess, R. Child, S. Gray, A. Radford, J. Wu, and D.Amodei. Scalinglawsforneurallanguagemodels,2020. URLhttps://arxiv.org/abs/2001.08361.
233 46 J. R. Karr, J. C. Sanghvi, D. N. Macklin, M. V. Gutschow, J. M. Jacobs, B. Bolival, N. Assad-Garcia, J. I. Glass, and M. W. Covert. A whole-cell computational model predicts phenotype from genotype. Cell, 150(2):389–401, 2012.
234 47 S. Knudsen. Promoter2.0: for the recognition of PolII promoter sequences. Bioinformatics (Oxford, England), 15(5):356–361, 1999.
235 48 M. Kozak. The scanning model for translation: an update. The Journal of cell biology, 108(2):229–241, 1989.
236 49 J. Ku, E. Nguyen, D. Romero, G. Brixi, B. Yang, A. Vorontsov, A. Taghibakhshi, A. Lu, D. Burke, G. Brockman, S. Massaroli, C. Re, P. Hsu, B. Hie, S. Ermon, and M. Poli. Systems and algorithms for convolutional multi-hybrid language models at scale. 2025.
237 50 P. Kunzmann, T. D. Müller, M. Greil, J. H. Krumbach, J. M. Anter, D. Bauer, F. Islam, and K. Hamacher. Biotite: new tools for a versatile python bioinformatics library. BMC Bioinformatics, 24(1):236, June 2023.
238 51 M. J. Landrum, J. M. Lee, G. R. Riley, W. Jang, W. S. Rubinstein, D. M. Church, and D. R. Maglott. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Research, 42 (D1):D980–D985, 11 2013. ISSN 0305-1048. doi: 10.1093/nar/gkt1113. URL https://doi.org/10.1093/nar/gkt1113.
239 52 S. Li, S. Moayedpour, R. Li, M. Bailey, S. Riahi, M. Miladi, J. Miner, D. Zheng, J. Wang, A. Balsubramani, K.Tran, M.Zacharia, M.Wu, X.Gu, R.Clinton, C.Asquith, J.Skalesk, L.Boeglin, S.Chivukula, A.Dias, F.U. Montoya, V. Agarwal, Z. Bar-Joseph, and S. Jager. CodonBERT: Large language models for mRNA design and optimization. bioRxiv, 2023. doi: 10.1101/2023.09.09.556981. URL https://www.biorxiv.org/content/early/2023/09/12/2023.09.09.556981.
240 53 W.-W. Liang, S. Müller, S. K. Hart, H.-H. Wessels, A. Méndez-Mancilla, A. Sookdeo, O. Choi, C. M. Caragine, A. Corman, L. Lu, O. Kolumba, B. Williams, and N. E. Sanjana. Transcriptome-scale RNA-targeting CRISPR screens reveal essential lncRNAs in human cells. Cell, 187(26):7637–7654.e29, Dec. 2024.
241 54 Z. Lin, H. Akin, R. Rao, B. Hie, Z. Zhu, W. Lu, N. Smetanin, R. Verkuil, O. Kabeli, Y. Shmueli, A. dos Santos Costa, M. Fazel-Zarandi, T. Sercu, S. Candido, and A. Rives. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science, 379:1123–1130, 3 2023. ISSN 0036-8075. doi:10.1126/science.ade2574.
242 55 J.Linder, D.Srivastava, H.Yuan, V.Agarwal, andD.R.Kelley. PredictingRNA-seqcoveragefromdnasequence as a unifying model of gene regulation. Nature Genetics, pages 1–13, 2025.
243 56 A. Liu, B. Feng, B. Xue, B. Wang, B. Wu, C. Lu, C. Zhao, C. Deng, C. Zhang, C. Ruan, et al. DeepSeek-v3 technical report. arXiv preprint arXiv:2412.19437, 2024.
244 57 B. J. Livesey and J. A. Marsh. Updated benchmarking of variant effect predictors using deep mutational scanning. Molecular Systems Biology, 19, 8 2023. ISSN 1744-4292. doi: 10.15252/msb.202211474.
245 58 A. Lozhkov, R. Li, L. B. Allal, F. Cassano, J. Lamy-Poirier, N. Tazi, A. Tang, D. Pykhtar, J. Liu, Y. Wei, et al. Starcoder 2 and the stack v2: The next generation. arXiv preprint arXiv:2402.19173, 2024.
246 59 A. Makhzani and B. Frey. k-sparse autoencoders. arXiv, 2014. URL https://arxiv.org/abs/1312.5663.
247 60 G. Marçais and C. Kingsford. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics, 27(6):764–770, 2011.
248 61 K. Marcker and F. Sanger. N-formyl-methionyl-s-rna. Journal of Molecular Biology, 8(6):835–IN8,1964. ISSN 0022-2836. doi: https://doi.org/10.1016/S0022-2836(64)80164-9. URL https://www.sciencedirect.com/science/article/pii/S0022283664801649.
249 62 L. McInnes, J. Healy, and J. Melville. UMAP: Uniform manifold approximation and projection for dimension reduction. arXiv, 1802.03426, 2018.
250 63 J. Meier, R. Rao, R. Verkuil, J. Liu, T. Sercu, and A. Rives. Language models enable zero-shot prediction of the effects of mutations on protein function. bioRxiv, 2021. doi: 10.1101/2021.07.09.450648. URL https://www.biorxiv.org/content/10.1101/2021.07.09.450648v1.
251 64 G. Mendel. Versuche über Pflanzen-Hybriden. Verhandlungen des naturforschenden Vereines in Brünn, 4:3–47, 1866.
252 65 E. C. Meng, T. D. Goddard, E. F. Pettersen, G. S. Couch, Z. J. Pearson, J. H. Morris, and T. E. Ferrin. Ucsf chimerax: Tools for structure building and analysis. Protein Science, 32(11):e4792, 2023. doi: https://doi.org/10.1002/pro.4792. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/pro.4792.
253 66 G. Meng, Y. Li, C. Yang, and S. Liu. MitoZ: a toolkit for animal mitochondrial genome assembly, annotation and visualization. Nucleic Acids Research, 47(11):e63–e63, 2019.
254 67 A. T. Merchant, S. H. King, E. Nguyen, and B. L. Hie. Semantic mining of functional de novo genes from a genomic language model. bioRxiv, 2024. doi: 10.1101/2024.12.17.628962. URL https://www.biorxiv.org/content/early/2024/12/18/2024.12.17.628962.
255 68 F. Meyer, D. Paarmann, M. D’Souza, R. Olson, E. M. Glass, M. Kubal, T. Paczian, A. Rodriguez, R. Stevens, A. Wilke, J. Wilkening, and R. A. Edwards. The metagenomics RAST server - a public resource for the automatic phylogenetic and function alanalysis of meta genomes. BMC Bioinformatics, 9(1):386, Sept.2008.
256 69 J. Mistry, S. Chuguransky, L. Williams, M. Qureshi, G. A. Salazar, E. L. Sonnhammer, S. C. Tosatto, L. Paladin, S. Raj, L. J. Richardson, et al. Pfam: The protein families database in 2021. Nucleic acids research, 49(D1): D412–D419, 2021
257 70 A. L. Mitchell, A. Almeida, M. Beracochea, M. Boland, J. Burgin, G. Cochrane, M. R. Crusoe, V. Kale, S. C. Potter, L. J. Richardson, E. Sakharova, M. Scheremetjew, A. Korobeynikov, A. Shlemov, O. Kunyavskaya,A. Lapidus, and R. D. Finn. MGnify: the microbiome analysis resource in 2020. Nucleic Acids Res., 48(D1):D570–D578, Jan. 2020.
258 71 M.Naghipourfar, S.Chen, M.Howard, C.Macdonald, A.Saberi, T.Hagen, M.Mofrad, W.Coyote-Maestas, and H. Goodarzi. A suite of foundation models captures the contextual interplay between codons. bioRxiv, Oct. 2024. doi: 10.1101/2024.10.10.617568. URL http://dx.doi.org/10.1101/2024.10.10.617568.
259 72 E.Nguyen, M.Poli, M.G.Durrant, B.Kang, D.Katrekar, D.B.Li, L.J.Bartie, A.W.Thomas, S.H.King, G.Brixi, J. Sullivan, M. Y. Ng, A. Lewis, A. Lou, S. Ermon, S. A. Baccus, T. Hernandez-Boussard, C. Ré, P. D. Hsu, and B. L. Hie. Sequence modeling and design from molecular to genome scale with evo. Science, 386(6723): eado9336, Nov. 2024a.
260 73 E.Nguyen,M.Poli,M.Faizi,A.Thomas,M.Wornow,C.Birch-Sykes,S.Massaroli,A.Patel,C.Rabideau,Y.Bengio, et al. HyenaDNA: Long-range genomic sequence modeling at single nucleotide resolution. Advances in neural information processing systems, 36, 2024b.
261 74 E. Nijkamp, J. A. Ruffolo, E. N. Weinstein, N. Naik, and A. Madani. ProGen2: exploring the boundaries of protein language models. Cell Systems, 14(11):968–978, 2023.
262 75 M. W. Nirenberg and J. H. Matthaei. The dependence of cell-free protein synthesis in E. coli upon naturally occurring or synthetic polyribonucleotides. Proceedings of the National Academy of Sciences, 47(10):1588–1602, 1961.
263 76 P. Notin, A. W. Kollasch, D. Ritter, L. van Niekerk, S. Paul, H. Spinner, N. Rollins, A. Shaw, R. Weitzman, J. Frazer, M. Dias, D. Franceschi, R. Orenbuch, Y. Gal, and D. S. Marks. ProteinGym: Large-scale benchmarks for protein design and fitness prediction. bioRxiv, page 2023.12.07.570727, 1 2023. doi:10.1101/2023.12.07.570727. URL http://biorxiv.org/content/early/2023/12/08/2023.12.07.570727.abstract.
264 77 N. A. O’Leary, M. W. Wright, J. R. Brister, S. Ciufo, D. Haddad, R. McVeigh, B. Rajput, B. Robbertse, B. Smith-White, D. Ako-Adjei, et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Research, 44(D1):D733–D745, 2016.
265 78 B. D. Ondov, T. J. Treangen, P. Melsted, A. B. Mallonee, N. H. Bergman, S. Koren, and A. M. Phillippy. Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol., 17(1):132, June 2016.
266 81 R. Overbeek, M. Fonstein, M. D’souza, G. D. Pusch, and N. Maltsev. The use of gene clusters to infer functional coupling. Proceedings of the National Academy of Sciences, 96(6):2896–2901, 1999.
267 82 D.H.Parks, M.Chuvochina, C.Rinke, A.J.Mussig, P.-A.Chaumeil, andP.Hugenholtz. GTDB:anongoingcen-sus of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. Nucleic Acids Res., 50(D1):D785–D794, Jan. 2022.
268 83 A. Patel, A. Singhal, A. Wang, A. Pampari, M. Kasowski, and A. Kundaje. DART-Eval: A comprehensive DNA language model evaluation benchmark on regulatory DNA. Advances in Neural Information Processing Systems, 37:62024–62061, 2024.
269 84 A. K. Pathak, N. Bora, M. Badonyi, B. J. Livesey, S. Consortium, J. Ngeow, and J. A. Marsh. Pervasive ancestry bias in variant effect predictors. bioRxiv, pages 2024–05, 2024.
270 85 F.Pedregosa, G.Varoquaux, A.Gramfort, V.Michel, B.Thirion, O.Grisel, M.Blondel, P.Prettenhofer, R.Weiss, V. Dubourg, et al. Scikit-learn: Machine learning in python. The Journal of Machine Learning Research, 12:2825–2830, 2011.
271 86 R. J. Penić, T. Vlašić, R. G. Huber, Y. Wan, and M. Šikić. RiNALMo: General-purpose RNA language models can generalize well on structure prediction tasks. arXiv, 2403.00043, 2024.
272 87 D. Piya, N. Nolan, M. L. Moore, L. A. R. Hernandez, B. F. Cress, R. Young, A. P. Arkin, and V. K. Mutalik. Systematic and scalable genome-wide essentiality mapping to identify nonessential genes in phages. PLOS Biology, 21:e3002416, 12 2023. ISSN 1545-7885. doi: 10.1371/journal.pbio.3002416.
273 88 M. Poli, S. Massaroli, E. Nguyen, D. Y. Fu, T. Dao, S. Baccus, Y. Bengio, S. Ermon, and C. Ré. Hyena hierarchy: Towards larger convolutional language models. In International Conference on Machine Learning, pages 28043–28078. PMLR, 2023.
274 89 M. Poli, A. W. Thomas, E. Nguyen, P. Ponnusamy, B. Deiseroth, K. Kersting, T. Suzuki, B. Hie, S. Ermon, C. Ré, et al. Mechanistic design and scaling of hybrid architectures. arXiv preprint arXiv:2403.17844, 2024.
275 90 K. S. Pollard, M. J. Hubisz, K. R. Rosenbloom, and A. Siepel. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Research, 20(1):110–21, 2010. doi: 10.1101/gr.097857.109.
276 91 E. Proux-Wéra, D. Armisén, K. P. Byrne, and K. H. Wolfe. A pipeline for automated annotation of yeast genome sequences by a conserved-synteny approach. BMC Bioinformatics, 13:1–12, 2012.
277 92 A. R. Quinlan and I. M. Hall. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics, 26(6):841–842, Mar. 2010.
278 93 A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, et al. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
279 94 S. Rajbhandari, J. Rasley, O. Ruwase, and Y. He. ZeRO: Memory optimizations toward training trillion parameter models. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1–16. IEEE, 2020.
280 95 Responsible AI x Biodesign. Community values, guiding principles, and commitments for the responsible development of AI for protein design, March 8 2024. URL https://responsiblebiodesign.ai/#values-and-principles.
281 96 A. Rives, J. Meier, T. Sercu, S. Goyal, Z. Lin, J. Liu, D. Guo, M. Ott, C. L. Zitnick, J. Ma, and R. Fergus. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proceedings of the National Academy of Sciences, 118:e2016239118, 2021. ISSN 0027-8424. doi:10.1073/pnas.2016239118.
282 97 J.T.Robinson,H.Thorvaldsdottir,D.Turner,andJ.P.Mesirov. igv.js: anembeddablejavascriptimplementation of the integrative genomics viewer (igv). Bioinformatics, 39(1):btac830, 12 2022. ISSN 1367-4811. doi:10.1093/bioinformatics/btac830. URL https://doi.org/10.1093/bioinformatics/btac830.
283 98 M. Sandoval-Velasco, O. Dudchenko, J. A. Rodríguez, C. P. Estrada, M. Dehasque, C. Fontsere, S. S. Mak, R. Khan, V. G. Contessoto, A. B. O. Junior, et al. Three-dimensional genome architecture persists in a 52,000-year-old woolly mammoth skin sample. Cell, 187(14):3541–3562, 2024.
284 99 I.Sarropoulos, R.Marin, M.Cardoso-Moreira, andH.Kaessmann. Developmental dynamics of lncRNAs across mammalian organs and species. Nature, 571(7766):510–514, July 2019.
285 100 V. A. Schneider, T. Graves-Lindsay, K. Howe, N. Bouk, H.-C. Chen, P. A. Kitts, T. D. Murphy, K. D. Pruitt, F. Thibaud-Nissen, D. Albracht, R. S. Fulton, M. Kremitzki, V. Magrini, C. Markovic, S. McGrath, K. M. Steinberg, K. Auger, W. Chow, J. Collins, G. Harden, T. Hubbard, S. Pelan, J. T. Simpson, G. Threadgold, J. Torrance, J. M. Wood, L. Clarke, S. Koren, M. Boitano, P. Peluso, H. Li, C.-S. Chin, A. M. Phillippy, R. Durbin, R. K. Wilson, P. Flicek, E. E. Eichler, and D. M. Church. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Research, 27(5):849–864, May 2017. ISSN 1088-9051, 1549-5469. doi: 10.1101/gr.213611.116. URL https://genome.cshlp.org/content/27/5/849. Company: Cold Spring Harbor Laboratory Press Distributor: Cold Spring Harbor Laboratory Press Institution: Cold Spring Harbor Laboratory Press Label: Cold Spring Harbor Laboratory Press Publisher: Cold Spring Harbor Lab.
286 101 M. Schubach, T. Maass, L. Nazaretyan, S. Röner, and M. Kircher. CADD v1.7: using protein language models, regulatoryc nn sando the rnucleotide-level score stoimprovegenome-wide variant predictions. Nucleic Acids Research, 52(D1):D1143–D1154, 01 2024. ISSN 0305-1048. doi: 10.1093/nar/gkad989. URL https://doi.org/10.1093/nar/gkad989.
287 102 O. Schwengers, L. Jelonek, M. A. Dieckmann, S. Beyvers, J. Blom, and A. Goesmann. Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Microb. Genom., 7 (11), Nov. 2021.
288 103 T. Shen, Z. Hu, S. Sun, D. Liu, F. Wong, J. Wang, J. Chen, Y. Wang, L. Hong, J. Xiao, et al. Accurate RNA 3d structure prediction using a language model-based deep learning approach. Nature Methods, pages 1–12, 2024.
289 104 J. Shine and L. Dalgarno. The 3-terminal sequence of escherichia coli 16s ribosomal rna: complementarity to nonsense triplets and ribosome binding sites. Proceedings of the National Academy of Sciences, 71(4): 1342–1346, 1974.
290 105 J. Söding, A. Biegert, and A. N. Lupas. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Research, 33(suppl_2):W244–W248, 2005.
291 106 M.Steinegger and J.Söding. MMseqs2 enables sensitive protein sequences earching for the analys is ofmassive data sets. Nat. Biotechnol., 35(11):1026–1028, Nov. 2017.
292 107 J. Su, M. Ahmed, Y. Lu, S. Pan, W. Bo, and Y. Liu. Roformer: Enhanced transformer with rotary position embedding. Neurocomputing, 568:127063, 2024.
293 108 P. J. Sullivan, J. M. Quinn, W. Wu, M. Pinese, and M. J. Cowley. SpliceVarDB: A comprehensive database of experimentally validated human splicing variants. The American Journal of Human Genetics, 111(10): 2164–2175, Oct. 2024. ISSN 0002-9297. doi: 10.1016/j.ajhg.2024.08.002. URL https://doi.org/10.1016/j.ajhg.2024.08.002. Publisher: Elsevier.
294 109 S. Sunagawa, L. P. Coelho, S. Chaffron, J. R. Kultima, K. Labadie, G. Salazar, B. Djahanschiri, G. Zeller, D. R. Mende, A. Alberti, F. M. Cornejo-Castillo, P. I. Costea, C. Cruaud, F. d’Ovidio, S. Engelen, I. Ferrera, J. M. Gasol, L. Guidi, F. Hildebrand, F. Kokoszka, C. Lepoivre, G. Lima-Mendez, J. Poulain, B. T. Poulos, M. RoyoLlonch, H. Sarmento, S. Vieira-Silva, C. Dimier, M. Picheral, S. Searson, S. Kandels-Lewis, Tara Oceans coordinators, C. Bowler, C. de Vargas, G. Gorsky, N. Grimsley, P. Hingamp, D. Iudicone, O. Jaillon, F. Not, H. Ogata, S. Pesant, S. Speich, L. Stemmann, M. B. Sullivan, J. Weissenbach, P. Wincker, E. Karsenti, J. Raes, S. G. Acinas, and P. Bork. Structure and function of the global ocean microbiome. Science, 348(6237):1261359, May 2015.
295 110 E. Szathmáry and J. M. Smith. The major evolutionary transitions. Nature, 374(6519):227–232, 1995.
296 111 A. Tareen and J. B. Kinney. Logomaker: beautiful sequence logos in python. Bioinformatics, 36(7):2272–2274, 12 2019. ISSN 1367-4803. doi: 10.1093/bioinformatics/btz921. URL https://doi.org/10.1093/bioinformatics/btz921.
297 112 Team OLMo, P. Walsh, L. Soldaini, D. Groeneveld, K. Lo, S. Arora, A. Bhagia, Y. Gu, S. Huang, M. Jordan, et al. 2 OLMo 2 furious. arXiv preprint arXiv:2501.00656, 2024.
298 113 A. Templeton, T. Conerly, J. Marcus, J. Lindsey, T. Bricken, B. Chen, A. Pearce, C. Citro, E. Ameisen, A. Jones, H.Cunningham, N.L.Turner, C.McDougall, M.MacDiarmid, C.D.Freeman, T.R.Sumers, E.Rees, J.Batson, A. Jermyn, S. Carter, C. Olah, and T. Henighan. Scaling monosemanticity: Extracting interpretable features from claude 3 sonnet. Transformer Circuits Thread, 2024. URL https://transformer-circuits.pub/2024/scaling-monosemanticity/index.html.
299 114 A. W. Thomas, R. Parnichkun, A. Amini, S. Massaroli, and M. Poli. STAR: Synthesis of tailored architectures. arXiv, 2411.17800, 2024.
300 115 K. Tunyasuvunakool, J. Adler, Z. Wu, T. Green, M. Zielinski, A. Žídek, A. Bridgland, A. Cowie, C. Meyer, A. Laydon, et al. Highly accurate protein structure prediction for the human proteome. Nature, 596(7873): 590–596, 2021.
301 116 A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. Gomez, L. Kaiser, and I. Polosukhin. Attention is all you need. Advances in Neural Information Processing Systems, 2017.
302 117 I. E. Vorontsov, I. A. Eliseeva, A. Zinkevich, M. Nikonov, S. Abramov, A. Boytsov, V. Kamenets, A. Kasianova, S. Kolmykov, I. S. Yevshin, A. Favorov, Y. A. Medvedeva, A. Jolma, F. Kolpakov, V. J. Makeev, and I. V. Kulakovskiy. HOCOMOCO in 2024: a rebuild of the curated collection of binding models for human and mouse transcription factors. Nucleic Acids Research, 52(D1):D154–D163, November 2023. ISSN 0305-1048. doi: 10.1093/nar/gkad1077. URL https://doi.org/10.1093/nar/gkad1077.
303 118 N. Wang, J. Bian, Y. Li, X. Li, S. Mumtaz, L. Kong, and H. Xiong. Multi-purpose RNA language modelling with motif-aware pretraining and type-guided fine-tuning. Nature Machine Intelligence, pages 1–10, 2024.
304 119 J. Watson and F. Crick. Molecular structure of nucleic acids: A structure for deoxyribose nucleic acid. Nature, 171:737–738, 4 1953. ISSN 0028-0836. doi: 10.1038/171737a0.
305 120 W.Xiong, J.Liu, I.Molybog, H.Zhang, P.Bhargava, R.Hou, L.Martin, R.Rungta, K.A.Sankararaman, B.Oguz, et al. Effective long-context scaling of foundation models. arXiv preprint arXiv:2309.16039, 2023.
306 121 K. K. Yang, N. Fusi, and A. X. Lu. Convolutions are competitive with transformers for protein sequence pretraining. Cell Systems, 15(3):286–294, 2024.
307 122 N. D. Youngblut, J. de la Cuesta-Zuluaga, G. H. Reischer, S. Dauser, N. Schuster, C. Walzer, G. Stalder, A. H. Farnleitner, and R. E. Ley. Large-scale metagenome assembly reveals novel animal-associated microbial genomes, biosynthetic gene clusters, and other genetic diversity, 2020.
308 123 R. Zhang. DEG: a database of essential genes. Nucleic Acids Research, 32:271D–272, 1 2004. ISSN 1362-4962. doi: 10.1093/nar/gkh024.
309 124 Z. Zhang, H. K. Wayment-Steele, G. Brixi, H. Wang, D. Kern, and S. Ovchinnikov. Protein language models learn evolutionary statistics of interacting sequence motifs. Proceedings of the National Academy of Sciences, 121(45):e2406285121, 2024.
310 125 M. Zvyagin, A. Brace, K. Hippe, Y. Deng, B. Zhang, C. O. Bohorquez, A. Clyde, B. Kale, D. Perez-Rivera, H. Ma, C. M. Mann, M. Irvin, D. G. Ozgulbas, N. Vassilieva, J. G. Pauloski, L. Ward, V. Hayot-Sasson, M. Emani, S. Foreman, Z. Xie, D. Lin, M. Shukla, W. Nie, J. Romero, C. Dallago, A. Vahdat, C. Xiao, T. Gibbs, I. Foster, J. J. Davis, M. E. Papka, T. Brettin, R. Stevens, A. Anandkumar, V. Vishwanath, and A. Ramanathan. GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics. The International Journal of High Performance Computing Applications, 37:683–705, 11 2023. ISSN 1094-3420. doi: 10.1177/10943420231201154.
311 [6] W. Brown. Granular format rewards for eliciting mathematical reasoning capabilities in small language models. https://gist.github.com/willccbb/4676755236bb08cab5f4e54a0475d6fb. GitHub Gist. https://qiita.com/kaizen_nagoya/items/98ef27a02a07c4c03d6e
312 7 0 [7] S. Chen, L. C. Francioli, J. K. Goodrich, R. L. Collins, M. Kanai, Q. Wang, J. Alföldi, N. A. Watts, C. Vittal, L. D. Gauthier, T. Poterba, M. W. Wilson, Y. Tarasova, W. Phu, R. Grant, M. T. Yohannes, Z. Koenig, Y. Farjoun, E. Banks, S. Donnelly, S. Gabriel, N. Gupta, S. Ferriera, C. Tolonen, S. Novod, L. Bergelson, D. Roazen, V. Ruano-Rubio, M. Covarrubias, C. Llanwarne, N. Petrillo, G. Wade, T. Jeandet, R. Munshi, K. Tibbetts, M. Abreu, C. A. A. Salinas, T. Ahmad, C. M. Albert, D. Ardissino, I. M. Armean, E. G. Atkinson, G. Atzmon, J. Barnard, S. M. Baxter, L. Beaugerie, E. J. Benjamin, D. Benjamin, M. Boehnke, L. L. Bonnycastle, E. P. Bottinger, D. W. Bowden, M. J. Bown, H. Brand, S. Brant, T. Brookings, S. Bryant, S. E. Calvo, H. Campos, J. C. Chambers, J. C. Chan, K. R. Chao, S. Chapman, D. I. Chasman, R. Chisholm, J. Cho, R. Chowdhury, M. K. Chung, W. K. Chung, K. Cibulskis, B. Cohen, K. M. Connolly, A. Correa, B. B. Cummings, D. Dabelea, J. Danesh, D. Darbar, P. Darnowsky, J. Denny, R. Duggirala, J. Dupuis, P. T. Ellinor, R. Elosua, J. Emery, E. England, J. Erdmann, T. Esko, E. Evangelista, D. Fatkin, J. Florez, A. Franke, J. Fu, M. Färkkilä, K. Garimella, J. Gentry, G. Getz, D. C. Glahn, B. Glaser, S. J. Glatt, D. Goldstein, C. Gonzalez, L. Groop, S. Gudmundsson, A. Haessly, C. Haiman, I. Hall, C. L. Hanis, M. Harms, M. Hiltunen, M. M. Holi, C. M. Hultman, C. Jalas, M. Kallela, D. Kaplan, J. Kaprio, S. Kathiresan, E. E. Kenny, B. J. Kim, Y. J. Kim, D. King, G. Kirov, J. Kooner, S. Koskinen, H. M. Krumholz, S. Kugathasan, S. H. Kwak, M. Laakso, N. Lake, T. Langsford, K. M. Laricchia, T. Lehtimäki, M. Lek, E. Lipscomb, R. J. Loos, W. Lu, S. A. Lubitz, T. T. Luna, R. C. Ma, G. M. Marcus, J. Marrugat, K. M. Mattila, S. McCarroll, M. I. McCarthy, J. L. McCauley, D. McGovern, R. McPherson, J. B. Meigs, O. Melander, A. Metspalu, D. Meyers, E. V. Minikel, B. D. Mitchell, V. K. Mootha, A. Naheed, S. Nazarian, P. M. Nilsson, M. C. O’Donovan, Y. Okada, D. Ongur, L. Orozco, M. J. Owen, C. Palmer, N. D. Palmer, A. Palotie, K. S. Park, C. Pato, A. E. Pulver, D. Rader, N. Rahman, A. Reiner, A. M. Remes, D. Rhodes, S. Rich, J. D. Rioux, S. Ripatti, D. M. Roden, J. I. Rotter, N. Sahakian, D. Saleheen, V. Salomaa, A. Saltzman, N. J. Samani, K. E. Samocha, A. Sanchis-Juan, J. Scharf, M. Schleicher, H. Schunkert, S. Schönherr, E. G. Seaby, S. H. Shah, M. Shand, T. Sharpe, M. B. Shoemaker, T. Shyong, E. K. Silverman, M. Singer-Berk, P. Sklar, J. T. Smith, J. G. Smith, H. Soininen, H. Sokol, R. G. Son, J. Soto, T. Spector, C. Stevens, N. O. Stitziel, P. F. Sullivan, J. Suvisaari, E. S. Tai, K. D. Taylor, Y. Y. Teo, M. Tsuang, T. Tuomi, D. Turner, T. Tusie-Luna, E. Vartiainen, M. Vawter, L. Wang, A. Wang, J. S. Ware, H. Watkins, R. K. Weersma, B. Weisburd, M. Wessman, N. Whiffin, J. G. Wilson, R. J. Xavier, A. O’Donnell-Luria, M. Solomonson, C. Seed, A. R. Martin, M. E. Talkowski, H. L. Rehm, M. J. Daly, G. Tiao, B. M. Neale, D. G. MacArthur, and K. J. Karczewski. A genomic mutational constraint map using variation in 76,156 human genomes. Nature 2023 625:7993, 625:92–100, 12 2023. ISSN 1476-4687. doi: 10.1038/s41586-023-06045-0. URL https://www.nature.com/articles/s41586-023-06045-0. https://qiita.com/kaizen_nagoya/items/e799ad85ee98bb2a8cf6
313 1 Short, P. J. et al. De novo mutations in regulatory elements in neurodevelopmental disorders. Nature 555, 611–616 (2018).
314 2 Satterstrom, F. K. et al. Large-scale exome sequencing study implicates both developmental and functional changes in the neurobiology of autism. Cell 180, 568–584.e523 (2020).
315 3 Singh, T. et al. The contribution of rare variants to risk of schizophrenia in individuals with and without intellectual disability. Nat. Genet. 49, 1167–1173 (2017).
316 4 Ganna, A. et al. Quantifying the impact of rare and ultra-rare coding variation across the phenotypic spectrum. Am. J. Hum. Genet. 102, 1204–1211 (2018).
317 5 Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
318 6 Petrovski, S., Wang, Q., Heinzen, E. L., Allen, A. S. & Goldstein, D. B. Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet. 9, e1003709 (2013).
319 7 Samocha, K. E. et al. A framework for the interpretation of de novo mutation in human disease. Nat. Genet. 46, 944–950 (2014).
320 8 Hindorff, L. A. et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA 106, 9362–9367 (2009).
321 9 Lanyi, J. K. Photochromism of halorhodopsin. cis/trans isomerization of the retinal around the 13–14 double bond. J. Biol. Chem. 261, 14025–14030 (1986).
322 10 Mathelier, A., Shi, W. & Wasserman, W. W. Identification of altered cis-regulatory elements in human disease. Trends Genet. 31, 67–76 (2015).
323 11 Spielmann, M. & Mundlos, S. Looking beyond the genes: the role of non-coding variants in human disease. Hum. Mol. Genet. 25, R157–R165 (2016).
324 12 Zhang, F. & Lupski, J. R. Non-coding genetic variants in human disease. Hum. Mol. Genet. 24, R102–R110 (2015).
325 13 Seplyarskiy, V. B. & Sunyaev, S. The origin of human mutation in light of genomic data. Nat. Rev. Genet. 22, 672–686 (2021).
326 14 Seplyarskiy, V. B. et al. Population sequencing data reveal a compendium of mutational processes in the human germ line. Science 373, 1030–1035 (2021).
327 15 Gussow, A. B. et al. Orion: Detecting regions of the human non-coding genome that are intolerant to variation using population genetics. PLoS ONE 12, e0181604 (2017).
328 16 di Iulio, J. et al. The human noncoding genome defined by genetic diversity. Nat. Genet. 50, 333–337 (2018).
329 17 Halldorsson, B. V. et al. The sequences of 150,119 genomes in the UK Biobank. Nature 607, 732–740 (2022).
330 18 Ritchie, G. et al. Functional annotation of noncoding sequence variants. Nat. Methods 11, 294–296 (2014).
331 19 Vitsios, D., Dhindsa, R. S., Middleton, L., Gussow, A. B. & Petrovski, S. Prioritizing non-coding regions based on human genomic constraint and sequence context with deep learning. Nat. Commun. 12, 1504 (2021).
332 20 Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).
333 21 Pollard, K. S., Hubisz, M. J., Rosenbloom, K. R. & Siepel, A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 20, 110–121 (2010).
334 22 Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
335 23 Halldorsson, B. V. et al. Characterizing mutagenic effects of recombination through a sequence-level genetic map. Science 363, eaau1043 (2019).
336 24 An, J. Y. et al. Genome-wide de novo risk score implicates promoter variation in autism spectrum disorder. Science 362, eaat6576 (2018).
337 25 Collins, R. L. et al. A structural variation reference for medical and population genetics. Nature 581, 444–451 (2020).
338 26 The ENCODE Project Consortium. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 583, 699–710 (2020).
339 27 Andersson, R. et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014).
340 28 Jiang, Y. et al. SEdb: a comprehensive human super-enhancer database. Nucleic Acids Res. 47, D235–D243 (2019).
341 29 Pott, S. & Lieb, J. D. What are super-enhancers? Nat. Genet. 47, 8–12 (2015).
342 30 Bartel, D. P. Metazoan microRNAs. Cell 173, 20–51 (2018).
343 31 Welter, D. et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 42, D1001–D1006 (2014).
344 32 Kanai, M. et al. Insights from complex trait fine-mapping across diverse populations. Preprint at medRxiv https://doi.org/10.1101/2021.09.03.21262975 (2021).
345 33 Jung, R. G. et al. Association between plasminogen activator inhibitor-1 and cardiovascular events: a systematic review and meta-analysis. Thromb. J. 16, 12 (2018).
346 34 Song, C., Burgess, S., Eicher, J. D., O’Donnell, C. J. & Johnson, A. D. Causal effect of plasminogen activator inhibitor type 1 on coronary heart disease. J. Am. Heart Assoc. 6, e004918 (2017).
347 35 Schaefer, A. S. et al. Genetic evidence for PLASMINOGEN as a shared genetic risk factor of coronary artery disease and periodontitis. Circ. Cardiovasc. Genet. 8, 159–167 (2015).
348 36 Li, Y. Y. Plasminogen activator inhibitor-1 4G/5G gene polymorphism and coronary artery disease in the Chinese Han population: a meta-analysis. PLoS ONE 7, e33511 (2012).
349 37 Drinane, M. C., Sherman, J. A., Hall, A. E., Simons, M. & Mulligan-Kehoe, M. J. Plasminogen and plasmin activity in patients with coronary artery disease. J. Thromb. Haemost. 4, 1288–1295 (2006).
350 38 Lowe, G. D. et al. Tissue plasminogen activator antigen and coronary heart disease. Prospective study and meta-analysis. Eur. Heart J. 25, 252–259 (2004).
351 39 Wang, Q. S. et al. Leveraging supervised learning for functionally informed fine-mapping of cis-eQTLs identifies an additional 20,913 putative causal eQTLs. Nat. Commun. 12, 3394 (2021).
352 40 Landrum, M. J. et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 46, D1062–D1067 (2018).
353 41 Stenson, P. D. et al. Human Gene Mutation Database (HGMD): 2003 update. Hum. Mutat. 21, 577–581 (2003).
354 42 Davydov, E. V. et al. Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Comput. Biol. 6, e1001025 (2010).
355 43 Greenway, S. C. et al. De novo copy number variants identify new genes and loci in isolated sporadic tetralogy of Fallot. Nat. Genet. 41, 931–935 (2009).
356 44 Mefford, H. C. et al. Recurrent reciprocal genomic rearrangements of 17q12 are associated with renal disease, diabetes, and epilepsy. Am. J. Hum. Genet. 81, 1057–1069 (2007).
357 45 Sebat, J. et al. Strong association of de novo copy number mutations with autism. Science 316, 445–449 (2007).
358 46 Stefansson, H. et al. Large recurrent microdeletions associated with schizophrenia. Nature 455, 232–236 (2008).
359 47 Walsh, T. et al. Rare structural variants disrupt multiple genes in neurodevelopmental pathways in schizophrenia. Science 320, 539–543 (2008).
360 48 Wright, C. F. et al. Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data. Lancet 385, 1305–1314 (2015).
361 49 Spielmann, M., Lupianez, D. G. & Mundlos, S. Structural variation in the 3D genome. Nat. Rev. Genet. 19, 453–467 (2018).
362 50 Spielmann, M. & Mundlos, S. Structural variations, the regulatory landscape of the genome and their alteration in human disease. Bioessays 35, 533–543 (2013).
363 51 Coe, B. P. et al. Refining analyses of copy number variation identifies specific genes associated with developmental delay. Nat. Genet. 46, 1063–1071 (2014).
364 52 Cooper, G. M. et al. A copy number variation morbidity map of developmental delay. Nat. Genet. 43, 838–846 (2011).
365 53 Klopocki, E. et al. Copy-number variations involving the IHH locus are associated with syndactyly and craniosynostosis. Am. J. Hum. Genet. 88, 70–75 (2011).
366 54 Barroso, E. et al. Identification of the fourth duplication of upstream IHH regulatory elements, in a family with craniosynostosis Philadelphia type, helps to define the phenotypic characterization of these regulatory elements. Am. J. Med. Genet. A 167A, 902–906 (2015).
367 55 Will, A. J. et al. Composition and dosage of a multipartite enhancer cluster control developmental expression of Ihh (Indian hedgehog). Nat. Genet. 49, 1539–1545 (2017).
368 56 Roadmap Epigenomics Consortium. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
369 57 Rehm, H. L. et al. ClinGen—the Clinical Genome Resource. N. Engl. J. Med. 372, 2235–2242 (2015).
370 58 Blake, J. A. et al. The Mouse Genome Database (MGD): premier model organism resource for mammalian genomics and genetics. Nucleic Acids Res. 39, D842–D848 (2011).
371 59 McKusick, V. A. Mendelian Inheritance in Man and its online version, OMIM. Am. J. Hum. Genet. 80, 588–604 (2007).
372 60 Consortium, G. T. The Genotype–Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).
373 61 Xu, H. et al. Elevated ASCL2 expression in breast cancer is associated with the poor prognosis of patients. Am. J. Cancer Res. 7, 955–961 (2017).
374 62 Jubb, A. M. et al. Achaete-scute like 2 (ascl2) is a target of Wnt signalling and is upregulated in intestinal neoplasia. Oncogene 25, 3445–3457 (2006).
375 63 Tian, Y. et al. MicroRNA-200 (miR-200) cluster regulation by achaete scute-like 2 (Ascl2): impact on the epithelial-mesenchymal transition in colon cancer cells. J. Biol. Chem. 289, 36101–36115 (2014).
376 64 Guo, M. H. et al. Inferring compound heterozygosity from large-scale exome sequencing data. Nat. Genet. https://doi.org/10.1038/s41588-023-01608-3 (2023).
377 65 Zhu, P. et al. Single-cell DNA methylome sequencing of human preimplantation embryos. Nat. Genet. 50, 12–19 (2018).
378 66 Tang, W. W. et al. A unique gene regulatory network resets the human germline epigenome for development. Cell 161, 1453–1467 (2015).
379 67 Ross, D. A., Lim, J., Lin, R.-S. & Yang, M.-H. Incremental learning for robust visual tracking. Int. J. Comput. Vision 77, 125–141 (2008).
380 68 Karolchik, D. et al. The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 32, D493–D496 (2004).
381 69 Li, H. Toward better understanding of artifacts in variant calling from high-coverage samples. Bioinformatics 30, 2843–2851 (2014).
382 70 Davis, C. A. et al. The Encyclopedia of DNA elements (ENCODE): data portal update. Nucleic Acids Res. 46, D794–D801 (2018).
383 71 Goldmann, J. M. et al. Germline de novo mutation clusters arise during oocyte aging in genomic regions with high double-strand-break incidence. Nat. Genet. 50, 487–492 (2018).
384 72 Zhao, H. et al. CrossMap: a versatile tool for coordinate conversion between genome assemblies. Bioinformatics 30, 1006–1007 (2014).
385 73 Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
386 74 Kent, W. J., Zweig, A. S., Barber, G., Hinrichs, A. S. & Karolchik, D. BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics 26, 2204–2207 (2010).
387 75 Koenig, Z. et al. A harmonized public resource of deeply sequenced diverse human genomes. Preprint at bioRxiv https://doi.org/10.1101/2023.01.23.525248 (2023).
388 76 Harrow, J. et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760–1774 (2012).
389 77 Hon, C. C. et al. An atlas of human long non-coding RNAs with accurate 5′ ends. Nature 543, 199–204 (2017).
390 78 Wang, G., Sarkar, A., Carbonetto, P. & Stephens, M. A simple new approach to variable selection in regression, with application to genetic fine-mapping. J. R. Stat. Soc. B 82, 1273–1300 (2020).
391 79 Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590, 290–299 (2021).
392 80 Budescu, D. V. Dominance analysis: a new approach to the problem of relative importance of predictors in multiple regression. Psych. Bull. 114, 542 (1993).
393 81 Azen, R. & Budescu, D. V. The dominance analysis approach for comparing predictors in multiple regression. Psych. Methods 8, 129 (2003).
394 82 Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011).
395 83 Liu, Y., Sarkar, A., Kheradpour, P., Ernst, J. & Kellis, M. Evidence of reduced recombination rate in human regulatory domains. Genome Biol. 18, 193 (2017).
396 84 Robin, X. et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12, 1–8 (2011).
397 85 Bergstrom, A. et al. Insights into human genetic variation and population history from 929 diverse genomes. Science 367, eaay5012 (2020).
398 86 The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).
399 8 0 [8] M. E. Consens, B. Li, A. R. Poetsch, and S. Gilbert. Genomic language models could transform medicine but not yet. npj Digital Medicine, 8:1–4, 12 2025. ISSN 23986352. doi: 10.1038/S41746-025-01603-4;SUBJMETA=1538,692,700;KWRD=HEALTH+CARE, HEALTH+POLICY. URL https://www.nature.com/articles/s41746-025-01603-4. https://qiita.com/kaizen_nagoya/items/f797330e64e0c7d05f39
400 1 Brixi, G. et al. Genome modeling and design across all domains of life with Evo 2. 2025.02.18.638918. Preprint at https://doi.org/10.1101/2025.02.18.638918 (2025).
401 2 Callaway, E. Biggest-ever AI biology model writes DNA on demand. Nature 638, 868–869 (2025).
402 3 Park, E. G. et al. Genomic analyses of non-coding RNAs overlapping transposable elements and its implication to human diseases. Int. J. Mol. Sci. 23, 8950 (2022).
403 4 Alonso, M.E., Pernaute, B., Crespo, M., Gómez-Skarmeta, J.L. & Manzanares, M. Understanding the regulatory genome. Int. J. Develop. Biol. https://ijdb.ehu.eus/article/072428ma (2009).
404 5 Consens, M. E. et al. Transformers and genome language models. Nat. Machine Intell. https://www.nature.com/articles/s42256-025-01007-9 (2025).
405 6 Dalla-Torre, H. et al. Nucleotide transformer: building and evaluating robust foundation models for human genomics. Nat. Methods 22, 287–297 (2025).
406 7 Nguyen, E. et al. Sequence modeling and design from molecular to genome scale with Evo. Science 386, eado9336 (2024).
407 8 Comfort, N. Genetics: we are the 98%. Nature 520, 615–616 (2015).
408 9 Benegas, G., Albors, C., Aw, A.J. et al. A DNA language model based on multispecies alignment predicts the effects of genome-wide variants. Nat Biotechnol. https://doi.org/10.1038/s41587-024-02511-w (2025).
409 10 Health (US), N. I. of & Study, B. S. C. Understanding Human Genetic Variation. in NIH Curriculum Supplement Series [Internet] (National Institutes of Health, 2007).
410 11 Gresova, K., Martinek, V., Cechak, D., Simecek, P. & Alexiou, P. Genomic benchmarks: a collection of datasets for genomic sequence classification. BMC Genomic Data 24, 25 (2023).
411 12 Marin, F. I. et al. BEND: Benchmarking DNA language models on biologically meaningful tasks. Preprint at https://doi.org/10.48550/arXiv.2311.12570 (2024).
412 13 Ji, Y., Zhou, Z., Liu, H. & Davuluri, R. V. DNABERT: pre-trained bidirectional encoder representations from transformers model for DNA-language in genome. Bioinformatics 37, 2112–2120 (2021).
413 14 Sanabria, M., Hirsch, J. & Poetsch, A. R. Distinguishing word identity and sequence context in DNA language models. BMC Bioinformatics 25, 301 (2024).
414 15 Sanabria, M., Hirsch, J., Joubert, P. M. & Poetsch, A. R. DNA language model GROVER learns sequence context in the human genome. Nat. Mach. Intell. 6, 911–923 (2024).
415 16 Bloomfield, D. et al. AI and biosecurity: the need for governance. Science 385, 831–833 (2024).
416 17 Riedemann, L., Labonne, M. & Gilbert, S. The path forward for large language models in medicine is open. Npj Digit. Med. 7, 1–5 (2024).
417 18 Poetsch, A. R. KI-Modell analysiert und generiert DNA-Strukturen (SMC, 2024).
418 19 Derraz, B. et al. New regulatory thinking is needed for AI-based personalised drug and cell therapies in precision oncology. Npj Precis. Oncol. 8, 1–11 (2024).
419 20 Harishbhai Tilala, M. et al. Ethical considerations in the use of artificial intelligence and machine learning in health care: a comprehensive review. Cureus 16, e62443 (2024).
420 21 Zhou, Z. et al. DNABERT-2: Efficient Foundation Model and Benchmark For Multi-Species Genome. (2023).
421 22 Nguyen, E. et al. HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution. (2023).
422 9 0 [9] H. Dalla-Torre, L. Gonzalez, J. Mendoza-Revilla, N. L. Carranza, A. H. Grzywaczewski, F. Oteri, C. Dallago, E. Trop, B. P. de Almeida, H. Sirelkhatim, G. Richard, M. Skwark, K. Beguir, M. Lopez, and T. Pierrot. Nucleotide transformer: building and evaluating robust foundation models for human genomics. Nature Methods, 22:287–297, 2 2024. ISSN 15487105. doi: 10.1038/S41592-024-02523-Z;SUBJMETA=114,1305,1647,208,212,631,794; KWRD=GENOMICS,MACHINE+LEARNING,SOFTWARE. URL https://www.nature.com/articles/s41592-024-02523-z. https://qiita.com/kaizen_nagoya/items/1c147c2b095364f04ef7
423 1 Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. Bert: pre-training of deep bidirectional transformers for language understanding. Preprint at https://arxiv.org/abs/1810.04805 (2018).
424 2 Brown, T. et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 33, 1877–1901 (2020).
425 3 Jumper, J. et al. Highly accurate protein structure prediction with alphafold. Nature 596, 583–589 (2021).
426 4 Rives, A. et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc. Natl Acad. Sci. USA 118, e2016239118 (2021).
427 5 Elnaggar, A. et al. ProtTrans: towards cracking the language of life’s code through self-supervised deep learning and high performance computing. IEEE Trans. Pattern Anal. Mach. Intell. 44, 7112–7127 (2022).
428 6 Littmann, M., Heinzinger, M., Dallago, C., Olenyi, T. & Rost, B. Embeddings from deep learning transfer go annotations beyond homology. Sci. Rep. 11, 1–14 (2021).
429 7 Marquet, C. et al. Embeddings from protein language models predict conservation and variant effects. Hum. Genet. 141, 1629–1647 (2022).
430 8 Littmann, M., Heinzinger, M., Dallago, C., Weissenow, K. & Rost, B. Protein embeddings and deep learning predict binding residues for various ligand classes. Sci. Rep. 11, 23916 (2021).
431 9 Avsec, Ž. et al. Base-resolution models of transcription-factor binding reveal soft motif syntax. Nat. Genet. 53, 354–366 (2021).
432 10 Zhou, J. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning–based sequence model. Nat. Methods 12, 931–934 (2015).
433 11 Mateo, L. J., Sinnott-Armstrong, N. & Boettiger, A. N. Tracing dna paths and rna profiles in cultured cells and tissues with orca. Nat. Protoc. 16, 1647–1713 (2021).
434 12 de Almeida, B. P., Reiter, F., Pagani, M. & Stark, A. DeepSTARR predicts enhancer activity from dna sequence and enables the de novo design of synthetic enhancers. Nat. Genet. 54, 613–624 (2022).
435 13 Eraslan, G., Avsec, Ž., Gagneur, J. & Theis, F. J. Deep learning: new computational modelling techniques for genomics. Nat. Rev. Genet. 20, 389–403 (2019).
436 14 Zhou, J. et al. Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk. Nat. Genet. 50, 1171–1179 (2018).
437 15 Kelley, D. R. Cross-species regulatory sequence activity prediction. PLOS Comput. Biol. 16, e1008050 (2020).
438 16 Kelley, D. R. et al. Sequential regulatory activity prediction across chromosomes with convolutional neural networks. Genome Res. 28, 739–750 (2018).
439 17 Agarwal, V. & Shendure, J. Predicting mRNA abundance directly from genomic sequence using deep convolutional neural networks. Cell Rep. 31, 107663 (2020).
440 18 Chen, K. M., Wong, A. K., Troyanskaya, O. G. & Zhou, J. A sequence-based global map of regulatory activity for deciphering human genetics. Nat. Genet. 54, 940–949 (2022).
441 19 Avsec, Ž. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021).
442 20 Ji, Y., Zhou, Z., Liu, H. & Davuluri, R. V. Dnabert: pre-trained bidirectional encoder representations from transformers model for DNA-language in genome. Bioinformatics 37, 2112–2120 (2021).
443 21 Zvyagin, M. T. et al. Genslms: genome-scale language models reveal SARS-CoV-2 evolutionary dynamics. Preprint at bioRxiv https://doi.org/10.1101/2022.10.10.511571 (2022).
444 22 Outeiral, C. & Deane, C. M. Codon language embeddings provide strong signals for protein engineering. Nat. Mach. Intell. 6, 170–179 (2024).
445 23 Zhou, Z. et al. Dnabert-2: efficient foundation model and benchmark for multi-species genome. in Proceedings of the Twelfth International Conference on Learning Representations https://openreview.net/pdf?id=oMLQB4EZE1 (ICLR, 2024).
446 24 Fishman, V. et al. Gena-lm: A family of open-source foundational models for long dna sequences. Preprint at bioRxiv https://doi.org/10.1101/2023.06.12.544594 (2023).
447 25 Nguyen, E. et al. Hyenadna: Long-range genomic sequence modeling at single nucleotide resolution. in 37th Conference on Neural Information Processing Systems https://openreview.net/pdf?id=ubzNoJjOKj (NeurIPS, 2023).
448 26 Mendoza-Revilla, J. et al. A foundational large language model for edible plant genomes. Commun. Biol. 7, 835 (2024).
449 27 Rae, J. W. et al. Scaling language models: methods, analysis & insights from training gopher. Preprint at https://arxiv.org/abs/2112.11446 (2021).
450 28 Consortium, G. P. et al. A global reference for human genetic variation. Nature 526, 68 (2015).
451 29 Harrow, J. et al. GENCODE: the reference human genome annotation for the encode project. Genome Res. 22, 1760–1774 (2012).
452 30 Meylan, P., Dreos, R., Ambrosini, G., Groux, R. & Bucher, P. Epd in 2020: enhanced data visualization and extension to ncRNA promoters. Nucleic Acids Res. 48, D65–D69 (2020).
453 31 ENCODE. An integrated encyclopedia of dna elements in the human genome. Nature 489, 57–74 (2012).
454 32 The ENCODE Project Consortium. Expanded encyclopaedias of dna elements in the human and mouse genomes. Nature 583, 699–710 (2020).
455 33 Meuleman, W. et al. Index and biological spectrum of human DNase I hypersensitive sites. Nature 584, 244–251 (2020).
456 34 Li, F.-Z., Amini, A. P., Yang, K. K. & Lu, A. X. Pretrained protein language model transfer learning: is the final layer representation what we want? in Machine Learning for Structural Biology Workshop (NeurIPS, 2022).
457 35 Liu, H. et al. Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning. in 36th Conference on Neural Information Processing Systems (NeurIPS, 2022).
458 36 Jaganathan, K. et al. Predicting splicing from primary sequence with deep learning. Cell 176, 535–548 (2019).
459 37 Benegas, G., Batra, S. S. & Song, Y. S. DNA language models are powerful zero-shot predictors of non-coding variant effects. Proc. Natl Acad. Sci. USA 120, e2311219120 (2023).
460 38 Vig, J. et al. BERTology meets biology: interpreting attention in protein language models. in Proceedings of the International Conference on Learning Representations 2021 https://openreview.net/pdf?id=YWtLZvLmud7 (ICLR, 2021).
461 39 Braun, S. et al. Decoding a cancer-relevant splicing decision in the ron proto-oncogene using high-throughput mutagenesis. Nat. Commun. 9, 3315 (2018).
462 40 McLaren, W. et al. The Ensembl variant effect predictor. Genome Biol. 17, 1–14 (2016).
463 41 Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).
464 42 Consortium, G. The gtex consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).
465 43 Võsa, U. et al. Large-scale cis-and trans-eqtl analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. Nat. Genetics 53, 1300–1310 (2021).
466 44 Chowdhery, A. et al. PaLM: scaling language modeling with pathways. J. Mach. Learn. Technol. 24, 11324–11436 (2021).
467 45 Hoffmann, J. et al. Training compute-optimal large language models. in 36th Conference on Neural Information Processing Systems https://proceedings.neurips.cc/paper_files/paper/2022/file/c1e2faff6f588870935f114ebe04a3e5-Paper-Conference.pdf (NeurIPS, 2022).
468 46 Rogers, A., Kovaleva, O. & Rumshisky, A. A primer in BERTology: what we know about how bert works. Trans. Assoc. Comput. Linguist. 8, 842–866 (2020).
469 47 Stärk, H., Dallago, C., Heinzinger, M. & Rost, B. Light attention predicts protein location from the language of life. Bioinform. Adv. 1, vbab035 (2021).
470 48 Zou, J. et al. A primer on deep learning in genomics. Nat. Genetics 51, 12–18 (2019).
471 49 Wang, A. et al. Superglue: a stickier benchmark for general-purpose language understanding systems. in 33rd Conference on Neural Information Processing Systems https://papers.nips.cc/paper_files/paper/2019/file/4496bf24afe7fab6f046bf4923da8de6-Paper.pdf (NeurIPS, 2019).
472 50 Hendrycks, D. & Gimpel, K. Gaussian error linear units (gelus). Preprint at https://arxiv.org/abs/1606.08415 (2016).
473 51 Su, J. et al. Roformer: enhanced transformer with rotary position embedding. Neurocomputing 568, 127063 (2024).
474 52 Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980v5 (2015).
475 53 Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
476 54 Bergstra, J., Bardenet, R., Bengio, Y. & Kégl, B. Algorithms for hyper-parameter optimization. in Advances in Neural Information Processing Systems 24 https://papers.nips.cc/paper_files/paper/2011/file/86e8f7ab32cfd12577bc2619bc635690-Paper.pdf (NeurIPS, 2011).
477 55 Byrska-Bishop, M. et al. High-coverage whole-genome sequencing of the expanded 1000 genomes project cohort including 602 trios. Cell 185, 3426–3440 (2022).
478 56 Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with hisat2 and hisat-genotype. Nat. Biotechnol.37, 907–905 (2019).
479 57 Leslie, R., O’Donnell, C. J. & Johnson, A. D. GRASP: analysis of genotype–phenotype results from 1390 genome-wide association studies and corresponding open access database. Bioinformatics 30, i185–i194 (2014).
480 58 Landrum, M. J. et al. Clinvar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 46, D1062–D1067 (2018).
481 59 Stenson, P. D. et al. The Human Gene Mutation Database (HGMD®): optimizing its use in a clinical diagnostic or research setting. Hum. Genet. 139, 1197–1207 (2020).
482 11 0 [11] DeepSeek-AI, D. Guo, D. Yang, H. Zhang, J. Song, R. Zhang, R. Xu, Q. Zhu, S. Ma, P. Wang, X. Bi, X. Zhang, X. Yu, Y. Wu, Z. F. Wu, Z. Gou, Z. Shao, Z. Li, Z. Gao, A. Liu, B. Xue, B. Wang, B. Wu, B. Feng, C. Lu, C. Zhao, C. Deng, C. Zhang, C. Ruan, D. Dai, D. Chen, D. Ji, E. Li, F. Lin, F. Dai, F. Luo, G. Hao, G. Chen, G. Li, H. Zhang, H. Bao, H. Xu, H. Wang, H. Ding, H. Xin, H. Gao, H. Qu, H. Li, J. Guo, J. Li, J. Wang, J. Chen, J. Yuan, J. Qiu, J. Li, J. L. Cai, J. Ni, J. Liang, J. Chen, K. Dong, K. Hu, K. Gao, K. Guan, K. Huang, K. Yu, L. Wang, L. Zhang, L. Zhao, L. Wang, L. Zhang, L. Xu, L. Xia, M. Zhang, M. Zhang, M. Tang, M. Li, M. Wang, M. Li, N. Tian, P. Huang, P. Zhang, Q. Wang, Q. Chen, Q. Du, R. Ge, R. Zhang, R. Pan, R. Wang, R. J. Chen, R. L. Jin, R. Chen, S. Lu, S. Zhou, S. Chen, S. Ye, S. Wang, S. Yu, S. Zhou, S. Pan, S. S. Li, S. Zhou, S. Wu, S. Ye, T. Yun, T. Pei, T. Sun, T. Wang, W. Zeng, W. Zhao, W. Liu, W. Liang, W. Gao, W. Yu, W. Zhang, W. L. Xiao, W. An, X. Liu, X. Wang, X. Chen, X. Nie, X. Cheng, X. Liu, X. Xie, X. Liu, X. Yang, X. Li, X. Su, X. Lin, X. Q. Li, X. Jin, X. Shen, X. Chen, X. Sun, X. Wang, X. Song, X. Zhou, X. Wang, X. Shan, Y. K. Li, Y. Q. Wang, Y. X. Wei, Y. Zhang, Y. Xu, Y. Li, Y. Zhao, Y. Sun, Y. Wang, Y. Yu, Y. Zhang, Y. Shi, Y. Xiong, Y. He, Y. Piao, Y. Wang, Y. Tan, Y. Ma, Y. Liu, Y. Guo, Y. Ou, Y. Wang, Y. Gong, Y. Zou, Y. He, Y. Xiong, Y. Luo, Y. You, Y. Liu, Y. Zhou, Y. X. Zhu, Y. Xu, Y. Huang, Y. Li, Y. Zheng, Y. Zhu, Y. Ma, Y. Tang, Y. Zha, Y. Yan, Z. Z. Ren, Z. Ren, Z. Sha, Z. Fu, Z. Xu, Z. Xie, Z. Zhang, Z. Hao, Z. Ma, Z. Yan, Z. Wu, Z. Gu, Z. Zhu, Z. Liu, Z. Li, Z. Xie, Z. Song, Z. Pan, Z. Huang, Z. Xu, Z. Zhang, and Z. Zhang. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. 1 2025. URL https://arxiv.org/pdf/2501.12948. https://qiita.com/kaizen_nagoya/items/bb5ee9f17c03e07659d8
483 1 AI@Meta. Llama 3.1 model card, 2024. URL https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md.
484 2 Anthropic. Claude 3.5 sonnet, 2024. URL https://www.anthropic.com/news/claude-3-5-sonnet.
485 3 M. Chen, J. Tworek, H. Jun, Q. Yuan, H. P. de Oliveira Pinto, J. Kaplan, H. Edwards, Y. Burda, N. Joseph, G. Brockman, A. Ray, R. Puri, G. Krueger, M. Petrov, H. Khlaaf, G. Sastry, P. Mishkin, B. Chan, S. Gray, N. Ryder, M. Pavlov, A. Power, L. Kaiser, M. Bavarian, C. Winter, P. Tillet, F. P. Such, D. Cummings, M. Plappert, F. Chantzis, E. Barnes, A. Herbert-Voss, W. H. Guss, A. Nichol, A. Paino, N. Tezak, J. Tang, I. Babuschkin, S. Balaji, S. Jain, W. Saunders, C. Hesse, A. N. Carr, J. Leike, J. Achiam, V. Misra, E. Morikawa, A. Radford, M. Knight, M. Brundage,
486 4 M. Murati, K. Mayer, P. Welinder, B. McGrew, D. Amodei, S. McCandlish, I. Sutskever, and W. Zaremba. Evaluating large language models trained on code. CoRR, abs/2107.03374, 2021. URL https://arxiv.org/abs/2107.03374.
487 5 A. Dubey, A. Jauhri, A. Pandey, A. Kadian, A. Al-Dahle, A. Letman, A. Mathur, A. Schelten, A. Yang, A. Fan, et al. The llama 3 herd of models. arXiv preprint arXiv:2407.21783, 2024.
488 6 Y. Dubois, B. Galambosi, P. Liang, and T. B. Hashimoto. Length-controlled alpacaeval: A simple way to debias automatic evaluators. arXiv preprint arXiv:2404.04475, 2024.
489 7 X. Feng, Z. Wan, M. Wen, S. M. McAleer, Y. Wen, W. Zhang, and J. Wang. Alphazero-like tree-search can guide large language model decoding and training, 2024. URL https://arxiv.org/abs/2309.17179.
490 8 L. Gao, J. Schulman, and J. Hilton. Scaling laws for reward model overoptimization, 2022. URL https://arxiv.org/abs/2210.10760.
491 9 A. P. Gema, J. O. J. Leang, G. Hong, A. Devoto, A. C. M. Mancino, R. Saxena, X. He, Y. Zhao, X. Du, M. R. G. Madani, C. Barale, R. McHardy, J. Harris, J. Kaddour, E. van Krieken, and P. Minervini. Are we done with mmlu? CoRR, abs/2406.04127, 2024. URL https://doi.org/10.48550/arXiv.2406.04127.
492 10 Google. Our next-generation model: Gemini 1.5, 2024. URL https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024.
493 11 Y. He, S. Li, J. Liu, Y. Tan, W. Wang, H. Huang, X. Bu, H. Guo, C. Hu, B. Zheng, et al. Chinese simpleqa: A chinese factuality evaluation for large language models. arXiv preprint arXiv:2411.07140, 2024.
494 12 D. Hendrycks, C. Burns, S. Basart, A. Zou, M. Mazeika, D. Song, and J. Steinhardt. Measuring massive multitask language understanding. arXiv preprint arXiv:2009.03300, 2020.
495 13 Y. Huang, Y. Bai, Z. Zhu, J. Zhang, J. Zhang, T. Su, J. Liu, C. Lv, Y. Zhang, J. Lei, et al. C-Eval: A multi-level multi-discipline chinese evaluation suite for foundation models. arXiv preprint arXiv:2305.08322, 2023.
496 14 N. Jain, K. Han, A. Gu, W. Li, F. Yan, T. Zhang, S. Wang, A. Solar-Lezama, K. Sen, and I. Stoica. Livecodebench: Holistic and contamination free evaluation of large language models for code. CoRR, abs/2403.07974, 2024. URL https://doi.org/10.48550/arXiv.2403.07974.
497 15 S. Krishna, K. Krishna, A. Mohananey, S. Schwarcz, A. Stambler, S. Upadhyay, and M. Faruqui. Fact, fetch, and reason: A unified evaluation of retrieval-augmented generation. CoRR, abs/2409.12941, 2024. doi: 10.48550/ARXIV.2409.12941. URL https://doi.org/10.48550/arXiv.2409.12941.
498 16 A. Kumar, V. Zhuang, R. Agarwal, Y. Su, J. D. Co-Reyes, A. Singh, K. Baumli, S. Iqbal, C. Bishop, R. Roelofs, et al. Training language models to self-correct via reinforcement learning. arXiv preprint arXiv:2409.12917, 2024.
499 17 H. Li, Y. Zhang, F. Koto, Y. Yang, H. Zhao, Y. Gong, N. Duan, and T. Baldwin. CMMLU: Measuring massive multitask language understanding in Chinese. arXiv preprint arXiv:2306.09212, 2023.
500 18 T. Li, W.-L. Chiang, E. Frick, L. Dunlap, T. Wu, B. Zhu, J. E. Gonzalez, and I. Stoica. From crowdsourced data to high-quality benchmarks: Arena-hard and benchbuilder pipeline. arXiv preprint arXiv:2406.11939, 2024.
501 19 H. Lightman, V. Kosaraju, Y. Burda, H. Edwards, B. Baker, T. Lee, J. Leike, J. Schulman, I. Sutskever, and K. Cobbe. Let’s verify step by step. arXiv preprint arXiv:2305.20050, 2023.
502 20 B. Y. Lin. ZeroEval: A Unified Framework for Evaluating Language Models, July 2024. URL https://github.com/WildEval/ZeroEval.
503 21 MAA. American invitational mathematics examination - aime. Mathematics Examination - AIME 2024, February 2024. URL https://maa.org/math In American Invitational-competitions/american-invitational-mathematics-examination-aime.
504 22 OpenAI. Hello GPT-4o, 2024a. URL https://openai.com/index/hello-gpt-4o/.
505 23 OpenAI. Learning to reason with llms, 2024b. URL https://openai.com/index/learning-to-reason-with-llms/.
506 24 OpenAI. Introducing SimpleQA, 2024c. URL https://openai.com/index/introducing-simpleqa/.
507 25 OpenAI. Introducing SWE-bench verified we’re releasing a human-validated subset of swebench that more, 2024d. URL https://openai.com/index/introducing-swe-bench-verified/.
508 26 Qwen. Qwq: Reflect deeply on the boundaries of the unknown, 2024a. URL https://qwenlm.github.io/blog/qwq-32b-preview/.
509 27 Qwen. Qwen2.5: A party of foundation models, 2024b. URL https://qwenlm.github.io/blog/qwen2.5.
510 28 D. Rein, B. L. Hou, A. C. Stickland, J. Petty, R. Y. Pang, J. Dirani, J. Michael, and S. R. Bowman. GPQA: A graduate-level google-proof q&a benchmark. arXiv preprint arXiv:2311.12022, 2023.
511 29 Z. Shao, P. Wang, Q. Zhu, R. Xu, J. Song, M. Zhang, Y. Li, Y. Wu, and D. Guo. Deepseekmath: Pushing the limits of mathematical reasoning in open language models. arXiv preprint arXiv:2402.03300, 2024.
512 30 D. Silver, T. Hubert, J. Schrittwieser, I. Antonoglou, M. Lai, A. Guez, M. Lanctot, L. Sifre, D. Kumaran, T. Graepel, T. P. Lillicrap, K. Simonyan, and D. Hassabis. Mastering chess and shogi by self-play with a general reinforcement learning algorithm. CoRR, abs/1712.01815, 2017a. URL http://arxiv.org/abs/1712.01815.
513 31 D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, A. Bolton, Y. Chen, T. P. Lillicrap, F. Hui, L. Sifre, G. van den Driessche, T. Graepel, and D. Hassabis. Mastering the game of go without human knowledge. Nat., 550(7676):354–359, 2017b. doi: 10.1038/NATURE24270. URL https://doi.org/10.1038/nature24270.
514 32 C. Snell, J. Lee, K. Xu, and A. Kumar. Scaling llm test-time compute optimally can be more effective than scaling model parameters, 2024. URL https://arxiv.org/abs/2408.03314.
515 33 J. Uesato, N. Kushman, R. Kumar, F. Song, N. Siegel, L. Wang, A. Creswell, G. Irving, and I. Higgins. Solving math word problems with process-and outcome-based feedback. arXiv preprint arXiv:2211.14275, 2022.
516 34 T. Trinh, Y. Wu, Q. Le, H. He, and T. Luong. Solving olympiad geometry without human demonstrations. Nature, 2024. doi: 10.1038/s41586-023-06747-5.
517 35 P. Wang, L. Li, Z. Shao, R. Xu, D. Dai, Y. Li, D. Chen, Y. Wu, and Z. Sui. Math-shepherd: A label-free step-by-step verifier for llms in mathematical reasoning. arXiv preprint arXiv:2312.08935, 2023.
518 36 X. Wang, J. Wei, D. Schuurmans, Q. Le, E. Chi, S. Narang, A. Chowdhery, and D. Zhou. Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171, 2022.
519 37 Y. Wang, X. Ma, G. Zhang, Y. Ni, A. Chandra, S. Guo, W. Ren, A. Arulraj, X. He, Z. Jiang, T. Li, M. Ku, K. Wang, A. Zhuang, R. Fan, X. Yue, and W. Chen. Mmlu-pro: A more robust and challenging multi-task language understanding benchmark. CoRR, abs/2406.01574, 2024. URL https://doi.org/10.48550/arXiv.2406.01574.
520 38 C. S. Xia, Y. Deng, S. Dunn, and L. Zhang. Agentless: Demystifying llm-based software engineering agents. arXiv preprint, 2024.
521 39 H. Xin, Z. Z. Ren, J. Song, Z. Shao, W. Zhao, H. Wang, B. Liu, L. Zhang, X. Lu, Q. Du, W. Gao, Q. Zhu, D. Yang, Z. Gou, Z. F. Wu, F. Luo, and C. Ruan. Deepseek-prover-v1.5: Harnessing proof assistant feedback for reinforcement learning and monte-carlo tree search, 2024. URL https://arxiv.org/abs/2408.08152.
522 40 J. Zhou, T. Lu, S. Mishra, S. Brahma, S. Basu, Y. Luan, D. Zhou, and L. Hou. Instruction-following evaluation for large language models. arXiv preprint arXiv:2311.07911, 2023.
523 12 0 [12] A. Fallahpour, V. Gureghian, G. J. Filion, A. B. Lindner, and A. Pandi. Codontransformer: A multispecies codon optimizer using context-aware neural networks. Nature Communications, 16(1), Apr 2025. doi: 10.1038/s41467-025-58588-7. https://qiita.com/kaizen_nagoya/items/d4be1d4dd9eb307f09cc
524 1 Pechmann, S. & Frydman, J. Evolutionary conservation of codon optimality reveals hidden signatures of cotranslational folding. Nat. Struct. Mol. Biol. 20, 237–243 (2013).
525 2 Rocha, E. P. C. Codon usage bias from tRNA’s point of view: redundancy, specialization, and efficient decoding for translation optimization. Genome Res 14, 2279–2286 (2004).
526 3 Ran, W., Kristensen, D. M. & Koonin, E. V. Coupling between protein level selection and codon usage optimization in the evolution of bacteria and archaea. MBio 5, e00956–14 (2014).
527 4 Plotkin, J. B. & Kudla, G. Synonymous but not the same: the causes and consequences of codon bias. Nat. Rev. Genet. 12, 32–42 (2011).
528 5 Deng, Y., de Lima Hedayioglu, F., Kalfon, J., Chu, D. & von der Haar, T. Hidden patterns of codon usage bias across kingdoms. J. R. Soc. Interface 17, 20190819 (2020).
529 6 Mauro, V. P. Codon Optimization in the Production of Recombinant Biotherapeutics: Potential Risks and Considerations. BioDrugs 32, 69–81 (2018).
530 7 Mauro, V. P. & Chappell, S. A. A critical analysis of codon optimization in human therapeutics. Trends Mol. Med 20, 604–613 (2014).
531 8 Sharp, P. M. & Li, W. H. The codon Adaptation Index-a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 15, 1281–1295 (1987).
532 9 Quax, T. E. F., Claassens, N. J., Söll, D. & van der Oost, J. Codon bias as a means to fine-tune gene expression. Mol. Cell 59, 149–161 (2015).
533 10 Brule, C. E. & Grayhack, E. J. Synonymous codons: choose wisely for expression. Trends Genet 33, 283–297 (2017).
534 11 Khakzad, H. et al. A new age in protein design empowered by deep learning. Cell Syst. 14, 925–939 (2023).
535 12 Watson, J. L. et al. De novo design of protein structure and function with RFdiffusion. Nature 620, 1089–1100 (2023).
536 13 Ingraham, J. B. et al. Illuminating protein space with a programmable generative model. Nature 623, 1070–1078 (2023).
537 14 Madani, A. et al. Large language models generate functional protein sequences across diverse families. Nat. Biotechnol. 41, 1099–1106 (2023).
538 15 Ferruz, N., Schmidt, S. & Höcker, B. ProtGPT2 is a deep unsupervised language model for protein design. Nat. Commun. 13, 4348 (2022).
539 16 Johnson, S. R. et al. Computational scoring and experimental evaluation of enzymes generated by neural networks. Nat. Biotechnol. https://doi.org/10.1038/s41587-024-02214-2 (2024).
540 17 UniProtKB/Swiss-Prot Release 2024_04 statistics. https://web.expasy.org/docs/relnotes/relstat.html.
541 18 Jain, R., Jain, A., Mauro, E., LeShane, K. & Densmore, D. ICOR: improving codon optimization with recurrent neural networks. BMC Bioinforma. 24, 132 (2023).
542 19 Yang, Q. et al. eRF1 mediates codon usage effects on mRNA translation efficiency through premature termination at rare codons. Nucleic Acids Res. 47, 9243–9258 (2019).
543 20 Angov, E., Legler, P. M. & Mease, R. M. Adjustment of codon usage frequencies by codon harmonization improves protein expression and folding. Methods Mol. Biol. 705, 1–13 (2011).
544 21 Claassens, N. J. et al. Improving heterologous membrane protein production in Escherichia coli by combining transcriptional tuning and codon usage algorithms. PLoS ONE 12, e0184355 (2017).
545 22 [No title]. https://mlcb.github.io/mlcb2019_proceedings/papers/paper_29.pdf.
546 23 Fu, H. et al. Codon optimization with deep learning to enhance protein expression. Sci. Rep. 10, 17617 (2020).
547 24 Constant, D. A. et al. Deep learning-based codon optimization with large-scale synonymous variant datasets enables generalized tunable protein expression. Preprint at bioRxiv https://doi.org/10.1101/2023.02.11.528149 (2023).
548 25 Sabath, N., Wagner, A. & Karlin, D. Evolution of viral proteins originated de novo by overprinting. Mol. Biol. Evol. 29, 3767–3780 (2012).
549 26 Cho, K., van Merrienboer, B., Bahdanau, D. & Bengio, Y. On the properties of neural machine translation: Encoder-decoder approaches. https://doi.org/10.48550/ARXIV.1409.1259 (2014).
550 27 Vaswani, A. et al. Attention is all you need. https://doi.org/10.48550/ARXIV.1706.03762 (2017).
551 28 Brown, T. B. et al. Language models are few-shot learners. https://doi.org/10.48550/ARXIV.2005.14165 (2020).
552 29 Zaheer, M. et al. Big bird: Transformers for longer sequences. https://doi.org/10.48550/ARXIV.2007.14062 (2020).
553 30 Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional Transformers for language understanding. https://doi.org/10.48550/ARXIV.1810.04805 (2018)
554 31 Ranaghan, M. J., Li, J. J., Laprise, D. M. & Garvie, C. W. Assessing optimal: inequalities in codon optimization algorithms. BMC Biol. 19, 36 (2021).
555 32 Chandra, S. et al. The high mutational sensitivity of ccda antitoxin is linked to codon optimality. Mol Biol Evol 39, (2022).
556 33 Liu, Y., Yang, Q. & Zhao, F. Synonymous But Not Silent: the Codon Usage Code for Gene Expression and Protein Folding. Annu. Rev. Biochem. 90, 375–401 (2021).
557 34 Liu, Y. A code within the genetic code: codon usage regulates co-translational protein folding. Cell Commun. Signal. 18, 1–9 (2020).
558 35 Lyu, X. & Liu, Y. Nonoptimal codon usage is critical for protein structure and function of the master general amino acid control regulator CPC-1. MBio 11, (2020).
559 36 Walsh, I. M., Bowman, M. A., Soto Santarriaga, I. F., Rodriguez, A. & Clark, P. L. Synonymous codon substitutions perturb cotranslational protein folding in vivo and impair cell fitness. Proc. Natl Acad. Sci. USA 117, 3528–3534 (2020).
560 37 Pechmann, S., Chartron, J. W. & Frydman, J. Local slowdown of translation by nonoptimal codons promotes nascent-chain recognition by SRP in vivo. Nat. Struct. Mol. Biol. 21, 1100–1105 (2014).
561 38 Zhou, T., Weems, M. & Wilke, C. O. Translationally optimal codons associate with structurally sensitive sites in proteins. Mol. Biol. Evol. 26, 1571–1580 (2009).
562 39 Zhou, M., Wang, T., Fu, J., Xiao, G. & Liu, Y. Nonoptimal codon usage influences protein structure in intrinsically disordered regions. Mol. Microbiol. 97, 974–987 (2015).
563 40 Zhou, M. et al. Non-optimal codon usage affects expression, structure and function of clock protein FRQ. Nature 495, 111–115 (2013).
564 41 Clarke, T. F. 4th & Clark, P. L. Rare codons cluster. PLoS ONE 3, e3412 (2008).
565 42 Lorenz, R. et al. ViennaRNA Package 2.0. Algorithms Mol. Biol. 6, 26 (2011).
566 43 Real, R. & Vargas, J. M. The probabilistic basis of jaccard’s index of similarity. Syst. Biol. 45, 380–385 (1996).
567 44 Montgomery, K. T., Tardiff, J., Reid, L. M. & Krauter, K. S. Negative and positive cis-acting elements control the expression of murine alpha 1-protease inhibitor genes. Mol. Cell. Biol. 10, 2625–2637 (1990).
568 45 Medina-Muñoz, S. G. et al. Crosstalk between codon optimality and cis-regulatory elements dictates mRNA stability. Genome Biol. 22, 14 (2021).
569 46 Shabalina, S. A., Spiridonov, N. A. & Kashina, A. Sounds of silence: synonymous nucleotides as a key to biological regulation and complexity. Nucleic Acids Res. 41, 2073–2094 (2013).
570 47 Nuryana, I. et al. Codon optimization of a gene encoding DNA polymerase from Pyrococcus furiosus and its expression in Escherichia coli. J. Genet. Eng. Biotechnol. 21, 129 (2023).
571 48 Moss, M. J., Chamness, L. M. & Clark, P. L. The effects of codon usage on protein structure and folding. Annu. Rev. Biophys. 53, 87–108 (2024).
572 49 Barrington, C. L. et al. Synonymous codon usage regulates translation initiation. Cell Rep. 42, 113413 (2023).
573 50 Outeiral, C. & Deane, C. M. Codon language embeddings provide strong signals for use in protein engineering. Nat. Mach. Intell. 6, 170–179 (2024).
574 51 Lin, B. C., Kaissarian, N. M. & Kimchi-Sarfaty, C. Implementing computational methods in tandem with synonymous gene recoding for therapeutic development. Trends Pharmacol. Sci. 44, 73–84 (2023).
575 52 Bio.Data.CodonTable module—Biopython 1.75 documentation. https://biopython.org/docs/1.75/api/Bio.Data.CodonTable.html.
576 53 Fallahpour, A., Alinoori, M., Afkanpour, A. & Krishnan, A. EHRMamba: towards generalizable and scalable foundation models for Electronic Health Records. https://doi.org/10.48550/ARXIV.2405.14567 (2024).
577 54 Wolf, T. et al. HuggingFace’s transformers: State-of-the-art natural language processing. https://doi.org/10.48550/ARXIV.1910.03771 (2019).
578 55 Lee, B. D. Python Implementation of Codon Adaptation Index. J. Open Source Softw. 3, 905 (2018).
579 56 Codon Usage Database. https://www.kazusa.or.jp/codon/.
580 57 Sakoe, H. & Chiba, S. Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. 26, 43–49 (1978).
581 58 Dynamic Time Warping. in Information Retrieval for Music and Motion, 69–84 (Springer Berlin Heidelberg, Berlin, Heidelberg, 2007).
582 59 Giorgino, T. Computing and visualizing dynamic time warping alignments in R: The dtw Package. J. Stat. Softw. 31, 1–24 (2009).
583 60 Górska, A., Plucinska, M., Pedersen, L., Kielpinski, L., Tehler, D. & Hagedorn, P. XNAString: efficient manipulation of modified oligonucleotide sequences. R package version 1.14.0. https://doi.org/10.18129/B9.BIOC.XNASTRING. (2024).
584 61 Fallahpour, A. et al. CodonTransformer: a multispecies codon optimizer using context-aware neural networks. Adibvafa/CodonTransformer. https://doi.org/10.5281/ZENODO.15000833 (Zenodo, 2025).
585 13 0 [13] A. Fallahpour, J. Ma, A. Munim, H. Lyu, and B. Wang. Medrax: Medical reasoning agent for chest x-ray, 2025. URL https://arxiv.org/abs/2502.02673.
586 1 Adibi, A., Cao, X., Ji, Z., Kaur, J. N., Chen, W., Healey, E., Nuwagira, B., Ye, W., Woollard, G., Xu, M. A., Cui, H., Xi, J., Chang, T., Bikia, V., Zhang, N., Noori, A., Xia, Y., Hossain, M. B., Frank, H. A., Peluso, A., Pu, Y., Shen, S. Z., Wu, J., Fallahpour, A., Mahbub, S., Duncan, R., Zhang, Y., Cao, Y., Xu, Z., Craig, M., Krishnan, R. G., Beheshti, R., Rehg, J. M., Karim, M. E., Coffee, M., Celi, L. A., Fries, J. A., Sadatsafavi, M., Shung, D., McWeeney, S., Dafflon, J., and Jabbour, S. Recent advances, applications and open challenges in machine learning for health: Reflections from research roundtables at ml4h 2024 symposium, 2025.
587 2 Ahn, J. S., Ebrahimian, S., McDermott, S., Lee, S., Naccarato, L., Di Capua, J. F., Wu, M. Y., Zhang, E. W., Muse, V., Miller, B., et al. Association of artificial intelligence–aided chest radiograph interpretation with reader performance and efficiency. JAMA Network Open, 5(8):e2229289–e2229289, 2022.
588 3 Baghbanzadeh, N., Fallahpour, A., Parhizkar, Y., Ogidi, F., Roy, S., Ashkezari, S., Khazaie, V. R., Colacci, M., Etemad, A., Afkanpour, A., and Dolatabadi, E. Advancing medical representation learning through high-quality data, 2025.
589 4 Bahl, S., Ramzan, T., and Maraj, R. Interpretation and documentation of chest x-rays in the acute medical unit. Clinical Medicine, 20(2):s73, 2020.
590 5 Baltruschat, I., Steinmeister, L., Nickisch, H., Saalbach, A., Grass, M., Adam, G., Knopp, T., and Ittrich, H. Smart chest x-ray worklist prioritization using artificial intelligence: a clinical workflow simulation. European radiology, 31:3837–3845, 2021.
591 6 Bannur, S., Bouzid, K., Castro, D. C., Schwaighofer, A., Thieme, A., Bond-Taylor, S., Ilse, M., P´ erez-Garc´ ıa, F., Salvatelli, V., Sharma, H., Meissen, F., Ranjit, M., Srivastav, S., Gong, J., Codella, N. C. F., Falck, F., Oktay, O., Lungren, M. P., Wetscherek, M. T., Alvarez-Valle, J., and Hyland, S. L. Maira-2: Grounded radiology report generation, 2024.
592 7 Bansal, H., Israel, D., Zhao, S., Li, S., Nguyen, T., and Grover, A. Medmax: Mixed-modal instruction tuning for training biomedical assistants, 2024.
593 8 Chambon, P., Bluethgen, C., Delbrouck, J.-B., der Sluijs, R. V., Połacin, M., Chaves, J. M. Z., Abraham, T. M., Purohit, S., Langlotz, C. P., and Chaudhari, A. Roentgen: Vision-language foundation model for chest x-ray generation, 2022.
594 9 Chambon, P., Delbrouck, J.-B., Sounack, T., Huang, S.-C., Chen, Z., Varma, M., Truong, S. Q., Chuong, C. T., and Langlotz, C. P. Chexpert plus: Augmenting a large chest x-ray dataset with text radiology reports, patient demographics and additional image formats, 2024.
595 10 Chen, Z., Varma, M., Delbrouck, J.-B., Paschali, M., Blankemeier, L., Van Veen, D., Valanarasu, J. M. J., Youssef, A., Cohen, J. P., Reis, E. P., et al. Chexagent: Towards
596 11 a foundation model for chest x-ray interpretation. arXiv preprint arXiv:2401.12208, 2024a.
597 12 Chen, Z., Varma, M., Xu, J., Paschali, M., Veen, D. V., Johnston, A., Youssef, A., Blankemeier, L., Bluethgen, C., Altmayer, S., Valanarasu, J. M. J., Muneer, M. S. E., Reis, E. P., Cohen, J. P., Olsen, C., Abraham, T. M., Tsai, E. B., Beaulieu, C. F., Jitsev, J., Gatidis, S., Delbrouck, J.-B., Chaudhari, A. S., and Langlotz, C. P. A vision-language foundation model to enhance efficiency of chest x-ray interpretation, 2024b.
598 13 MedRAX: Medical Reasoning Agent for Chest X-ray Cohen, J. P., Hashir, M., Brooks, R., and Bertrand, H. On the limits of cross-domain generalization in automated x-ray prediction. In Medical Imaging with Deep Learning, 2020.
599 14 Cohen, J. P., Viviano, J. D., Bertin, P., Morrison, P., Torabian, P., Guarrera, M., Lungren, M. P., Chaudhari, A., Brooks, R., Hashir, M., and Bertrand, H. TorchXRayVision: A library of chest X-ray datasets and models. In Medical Imaging with Deep Learning, 2022.
600 15 Erdal, B. S., Gupta, V., Demirer, M., Fair, K. H., White, R. D., Blair, J., Deichert, B., Lafleur, L., Qin, M. M., Bericat, D., and Genereaux, B. Integration and implementation strategies for ai algorithm deployment with smart routing rules and workflow management, 2023.
601 16 Eriksen, A. V., M¨ oller, S., and Ryg, J. Use of gpt-4 to diagnose complex clinical cases, 2024.
602 17 Fallahpour, A., Alinoori, M., Ye, W., Cao, X., Afkanpour, A., and Krishnan, A. Ehrmamba: Towards generalizable and scalable foundation models for electronic health records, 2024.
603 18 Guo, D., Yang, D., Zhang, H., Song, J., Zhang, R., Xu, R., Zhu, Q., Ma, S., Wang, P., Bi, X., et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv preprint arXiv:2501.12948, 2025.
604 19 Huang, J., Neill, L., Wittbrodt, M., Melnick, D., Klug, M., Thompson, M., Bailitz, J., Loftus, T., Malik, S., Phull, A., et al. Generative artificial intelligence for chest radiograph interpretation in the emergency department. JAMA network open, 6(10):e2336100–e2336100, 2023.
605 20 Hyland, S. L., Bannur, S., Bouzid, K., Castro, D. C., Ranjit, M., Schwaighofer, A., P´ erez-Garc´ ıa, F., Salvatelli, V., Srivastav, S., Thieme, A., Codella, N., Lungren, M. P., Wetscherek, M. T., Oktay, O., and Alvarez-Valle, J. Maira1: A specialised large multimodal model for radiology report generation, 2024.
606 21 Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., Chute, C., Marklund, H., Haghgoo, B., Ball, R., Shpanskaya, K., Seekins, J., Mong, D. A., Halabi, S. S., Sandberg, J. K., Jones, R., Larson, D. B., Langlotz, C. P., Patel, B. N., Lungren, M. P., and Ng, A. Y. Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison, 2019.
607 22 Jaech, A., Kalai, A., Lerer, A., Richardson, A., El-Kishky, A., Low, A., Helyar, A., Madry, A., Beutel, A., Carney, A., et al. Openai o1 system card. arXiv preprint arXiv:2412.16720, 2024.
608 23 Javan, R., Kim, T., and Mostaghni, N. Gpt-4 vision: Multimodal evolution of chatgpt and potential role in radiology. Cureus, 16(8):e68298, 2024.
609 24 Jiang, Y., Black, K. C., Geng, G., Park, D., Ng, A. Y., and Chen, J. H. Medagentbench: Dataset for benchmarking llms as agents in medical applications, 2025.
610 25 Jimenez, C. E., Yang, J., Wettig, A., Yao, S., Pei, K., Press, O., and Narasimhan, K. Swe-bench: Can language models resolve real-world github issues? arXiv preprint arXiv:2310.06770, 2023.
611 26 Kim, Y., Park, C., Jeong, H., Chan, Y. S., Xu, X., McDuff, D., Lee, H., Ghassemi, M., Breazeal, C., and Park, H. W. Mdagents: An adaptive collaboration of llms for medical decision-making. In The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024.
612 27 Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A. C., Lo, W.-Y., Doll´ ar, P., and Girshick, R. Segment anything, 2023.
613 28 Li, B., Yan, T., Pan, Y., Luo, J., Ji, R., Ding, J., Xu, Z., Liu, S., Dong, H., Lin, Z., et al. Mmedagent: Learning to use medical tools with multi-modal agent. arXiv preprint arXiv:2407.02483, 2024a.
614 29 Li, C., Wong, C., Zhang, S., Usuyama, N., Liu, H., Yang, J., Naumann, T., Poon, H., and Gao, J. Llava-med: Training a large language-and-vision assistant for biomedicine in one day. Advances in Neural Information Processing Systems, 36, 2024b.
615 30 Lian, J., Liu, J., Zhang, S., Gao, K., Liu, X., Zhang, D., and Yu, Y. A Structure-Aware Relation Network for Thoracic Diseases Detection and Segmentation. IEEE Transactions on Medical Imaging, 2021. doi: 10.48550/arxiv.2104.10326.
616 31 Liu, B., Zhan, L.-M., Xu, L., Ma, L., Yang, Y., and Wu, X.-M. Slake: A semantically-labeled knowledge-enhanced dataset for medical visual question answering, 2021.
617 32 Liu, X., Yu, H., Zhang, H., Xu, Y., Lei, X., Lai, H., Gu, Y., Ding, H., Men, K., Yang, K., et al. Agentbench: Evaluating llms as agents. arXiv preprint arXiv:2308.03688, 2023.
618 33 Ma, J., He, Y., Li, F., Han, L., You, C., and Wang, B. Segment anything in medical images. Nature Communications, 15(1), January 2024. ISSN 2041-1723. doi:10.1038/s41467-024-44824-z.
619 34 Ma, J., Yang, Z., Kim, S., Chen, B., Baharoon, M., Fallahpour, A., Asakereh, R., Lyu, H., and Wang, B. Medsam2: Segment anything in 3d medical images and videos, 2025.
620 35 Masterman, T., Besen, S., Sawtell, M., and Chao, A. The landscape of emerging ai agent architectures for reasoning, planning, and tool calling: A survey. arXiv preprint arXiv:2404.11584, 2024.
621 36 MedRAX: Medical Reasoning Agent for Chest X-ray Nori, H., King, N., McKinney, S. M., Carignan, D., and Horvitz, E. Capabilities of gpt-4 on medical challenge problems. arXiv preprint arXiv:2303.13375, 2023.
622 37 Ouis, M. Y. and Akhloufi, M. A. Chestbiox-gen: contextual biomedical report generation from chest x-ray images using biogpt and co-attention mechanism. Frontiers in Imaging, 3:1373420, 2024.
623 38 Park, J., Kim, S., Yoon, B., Hyun, J., and Choi, K. M4cxr: Exploring multi-task potentials of multi-modal large language models for chest x-ray interpretation, 2024.
624 39 Pellegrini, C., Keicher, M., Ozsoy, E., and Navab, N. Radrestruct: A novel vqa benchmark and method for structured radiology reporting, 2023.
625 40 Pellegrini, C., Ozsoy, E., Busam, B., Navab, N., and Keicher, M. Radialog: A large vision-language model for radiology report generation and conversational assistance, 2025.
626 41 Pham, H. H., Nguyen, H. Q., Nguyen, H. T., Le, L. T., and Khanh, L. An accurate and explainable deep learning system improves interobserver agreement in the interpretation of chest radiograph. IEEE Access, 10:104512–104531, 2022.
627 42 Schmidgall, S., Ziaei, R., Harris, C., Reis, E., Jopling, J., and Moor, M. Agentclinic: a multimodal agent benchmark to evaluate ai in simulated clinical environments, 2024.
628 43 Shin, H. J., Han, K., Ryu, L., and Kim, E.-K. The impact of artificial intelligence on the reading times of radiologists for chest radiographs. NPJ Digital Medicine, 6(1):82, 2023.
629 44 Tanno, R., Barrett, D. G., Sellergren, A., Ghaisas, S., Dathathri, S., See, A., Welbl, J., Lau, C., Tu, T., Azizi, S., et al. Collaboration between clinicians and vision–language models in radiology report generation. Nature Medicine, pp. 1–10, 2024.
630 45 Tu, T., Azizi, S., Driess, D., Schaekermann, M., Amin, M., Chang, P.-C., Carroll, A., Lau, C., Tanno, R., Ktena, I., Mustafa, B., Chowdhery, A., Liu, Y., Kornblith, S., Fleet, D., Mansfield, P., Prakash, S., Wong, R., Virmani, S., Semturs, C., Mahdavi, S. S., Green, B., Dominowska, E., y Arcas, B. A., Barral, J., Webster, D., Corrado, G. S., Matias, Y., Singhal, K., Florence, P., Karthikesalingam, A., and Natarajan, V. Towards generalist biomedical ai, 2023.
631 46 United Nations Scientific Committee on the Effects of Atomic Radiation. Sources, Effects and Risks of Ionizing Radiation: UNSCEAR 2020/2021 Report, Volume I. United Nations, New York, 2022. ISBN 978-92-1- 139206-7.
632 47 Wu, C., Zhang, X., Zhang, Y., Wang, Y., and Xie, W. Towards generalist foundation model for radiology by leveraging web-scale 2d and 3d medical data, 2023.
633 48 Xi, Z., Chen, W., Guo, X., He, W., Ding, Y., Hong, B., Zhang, M., Wang, J., Jin, S., Zhou, E., et al. The rise and potential of large language model based agents: A survey. Science China Information Sciences, 68(2):121101, 2025.
634 49 Yan, Z., Zhang, K., Zhou, R., He, L., Li, X., and Sun, L. Multimodal chatgpt for medical applications: an experimental study of gpt-4v. arXiv preprint arXiv:2310.19061, 2023.
635 50 Yang, H. M., Duan, T., Ding, D., Bagul, A., Langlotz, C., Shpanskaya, K., et al. Chexnet: radiologist-level pneumonia detection on chest x-rays with deep learning. arXiv preprint arXiv:1711.05225, 2017.
636 51 Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., and Cao, Y. React: Synergizing reasoning and acting in language models, 2023.
637 52 Yin, G., Bai, H., Ma, S., Nan, F., Sun, Y., Xu, Z., Ma, S., Lu, J., Kong, X., Zhang, A., et al. Mmau: A holistic benchmark of agent capabilities across diverse domains. arXiv preprint arXiv:2407.18961, 2024.
638 53 Zambrano Chaves, J., Huang, S.-C., Xu, Y., Xu, H., Usuyama, N., and Zhang, S, e. a. Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation. arXiv preprint arXiv:2403.08002, 2024.
639 54 Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. Pyramid scene parsing network, 2017.
640 55 Zhao, P., Jin, Z., and Cheng, N. An in-depth survey of large language model-based artificial intelligence agents. arXiv preprint arXiv:2309.14365, 2023.
641 14 0 [14] H. Feng, L. Wu, B. Zhao, C. Huff, J. Zhang, J. Wu, L. Lin, P. Wei, C. Wu, P. W. pwei, and A. Professor. Benchmarking dna foundation models for genomic sequence classification running title: Dna foundation models benchmarking. doi: 10.1101/2024.08.16.608288. URL https://doi.org/10.1101/2024.08.16.608288. https://qiita.com/kaizen_nagoya/items/01e3dde0d8274fee0fd8
642 1 [1] OpenAI et al. GPT-4 Technical Report. Preprint at https://doi.org/10.48550/arXiv.2303.08774 (2024).
643 2 [2] Touvron, H. et al. Llama 2: Open Foundation and Fine-Tuned Chat Models. Preprint at https://doi.org/10.48550/arXiv.2307.09288 (2023).
644 3 [3] Jiang, A. Q. et al. Mistral 7B. Preprint at https://doi.org/10.48550/arXiv.2310.06825 (2023).
645 4 [4] Chen, M. et al. Evaluating Large Language Models Trained on Code. Preprint at https://doi.org/10.48550/arXiv.2107.03374 (2021).
646 5 [5] Madani, A. et al. Large language models generate functional protein sequences across diverse families. Nat Biotechnol 41, 1099–1106 (2023).
647 6 [6] Cui, H. et al. scGPT: toward building a foundation model for single-cell multi-omics using generative AI. Nat Methods 1–11 (2024) doi:10.1038/s41592-024-02201-0.
648 7 [7] Lin, Z. et al. Language models of protein sequences at the scale of evolution enable accurate structure prediction. Preprint at https://doi.org/10.1101/2022.07.20.500902 (2022).
649 8 [8] Gershman, A. et al. Epigenetic patterns in a complete human genome. Science 376, eabj5089 (2022).
650 9 [9] Wang, G. et al. Understanding Transcription Factor Regulation by Integrating Gene Expression and DNase I Hypersensitive Sites. Biomed Res Int 2015, 757530 (2015).
651 10 [10] Zhou, Z. et al. DNABERT-2: Efficient Foundation Model and Benchmark For Multi-Species Genome. Preprint at https://doi.org/10.48550/arXiv.2306.15006 (2024).
652 11 [11] Dalla-Torre, H. et al. The Nucleotide Transformer: Building and Evaluating Robust Foundation Models for Human Genomics. Preprint at https://doi.org/10.1101/2023.01.11.523679 (2023).
653 12 [12] Nguyen, E. et al. HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution. Preprint at https://doi.org/10.48550/arXiv.2306.15794 (2023).
654 13 [13] Genome Reference Consortium. Genome reference consortium human build 38 (grch38). National Center for Biotechnology Information, 2013. URL https://www.ncbi.nlm.nih.gov/assembly/GCF_000001405.26/
655 14 [14] M. Byrska-Bishop, U. S. Evani, X. Zhao, A. O. Basile, H. J. Abel, A. A. Regier, A. Corvelo, W. E. Clarke, R. Musunuri, K. Nagulapalli, et al., “High-coverage whole-genome sequencing of the expanded 1000 genomes project cohort including 602 trios,” Cell, vol. 185, no. 18, pp. 3426– 3440, 2022.
656 676 15 [15] Hu, E. J. et al. LoRA: Low-Rank Adaptation of Large Language Models. Preprint at https://doi.org/10.48550/arXiv.2106.09685 (2021).
657 16 [16] Liu, H. et al. Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning. Preprint at https://doi.org/10.48550/arXiv.2205.05638 (2022).
658 17 [17] Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Preprint at https://doi.org/10.48550/arXiv.1810.04805 (2019).
659 18 [18] Xu, H., Jia, P. & Zhao, Z. Deep4mC: systematic assessment and computational prediction for DNA N4-methylcytosine sites by deep learning. Briefings in Bioinformatics 22, bbaa099 (2021).
660 19 [19] Liu, B., Long, R. & Chou, K.-C. iDHS-EL: identifying DNase I hypersensitive sites by fusing three different modes of pseudo nucleotide composition into an ensemble learning framework. Bioinformatics 32, 2411–2418 (2016).
661 20 [20] Jin, J. et al. iDNA-ABF: multi-scale deep biological language learning model for the interpretable prediction of DNA methylations. Genome Biology 23, 219 (2022).
662 21 [21] Zhang, P., Zhang, H. & Wu, H. iPro-WAEL: a comprehensive and robust framework for identifying promoters in multiple species. Nucleic Acids Research 50, 10278–10289 (2022).
663 22 [22] Grinsztajn, L., Oyallon, E. & Varoquaux, G. Why do tree-based models still outperform deep learning on typical tabular data? Advances in Neural Information Processing Systems 35, 507–520 (2022).
664 23 [23] Gillioz, A., Casas, J., Mugellini, E. & Khaled, O. A. Overview of the Transformer-based Models for NLP Tasks. in Annals of Computer Science and Information Systems vol. 21 179–183 (2020).
665 24 [24] Zhang, H. & Shafiq, M. O. Survey of transformers and towards ensemble learning using transformers for natural language processing. Journal of Big Data 11, 25 (2024).
666 25 [25] Yang, X., Huang, J. Y., Zhou, W. & Chen, M. Parameter-Efficient Tuning with Special Token Adaptation. Preprint at https://doi.org/10.48550/arXiv.2210.04382 (2023).
667 26 [26] Hubert, L. & Arabie, P. Comparing partitions. Journal of Classification 2, 193–218 (1985).
668 27 [27] Vinh, N. X., Epps, J. & Bailey, J. Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance. Journal of Machine Learning Research 11, 2837–2854 (2010).
669 28 [28] Rousseeuw, P. J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics 20, 53–65 (1987).
670 29 [29] Matthews, B. W. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA) - Protein Structure 405, 442–451 (1975).
671 30 [30] Marin, F. I. et al. BEND: Benchmarking DNA Language Models on biologically meaningful tasks. Preprint at https://doi.org/10.48550/arXiv.2311.12570 (2024).
672 31 [31] Lester, B., Al-Rfou, R. & Constant, N. The Power of Scale for Parameter-Efficient Prompt Tuning. Preprint at https://doi.org/10.48550/arXiv.2104.08691 (2021).
673 32 [32] Chen, T. & Guestrin, C. XGBoost: A Scalable Tree Boosting System. in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794 (Association for Computing Machinery, New York, NY, USA, 2016). doi:10.1145/2939672.2939785.
674 33 [33] Breiman, L. Random Forests. Machine Learning 45, 5–32 (2001).
675 34 [34] DeLong, Elizabeth R., et al. “Comparing the Areas under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach.” Biometrics, vol. 44, no. 3, 1988, pp. 837–45. JSTOR, https://doi.org/10.2307/2531595.
676 15 0 [15] E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen. Lora: Low-rank adaptation of large language models, 2021. URL https://arxiv.org/abs/2106.09685. https://qiita.com/kaizen_nagoya/items/877058f681d77808b44c
677 Armen Aghajanyan, Luke Zettlemoyer, and Sonal Gupta. Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning. arXiv:2012.13255 [cs], December 2020. URL http://arxiv.org/abs/2012.13255.
678 Zeyuan Allen-Zhu and Yuanzhi Li. What Can ResNet Learn Efficiently, Going Beyond Kernels? In NeurIPS, 2019. Full version available at http://arxiv.org/abs/1905.10337.
679 Zeyuan Allen-Zhu and Yuanzhi Li. Backward feature correction: How deep learning performs deep learning. arXiv preprint arXiv:2001.04413, 2020a.
680 Zeyuan Allen-Zhu and Yuanzhi Li. Feature purification: How adversarial training performs robust deep learning. arXiv preprint arXiv:2005.10190, 2020b.
681 Zeyuan Allen-Zhu, Yuanzhi Li, and Zhao Song. A convergence theory for deep learning via overparameterization. In ICML, 2019. Full version available at http://arxiv.org/abs/1811.03962.
682 Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. Layer normalization, 2016.
683 Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. Language Models are Few-Shot Learners. arXiv:2005.14165[cs], July 2020. URL http://arxiv.org/abs/2005.14165.
684 Jian-Feng Cai, Emmanuel J Cand` es, and Zuowei Shen. A singular value thresholding algorithm for matrix completion. SIAM Journal on optimization, 20(4):1956–1982, 2010.
685 Daniel Cer, Mona Diab, Eneko Agirre, Inigo Lopez-Gazpio, and Lucia Specia. Semeval-2017 task 1: Semantic textual similarity multilingual and crosslingual focused evaluation. Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), 2017. doi: 10.18653/v1/s17-2001. URL http://dx.doi.org/10.18653/v1/S17-2001.
686 Ronan Collobert and Jason Weston. A unified architecture for natural language processing: deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning, ICML ’08, pp. 160–167, New York, NY, USA, July 2008. Association for Computing Machinery. ISBN 978-1-60558-205-4. doi: 10.1145/1390156.1390177. URL https://doi.org/10.1145/1390156.1390177.
687 Misha Denil, Babak Shakibi, Laurent Dinh, Marc’Aurelio Ranzato, and Nando de Freitas. Predicting parameters in deep learning, 2014.
688 Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding, 2019a.
689 Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 [cs], May 2019b. URL http://arxiv.org/abs/1810.04805. arXiv: 1810.04805.
690 William B. Dolan and Chris Brockett. Automatically constructing a corpus of sentential paraphrases. In Proceedings of the Third International Workshop on Paraphrasing (IWP2005), 2005. URL https://aclanthology.org/I05-5002.
691 Claire Gardent, Anastasia Shimorina, Shashi Narayan, and Laura Perez-Beltrachini. The webnlg challenge: Generating text from rdf data. In Proceedings of the 10th International Conference on Natural Language Generation, pp. 124–133, 2017.
692 Behrooz Ghorbani, Song Mei, Theodor Misiakiewicz, and Andrea Montanari. When do neural networks outperform kernel methods? arXiv preprint arXiv:2006.13409, 2020.
693 Bogdan Gliwa, Iwona Mochol, Maciej Biesek, and Aleksander Wawer. Samsum corpus: A human-annotated dialogue dataset for abstractive summarization. CoRR, abs/1911.12237, 2019. URL http://arxiv.org/abs/1911.12237.
694 Lars Grasedyck, Daniel Kressner, and Christine Tobler. A literature survey of low-rank tensor approximation techniques. GAMM-Mitteilungen, 36(1):53–78, 2013.
695 Jihun Ham and Daniel D. Lee. Grassmann discriminant analysis: a unifying view on subspace-based learning. In ICML, pp. 376–383, 2008. URL https://doi.org/10.1145/1390156.1390204.
696 Karen Hambardzumyan, Hrant Khachatrian, and Jonathan May. WARP: Word-level Adversarial ReProgramming. arXiv:2101.00121 [cs], December 2020. URL http://arxiv.org/abs/2101.00121. arXiv: 2101.00121.
697 Pengcheng He, Xiaodong Liu, Jianfeng Gao, and Weizhu Chen. Deberta: Decoding-enhanced bert with disentangled attention, 2021.
698 Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, and Sylvain Gelly. Parameter-Efficient Transfer Learning for NLP. arXiv:1902.00751 [cs, stat], June 2019. URL http://arxiv.org/abs/1902.00751.
699 Max Jaderberg, Andrea Vedaldi, and Andrew Zisserman. Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866, 2014.
700 Mikhail Khodak, Neil Tenenholtz, Lester Mackey, and Nicol` o Fusi. Initialization and regularization of factorized neural layers, 2021.
701 Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization, 2017.
702 Dmitry Lepikhin, HyoukJoong Lee, Yuanzhong Xu, Dehao Chen, Orhan Firat, Yanping Huang, Maxim Krikun, Noam Shazeer, and Zhifeng Chen. Gshard: Scaling giant models with conditional computation and automatic sharding, 2020.
703 Brian Lester, Rami Al-Rfou, and Noah Constant. The Power of Scale for Parameter-Efficient Prompt Tuning. arXiv:2104.08691 [cs], April 2021. URL http://arxiv.org/abs/2104.08691. arXiv: 2104.08691.
704 Chunyuan Li, Heerad Farkhoor, Rosanne Liu, and Jason Yosinski. Measuring the Intrinsic Dimension of Objective Landscapes. arXiv:1804.08838 [cs, stat], April 2018a. URL http://arxiv.org/abs/1804.08838. arXiv: 1804.08838.
705 Xiang Lisa Li and Percy Liang. Prefix-Tuning: Optimizing Continuous Prompts for Generation. arXiv:2101.00190 [cs], January 2021. URL http://arxiv.org/abs/2101.00190.
706 Yuanzhi Li and Yingyu Liang. Learning overparameterized neural networks via stochastic gradient descent on structured data. In Advances in Neural Information Processing Systems, 2018.
707 Yuanzhi Li, Yingyu Liang, and Andrej Risteski. Recovery guarantee of weighted low-rank approximation via alternating minimization. In International Conference on Machine Learning, pp.2358–2367. PMLR, 2016.
708 Yuanzhi Li, Tengyu Ma, and Hongyang Zhang. Algorithmic regularization in over-parameterized matrix sensing and neural networks with quadratic activations. In Conference On Learning Theory, pp. 2–47. PMLR, 2018b.
709 Zhaojiang Lin, Andrea Madotto, and Pascale Fung. Exploring versatile generative language model via parameter-efficient transfer learning. In Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 441–459, Online, November 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.findings-emnlp.41. URL https://aclanthologyorg/2020.findings-emnlp.41.
710 Xiao Liu, Yanan Zheng, Zhengxiao Du, Ming Ding, Yujie Qian, Zhilin Yang, and Jie Tang. GPT Understands, Too. arXiv:2103.10385 [cs], March 2021. URL http://arxiv.org/abs/2103.10385. arXiv: 2103.10385.
711 Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Roberta: A robustly optimized bert pretraining approach, 2019.
712 Ilya Loshchilov and Frank Hutter. arXiv:1711.05101, 2017.
713 Decoupled weight decay regularization. arXiv preprint Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization, 2019.
714 Rabeeh Karimi Mahabadi, James Henderson, and Sebastian Ruder. Compacter: Efficient low-rank hypercomplex adapter layers, 2021.
715 Linyong Nan, Dragomir Radev, Rui Zhang, Amrit Rau, Abhinand Sivaprasad, Chiachun Hsieh, Xiangru Tang, Aadit Vyas, Neha Verma, Pranav Krishna, et al. Dart: Open-domain structured data record to text generation. arXiv preprint arXiv:2007.02871, 2020.
716 Jekaterina Novikova, Ondˇ rej Duˇ sek, and Verena Rieser. The e2e dataset: New challenges for end-to-end generation. arXiv preprint arXiv:1706.09254, 2017.
717 Samet Oymak, Zalan Fabian, Mingchen Li, and Mahdi Soltanolkotabi. Generalization guarantees for neural networks via harnessing the low-rank structure of the jacobian. arXiv preprint arXiv:1906.05392, 2019.
718 Jonas Pfeiffer, Aishwarya Kamath, Andreas R¨ uckl´ e, Kyunghyun Cho, and Iryna Gurevych. Adapter-fusion: Non-destructive task composition for transfer learning, 2021.
719 Daniel Povey, Gaofeng Cheng, Yiming Wang, Ke Li, Hainan Xu, Mahsa Yarmohammadi, and Sanjeev Khudanpur. Semi-orthogonal low-rank matrix factorization for deep neural networks. In Interspeech, pp. 3743–3747, 2018.
720 Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. Improving Language Understanding by Generative Pre-Training. pp. 12, a.
721 Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. Language Models are Unsupervised Multitask Learners. pp. 24, b.
722 Pranav Rajpurkar, Robin Jia, and Percy Liang. Know what you don’t know: Unanswerable questions for squad. CoRR, abs/1806.03822, 2018. URL http://arxiv.org/abs/1806.03822.
723 Sylvestre-Alvise Rebuffi, Hakan Bilen, and Andrea Vedaldi. Learning multiple visual domains with residual adapters. arXiv:1705.08045 [cs, stat], November 2017. URL http://arxiv.org/abs/1705.08045. arXiv: 1705.08045.
724 Andreas R¨ uckl´ e, Gregor Geigle, Max Glockner, Tilman Beck, Jonas Pfeiffer, Nils Reimers, and Iryna Gurevych. Adapterdrop: On the efficiency of adapters in transformers, 2020.
725 Tara N Sainath, Brian Kingsbury, Vikas Sindhwani, Ebru Arisoy, and Bhuvana Ramabhadran. Low-rank matrix factorization for deep neural network training with high-dimensional output targets. In 2013 IEEE international conference on acoustics, speech and signal processing, pp. 6655–6659. IEEE, 2013.
726 Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper, and Bryan Catanzaro. Megatron-lm: Training multi-billion parameter language models using model parallelism, 2020.
727 Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D. Manning, Andrew Ng, and Christopher Potts. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1631–1642, Seattle, Washington, USA, October 2013. Association for Computational Linguistics. URL https://aclanthology.org/D13-1170.
728 Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6000–6010, 2017.
729 Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. Glue: A multi-task benchmark and analysis platform for natural language understanding, 2019.
730 Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. Superglue: A stickier benchmark for general-purpose language understanding systems, 2020.
731 Alex Warstadt, Amanpreet Singh, and Samuel R Bowman. Neural network acceptability judgments. arXiv preprint arXiv:1805.12471, 2018.
732 Adina Williams, Nikita Nangia, and Samuel Bowman. A broad-coverage challenge corpus for sentence understanding through inference. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 1112–1122, New Orleans, Louisiana, June 2018. Association for Computational Linguistics. doi: 10.18653/v1/N18-1101. URL https://www.aclweb.org/anthology/N18-1101.
733 Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, R´ emi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45, Online, October 2020. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/2020.emnlp-demos.6.
734 Greg Yang and Edward J. Hu. Feature Learning in Infinite-Width Neural Networks. arXiv:2011.14522 [cond-mat], May 2021. URL http://arxiv.org/abs/2011.14522.arXiv: 2011.14522.
735 Elad Ben Zaken, Shauli Ravfogel, and Yoav Goldberg. Bitfit: Simple parameter-efficient fine-tuning for transformer-based masked language-models, 2021.
736 Yu Zhang, Ekapol Chuangsuwanich, and James Glass. Extracting deep neural network bottleneck features using low-rank matrix factorization. In 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp. 185–189. IEEE, 2014.
737 Yong Zhao, Jinyu Li, and Yifan Gong. Low-rank plus diagonal adaptation for deep neural networks. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5005–5009. IEEE, 2016.
738 Victor Zhong, Caiming Xiong, and Richard Socher. Seq2sql: Generating structured queries from natural language using reinforcement learning. CoRR, abs/1709.00103, 2017. URL http://arxiv.org/abs/1709.00103.
739 16 0 [16] E. Huckvale and H. N. Moseley. kegg pull: a software package for the restful access and pulling from the kyoto encyclopedia of gene and genomes. BMC Bioinformatics, 24:1–17, 12 2023. ISSN 14712105. doi: 10.1186/S12859-023-05208-0/TABLES/12. URL https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-023-05208-0http://creativecommons.org/publicdomain/zero/1.0/. https://qiita.com/kaizen_nagoya/items/05be40565793f2b4f7f3
740 1 Kawashima S, Katayama T, Sato Y, Kanehisa M. KEGG API: a web service using SOAP/WSDL to Access the KEGG System. Genome Inform. 2003;14:673.
741 2 Kanehisa M. Toward understanding the origin and evolution of cellular organisms. Protein Sci. 2019;28:1947–51.
742 3 Kanehisa M, Furumichi M, Sato Y, Ishiguro-Watanabe M, Tanabe M. KEGG: integrating viruses and cellular organisms. Nucleic Acids Res. 2021;49:D545–51.
743 4 The kyoto encyclopedia of genes and genomes—kegg. Yeast. 2000;1:48–55.
744 5 Fielding RT. Representational state transfer. Architectural Styles and the Design of Network-Based Software Architectures. Doctoral dissertation. University of California Irvine, Irvine, CA, USA; 2000.
745 6 Reitz K. requests. Computer software. Pypi; 2013.
746 7 Christudas B. cURL and Postman. In: Practical Microservices Architectural Patterns: Event-Based Java Microservices with Spring Boot and Spring Cloud. Berkeley, CA: Apress. 2019;847–55.
747 8 R Core Team, editor. R: A Language and environment for statistical computing. 2018.
748 9 Rossum GV, Drake FL. Python 3 Reference Manual. CreateSpace; 2009.
749 10 Wilkinson MD, Dumontier M, Aalbersberg IJJ, Appleton G, Axton M, Baak A, et al. The FAIR guiding principles for scientific data management and stewardship. Sci Data. 2016;3:160018.
750 11 Tenenbaum D, Volkening J. KEGGREST. Computer software. Bioconductor Package Maintainer; 2022.
751 12 Castelli FM. KEGGutils v04.1. Computer software. 2022. Zenodo. https://doi.org/10.5281/zenodo.7482523.
752 13 Cock PJA. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Computer software. PyPi; 2009.
753 14 Giampieri E. keggrest. Computer software. PyPi; 2013.
754 17 0 [17] Q. Jin, Y. Yang, Q. Chen, and Z. Lu. Genegpt: augmenting large language models with domain tools for improved access to biomedical information. Bioinformatics, 40, 2 2024. ISSN 13674811. doi: 10.1093/BIOINFORMATICS/BTAE075. URL https://dx.doi.org/10.1093/bioinformatics/btae075. https://qiita.com/kaizen_nagoya/items/8897792ff52fb5e68a46
755 1 Altschul SF, Gish W, Miller W et al. Basic local alignment search tool. J Mol Biol 1990;215:403–10.
756 2 Boratyn GM, Camacho C, Cooper PS et al. Blast: a more efficient report with usability improvements. Nucleic Acids Res 2013;41:W29–W33.
757 3 Borgeaud S, Mensch A, Hoffmann J et al. Improving language models by retrieving from trillions of tokens. In: International conference on machine learning, Baltimore, Maryland, USA, p. 2206–40. PMLR, 2022.
758 4 Brown T, Mann B, Ryder N et al. Language models are few-shot learners. Advances in Neural Information Processing Systems 2020;33:1877–901.
759 5 Chen M, Tworek J, Jun H et al. Evaluating large language models trained on code. arXiv, arXiv:2107.03374, 2021, preprint: not peer reviewed.
760 6 Chowdhery A, Narang S, Devlin J et al. Palm: scaling language modeling with pathways. arXiv, arXiv:2204.02311, 2022, preprint: not peer reviewed.
761 7 Ely JW, Osheroff JA, Chambliss ML et al. Answering physicians’ clinical questions: obstacles and potential solutions. J Am Med Inform Assoc 2005;12:217–24.
762 8 Gao L, Madaan A, Zhou S et al. Pal: program-aided language models. arXiv, arXiv:2211.10435, 2022, preprint: not peer reviewed.
763 9 Guu K, Lee K, Tung Z et al. Retrieval augmented language model pre-training. In: International conference on machine learning, p. 3929–3938. PMLR, 2020.
764 10 Hou W, Ji Z. Geneturing tests gpt models in genomics. bioRxiv 2023:2023–03. pages
765 11 Ji Z, Lee N, Frieske R et al. Survey of hallucination in natural language generation. ACM Comput Surv 2023;55:1–38.
766 12 Jin Q, Leaman R, Lu Z. Retrieve, summarize, and verify: how will chatgpt impact information seeking from the medical literature? J Am Soc Nephrol 2023a;34:1302–4.
767 13 Jin Q, Wang Z, Floudas CS et al. Matching patients to clinical trials with large language models. arXiv, arXiv:2307.15051, 2023b, preprint: not peer reviewed.
768 14 Jin Q, Yuan Z, Xiong G et al. Biomedical question answering: a survey of approaches and challenges. ACM Comput Surv 2022;55:1–36.
769 15 Kaplan J, McCandlish S, Henighan T et al. Scaling laws for neural language models. arXiv, arXiv:2001.08361, 2020, preprint: not peer reviewed.
770 16 Lewis P, Perez E, Piktus A et al. Retrieval-augmented generation for knowledge-intensive nlp tasks. Adv Neural Inform Process Syst 2020;33:9459–74.
771 17 Liévin V, Hother CE, Winther O. Can large language models reason about medical questions? arXiv, arXiv:2207.08143, 2022, preprint: not peer reviewed.
772 18 Luo R, Sun L, Xia Y et al. Biogpt: generative pre-trained transformer for biomedical text generation and mining. Brief Bioinform 2022;23. https://doi.org/10.1093/bib/bbac409.
773 19 Mialon G, Dessì R, Lomeli M et al. Augmented language models: a survey. arXiv, arXiv:2302.07842, 2023, preprint: not peer reviewed.
774 20 Nori H, King N, McKinney SM et al. Capabilities of gpt-4 on medical challenge problems. arXiv, arXiv:2303.13375, 2023, preprint: not peer reviewed.
775 21 OpenAI. GPT-4 technical report. CoRR abs/2303.08774, 2023.
776 22 Parisi A, Zhao Y, Fiedel N. Talm: tool augmented language models. arXiv, arXiv:2205.12255, 2022, preprint: not peer reviewed.
777 23 Qin Y, Hu S, Lin Y et al. Tool learning with foundation models. arXiv, arXiv:2304.08354, 2023, preprint: not peer reviewed. http://arxiv.org/pdf/2304.08354.pdf.
778 24 Radford A, Narasimhan K, Salimans T et al. Improving language understanding by generative pre-training, OpenAI Blog, 2018.
779 25 Radford A, Wu J, Child R et al. Language models are unsupervised multitask learners. OpenAI Blog 2019;1:9.
780 26 Sayers EW, Agarwala R, Bolton EE et al. Database resources of the national center for biotechnology information. Nucleic Acids Res 2019;47:D23–D28.
781 27 Schick T, Dwivedi-Yu J, Dessì R et al. Toolformer: language models can teach themselves to use tools. arXiv, arXiv:2302.04761, 2023, preprint: not peer reviewed.
782 28 Schuler G, Epstein J, Ohkawa H et al. Entrez: molecular biology database and retrieval system. Methods Enzymol 1996;266:141–62.
783 29 Singhal K, Azizi S, Tu T et al. Large language models encode clinical knowledge. arXiv, arXiv:2212.13138, 2022, preprint: not peer reviewed.
784 30 Tian S, Jin Q, Yeganova L et al. Opportunities and challenges for chatgpt and large language models in biomedicine and health. Brief Bioinform 2024;25(1). https://doi.org/10.1093/bib/bbad493.
785 31 Wei J, Tay Y, Bommasani R et al. Emergent abilities of large language models. arXiv, arXiv:2206.07682, 2022a, preprint: not peer reviewed.
786 32 Wei J, Wang X, Schuurmans D et al. Chain of thought prompting elicits reasoning in large language models. arXiv, arXiv:2201.11903, 2022b, preprint: not peer reviewed.
787 33 Wong C, Zheng S, Gu Y et al. Scaling clinical trial matching using large language models: a case study in oncology. arXiv, arXiv:2308.02180, 2023, preprint: not peer reviewed.
788 34 Yao S, Zhao J, Yu D et al. React: synergizing reasoning and acting in language models. arXiv, arXiv:2210.03629, 2022, preprint: not peer reviewed.
789 18 0 [18] M. Kanehisa, M. Furumichi, Y. Sato, Y. Matsuura, and M. Ishiguro-Watanabe. Kegg: biological systems database as a model of the real world. Nucleic Acids Research, 53:D672–D677, 1 2025. ISSN 0305-1048. doi: 10.1093/NAR/GKAE909. URL https://dx.doi.org/10.1093/nar/gkae909. https://qiita.com/kaizen_nagoya/items/f63573043eaf8f9c6a2c
790 1 1. Kanehisa,M.(2019) Toward understanding the origin and evolution of cellular organisms.Protein Sci.,28,1947–1951.
791 2 2. Kanehisa,M.,Furumichi,M.,Sato,Y.,Kawashima,M.and Ishiguro-Watanabe,M.(2023) KEGG for taxonomy-based analysis of pathways and genomes.Nucleic Acids Res.,51,D587–D592.
792 3 3. Jin,Z.,Sato,Y.,Kawashima,M.and Kanehisa,M.(2023) KEGG tools for classification and analysis of viral proteins.Protein Sci., 32,e4840.
793 4 4. Fujibuchi,W.,Goto,S.,Migimatsu,H.,Uchiyama,I.,Ogiwara,A., Akiyama,Y.and Kanehisa,M.(1998) DBGET/LinkDB: an integrated database retrieval system.Pac. Symp. Biocomput., 683–694.
794 5 5. Kanehisa,M.and Sato,Y.(2020) KEGG Mapper for inferring cellular functions from protein sequences.Protein Sci.,29,28–35.
795 6 6. Kanehisa,M.,Sato,Y.and Kawashima,M.(2022) KEGG mapping tools for uncovering hidden features in biological data.Protein Sci.,31,47–53.
796 7 7. Haft,D.H.,Badretdin,A.,Coulouris,G.,DiCuccio,M.,Durkin,A.S., Jovenitti,E.,Li,W.,Mersha,M.,O’Neill,K.R.,Virothaisakun,J.,et al. (2024) RefSeq and the prokaryotic genome annotation pipeline in the age of metagenomes.Nucleic Acids Res.,52,D762–D769.
797 8 8. Goad,W.B.and Kanehisa,M.I.(1982) Pattern recognition in nucleic acid sequences.I.A general method for finding local homologies and symmetries.Nucleic Acids Res.,10,247–263.
798 9 9. Smith,T.F.and Waterman,M.S.(1981) Identification of common molecular subsequences.J. Mol. Biol.,147,195–197.
799 10 10. Schoch,C.L.,Ciufo,S.,Domrachev,M.,Hotton,C.L.,Kannan,S., Khovanskaya, R.,Leipe, D.,Mcveigh, R.,O’Neill,K.,Robbertse,B., et al. (2020) NCBI Taxonomy: a comprehensive update on curation,resources and tools.Database,2020,baaa062.
800 11 11. Siddell,S.G.,Smith,D.B.,Adriaenssens,E.,Alfenas-Zerbini,P., Dutilh, B.E.,Garcia, M.L.,Junglen, S.,Krupovic, M.,Kuhn, J.H., Lambert,A.J.,et al. (2023) Virus taxonomy and the role of the International Committee on Taxonomy of Viruses (ICTV).J. Gen. Virol.,104,001840.
801 19 0 [19] J. Kans. Entrez direct: E-utilities on the unix command line - entrez programming utilities help - ncbi bookshelf, 4 2013. URL https://www.ncbi.nlm.nih.gov/books/NBK179288/. https://qiita.com/kaizen_nagoya/items/cc4bbde566e67abc93d9
802 1 The Smithsonian Online Collections Databases are provided by the National Museum of Natural History, Smithsonian Institution, 10th and Constitution Ave. N.W., Washington, DC 20560-0193. https://collections.nmnh.si.edu/.
803 2 den Dunnen JT, Dalgleish R, Maglott DR, Hart RK, Greenblatt MS, McGowan-Jordan J, Roux AF, Smith T, Antonarakis SE, Taschner PE. HGVS Recommendations for the Description of Sequence Variants: 2016 Update. Hum Mutat. 2016. https://doi.org/10.1002/humu.22981. (PMID 26931183.)
804 3 Holmes JB, Moyer E, Phan L, Maglott D, Kattman B. SPDI: data model for variants and applications at NCBI. Bioinformatics. 2020. https://doi.org/10.1093/bioinformatics/btz856. (PMID 31738401.)
805 4 Hutchins BI, Baker KL, Davis MT, Diwersy MA, Haque E, Harriman RM, Hoppe TA, Leicht SA, Meyer P, Santangelo GM. The NIH Open Citation Collection: A public access, broad coverage resource. PLoS Biol. 2019. https://doi.org/10.1371/journal.pbio.3000385. (PMID 31600197.)
806 5 Kim S, Thiessen PA, Cheng T, Yu B, Bolton EE. An update on PUG-REST: RESTful interface for programmatic access to PubChem. Nucleic Acids Res. 2018. https://doi.org/10.1093/nar/gky294. (PMID 29718389.)
807 6 Mitchell JA, Aronson AR, Mork JG, Folk LC, Humphrey SM, Ward JM. Gene indexing: characterization and analysis of NLM's GeneRIFs. AMIA Annu Symp Proc. 2003:460-4. (PMID 14728215.)
808 7 Ostell JM, Wheelan SJ, Kans JA. The NCBI data model. Methods Biochem Anal. 2001. https://doi.org/10.1002/0471223921.ch2. (PMID 11449725.)
809 8 Schuler GD, Epstein JA, Ohkawa H, Kans JA. Entrez: molecular biology database and retrieval system. Methods Enzymol. 1996. https://doi.org/10.1016/s0076-6879(96)66012-1. (PMID 8743683.)
810 9 Wei C-H, Allot A, Leaman R, Lu Z. PubTator central: automated concept annotation for biomedical full text articles. Nucleic Acids Res. 2019. https://doi.org/10.1093/nar/gkz389. (PMID 31114887.)
811 10 Wu C, Macleod I, Su AI. BioGPS and MyGene.info: organizing online, gene-centric information. Nucleic Acids Res. 2013. https://doi.org/10.1093/nar/gks1114. (PMID 23175613.)
812 20 0 [20] M. J. Landrum, J. M. Lee, G. R. Riley, W. Jang, W. S. Rubinstein, D. M. Church, and D. R. Maglott. Clinvar: Public archive of relationships among sequence variation and human phenotype. Nucleic Acids Research, 42, 1 2014. ISSN 03051048. doi: 10.1093/NAR/GKT1113,. URL https://pubmed.ncbi.nlm.nih.gov/24234437/. https://qiita.com/kaizen_nagoya/items/8149b7a5a4f930490fad
813 1 NCBI Resource Coordinators. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2013;41:D8–D20.
814 2 Lappalainen I, Lopez J, Skipper L, Hefferon T, Spalding JD, Garner J, Chen C, Maguire M, Corbett M, Zhou G, et al. DbVar and DGVa: public archives for genomic structural variation. Nucleic Acids Res. 2013;41:D936–D941.
815 3 Rubinstein WS, Maglott DR, Lee JM, Kattman BL, Malheiro AJ, Ovetsky M, Hem V, Gorelenkov V, Song G, Wallin C, et al. The NIH genetic testing registry: a new, centralized database of genetic tests to enable access to comprehensive information and improve transparency. Nucleic Acids Res. 2013;41:D925–D935.
816 4 den Dunnen JT, Antonarakis SE. Mutation nomenclature extensions and suggestions to describe complex mutations: a discussion. Hum. Mutation. 2000;15:7–12.
817 5 Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell C, Hart J, Landrum MJ, McGarvey KM, et al. RefSeq: an update on mammalian reference sequences. Nucleic Acids Res. 2014;42:D756–D763.
818 21 0 [21] J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang. Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36:1234–1240, 2 2020. ISSN 1367-4803. doi: 10.1093/BIOINFORMATICS/BTZ682. URL https://dx.doi.org/10.1093/bioinformatics/btz682. https://qiita.com/kaizen_nagoya/items/63781eb6db1fc2ded80a
819 1 Alsentzer E. et al. (2019) Publicly available clinical bert embeddings. In: Proceedings of the 2nd Clinical Natural Language Processing Workshop, Minneapolis, MN, USA. pp. 72–78. Association for Computational Linguistics. https://www.aclweb.org/anthology/W19-1909.
820 2 Bhasuran B. , Natarajan J. (2018) Automatic extraction of gene-disease associations from literature using joint ensemble learning. PLoS One, 13, e0200699.
821 3 Bravo À. et al. (2015) Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research. BMC Bioinformatics, 16, 55.
822 4 Devlin J. et al. (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA. pp. 4171–4186. Association for Computational Linguistics. https://www.aclweb.org/anthology/N19-1423.
823 5 Doğan R.I. et al. (2014) NCBI disease corpus: a resource for disease name recognition and concept normalization. J. Biomed. Inform., 47, 1–10.
824 6 Gerner M. et al. (2010) Linnaeus: a species name identification system for biomedical literature. BMC Bioinformatics, 11, 85.
825 7 Giorgi J.M. , Bader G.D. (2018) Transfer learning for biomedical named entity recognition with neural networks. Bioinformatics, 34, 4087.
826 8 Habibi M. et al. (2017) Deep learning with word embeddings improves biomedical named entity recognition. Bioinformatics, 33, i37–i48.
827 9 Kim J.-D. et al. (2004) Introduction to the bio-entity recognition task at JNLPBA. In: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA/BioNLP), Geneva, Switzerland. pp. 73–78. COLING. https://www.aclweb.org/anthology/W04-1213.
828 10 Krallinger M. et al. (2015) The chemdner corpus of chemicals and drugs and its annotation principles. J. Cheminform., 7.
829 11 Krallinger M. et al. (2017) Overview of the BioCreative VI chemical-protein interaction track. In: Proceedings of the BioCreative VI Workshop, Bethesda, MD, USA, pp. 141–146. https://academic.oup.com/database/article/doi/10.1093/database/bay073/5055578.
830 12 Li J. et al. (2016) Biocreative V CDR task corpus: a resource for chemical disease relation extraction. Database, 2016.
831 13 Lim S. , Kang J. (2018) Chemical–gene relation extraction using recursive neural network. Database, 2018.
832 14 Lin C. et al. (2019) A bert-based universal model for both within-and cross-sentence clinical temporal relation extraction. In: Proceedings of the 2nd Clinical Natural Language Processing Workshop, Minneapolis, MN, USA. pp. 65–71. Association for Computational Linguistics. https://www.aclweb.org/anthology/W19-1908.
833 15 Lou Y. et al. (2017) A transition-based joint model for disease named entity recognition and normalization. Bioinformatics, 33, 2363–2371.
834 16 Luo L. et al. (2018) An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition. Bioinformatics, 34, 1381–1388.
835 17 McCann B. et al. (2017) Learned in translation: contextualized word vectors. In: Guyon,I. et al. (eds.), Advances in Neural Information Processing Systems 30, Curran Associates, Inc., pp. 6294–6305. http://papers.nips.cc/paper/7209-learned-in-translation-contextualized-word-vectors.pdf.
836 18 Mikolov T. et al. (2013) Distributed representations of words and phrases and their compositionality. In: Burges,C.J.C. (eds.), Advances in Neural Information Processing Systems 26, Curran Associates, Inc., pp. 3111–3119. http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf.
837 19 Mohan S. , Li D. (2019) Medmentions: a large biomedical corpus annotated with UMLS concepts. arXiv preprint arXiv: 1902.09476.
838 20 Pafilis E. et al. (2013) The species and organisms resources for fast and accurate identification of taxonomic names in text. PLoS One, 8, e65390.
839 21 Pennington J. et al. (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar. pp. 1532–1543. Association for Computational Linguistics. https://www.aclweb.org/anthology/D14-1162.
840 22 Peters M.E. et al. (2018) Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), New Orleans, LA. pp. 2227–2237. Association for Computational Linguistics. https://www.aclweb.org/anthology/N18-1202.
841 23 Pyysalo S. et al. (2013) Distributional semantics resources for biomedical text processing. In: Proceedings of the 5th International Symposium on Languages in Biology and Medicine, Tokyo, Japan, pp. 39–43. https://academic.oup.com/bioinformatics/article/33/14/i37/3953940.
842 24 Rajpurkar P. et al. (2016) Squad: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, TX. pp. 2383–2392. Association for Computational Linguistics. https://www.aclweb.org/anthology/D16-1264.
843 25 Sachan D.S. et al. (2018) Effective use of bidirectional language modeling for transfer learning in biomedical named entity recognition. In: Finale,D.-V. et al. (eds.), Proceedings of Machine Learning Research, Palo Alto, CA, Vol. 85, pp. 383–402. PMLR. http://proceedings.mlr.press/v85/sachan18a.html.
844 26 Smith L. et al. (2008) Overview of biocreative ii gene mention recognition. Genome Biol., 9, S2.
845 27 Sousa D. et al. (2019) A silver standard corpus of human phenotype-gene relations. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN. pp. 1487–1492. Association for Computational Linguistics. https://www.aclweb.org/anthology/N19-1152.
846 28 Sung N. et al. (2017) NSML: A machine learning platform that enables you to focus on your models. arXiv preprint arXiv: 1712.05902.
847 29 Tsatsaronis G. et al. (2015) An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition. BMC Bioinformatics, 16, 138.
848 30 Uzuner Ö. et al. (2011) 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. J. Am. Med. Inform. Assoc., 18, 552–556.
849 31 Van Mulligen E.M. et al. (2012) The EU-ADR corpus: annotated drugs, diseases, targets, and their relationships. J. Biomed. Inform., 45, 879–884.
850 32 Vaswani A. et al. (2017) Attention is all you need. In: Guyon,I. et al. (eds.), Advances in Neural Information Processing Systems, pp. 5998–6008. Curran Associates, Inc. http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf.
851 33 Wang X. et al. (2018) Cross-type biomedical named entity recognition with deep multi-task learning. Bioinformatics, 35, 1745–1752.
852 34 Wiese G. et al. (2017) Neural domain adaptation for biomedical question answering. In: Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), Vancouver, Canada. pp. 281–289. Association for Computational Linguistics. https://www.aclweb.org/anthology/K17-1029.
853 35 Wu Y. et al. (2016) Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv: 1609.08144.
854 36 Xu K. et al. (2019) Document-level attention-based BiLSTM-CRF incorporating disease dictionary for disease named entity recognition. Comput. Biol. Med., 108, 122–132.
855 37 Yoon W. et al. (2019) Collabonet: collaboration of deep neural networks for biomedical named entity recognition. BMC Bioinformatics, 20, 249.
856 38 Zhu H. et al. (2018) Clinical concept extraction with contextual word embedding. NIPS Machine Learning for Health Workshop. http://par.nsf.gov/biblio/10098080.
857 22 0 [22] Q. Li, Z. Hu, Y. Wang, L. Li, Y. Fan, I. King, G. Jia, S. Wang, L. Song, and Y. Li. Progress and opportunities of foundation models in bioinformatics. Briefings in Bioinformatics, 25:548, 9 2024. ISSN 14774054. doi: 10.1093/BIB/BBAE548. URL https://dx.doi.org/10.1093/bib/bbae548. https://qiita.com/kaizen_nagoya/items/6ef20eaf796532fed6f8
858 1 Hughes JP, Rees S, Kalindjian SB. et al. Principles of early drug discovery. Br J Pharmacol 2011;162:1239–49. 10.1111/j.1476-5381.2010.01127.x.
859 2 Bommasani DA, Hudson E, Adeli E. et al. On the opportunities and risks of foundation models.arXiv preprint arXiv:2108.07258, 2021.
860 3 Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med 2019;25:44–56. 10.1038/s41591-018-0300-7.
861 4 Park Y S, Lek S. Artificial Neural Networks: Multilayer Perceptron for Ecological Modeling[M]. In: Jørgensen SE, (eds.), Developments in Environmental Modeling. Netherlands: Elsevier, 2016;28: 123–40, 10.1016/B978-0-444-63623-2.00007-4.
862 5 Wang M, Tai CEW, Wei L. DeFine: deep convolutional neural networks accurately quantify intensities of transcription factor-DNA binding and facilitate evaluation of functional non-coding variants. Nucleic Acids Res 2018;46:e69–9. 10.1093/nar/gky215.
863 6 Shen J, Liu F, Tu Y. et al. Finding gene network topologies for given biological function with recurrent neural network. Nat Commun 2021;12:3125. 10.1038/s41467-021-23420-5.
864 7 Whalen S, Truty RM, Pollard KS. Enhancer–promoter interactions are encoded by complex genomic signatures on looping chromatin. Nat Genet 2016;48:488–96. 10.1038/ng.3539.
865 8 Forster DT, Li SC, Yashiroda Y. et al. BIONIC: biological network integration using convolutions. Nat Methods 2022;19:1250–61. 10.1038/s41592-022-01616-x.
866 9 Dong K, Zhang S. Deciphering spatial domains from spatially resolved transcriptomics with an adaptive graph attention auto-encoder. Nat Commun 2022;13:1739. 10.1038/s41467-022-29439-6.
867 10 Mahmud M, Kaiser MS, Hussain A. et al. Applications of deep learning and reinforcement learning to biological data. IEEE Trans Neural Netw Learn Syst 2018;29:2063–79. 10.1109/TNNLS.2018.2790388.
868 11 Wiggins WF, Tejani AS. On the opportunities and risks of foundation models for natural language processing in radiology. Radiol Artif Intell 2022;4:e220119. 10.1148/ryai.220119.
869 12 Baker B, Akkaya I, Zhokov P. et al. Video pretraining (vpt): learning to act by watching unlabeled online videos. Adv Neural Inf Process Syst 2022;35:24639–54.
870 13 Tack A, Piech C. The AI teacher test: measuring the pedagogical ability of blender and GPT-3 in educational dialogues. arXiv preprint arXiv:2205.07540, 2022.
871 14 Moor M, Banerjee O, Abad ZSH. et al. Foundation models for generalist medical artificial intelligence. Nature 2023;616:259–65. 10.1038/s41586-023-05881-4.
872 15 Rao R M, Liu J, Verkuil R. et al. MSA transformer. International Conference on Machine Learning PMLR 2021;139:8844–56.
873 16 Sapoval N, Aghazadeh A, Nute MG. et al. Current progress and open challenges for applying deep learning across the biosciences. Nat Commun 2022;13:1728. 10.1038/s41467-022-29268-7.
874 17 Theodoris CV, Xiao L, Chopra A. et al. Transfer learning enables predictions in network biology. Nature 2023;618:616–24. 10.1038/s41586-023-06139-9.
875 18 Zou J, Huss M, Abid A. et al. A primer on deep learning in genomics. Nat Genet 2019;51:12–8. 10.1038/s41588-018-0295-5.
876 19 Uhlmann V, Donati L, Sage D. A practical guide to supervised deep learning for bioimage analysis: challenges and good practices. IEEE Signal Process Mag 2022;39:73–86. 10.1109/MSP.2021.3123589.
877 20 Wasserman WW, Sandelin A. Applied bioinformatics for the identification of regulatory elements. Nat Rev Genet 2004;5:276–87. 10.1038/nrg1315.
878 21 Howard J, Ruder S. Universal language model fine-tuning for text classification. arXiv preprint arXiv:1801.06146, 2018.
879 22 Yuan L, Chen D, Chen YL. et al. Florence: a new foundation model for computervision, arXiv preprint arXiv:2111.11432, 2021.
880 23 Devlin J, Chang MW, Lee K. et al. Bert: pre-training of deep bidirectional transformers for languageunderstanding. arXiv preprint arXiv:1810.04805, 2018.
881 24 Lee J, Yoon W, Kim S. et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 2020;36:1234–40. 10.1093/bioinformatics/btz682.
882 25 Gu Y, Tinn R, Cheng H. et al. Domain-specific language model pretraining for biomedical natural language processing. ACM Trans Comput Healthc 2021;3:1–23.
883 26 Ji Y, Zhou Z, Liu H. et al. DNABERT: pre-trained bidirectional encoder representations from transformers model for DNA-language in genome. Bioinformatics 2021;37:2112–20. 10.1093/bioinformatics/btab083.
884 27 Brandes N, Ofer D, Peleg Y. et al. Proteinbert: a universal deep-learning model of protein sequence and function. Bioinformatics 2022;38:2102–10. 10.1093/bioinformatics/btac020.
885 28 Radford A, Wu J, Child R. et al. Language models are unsupervised multitask learners. OpenAI Blog 2019;1:9.
886 29 Wu Y, Wang S, Yang H. et al. An early evaluation of gpt-4v(ision).arXiv preprintarXiv:2310.16534, 2023.
887 30 Lin Z, Akin H, Rao R. et al. Language models of protein sequences at the scale of evolution enable accurate structure prediction. bioRxiv 500902, 2022.
888 31 Hayes T, Rao R, Akin H. et al. Simulating 500 million years of evolution with a language model. bioRxiv 600583, 2024.
889 32 Raffel C, Shazeer N, Roberts A. et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 2020;21:1–67.
890 33 Song K, Tan X, Qin T. et al. Mpnet: masked and permuted pre-training for language understanding. Adv Neural Inf Process Syst 2020;33:16857–67.
891 34 Avsec Ž, Agarwal V, Visentin D. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat Methods 2021;18:1196–203. 10.1038/s41592-021-01252-x.
892 35 Xu M, Yuan X, Miret S. et al. Protst: multi-modality learning of protein sequences and biomedicaltexts. arXiv preprint arXiv:2301.12040, 2023.
893 36 Ferruz N, Schmidt S, Höcker B. ProtGPT2 is a deep unsupervised language model for protein design. Nat Commun 2022;13:4348. 10.1038/s41467-022-32007-7.
894 37 Chen B, Cheng X, Geng Y. et al. xTrimoPGLM: unified 100B-scale pre-trained transformer for deciphering the language of protein. arXiv preprint arXiv:2401.06199, 2024.
895 38 Liu Y and Tian B. Protein-DNA binding sites prediction based on pre-trained protein language model and contrastive learning [J]. arXiv preprint arXiv, 2023;2306:15912.
896 39 Azher ZL, Suvarna A, Chen JQ. et al. Assessment of emerging pretraining strategies in interpretable multimodal deep learning for cancer prognostication. BioData Min 2023;16:23. 10.1186/s13040-023-00338-w.
897 40 Liu Y, Tian B. Protein-DNA binding sites prediction based on pre-trained protein language model and contrastive learning. Briefings in Bioinformatics 2024;25.1:bbad488. 10.1093/bib/bbad488.
898 41 Nguyen E, Poli M, Faizi M. et al. Hyenadna: long-range genomic sequence modeling at single nucleotide resolution. Advances in Neural Information Processing Systems, 2024;36.
899 42 Cui H, Wang C, Maan H. et al. scGPT: towards building a foundation model for single-cell multi-omics using generative AI. Nature Methods 2024;1–11.
900 43 Madani A, Krause B, Greene ER. et al. Large language models generate functional protein sequences across diverse families. Nat Biotechnol 2023;41:1099–106. 10.1038/s41587-022-01618-2.
901 44 Senior AW, Evans R, Jumper J. et al. Improved protein structure prediction using potentials from deep learning. Nature 2020;577:706–10. 10.1038/s41586-019-1923-7.
902 45 Walsh B, Mohamed SK, Nováček V. Biokg: A knowledge graph for relational learning on biological data. In: d'Aquin PM, Dietze PS, (eds.), Proceedings of the 29th ACM International Conference on Information & Knowledge Management. ACM (Association for Computing Machinery), New York, NY, USA, 2020; 3173–3180.
903 46 Bernstein NJ, Fong NL, Lam I. et al. Solo: doublet identification in single-cell RNA-seq via semi-supervised deep learning. Cell Syst 2020;11:95–101.e5e5. 10.1016/j.cels.2020.05.010.
904 47 Brendel M, Su C, Bai Z. et al. Application of deep learning on single-cell RNA sequencing data analysis: a review. Genomics Proteomics Bioinformatics 2022;20:814–35. 10.1016/j.gpb.2022.11.011.
905 48 Arisdakessian C, Poirion O, Yunits B. et al. DeepImpute: an accurate, fast, and scalable deep neural network method to impute single-cell RNA-seq data. Genome Biol 2019;20:211. 10.1186/s13059-019-1837-6.
906 49 Tran HTN, Ang KS, Chevrier M. et al. A benchmark of batch-effect correction methods for single-cell RNA sequencing data. Genome Biol 2020;21:1–32. 10.1186/s13059-019-1850-9.
907 50 Clement L. Statistical methods for quantitative MS-based proteomics: part I. Preprocessing.
908 51 Mowoe MO, Garnett S, Lennard K. et al. Pro-MAP: a robust pipeline for the pre-processing of single channel protein microarray data. BMC Bioinformatics 2022;23:534. 10.1186/s12859-022-05095-x.
909 52 Hong L, Sun S, Zheng L, Tan Q X, and Li Y. fastmsa: Accelerating multiple sequence alignment with dense retrieval on protein language. bioRxiv 2021;2021–12.
910 53 Steinegger M, Meier M, Mirdita M. et al. HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinformatics 2019;20:1–15. 10.1186/s12859-019-3019-7.
911 54 Stecher G, Tamura K, Kumar S. Molecular evolutionary genetics analysis (MEGA) for macOS. Mol Biol Evol 2020;37:1237–9. 10.1093/molbev/msz312.
912 55 Chen K, Zhao H, Yang Y. Capturing large genomic contexts for accurately predicting enhancer-promoter interactions. Brief Bioinform 2022; 23:bbab577. 10.1093/bib/bbab577.
913 56 Novakovsky G, Dexter N, Libbrecht MW. et al. Obtaining genetics insights from deep learning via explainable artificial intelligence. Nat Rev Genet 2023;24:125–37. 10.1038/s41576-022-00532-2.
914 57 Dalla-Torre H, Gonzalez L, Mendoza J. et al. The nucleotide transformer: building and evaluating robust foundation models for human genomics. bioRxiv 2023;2023–01.
915 58 Chen J, Hu Z, Sun S. et al. Interpretable RNA foundation model from unannotated data for highly accurate RNA structure and function predictions. arXiv preprint arXiv:2204.00300, 2022.
916 59 Alipanahi B, Delong A, Weirauch MT. et al. Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning. Nat Biotechnol 2015;33:831–8. 10.1038/nbt.3300.
917 60 Liu P, Yuan W, Fu J. et al. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput Surv 2023;55:1–35. 10.1145/3560815.
918 61 Rentzsch P, Schubach M, Shendure J. et al. CADD-splice—improving genome-wide variant effect prediction using deep learning-derived splice scores. Genome Med 2021;13:1–12. 10.1186/s13073-021-00835-9.
919 62 Mi H, Muruganujan A, Huang X. et al. Protocol update for large-scale genome and gene function analysis with the PANTHER classification system (v. 14.0). Nat Protoc 2019;14:703–21. 10.1038/s41596-019-0128-8.
920 63 Ernst J, Kheradpour P, Mikkelsen TS. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 2011;473:43–9. 10.1038/nature09906.
921 64 Tang Z, Li C, Kang B. et al. GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses. Nucleic Acids Res 2017;45:W98–102. 10.1093/nar/gkx247.
922 65 Saelens W, Cannoodt R, Todorov H. et al. A comparison of single-cell trajectory inference methods. Nat Biotechnol 2019;37:547–54. 10.1038/s41587-019-0071-9.
923 66 Kaddour J, Harris J, Mozes M. et al. Challenges and applications of large languagemodels. arXiv preprint arXiv:2307.10169. 2023.
924 67 Liu W, Zhou P, Zhao Z. et al. K-bert: enabling language representation with knowledge graph. In: Proceedings of the AAAI Conference on Artificial Intelligence 2020;34:2901–8. 10.1609/aaai.v34i03.5681.
925 68 Brown TB, Mann B, Ryder N. et al. Language models are few-shot learners. Adv Neural Inf Process Syst 2020;33:1877–901.
926 69 Yasunaga M, Bosselut A, Ren H. et al. Deep bidirectional language-knowledge graph pretraining. Adv Neural Inf Process Syst 2022;35:37309–23.
927 70 Denny V, Krötzsch M. Wikidata: a free collaborative knowledgebase. Communications of the ACM, 2014;57:78–85.
928 71 Zhu Y, Kiros R, Zemel R. et al. Aligning books and movies: towards story-like visual explanations by watching movies and reading books. arXiv preprint arXiv:1506.06724, 2015.
929 72 Speer R, Chin J, Havasi C. Conceptnet 5.5: an open multilingual graph of general knowledge. In Proceedings of the AAAI Conference on Artificial Intelligence 2017;31:4444–4451. 10.1609/aaai.v31i1.11164.
930 73 Jia G, Li Y, Zhong X. et al. The high-dimensional space of human diseases built from diagnosis records and mapped to genetic loci. Nat Comput Sci 2023;3:403–17. 10.1038/s43588-023-00453-y.
931 74 Jia G, Li Y, Zhang H. et al. Estimating heritability and genetic correlations from large health datasets in the absence of genetic data. Nat Commun 2019;10:5508. 10.1038/s41467-019-13455-0.
932 75 Singhal K, Azizi S, Tu T. et al. Large language models encode clinical knowledge. Nature 2023;620:172–80. 10.1038/s41586-023-06291-2.
933 76 Kanakarajan RK, Kundumani B, Sankarasubbu M. BioELECTRA: Pretrained biomedical text encoder using discriminators. In: Demner-Fushman D, Cohen KB, Ananiadou S, Tsujii J, (eds.), Proceedings of the 20th Workshop on Biomedical Language Processing. Association for Computational Linguistics, Online, 2021;143–154.
934 77 Babjac AN, Lu Z, Emrich SJ. CodonBERT: Using BERT for sentiment analysis to better predict genes with low expression. In: Wang MD, Byung-Jun Yoon P, (eds.), Proceedings of the 14th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics. Association for Computing Machinery, New York, NY, United States, 2023; 1–6.
935 78 Yuan H, Yuan Z, Gan R. et al. BioBART: Pretraining and evaluation of a biomedical generative language model. arXiv preprint arXiv:2204.03905, 2022.
936 79 Rajpurkar P, Zhang J, Lopyrev K. et al. Squad: 100,000+ questions for machine comprehension of textarXiv preprint. arXiv preprint arXiv:1606.05250, 2016.
937 80 Fiorini N, Leaman R, Lipman DJ. et al. How user intelligence is improving PubMed. Nat Biotechnol 2018;36:937–45. 10.1038/nbt.4267.
938 81 Wu J, Fu R, Fang H. et al. Medical sam adapter: adapting segment anything model for medical image segmentation. arXiv preprint arXiv:2304.12620, 2023.
939 82 Pathak Y, Shukla PK, Tiwari A. et al. Deep transfer learning based classification model for COVID-19 disease. Ing Rech Biomed 2022;43:87–92. 10.1016/j.irbm.2020.05.003.
940 83 Bolton E, Hall D, Yasunaga M. et al. Stanford crfm introduces pubmedgpt 2.7 b. 2022.
941 84 Zhou Z, Ji Y, Li W. et al. DNABERT-2: efficient foundation model and benchmark for multi-species genome. arXiv preprint arXiv:2306.15006, 2023.
942 85 Wang R, Wang Z, Wang J. et al. SpliceFinder: ab initio prediction of splice sites using convolutional neural network. BMC Bioinformatics 2019;20:1–13. 10.1186/s12859-019-3306-3.
943 86 Repecka D, Jauniskis V, Karpus L. et al. Expanding functional protein sequence spaces using generative adversarial networks. Nat Mach Intell 2021;3:324–33. 10.1038/s42256-021-00310-5.
944 87 Gresova K, Martinek V, Cechak D. et al. Genomic benchmarks: a collection of datasets for genomic sequence classification. BMC Genomic Data, 2023;24:25.
945 88 Wu R, Ding F, Wang R. et al. High-resolution de novo structure prediction from primary sequence. bioRxiv 2022; 2022-07.
946 89 Lin Z, Akin H, Rao R. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 2023;379:1123–30. 10.1126/science.ade2574.
947 90 Ruffolo JA, Chu LS, Mahajan SP. et al. Fast, accurate antibody structure prediction from deep learning on massive set of natural antibodies. Nat Commun 2023;14:2389. 10.1038/s41467-023-38063-x.
948 91 Wang Y, Xumeng Gong, Li S, Yang B, Sun Y, Chuan Shi, Wang Y, Yang C, Li H, and Song L. xtrimoabfold: de novo antibody structure prediction without msa. arXiv preprint arXiv:2212.00735, 2022.
949 92 Skinnider M, Johnston C, Gunabalasingam M. et al. Comprehensive prediction of secondary metabolite structure and biological activity from microbial genome sequences. Nat Commun 2020;11:6058. 10.1038/s41467-020-19986-1.
950 93 Jumper J, Evans R, Pritzel A. et al. Highly accurate protein structure prediction with AlphaFold. Nature 2021;596:583–9. 10.1038/s41586-021-03819-2.
951 94 Rives A, Meier J, Sercu T. et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc Natl Acad Sci 2021;118:e2016239118. 10.1073/pnas.2016239118.
952 95 Klausen MS, Jespersen MC, Nielsen H. et al. NetSurfP-2.0: improved prediction of protein structural features by integrated deep learning. Proteins 2019;87:520–7. 10.1002/prot.25674.
953 96 Elnaggar A, Heinzinger M, Dallago C. et al. Prottrans: toward understanding the language of life through self-supervised learning. IEEE Trans Pattern Anal Mach Intell 2021;44:7112–27. 10.1109/TPAMI.2021.3095381.
954 97 Zhou G, Gao Z, Ding Q. et al. Uni-Mol: A universal 3d molecular representation learning framework. chemrxiv, 2023.
955 98 Feynman R. The Character of Physical Law, with New Foreword. MIT Press, Cambridge, Massachusetts, USA, 2017, 10.7551/mitpress/11068.001.0001.
956 99 Chowdhury R, Bouatta N, Biswas S. et al. Single-sequence protein structure prediction using a language model and deep learning. Nat Biotechnol 2022;40:1617–23. 10.1038/s41587-022-01432-w.
957 100 Guo Y, Wu J, Ma H. et al. Self-supervised pre-training for protein embeddings using tertiary structures. Proceedings of the AAAI Conference on Artificial Intelligence 2022;36:6801–9. 10.1609/aaai.v36i6.20636.
958 101 McDermott M, Yap B, Szolovits P. et al. Structure-inducing pre-training. Nat Mach Intell 2023;5:612–21. 10.1038/s42256-023-00647-z.
959 102 Singh J, Hanson J, Paliwal K. et al. RNA secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning. Nat Commun 2019;10:5407. 10.1038/s41467-019-13395-9.
960 103 Fu L, Cao Y, Wu J. et al. UFold: fast and accurate RNA secondary structure prediction with deep learning. Nucleic Acids Res 2022;50:e14–4. 10.1093/nar/gkab1074.
961 104 Zhu H, Hu J, Song XN. et al. DNAPred: accurate identification of DNA-binding sites from protein sequence by ensembled hyperplane-distance-based support vector machines. J Chem Inf Model 2019;59:3057–71. 10.1021/acs.jcim.8b00749.
962 105 Zhang J, Chen Q, Liu B. NCBRPred: predicting nucleic acid binding residues in proteins based on multilabel learning. Brief Bioinform 2021;22:bbaa397. 10.1093/bib/bbaa397.
963 106 Su H, Liu M, Sun S. et al. Improving the prediction of protein-nucleic acids binding residues via multiple sequence profiles and the consensus of complementary methods. Bioinformatics 2019;35:930–6. 10.1093/bioinformatics/bty756.
964 107 Ashburner M, Ball C, Blake J. et al. Gene ontology: tool for the unification of biology. Nat Genet 2000;25:25–9. 10.1038/75556.
965 108 Gligorijević V, Renfrew PD, Kosciolek T. et al. Structure-based protein function prediction using graph convolutional networks. Nat Commun 2021;12:3168. 10.1038/s41467-021-23303-9.
966 109 Kulmanov M, Hoehndorf R. DeepGOZero: improving protein function prediction from sequence and zero-shot learning based on ontology axioms. Bioinformatics 2022;38:i238–45. 10.1093/bioinformatics/btac256.
967 110 Yang F, Wang W, Wang F. et al. scBERT as a large-scale pre-trained deep language model for cell type annotation of single-cell RNA-seq data. Nat Mach Intell 2022;4:852–66. 10.1038/s42256-022-00534-z.
968 111 Choromanski K, Likhosherstov V, Dohan D. et al. Rethinking attention with performers. arXiv preprint arXiv:2009.14794, 2020.
969 112 Lu Y, Jiang X, Fang Y. et al. Learning to pre-train graph neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 2021;35:4276–84. 10.1609/aaai.v35i5.16552.
970 113 Li C, Liu B, Kang B. et al. SciBet as a portable and fast single cell type identifier. Nat Commun 2020;11:1818. 10.1038/s41467-020-15523-2.
971 114 Kiselev VY, Yiu A, Hemberg M. Scmap: projection of single-cell RNA-seq data across data sets. Nat Methods 2018;15:359–62. 10.1038/nmeth.4644.
972 115 Yang X, Mann KK, Wu H. et al. scCross: a deep generative model for unifying single-cell multi-omics with seamless integration, cross-modal generation, and in silico exploration. Genome Biol 2024;25:198. 10.1186/s13059-024-03338-z.
973 116 Hao M, Gong J, Zeng X. et al. Large-scale foundation model on single-cell transcriptomics. Nat Methods 2024;21:1481–1491.
974 117 Saharia C, Chan W, Saxena S. et al. Photorealistic text-to-image diffusion models with deep language understanding. Adv Neural Inf Process Syst 2022;35:36479–94.
975 118 Cao ZJ, Gao G. Multi-omics single-cell data integration and regulatory inference with graph-linked embedding. Nat Biotechnol 2022;40:1458–66. 10.1038/s41587-022-01284-4.
976 119 Ciciani M, Demozzi M, Pedrazzoli E. et al. Automated identification of sequence-tailored Cas9 proteins using massive metagenomic data. Nat Commun 2022;13:6474. 10.1038/s41467-022-34213-9.
977 120 Ruiz C, Zitnik M, Leskovec J. Identification of disease treatment mechanisms through the multiscale interactome. Nat Commun 2021;12:1796. 10.1038/s41467-021-21770-8.
978 121 Eraslan G, Avsec Ž, Gagneur J. et al. Deep learning: new computational modeling techniques for genomics. Nat Rev Genet 2019;20:389–403. 10.1038/s41576-019-0122-6.
979 122 Poli M, Massaroli S, Nguyen E. et al. Hyena hierarchy: towards larger convolutional language models. In: International Conference on Machine Learning. PMLR, 2023;28043–28078.
980 123 Jeliazkov JR, del Alamo D, Karpiak JD. Esmfold hallucinates native-like protein sequences. bioRxiv 2023; 2023–05.
981 124 Wang Z, Dai Z, Póczos B. et al. Characterizing and avoiding negative transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019; 11293–11302.
982 125 Wang H, Kaddour J, Liu S. et al. Evaluating self-supervised learning for molecular graph embeddings. Advances in Neural Information Processing Systems, 2024;36.
983 126 Zhou H, Zhang S, Peng J. et al. Informer: beyond efficient transformer for long sequence time-series forecasting. Proceedings of the AAAI Conference on Artificial Intelligence 2021;35:11106–15. 10.1609/aaai.v35i12.17325.
984 127 Press O, Smith NA, Lewis M. Shortformer: better language modeling using shorterinputs. arXiv preprint arXiv:2012.15832, 2020.
985 128 Li C, Zhang M, He Y. The stability-efficiency dilemma: investigating sequence length warmup for training GPT models. Adv Neural Inf Process Syst 2022;35:26736–50.
986 129 Dao T, Fu D, Ermon S. et al. Flashattention: fast and memory-efficient exact attention with io-awareness. Adv Neural Inf Process Syst 2022;35:16344–59.
987 130 Ainslie J, Lee-Thorp J, de Jong M. et al. GQA: training generalized multi-query transformer models from multi-headcheckpoints. arXiv preprint arXiv:2305.13245, 2023.
988 131 Hijma P, Heldens S, Sclocco A. et al. Optimization techniques for GPU programming. ACM Comput Surv 2023;55:1–81. 10.1145/3570638.
989 132 Cui P, Athey S. Stable learning establishes some common ground between causal inference and machine learning. Nat Mach Intell 2022;4:110–5. 10.1038/s42256-022-00445-z.
990 133 Jin Q, Yuan Z, Xiong G. et al. Biomedical question answering: a survey of approaches and challenges. ACM Comput Surv 2022;55:1–36. 10.1145/3490238.
991 134 Danaee P, Rouches M, Wiley M. et al. bpRNA: large-scale automated annotation and analysis of RNA secondary structure. Nucleic Acids Res 2018;46:5381–94. 10.1093/nar/gky285.
992 135 Moon I, LoPiccolo J, Baca SC. et al. Machine learning for genetics-based classification and treatment response prediction in cancer of unknown primary. Nat Med 2023;29:2057–67. 10.1038/s41591-023-02482-6.
993 136 Wornow M, Xu Y, Thapa R. et al. The shaky foundations of large language models and foundation models for electronic health records. npj Digit Med 2023;6:135. 10.1038/s41746-023-00879-8.
994 23 0 [23] F. I. Marin, F. Teufel, M. Horlacher, D. Madsen, D. Pultz, O. Winther, and W. Boomsma. Bend: Benchmarking dna language models on biologically meaningful tasks. 12th International Conference on Learning Representations, ICLR 2024, 11 2023. URL https://arxiv.org/pdf/2311.12570. https://qiita.com/kaizen_nagoya/items/8417e72454d2107a9d06
995 1 Babak Alipanahi, Andrew Delong, Matthew T Weirauch, and Brendan J Frey. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nature Biotechnology, 33(8):831–838, August 2015. ISSN 1087-0156, 1546-1696. doi: 10.1038/nbt.3300. URL https://www.nature.com/articles/nbt.3300.
996 2 Weizhi An, Yuzhi Guo, Yatao Bian, Hehuan Ma, Jinyu Yang, Chunyuan Li, and Junzhou Huang. MoDNA: motif-oriented pre-training for DNA language model. In Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, pp. 1–5, Northbrook Illinois, August 2022. ACM. ISBN 978-1-4503-9386-7. doi: 10.1145/3535508.3545512. URL https://dl.acm.org/doi/10.1145/3535508.3545512.
997 3 Christof Angermueller, Heather J. Lee, Wolf Reik, and Oliver Stegle. DeepCpG: accurate prediction of single-cell DNA methylation states using deep learning. Genome Biology, 18(1):67, December 2017. ISSN 1474-760X. doi: 10.1186/s13059-017-1189-z. URL http://genomebiology.biomedcentral.com/articles/10.1186/s13059-017-1189-z.
998 4 Adam Auton, Gonc¸alo R. Abecasis, David M. Altshuler, Richard M. Durbin, Gonc¸alo R. Abecasis, David R. Bentley, Aravinda Chakravarti, Andrew G. Clark, Peter Donnelly, Evan E. Eichler, Paul Flicek, Stacey B. Gabriel, Richard A. Gibbs, Eric D. Green, Matthew E. Hurles, Bartha M. Knoppers, Jan O. Korbel, Eric S. Lander, Charles Lee, Hans Lehrach, Elaine R. Mardis, Gabor T. Marth, Gil A. McVean, Deborah A. Nickerson, Jeanette P. Schmidt, Stephen T. Sherry, Jun Wang, Richard K. Wilson, Richard A. Gibbs, Eric Boerwinkle, Harsha Doddapaneni, Yi Han, Viktoriya Korchina, Christie Kovar, Sandra Lee, Donna Muzny, Jeffrey G. Reid, Yiming Zhu, Jun Wang, Yuqi Chang, Qiang Feng, Xiaodong Fang, Xiaosen Guo, Min Jian, Hui Jiang, Xin Jin, Tianming Lan, Guoqing Li, Jingxiang Li, Yingrui Li, Shengmao Liu, Xiao Liu, Yao Lu, Xuedi Ma, Meifang Tang, Bo Wang, Guangbiao Wang, Honglong Wu, Renhua Wu, Xun Xu, Ye Yin, Dandan Zhang, Wenwei Zhang, Jiao Zhao, Meiru Zhao, Xiaole Zheng, Eric S. Lander, David M. Altshuler, Stacey B. Gabriel, Namrata Gupta, Neda Gharani, Lorraine H. Toji, Norman P. Gerry, Alissa M. Resch, Paul Flicek, Jonathan Barker, Laura Clarke, Laurent Gil, Sarah E. Hunt, Gavin Kelman, Eugene Kulesha, Rasko Leinonen, William M. McLaren, Rajesh Radhakrishnan, Asier Roa, Dmitriy Smirnov, Richard E. Smith, Ian Streeter, Anja Thormann, Iliana Toneva, Brendan Vaughan, Xiangqun Zheng-Bradley, David R. Bentley, Russell Grocock, Sean Humphray, Terena James, Zoya Kingsbury, Hans Lehrach, Ralf Sudbrak, Marcus W. Albrecht, Vyacheslav S. Amstislavskiy, Tatiana A. Borodina, Matthias Lienhard, Florian Mertes, Marc Sultan, Bernd Timmermann, Marie-Laure Yaspo, Elaine R. Mardis, Richard K. Wilson, Lucinda Fulton, Robert Fulton, Stephen T. Sherry, Victor Ananiev, Zinaida Belaia, Dimitriy Beloslyudtsev, Nathan Bouk, Chao Chen, Deanna Church, Robert Cohen, Charles Cook, John Garner, Timothy Hefferon, Mikhail Kimelman, Chunlei Liu, John Lopez, Peter Meric, Chris O’Sullivan, Yuri Ostapchuk, Lon Phan, Sergiy Ponomarov, Valerie Schneider, Eugene Shekhtman, Karl Sirotkin, Douglas Slotta, Hua Zhang, Gil A. McVean, Richard M. Durbin, Senduran Balasubramaniam, John Burton, Petr Danecek, Thomas M. Keane, Anja Kolb-Kokocinski, Shane McCarthy, James Stalker, Michael Quail, Jeanette P. Schmidt, Christopher J. Davies, Jeremy Gollub, Teresa Webster, Brant Wong, Yiping Zhan, Adam Auton, Christopher L. Campbell, Yu Kong, Anthony Marcketta, Richard A. Gibbs, Fuli Yu, Lilian Antunes, Matthew Bainbridge, Donna Muzny, Aniko Sabo, Zhuoyi Huang, Published as a conference paper at ICLR 2024
999 5 Jun Wang, Lachlan J. M. Coin, Lin Fang, Xiaosen Guo, Xin Jin, Guoqing Li, Qibin Li, Yingrui Li, Zhenyu Li, Haoxiang Lin, Binghang Liu, Ruibang Luo, Haojing Shao, Yinlong Xie, Chen Ye, Chang Yu, Fan Zhang, Hancheng Zheng, Hongmei Zhu, Can Alkan, Elif Dal, Fatma Kahveci, Gabor T. Marth, Erik P. Garrison, Deniz Kural, Wan-Ping Lee, Wen Fung Leong, Michael Stromberg, Alistair N. Ward, Jiantao Wu, Mengyao Zhang, Mark J. Daly, Mark A. DePristo, Robert E. Handsaker, David M. Altshuler, Eric Banks, Gaurav Bhatia, Guillermo del Angel, Stacey B. Gabriel, Giulio Genovese, Namrata Gupta, Heng Li, Seva Kashin, Eric S. Lander, Steven A. McCarroll, James C. Nemesh, Ryan E. Poplin, Seungtai C. Yoon, Jayon Lihm, Vladimir Makarov, Andrew G. Clark, Srikanth Gottipati, Alon Keinan, Juan L. Rodriguez-Flores, Jan O. Korbel, Tobias Rausch, Markus H. Fritz, Adrian M. St¨ utz, Paul Flicek, Kathryn Beal, Laura Clarke, Avik Datta, Javier Herrero, William M. McLaren, Graham R. S. Ritchie, Richard E. Smith, Daniel Zerbino, Xiangqun Zheng-Bradley, Pardis C. Sabeti, Ilya Shlyakhter, Stephen F. Schaffner, Joseph Vitti, David N. Cooper, Edward V. Ball, Peter D. Stenson, David R. Bentley, Bret Barnes, Markus Bauer, R. Keira Cheetham, Anthony Cox, Michael Eberle, Sean Humphray, Scott Kahn, Lisa Murray, John Peden, Richard Shaw, Eimear E. Kenny, Mark A. Batzer, Miriam K. Konkel, Jerilyn A. Walker, Daniel G. MacArthur, Monkol Lek, Ralf Sudbrak, Vyacheslav S. Amstislavskiy, Ralf Herwig, Elaine R. Mardis, Li Ding, Daniel C. Koboldt, David Larson, Kai Ye, Simon Gravel, The 1000 Genomes Project Consortium, Corresponding authors, Steering committee, Production group, Baylor College of Medicine, BGI-Shenzhen, Broad Institute of MIT and Harvard, Coriell Institute for Medical Research, European Bioinformatics Institute European Molecular Biology Laboratory, Illumina, Max Planck Institute for Molecular Genetics, McDonnell Genome Institute at Washington University, US National Institutes of Health, University of Oxford, Wellcome Trust Sanger Institute, Analysis group, Affymetrix, Albert Einstein College of Medicine, Bilkent University, Boston College, Cold Spring Harbor Laboratory, Cornell University, European Molecular Biology Laboratory, Harvard University, Human Gene Mutation Database, Icahn School of Medicine at Mount Sinai, Louisiana State University, Massachusetts General Hospital, McGill University, and NIH National Eye Institute. A global reference for human genetic variation. Nature, 526(7571):68–74, October 2015. ISSN 1476-4687. doi: 10.1038/nature15393. URL https://www.nature.com/articles/nature15393. Number: 7571 Publisher: Nature Publishing Group.
1000 6 Ziga Avsec, Vikram Agarwal, Daniel Visentin, Joseph R. Ledsam, Agnieszka Grabska-Barwinska, Kyle R. Taylor, Yannis Assael, John Jumper, Pushmeet Kohli, and David R. Kelley. Effective gene expression prediction from sequence by integrating long-range interactions. Nature Methods, 18(10):1196–1203, October 2021. ISSN 1548-7105. doi: 10.1038/s41592-021-01252-x. URL https://www.nature.com/articles/s41592-021-01252-x. Number: 10 Publisher: Nature Publishing Group.
1001 7 Andrew J Bannister and Tony Kouzarides. Regulation of chromatin by histone modifications. Cell Research, 21(3):381–395, March 2011. ISSN 1748-7838. doi: 10.1038/cr.2011.22. URL https://doi.org/10.1038/cr.2011.22.
1002 8 Gonzalo Benegas, Sanjit Singh Batra, and Yun S. Song. DNA language models are powerful zero-shot predictors of genome-wide variant effects. bioRxiv, pp. 2022.08.22.504706, January 2023. doi: 10.1101/2022.08.22.504706. URL http://biorxiv.org/content/early/2023/04/12/2022.08.22.504706.abstract.
1003 9 Tristan Bepler and Bonnie Berger. Learning the protein language: Evolution, structure, and function. Cell Systems, 12(6):654–669.e3, June 2021. ISSN 24054712. doi: 10.1016/j.cels.2021.05.017. URL https://linkinghub.elsevier.com/retrieve/pii/S2405471221002039.
1004 10 Kathleen M. Chen, Aaron K. Wong, Olga G. Troyanskaya, and Jian Zhou. A sequence-based global map of regulatory activity for deciphering human genetics. Nature Genetics, 54(7):940–949, July 2022. ISSN 1061-4036, 1546-1718. doi: 10.1038/s41588-022-01102-2. URL https://www.nature.com/articles/s41588-022-01102-2.
1005 11 Lei Cheng, Tong Yu, Tero Aittokallio, Jukka Corander, Ruslan Khalitov, and Zhirong Yang. Self-supervised learning for DNA sequences with circular dilated convolutional networks. preprint, Bioinformatics, February 2023. URL http://biorxiv.org/lookup/doi/10.1101/2023.01.30.526193. Published as a conference paper at ICLR 2024
1006 12 Hugo Dalla-Torre, Liam Gonzalez, Javier Mendoza Revilla, Nicolas Lopez Carranza, Adam Henryk Grzywaczewski, Francesco Oteri, Christian Dallago, Evan Trop, Hassan Sirelkhatim, Guillaume Richard, Marcin Skwark, Karim Beguir, Marie Lopez, and Thomas Pierrot. The Nucleotide Transformer: Building and Evaluating Robust Foundation Models for Human Genomics. bioRxiv, pp. 2023.01.11.523679, January 2023. doi: 10.1101/2023.01.11.523679. URL http://biorxiv.org/content/early/2023/01/15/2023.01.11.523679.abstract.
1007 13 Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. 2018. doi: 10.48550/ARXIV.1810.04805. URL https://arxiv.org/abs/1810.04805. Publisher: arXiv Version Number: 2.
1008 14 Ahmed Elnaggar, Michael Heinzinger, Christian Dallago, Ghalia Rehawi, Yu Wang, Llion Jones, Tom Gibbs, Tamas Feher, Christoph Angerer, Martin Steinegger, Debsindhu Bhowmik, and Burkhard Rost. ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):7112–7127, October 2022. ISSN 0162-8828, 2160-9292, 1939-3539. doi: 10.1109/TPAMI.2021.3095381. URL https://ieeexplore.ieee.org/document/9477085/.
1009 15 ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature, 489(7414):57–74, September 2012. ISSN 1476-4687. doi: 10.1038/nature11247.
1010 16 Veniamin Fishman, Yuri Kuratov, Maxim Petrov, Aleksei Shmelev, Denis Shepelin, Nikolay Chekanov, Olga Kardymon, and Mikhail Burtsev. GENA-LM: A Family of Open-Source Foundational Models for Long DNA Sequences, June 2023. URL https://www.biorxiv.org/content/10.1101/2023.06.12.544594v1. Pages: 2023.06.12.544594 Section: New Results.
1011 17 Adam Frankish, Mark Diekhans, Irwin Jungreis, Julien Lagarde, Jane E Loveland, Jonathan M Mudge, Cristina Sisu, James C Wright, Joel Armstrong, If Barnes, Andrew Berry, Alexandra Bignell, Carles Boix, Silvia Carbonell Sala, Fiona Cunningham, Tom´ as Di Domenico, Sarah Donaldson, Ian T Fiddes, Carlos Garc´ ıa Gir´ on, Jose Manuel Gonzalez, Tiago Grego, Matthew Hardy, Thibaut Hourlier, Kevin L Howe, Toby Hunt, Osagie G Izuogu, Rory Johnson, Fergal J Martin, Laura Mart´ ınez, Shamika Mohanan, Paul Muir, Fabio C P Navarro, Anne Parker, Baikang Pei, Fernando Pozo, Ferriol Calvet Riera, Magali Ruffier, Bianca M Schmitt, Eloise Stapleton, Marie-Marthe Suner, Irina Sycheva, Barbara Uszczynska-Ratajczak, Maxim Y Wolf, Jinuri Xu, Yucheng T Yang, Andrew Yates, Daniel Zerbino, Yan Zhang, Jyoti S Choudhary, Mark Gerstein, Roderic Guig´ o, Tim J P Hubbard, Manolis Kellis, Benedict Paten, Michael L Tress, and Paul Flicek. GENCODE 2021. Nucleic Acids Research, 49(D1):D916–D923, January 2021. ISSN 0305-1048. doi: 10.1093/nar/gkaa1087. URL https://doi.org/10.1093/nar/gkaa1087.
1012 18 Jonathan Frazer, Pascal Notin, Mafalda Dias, Aidan Gomez, Joseph K. Min, Kelly Brock, Yarin Gal, and Debora S. Marks. Disease variant prediction with deep generative models of evolutionary data. Nature, 599(7883):91–95, November 2021. ISSN 1476-4687. doi: 10.1038/s41586-021-04043-8. URL https://doi.org/10.1038/s41586-021-04043-8.
1013 19 Charles P. Fulco, Joseph Nasser, Thouis R. Jones, Glen Munson, Drew T. Bergman, Vidya Subramanian, Sharon R. Grossman, Rockwell Anyoha, Benjamin R. Doughty, Tejal A. Patwardhan, Tung H. Nguyen, Michael Kane, Elizabeth M. Perez, Neva C. Durand, Caleb A. Lareau, Elena K. Stamenova, Erez Lieberman Aiden, Eric S. Lander, and Jesse M. Engreitz. Activity-by-contact model of enhancer–promoter regulation from thousands of CRISPR perturbations. Nature Genetics, 51(12):1664–1669, December 2019. ISSN 1546-1718. doi: 10.1038/s41588-019-0538-0. URL https://www.nature.com/articles/s41588-019-0538-0. Number: 12 Publisher: Nature Publishing Group.
1014 20 Dennis Gankin, Alexander Karollus, Martin Grosshauser, Kristian Klemon, Johannes Hingerl, and Julien Gagneur. Species-aware DNA language modeling. bioRxiv, pp. 2023.01.26.525670, Jan-uary 2023. doi: 10.1101/2023.01.26.525670. URL http://biorxiv.org/content/early/2023/01/27/2023.01.26.525670.abstract. Published as a conference paper at ICLR 2024
1015 21 Molly Gasperini, Andrew J. Hill, Jos´ e L. McFaline-Figueroa, Beth Martin, Seungsoo Kim, Melissa D. Zhang, Dana Jackson, Anh Leith, Jacob Schreiber, William S. Noble, Cole Trapnell, Nadav Ahituv, and Jay Shendure. A Genome-wide Framework for Mapping Gene Regulation via Cellular Genetic Screens. Cell, 176(1):377–390.e19, January 2019. ISSN 0092-8674. doi: 10.1016/j.cell.2018.11.029. URL https://www.sciencedirect.com/science/article/pii/S009286741831554X.
1016 22 Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daum´ e, and Kate Crawford. Datasheets for Datasets. 2018. doi: 10.48550/ARXIV.1803.09010. URL https://arxiv.org/abs/1803.09010. Publisher: arXiv Version Number:8.
1017 23 J. Gorodkin. Comparing two K-category assignments by a K-category correlation coefficient. Computational Biology and Chemistry, 28(5):367–374, December 2004. ISSN 1476-9271. doi: 10.1016/j.compbiolchem.2004.09.006. URL https://www.sciencedirect.com/science/article/pii/S1476927104000799.
1018 24 Katar´ ına Greˇsov´ a, Vlastimil Martinek, David Cech´ ak, Petr Simeˇ cek, and Panagiotis Alexiou. Genomic benchmarks: a collection of datasets for genomic sequence classification. BMC Genomic Data, 24(1):25, May 2023. ISSN 2730-6844. doi: 10.1186/s12863-023-01123-8. URL https://doi.org/10.1186/s12863-023-01123-8.
1019 25 Benjamin C. Hitz, Jin-Wook Lee, Otto Jolanki, Meenakshi S. Kagda, Keenan Graham, Paul Sud, Idan Gabdank, J. Seth Strattan, Cricket A. Sloan, Timothy Dreszer, Laurence D. Rowe, Nikhil R. Podduturi, Venkat S. Malladi, Esther T. Chan, Jean M. Davidson, Marcus Ho, Stuart Miyasato, Matt Simison, Forrest Tanaka, Yunhai Luo, Ian Whaling, Eurie L. Hong, Brian T. Lee, Richard Sandstrom, Eric Rynes, Jemma Nelson, Andrew Nishida, Alyssa Ingersoll, Michael Buckley, Mark Frerker, Daniel S Kim, Nathan Boley, Diane Trout, Alex Dobin, Sorena Rahmanian, Dana Wyman, Gabriela Balderrama-Gutierrez, Fairlie Reese, Neva C. Durand, Olga Dudchenko, David Weisz, Suhas S. P. Rao, Alyssa Blackburn, Dimos Gkountaroulis, Mahdi Sadr, Moshe Olshansky, Yossi Eliaz, Dat Nguyen, Ivan Bochkov, Muhammad Saad Shamim, Ragini Mahajan, Erez Aiden, Tom Gingeras, Simon Heath, Martin Hirst, W. James Kent, Anshul Kundaje, Ali Mortazavi, Barbara Wold, and J. Michael Cherry. The ENCODE Uniform Analysis Pipelines. preprint, Bioinformatics, April 2023. URL http://biorxiv.org/lookup/doi/10.1101/2023.04.04.535623.
1020 26 Yanrong Ji, Zhihan Zhou, Han Liu, and Ramana V Davuluri. DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome. Bioinformatics, 37(15):2112–2120, August 2021. ISSN 1367-4803. doi: 10.1093/bioinformatics/btab083. URL https://doi.org/10.1093/bioinformatics/btab083.
1021 27 Meenakshi S. Kagda, Bonita Lam, Casey Litton, Corinn Small, Cricket A. Sloan, Emma Spragins, Forrest Tanaka, Ian Whaling, Idan Gabdank, Ingrid Youngworth, J. Seth Strattan, Jason Hilton, Jennifer Jou, Jessica Au, Jin-Wook Lee, Kalina Andreeva, Keenan Graham, Khine Lin, Matt Simison, Otto Jolanki, Paul Sud, Pedro Assis, Philip Adenekan, Eric Douglas, Mingjie Li, Pedro Assis, Keenan Graham, Paul Sud, Stuart Miyasato, Weiwei Zhong, Yunhai Luo, Zachary Myers, J. Michael Cherry, and Benjamin C. Hitz. Data navigation on the ENCODE portal. 2023. doi: 10.48550/ARXIV.2305.00006. URL https://arxiv.org/abs/2305.00006. Publisher: arXiv Version Number: 2.
1022 28 David R. Kelley, Jasper Snoek, and John L. Rinn. Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Research, 26(7):990–999, July 2016. ISSN 1549-5469. doi: 10.1101/gr.200535.115.
1023 29 David R. Kelley, Yakir A. Reshef, Maxwell Bileschi, David Belanger, Cory Y. McLean, and Jasper Snoek. Sequential regulatory activity prediction across chromosomes with convolutional neural networks. Genome Research, 28(5):739–750, May 2018. ISSN 1088-9051, 1549-5469. doi: 10.1101/gr.227819.117. URL http://genome.cshlp.org/lookup/doi/10.1101/gr.227819.117.
1024 30 Andriy Kryshtafovych, Torsten Schwede, Maya Topf, Krzysztof Fidelis, and John Moult. Critical assessment of methods of protein structure prediction (CASP)—Round XIV. Proteins: Published as a conference paper at ICLR 2024
1025 31 Structure, Function, and Bioinformatics, 89(12):1607–1617, 2021. ISSN 1097-0134. doi: 10.1002/prot.26237. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/prot.26237. eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/prot.26237.
1026 32 Melissa J Landrum, Shanmuga Chitipiralla, Garth R Brown, Chao Chen, Baoshan Gu, Jennifer Hart, Douglas Hoffman, Wonhee Jang, Kuljeet Kaur, Chunlei Liu, Vitaly Lyoshin, Zenith Maddipatla, Rama Maiti, Joseph Mitchell, Nuala O’Leary, George R Riley, Wenyao Shi, George Zhou, Valerie Schneider, Donna Maglott, J Bradley Holmes, and Brandi L Kattman. ClinVar: improvements to accessing data. Nucleic Acids Research, 48(D1):D835–D844, January 2020. ISSN 0305-1048. doi: 10.1093/nar/gkz972. URL https://doi.org/10.1093/nar/gkz972.
1027 33 Richard Leslie, Christopher J. O’Donnell, and Andrew D. Johnson. GRASP: analysis of geno-type–phenotype results from 1390 genome-wide association studies and corresponding open access database. Bioinformatics, 30(12):i185–i194, June 2014. ISSN 1367-4803. doi:10.1093/bioinformatics/btu273. URL https://doi.org/10.1093/bioinformatics/btu273.
1028 34 Benjamin Levy, Zihao Xu, Liyang Zhao, Karl Kremling, Ross Altman, Phoebe Wong, and Chris Tanner. FloraBERT: cross-species transfer learning withattention-based neural networks for geneexpression prediction. preprint, In Review, August 2022. URL https://www.researchsquare.com/article/rs-1927200/v1.
1029 35 Zeming Lin, Halil Akin, Roshan Rao, Brian Hie, Zhongkai Zhu, Wenting Lu, Nikita Smetanin, Robert Verkuil, Ori Kabeli, Yaniv Shmueli, Allan dos Santos Costa, Maryam Fazel-Zarandi, Tom Sercu, Salvatore Candido, and Alexander Rives. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science, 379(6637):1123–1130, March 2023. ISSN 0036-8075, 1095-9203. doi: 10.1126/science.ade2574. URL https://www.science.org/doi/10.1126/science.ade2574.
1030 36 Yunhai Luo, Benjamin C. Hitz, Idan Gabdank, Jason A. Hilton, Meenakshi S. Kagda, Bonita Lam, Zachary Myers, Paul Sud, Jennifer Jou, Khine Lin, Ulugbek K. Baymuradov, Keenan Graham, Casey Litton, Stuart R. Miyasato, J. Seth Strattan, Otto Jolanki, Jin-Wook Lee, Forrest Y. Tanaka, Philip Adenekan, Emma O’Neill, and J. Michael Cherry. New developments on the Encyclopedia of DNA Elements (ENCODE) data portal. Nucleic Acids Research, 48(D1):D882–D889, January 2020. ISSN 1362-4962. doi: 10.1093/nar/gkz1062.
1031 37 Ali Madani, Ben Krause, Eric R. Greene, Subu Subramanian, Benjamin P. Mohr, James M. Holton, Jose Luis Olmos, Caiming Xiong, Zachary Z. Sun, Richard Socher, James S. Fraser, and Nikhil Naik. Large language models generate functional protein sequences across diverse families. Nature Biotechnology, January 2023. ISSN 1087-0156, 1546-1696. doi: 10.1038/s41587-022-01618-2. URL https://www.nature.com/articles/s41587-022-01618-2.
1032 38 William McLaren, Laurent Gil, Sarah E. Hunt, Harpreet Singh Riat, Graham R. S. Ritchie, Anja Thormann, Paul Flicek, and Fiona Cunningham. The Ensembl Variant Effect Predictor. Genome Biology, 17(1):122, June 2016. ISSN 1474-760X. doi: 10.1186/s13059-016-0974-4. URL https://doi.org/10.1186/s13059-016-0974-4.
1033 39 Gil A. McVean, David M. Altshuler (Co-Chair), Richard M. Durbin (Co-Chair), Gonc¸alo R. Abecasis, David R. Bentley, Aravinda Chakravarti, Andrew G. Clark, Peter Donnelly, Evan E. Eichler, Paul Flicek, Stacey B. Gabriel, Richard A. Gibbs, Eric D. Green, Matthew E. Hurles, Bartha M. Knoppers, Jan O. Korbel, Eric S. Lander, Charles Lee, Hans Lehrach, Elaine R. Mardis, Gabor T. Marth, Gil A. McVean, Deborah A. Nickerson, Jeanette P. Schmidt, Stephen T. Sherry, Jun Wang, Richard K. Wilson, Richard A. Gibbs (Principal Investigator), Huyen Dinh, Christie Kovar, Sandra Lee, Lora Lewis, Donna Muzny, Jeff Reid, Min Wang, Jun Wang (Principal Investigator), Xiaodong Fang, Xiaosen Guo, Min Jian, Hui Jiang, Xin Jin, Guoqing Li, Jingxiang Li, Yingrui Li, Zhuo Li, Xiao Liu, Yao Lu, Xuedi Ma, Zhe Su, Shuaishuai Tai, Meifang Tang, Bo Wang, Guangbiao Wang, Honglong Wu, Renhua Wu, Ye Yin, Wenwei Zhang, Jiao Zhao, Meiru Zhao, Xiaole Zheng, Yan Zhou, Eric S. Lander (Principal Investigator), David M. Altshuler, Stacey B. Gabriel (Co-Chair), Namrata Gupta, Paul Flicek (Principal Investigator), Laura Clarke, Rasko Leinonen, Richard E. Smith, Xiangqun Zheng-Bradley, David R. Bentley (Principal Investigator), Published as a conference paper at ICLR 2024
1034 40 Russell Grocock, Sean Humphray, Terena James, Zoya Kingsbury, Hans Lehrach (Principal Investigator), Ralf Sudbrak (Project Leader), Marcus W. Albrecht, Vyacheslav S. Amstislavskiy, Tatiana A. Borodina, Matthias Lienhard, Florian Mertes, Marc Sultan, Bernd Timmermann, MarieLaure Yaspo, Stephen T. Sherry (Principal Investigator), Gil A. McVean (Principal Investigator), Elaine R. Mardis (Co-Principal Investigator) (Co-Chair), Richard K. Wilson (Co-Principal Investigator), Lucinda Fulton, Robert Fulton, George M. Weinstock, Richard M. Durbin (Principal Investigator), Senduran Balasubramaniam, John Burton, Petr Danecek, Thomas M. Keane, Anja Kolb-Kokocinski, Shane McCarthy, James Stalker, Michael Quail, Jeanette P. Schmidt (Principal Investigator), Christopher J. Davies, Jeremy Gollub, Teresa Webster, Brant Wong, Yiping Zhan, Adam Auton (Principal Investigator), Richard A. Gibbs (Principal Investigator), Fuli Yu (Project Leader), Matthew Bainbridge, Danny Challis, Uday S. Evani, James Lu, Donna Muzny, Uma Nagaswamy, Jeff Reid, Aniko Sabo, Yi Wang, Jin Yu, Jun Wang (Principal Investigator), Lachlan J. M. Coin, Lin Fang, Xiaosen Guo, Xin Jin, Guoqing Li, Qibin Li, Yingrui Li, Zhenyu Li, Haoxiang Lin, Binghang Liu, Ruibang Luo, Nan Qin, Haojing Shao, Bingqiang Wang, Yinlong Xie, Chen Ye, Chang Yu, Fan Zhang, Hancheng Zheng, Hongmei Zhu, Gabor T. Marth (Principal Investigator), Erik P. Garrison, Deniz Kural, Wan-Ping Lee, Wen Fung Leong, Alistair N. Ward, Jiantao Wu, Mengyao Zhang, Charles Lee (Principal Investigator), Lauren Griffin, Chih-Heng Hsieh, Ryan E. Mills, Xinghua Shi, Marcin von Grotthuss, Chengsheng Zhang, Mark J. Daly (Principal Investigator), Mark A. DePristo (Project Leader), David M. Altshuler, Eric Banks, Gaurav Bhatia, Mauricio O. Carneiro, Guillermo del Angel, Stacey B. Gabriel, Giulio Genovese, Namrata Gupta, Robert E. Handsaker, Chris Hartl, Eric S. Lander, Steven A. McCarroll, James C. Nemesh, Ryan E. Poplin, Stephen F. Schaffner, Khalid Shakir, Seungtai C. Yoon (Principal Investigator), Jayon Lihm, Vladimir Makarov, Hanjun Jin (Principal Investigator), Wook Kim, Ki Cheol Kim, Jan O. Korbel (Principal Investigator), Tobias Rausch, Paul Flicek (Principal Investigator), Kathryn Beal, Laura Clarke, Fiona Cunningham, Javier Herrero, William M. McLaren, Graham R. S. Ritchie, Richard E. Smith, Xiangqun Zheng-Bradley, Andrew G. Clark (Principal Investigator), Srikanth Gottipati, Alon Keinan, Juan L. RodriguezFlores, Pardis C. Sabeti (Principal Investigator), Sharon R. Grossman, Shervin Tabrizi, Ridhi Tariyal, David N. Cooper (Principal Investigator), Edward V. Ball, Peter D. Stenson, David R. Bentley (Principal Investigator), Bret Barnes, Markus Bauer, R. Keira Cheetham, Tony Cox, Michael Eberle, Sean Humphray, Scott Kahn, Lisa Murray, John Peden, Richard Shaw, Kai Ye (Principal Investigator), Mark A. Batzer (Principal Investigator), Miriam K. Konkel, Jerilyn A. Walker, Daniel G. MacArthur (Principal Investigator), Monkol Lek, Sudbrak (Project Leader), Vyacheslav S. Amstislavskiy, Ralf Herwig, Mark D. Shriver (Principal Investigator), Carlos D. Bustamante (Principal Investigator), Jake K. Byrnes, Francisco M. De La Vega, Simon Gravel, Eimear E. Kenny, Jeffrey M. Kidd, Phil Lacroute, Brian K. Maples, Andres MorenoEstrada, Fouad Zakharia, Eran Halperin (Principal Investigator), Yael Baran, David W. Craig (Principal Investigator), Alexis Christoforides, Nils Homer, Tyler Izatt, Ahmet A. Kurdoglu, Shripad A. Sinari, Kevin Squire, Stephen T. Sherry (Principal Investigator), Chunlin Xiao, Jonathan Sebat (Principal Investigator), Vineet Bafna, Kenny Ye, Esteban G. Burchard (Principal Investigator), Ryan D. Hernandez (Principal Investigator), Christopher R. Gignoux, David Haussler (Principal Investigator), Sol J. Katzman, W. James Kent, Bryan Howie, Andres Ruiz-Linares (Principal Investigator), The 1000 Genomes Project Consortium, Corresponding Author, Steering committee, Production group:, Baylor College of Medicine, BGI-Shenzhen, Broad Institute of MIT and Harvard, European Bioinformatics Institute, Illumina, Max Planck Institute for Molecular Genetics, US National Institutes of Health, University of Oxford, Washington University in St Louis, Wellcome Trust Sanger Institute, Analysis group:, Affymetrix, Albert Einstein College of Medicine, Boston College, Brigham and Women’s Hospital, Cold Spring Harbor Laboratory, Dankook University, European Molecular Biology Laboratory, Cornell University, Harvard University, Human Gene Mutation Database, Leiden University Medical Center, Louisiana State University, Massachusetts General Hospital, Pennsylvania State University, Stanford University, Tel-Aviv University, Translational Genomics Research Institute, San Diego University of California, San Francisco University of California, Santa Cruz University of California, University of Chicago, University College London, and University of Geneva. An integrated map of genetic variation from 1,092 human genomes. Nature, 491(7422):56–65, November 2012. ISSN 1476-4687. doi: 10.1038/nature11632. URL https://doi.org/10.1038/nature11632.
1035 41 Stephen Merity, Nitish Shirish Keskar, and Richard Socher. Regularizing and Optimizing LSTM Language Models. 2017. doi: 10.48550/ARXIV.1708.02182. URL https://arxiv.org/
1036 42 Published as a conference paper at ICLR 2024 abs/1708.02182. Publisher: arXiv Version Number: 1.
1037 43 Eric Nguyen, Michael Poli, Marjan Faizi, Armin Thomas, Callum Birch-Sykes, Michael Wornow, Aman Patel, Clayton Rabideau, Stefano Massaroli, Yoshua Bengio, Stefano Ermon, Stephen A. Baccus, and Chris R´ e. HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution. 2023. doi: 10.48550/ARXIV.2306.15794. URL https://arxiv.org/abs/2306.15794. Publisher: arXiv Version Number: 1.
1038 44 Michael Poli, Stefano Massaroli, Eric Nguyen, Daniel Y. Fu, Tri Dao, Stephen Baccus, Yoshua Bengio, Stefano Ermon, and Christopher R´ e. Hyena Hierarchy: Towards Larger Convolutional Language Models, April 2023. URL http://arxiv.org/abs/2302.10866. arXiv:2302.10866 [cs].
1039 45 Ofir Press, Noah A. Smith, and Mike Lewis. Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation, April 2022. URL http://arxiv.org/abs/2108.12409. arXiv:2108.12409 [cs].
1040 46 Roshan Rao, Nicholas Bhattacharya, Neil Thomas, Yan Duan, Xi Chen, John Canny, Pieter Abbeel, and Yun S. Song. Evaluating Protein Transfer Learning with TAPE. Advances in neural information processing systems, 32:9689–9701, December 2019. ISSN 1049-5258. URL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7774645/.
1041 47 Roshan Rao, Joshua Meier, Tom Sercu, Sergey Ovchinnikov, and Alexander Rives. Transformer protein language models are unsupervised structure learners. preprint, Synthetic Biology, December 2020. URL http://biorxiv.org/lookup/doi/10.1101/2020.12.15.422761.
1042 48 Alexander Rives, Joshua Meier, Tom Sercu, Siddharth Goyal, Zeming Lin, Jason Liu, Demi Guo, Myle Ott, C. Lawrence Zitnick, Jerry Ma, and Rob Fergus. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. preprint, Synthetic Biology, April 2019. URL http://biorxiv.org/lookup/doi/10.1101/622803.
1043 49 Marco Salvatore, Marc Horlacher, Annalisa Marsico, Ole Winther, and Robin Andersson. Transfer learning identifies sequence determinants of cell-type specific regulatory element accessibility. NAR Genomics and Bioinformatics, 5(2):lqad026, March 2023. ISSN 2631-9268. doi: 10.1093/nargab/lqad026. URL https://academic.oup.com/nargab/article/doi/10.1093/nargab/lqad026/7092956.
1044 50 Melissa Sanabria, Jonas Hirsch, and Anna R. Poetsch. The human genome’s vocabulary as proposed by the DNA language model GROVER, September 2023. URL https://www.biorxiv.org/content/10.1101/2023.07.19.549677v2. Pages: 2023.07.19.549677 Section: New Results.
1045 51 Nicolas Scalzitti, Anne Jeannin-Girardon, Pierre Collet, Olivier Poch, and Julie D. Thompson. A benchmark study of ab initio gene prediction methods in diverse eukaryotic organisms. BMC Genomics, 21(1):293, December 2020. ISSN 1471-2164. doi:10.1186/s12864-020-6707-9. URL https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-020-6707-9.
1046 52 Valerie A. Schneider, Tina Graves-Lindsay, Kerstin Howe, Nathan Bouk, Hsiu-Chuan Chen, Paul A. Kitts, Terence D. Murphy, Kim D. Pruitt, Franc¸oise Thibaud-Nissen, Derek Albracht, Robert S. Fulton, Milinn Kremitzki, Vincent Magrini, Chris Markovic, Sean McGrath, Karyn Meltz Steinberg, Kate Auger, William Chow, Joanna Collins, Glenn Harden, Timothy Hubbard, Sarah Pelan, Jared T. Simpson, Glen Threadgold, James Torrance, Jonathan M. Wood, Laura Clarke, Sergey Koren, Matthew Boitano, Paul Peluso, Heng Li, Chen-Shan Chin, Adam M. Phillippy, Richard Durbin, Richard K. Wilson, Paul Flicek, Evan E. Eichler, and Deanna M. Church. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Research, 27(5):849–864, May 2017. ISSN 1549-5469. doi:10.1101/gr.213611.116.
1047 53 Ritambhara Singh, Jack Lanchantin, Gabriel Robins, and Yanjun Qi. DeepChrome: deep-learning for predicting gene expression from histone modifications. Bioinformatics, 32(17): i639–i648, September 2016. ISSN 1367-4803, 1367-4811. doi: 10.1093/bioinformatics/ Published as a conference paper at ICLR 2024 btw427. URL https://academic.oup.com/bioinformatics/article/32/17/i639/2450757.
1048 54 Mario Stanke and Stephan Waack. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics (Oxford, England), 19 Suppl 2:ii215–225, October 2003. ISSN 1367-4811. doi: 10.1093/bioinformatics/btg1080.
1049 55 Felix Teufel, Magn´ us Halld´ or G´ ıslason, Jos´ e Juan Almagro Armenteros, Alexander Rosenberg Johansen, Ole Winther, and Henrik Nielsen. GraphPart: homology partitioning for biological sequence analysis. NAR Genomics and Bioinformatics, 5(4):lqad088, October 2023. ISSN 2631-9268. doi: 10.1093/nargab/lqad088. URL https://academic.oup.com/nargab/article/doi/10.1093/nargab/lqad088/7318077.
1050 56 Jesse Vig, Ali Madani, Lav R. Varshney, Caiming Xiong, Richard Socher, and Nazneen Fatema Rajani. BERTology Meets Biology: Interpreting Attention in Protein Language Models, March 2021. URL http://arxiv.org/abs/2006.15222. arXiv:2006.15222 [cs, q-bio].
1051 57 Minghao Xu, Zuobai Zhang, Jiarui Lu, Zhaocheng Zhu, Yangtian Zhang, Ma Chang, Runcheng Liu, and Jian Tang. PEER: A Comprehensive and Multi-Task Benchmark for Protein Sequence Understanding. Advances in Neural Information Processing Systems, 35:35156–35173, December 2022. URL https://proceedings.neurips.cc/paper_files/paper/2022/hash/e467582d42d9c13fa9603df16f31de6d-Abstract-Datasets_and_Benchmarks.html.
1052 58 Meng Yang, Haiping Huang, Lichao Huang, Nan Zhang, Jihong Wu, Huanming Yang, and Feng Mu. LOGO, a contextualized pre-trained language model of human genome flexibly adapts to various downstream tasks by fine-tuning. preprint, In Review, August 2021. URL https://www.researchsquare.com/article/rs-448927/v1.
1053 59 Manzil Zaheer, Guru Guruganesh, Kumar Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, and Amr Ahmed. Big Bird: Transformers for Longer Sequences. In Advances in Neural Information Processing Systems, volume 33, pp. 17283–17297. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper/2020/hash/c8512d142a2d849725f31a9a7a361ab9-Abstract.html.
1054 60 Jian Zhou and Olga G Troyanskaya. Predicting effects of noncoding variants with deep learning–based sequence model. Nature Methods, 12(10):931–934, October 2015. ISSN 1548-7091,1548-7105. doi: 10.1038/nmeth.3547. URL https://www.nature.com/articles/nmeth.3547.
1055 61 Naihui Zhou, Yuxiang Jiang, Timothy R. Bergquist, Alexandra J. Lee, Balint Z. Kacsoh, Alex W. Crocker, Kimberley A. Lewis, George Georghiou, Huy N. Nguyen, Md Nafiz Hamid, Larry Davis, Tunca Dogan, Volkan Atalay, Ahmet S. Rifaioglu, Alperen Dalkıran, Rengul Cetin Atalay, Chengxin Zhang, Rebecca L. Hurto, Peter L. Freddolino, Yang Zhang, Prajwal Bhat, Fran Supek, Jos´ e M. Fern´ andez, Branislava Gemovic, Vladimir R. Perovic, Radoslav S. Davidovi´ c, Neven Sumonja, Nevena Veljkovic, Ehsaneddin Asgari, Mohammad R.K. Mofrad, Giuseppe Profiti, Castrense Savojardo, Pier Luigi Martelli, Rita Casadio, Florian Boecker, Heiko Schoof, Indika Kahanda, Natalie Thurlby, Alice C. McHardy, Alexandre Renaux, Rabie Saidi, Julian Gough, Alex A. Freitas, Magdalena Antczak, Fabio Fabris, Mark N. Wass, Jie Hou, Jianlin Cheng, Zheng Wang, Alfonso E. Romero, Alberto Paccanaro, Haixuan Yang, Tatyana Goldberg, Chenguang Zhao, Liisa Holm, Petri T¨or onen, Alan J. Medlar, Elaine Zosa, Itamar Borukhov, Ilya Novikov, Angela Wilkins, Olivier Lichtarge, Po-Han Chi, Wei-Cheng Tseng, Michal Linial, Peter W. Rose, Christophe Dessimoz, Vedrana Vidulin, Saso Dzeroski, Ian Sillitoe, Sayoni Das, Jonathan Gill Lees, David T. Jones, Cen Wan, Domenico Cozzetto, Rui Fa, Mateo Torres, Alex Warwick Vesztrocy, Jose Manuel Rodriguez, Michael L. Tress, Marco Frasca, Marco Notaro, Giuliano Grossi, Alessandro Petrini, Matteo Re, Giorgio Valentini, Marco Mesiti, Daniel B. Roche, Jonas Reeb, David W. Ritchie, Sabeur Aridhi, Seyed Ziaeddin Alborzi, Marie-Dominique Devignes, Da Chen Emily Koo, Richard Bonneau, Vladimir Gligorijevi´ c, Meet Barot, Hai Fang, Stefano Toppo, Enrico Lavezzo, Marco Falda, Michele Berselli, Silvio C.E. Tosatto, Marco Carraro, Damiano Piovesan, Hafeez Ur Rehman, Qizhong Mao, Shanshan Zhang, Slobodan Vucetic, Gage S. Black, Dane Jo, Erica Suh, Jonathan B. Dayton, Dallas J. Larsen, Ashton R. Omdahl, Liam J. McGuffin, Danielle A. Brackenridge, Patricia C. Babbitt, Jeffrey M. Yunes, Paolo Fontana, Feng Zhang, Shanfeng Zhu, Ronghui You, Zihan Zhang, Suyang Dai, Shuwei Yao, Weidong Tian, Renzhi Cao, Caleb Chandler, Miguel Amezola, Devon Johnson, Jia-Ming Chang, Wen-Hung Liao, Yi-Wei Liu, Stefano Pascarelli, Yotam Frank, Robert Hoehndorf, Maxat Kulmanov, Imane Boudellioua, Gianfranco Politano, Stefano Di Carlo, Alfredo Benso, Kai Hakala, Filip Ginter, Farrokh Mehryary, Suwisa Kaewphan, Jari Bj¨ orne, Hans Moen, Martti E.E. Tolvanen, Tapio Salakoski, Daisuke Kihara, Aashish Jain, Tomislav Smuc, Adrian Altenhoff, Asa Ben-Hur, Burkhard Rost, Steven E. Brenner, Christine A. Orengo, Constance J. Jeffery, Giovanni Bosco, Deborah A. Hogan, Maria J. Martin, Claire O’Donovan, Sean D. Mooney, Casey S. Greene, Predrag Radivojac, and Iddo Friedberg. The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens. Genome Biology, 20(1):244, November 2019. ISSN 1474-760X. doi: 10.1186/s13059-019-1835-8. URL https://doi.org/10.1186/s13059-019-1835-8.
1056 62 Zhihan Zhou, Yanrong Ji, Weijian Li, Pratik Dutta, Ramana Davuluri, and Han Liu. DNABERT-2: Efficient Foundation Model and Benchmark For Multi-Species Genome, June 2023. URL http://arxiv.org/abs/2306.15006. arXiv:2306.15006 [cs, q-bio].
1057 24 0 [24] E. Nguyen, M. Poli, M. Faizi, A. W. Thomas, C. B. Sykes, M. Wornow, A. Patel, C. Rabideau, S. Massaroli, Y. Bengio, S. Ermon, S. A. Baccus, and C. Ré. Hyenadna: Long-range genomic sequence modeling at single nucleotide resolution. ArXiv, 6 2023. ISSN 2331-8422. URL https://arxiv.org/pdf/2306.15794. https://qiita.com/kaizen_nagoya/items/07e1ba1138b0825c8a73
1058 1 Ž. Avsec, V. Agarwal, D. Visentin, J. R. Ledsam, A. Grabska-Barwinska, K. R. Taylor, Y. Assael, J. Jumper, P. Kohli, and D. R. Kelley. Effective gene expression prediction from sequence by integrating long-range interactions. Nature methods, 18(10):1196–1203, 2021.
1059 2 D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
1060 3 G. Benegas, S. S. Batra, and Y. S. Song. DNA language models are powerful zero-shot predictors of noncoding variant effects. bioRxiv, pages 2022–08, 2022.
1061 4 R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosselut, E. Brunskill, et al. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021.
1062 5 N. Brandes, D. Ofer, Y. Peleg, N. Rappoport, and M. Linial. ProteinBERT: a universal deep-learning model of protein sequence and function. Bioinformatics, 38(8):2102–2110, 2022.
1063 6 T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
1064 7 T. Chen and C. Guestrin. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pages 785–794, 2016.
1065 8 D. M. Church, V. A. Schneider, T. Graves, K. Auger, F. Cunningham, N. Bouk, H.-C. Chen, R. Agarwala, W. M. McLaren, G. R. Ritchie, et al. Modernizing reference genome assemblies. PLoS biology, 9(7): e1001091, 2011.
1066 9 F. Cunningham, J. E. Allen, J. Allen, J. Alvarez-Jarreta, M. R. Amode, I. M. Armean, O. Austine-Orimoloye, A. G. Azov, I. Barnes, R. Bennett, et al. Ensembl 2022. Nucleic acids research, 50(D1):D988–D995, 2022.
1067 10 H. Dalla-Torre, L. Gonzalez, J. Mendoza-Revilla, N. L. Carranza, A. H. Grzywaczewski, F. Oteri, C. Dallago, E. Trop, H. Sirelkhatim, G. Richard, M. Skwark, K. Beguir, M. Lopez, and T. Pierrot. The Nucleotide Transformer: Building and evaluating robust foundation models for human genomics. bioRxiv, 2023.
1068 11 T. Dao, D. Y. Fu, S. Ermon, A. Rudra, and C. Ré. FlashAttention: Fast and memory-efficient exact attention with IO-awareness. In Advances in Neural Information Processing Systems, 2022a.
1069 12 T. Dao, D. Y. Fu, K. K. Saab, A. W. Thomas, A. Rudra, and C. Ré. Hungry hungry hippos: Towards language modeling with state space models. arXiv preprint arXiv:2212.14052, 2022b.
1070 13 J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
1071 14 A. Elnaggar, M. Heinzinger, C. Dallago, G. Rehawi, Y. Wang, L. Jones, T. Gibbs, T. Feher, C. Angerer, M. Steinegger, et al. Prottrans: Toward understanding the language of life through self-supervised learning. IEEE transactions on pattern analysis and machine intelligence, 44(10):7112–7127, 2021.
1072 15 ENCODE Project Consortium. An integrated encyclopedia of dna elements in the human genome. Nature, 489(7414):57, 2012.
1073 16 ENCODEProjectConsortium. Expande dencyclopaedias of DNA elements in the humanan dmouse genomes. Nature, 583:699–710, 2020.
1074 17 N. Ferruz, S. Schmidt, and B. Höcker. ProtGPT2 is a deep unsupervised language model for protein design. Nature communications, 13(1):4348, 2022.
1075 18 Q. Fournier, G. M. Caron, and D. Aloise. A practical survey on faster and lighter transformers. ACM Computing Surveys, 2021.
1076 19 D. Y. Fu, E. L. Epstein, E. Nguyen, A. W. Thomas, M. Zhang, T. Dao, A. Rudra, and C. Ré. Simple hardware-efficient long convolutions for sequence modeling. arXiv preprint arXiv:2302.06646, 2023.
1077 20 D. Gankin, A. Karollus, M. Grosshauser, K. Klemon, J. Hingerl, and J. Gagneur. Species-aware DNA language modeling. bioRxiv, pages 2023–01, 2023.
1078 21 Q. Geng, R. Yang, and L. Zhang. A deep learning framework for enhancer prediction using word embedding and sequence generation. Biophysical Chemistry, 286:106822, 2022.
1079 22 Genome Reference Consortium. Genome reference consortium human build 38 (grch38). National Center for Biotechnology Information, 2013. URL https://www.ncbi.nlm.nih.gov/assembly/GCF_000001405.26/.
1080 23 K. Gresova, V. Martinek, D. Cechak, P. Simecek, and P. Alexiou. Genomic Benchmarks: A collection of datasets for genomic sequence classification. bioRxiv, 2022.
1081 24 A. Gu, K. Goel, and C. Ré. Efficiently modeling long sequences with structured state spaces. arXiv preprint arXiv:2111.00396, 2021.
1082 25 Y. Ji, Z. Zhou, H. Liu, and R. V. Davuluri. DNABERT: pre-trained bidirectional encoder representations from transformers model for DNA-language in genome. Bioinformatics, 37(15):2112–2120, 2021.
1083 26 W. J. Kent, C. W. Sugnet, T. S. Furey, K. M. Roskin, T. H. Pringle, A. M. Zahler, and D. Haussler. The human genome browser at ucsc. Genome research, 12(6):996–1006, 2002.
1084 27 B. Lester, R. Al-Rfou, and N. Constant. The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691, 2021.
1085 28 C. Li, M. Zhang, and Y. He. The stability-efficiency dilemma: Investigating sequence length warmup for training GPT models. In Advances in Neural Information Processing Systems, 2022.
1086 29 Z. Lin, H. Akin, R. Rao, B. Hie, Z. Zhu, W. Lu, A. dos Santos Costa, M. Fazel-Zarandi, T. Sercu, S. Candido, et al. Language models of protein sequences at the scale of evolution enable accurate structure prediction. BioRxiv, 2022.
1087 30 P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9):1–35, 2023.
1088 31 A. Madani, B. Krause, E. R. Greene, S. Subramanian, B. P. Mohr, J. M. Holton, J. L. Olmos Jr, C. Xiong, Z. Z. Sun, R. Socher, et al. Large language models generate functional protein sequences across diverse families. Nature Biotechnology, pages 1–8, 2023.
1089 32 J. Meier, R. Rao, R. Verkuil, J. Liu, T. Sercu, and A. Rives. Language models enable zero-shot prediction of the effects of mutations on protein function. Advances in Neural Information Processing Systems, 34: 29287–29303, 2021.
1090 33 J. Nasser, D. T. Bergman, C. P. Fulco, P. Guckelberger, B. R. Doughty, T. A. Patwardhan, T. R. Jones, T. H. Nguyen, J. C. Ulirsch, F. Lekschas, K. Mualim, H. M. Natri, E. M. Weeks, G. Munson, M. Kane, H. Y. Kang, A. Cui, J. P. Ray, T. M. Eisenhaure, R. L. Collins, K. Dey, H. Pfister, A. L. Price, C. B. Epstein, A. Kundaje, R. J. Xavier, M. J. Daly, H. Huang, H. K. Finucane, N. Hacohen, E. S. Lander, and J. M. Engreitz. Genome-wide enhancer maps link risk variants to disease genes. Nature, 593:238–243, 2021.
1091 34 M. Oubounyt, Z. Louadi, H. Tayara, and K. T. Chong. DeePromoter: Robust promoter predictor using deep learning. Frontiers in Genetics, 10, 2019.
1092 35 T. H. Pham, D. H. Tran, T. B. H. Ho, K. Satou, and G. Valiente. Qualitatively predicting acetylation and methylation areas in DNA sequences. Genome Informatics, 16(2):3–11, 2005.
1093 36 D. K. Pokholok, C. T. Harbison, S. Levine, F. Lewitter, D. K. Gifford, and R. A. Young. Genome-wide map of nucleosome acetylation and methylation in yeast. Cell, 122(4):517–527, 2005.
1094 37 M. Poli, S. Massaroli, E. Nguyen, D. Y. Fu, T. Dao, S. Baccus, Y. Bengio, S. Ermon, and C. Ré. Hyena Hierarchy: Towards larger convolutional language models. arXiv preprint arXiv:2302.10866, 2023.
1095 38 O. Press, N. A. Smith, and M. Lewis. Shortformer: Better language modeling using shorter inputs. arXiv preprint arXiv:2012.15832, 2020.
1096 39 A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, et al. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
1097 40 A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. Radford, M. Chen, and I. Sutskever. Zero-shot text-to-image generation. In International Conference on Machine Learning, pages 8821–8831. PMLR, 2021.
1098 41 R. Rao, J. Meier, T. Sercu, S. Ovchinnikov, and A. Rives. Transformer protein language models are unsupervised structure learners. Biorxiv, pages 2020–12, 2020.
1099 42 Roadmap Epigenomics Consortium. Integrative analysis of 111 reference human epigenomes. Nature, 518(7539):317–330, 2015.
1100 43 J. Rogers and R. A. Gibbs. Comparative primate genomics: emerging patterns of genome content and dynamics. Nature Reviews Genetics, 15(5):347–359, 2014.
1101 44 D. W. Romero, R.-J. Bruintjes, J. M. Tomczak, E. J. Bekkers, M. Hoogendoorn, and J. C. van Gemert. Flexconv: Continuous kernel convolution swith differentiable kernel sizes. arXiv preprint arXiv:2110.08059, 2021a.
1102 45 D. W. Romero, A. Kuzina, E. J. Bekkers, J. M. Tomczak, and M. Hoogendoorn. Ckconv: Continuous kernel convolution for sequential data. arXiv preprint arXiv:2102.02611, 2021b.
1103 46 N. Scalzitti, A. Kress, R. Orhand, T. Weber, L. Moulinier, A. Jeannin-Girardon, O. Collet, Pierre anf Poch, and J. D. Thompson. Spliceator: multi-species splice site prediction using convolutional neural networks. BMC Bioinformatics, 22(1):1–26, 2021.
1104 47 J. T. Smith, A. Warrington, and S. W. Linderman. Simplified state space layers for sequence modeling. arXiv preprint arXiv:2208.04933, 2022.
1105 48 Y. Tay, M. Dehghani, S. Abnar, Y. Shen, D. Bahri, P. Pham, J. Rao, L. Yang, S. Ruder, and D. Metzler. Long range arena: A benchmark for efficient transformers. arXiv preprint arXiv:2011.04006, 2020.
1106 49 Y. Tay, V. Q. Tran, S. Ruder, J. Gupta, H. W. Chung, D. Bahri, Z. Qin, S. Baumgartner, C. Yu, and D. Metzler. Charformer: Fast character transformers via gradient-based subword tokenization. arXiv preprint arXiv:2106.12672, 2021.
1107 50 L. Van der Maaten and G. Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
1108 51 A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
1109 52 J. Wei, M. Bosma, V. Y. Zhao, K. Guu, A. W. Yu, B. Lester, N. Du, A. M. Dai, and Q. V. Le. Finetuned language models are zero-shot learners. arXiv preprint arXiv:2109.01652, 2021.
1110 53 F. Yang, W. Wang, F. Wang, Y. Fang, D. Tang, J. Huang, H. Lu, and J. Yao. scBERT as a large-scale pretrained deep language model for cell type annotation of single-cell RNA-seq data. Nature Machine Intelligence, 4(10):852–866, 2022.
1111 54 J. Yu, Z. Wang, V. Vasudevan, L. Yeung, M. Seyedhosseini, and Y. Wu. Coca: Contrastive captioners are image-text foundation models. arXiv preprint arXiv:2205.01917, 2022.
1112 55 M. Zaheer, G. Guruganesh, K. A. Dubey, J. Ainslie, C. Alberti, S. Ontanon, P. Pham, A. Ravula, Q. Wang, L. Yang, et al. Big bird: Transformers for longer sequences. Advances in neural information processing systems, 33:17283–17297, 2020.
1113 56 S. Zaina, E. L. Pérez-Luque, and G. Lund. Genetics talks to epigenetics? the interplay between sequence variants and chromatin structure. Current genomics, 11(5):359–367, 2010.
1114 57 J. Zhou and O. G. Troyanskaya. Predicting effects of noncoding variants with deep learning–based sequence model. Nature methods, 12(10):931–934, 2015.
1115 58 M. Zvyagin, A. Brace, K. Hippe, Y. Deng, B. Zhang, C. O. Bohorquez, A. Clyde, B. Kale, D. Perez-Rivera, H. Ma, et al. GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics. bioRxiv, pages 2022–10, 2022.
1116 25 0 [25] OpenAI, :, A. Hurst, A. Lerer, A. P. Goucher, A. Perelman, A. Ramesh, A. Clark, A. Ostrow, A. Welihinda, A. Hayes, A. Radford, A. M ˛adry, A. Baker-Whitcomb, A. Beutel, A. Borzunov, A. Carney, A. Chow, A. Kirillov, A. Nichol, A. Paino, A. Renzin, A. T. Passos, A. Kirillov, A. Christakis, A. Conneau, A. Kamali, A. Jabri, A. Moyer, A. Tam, A. Crookes, A. Tootoochian, A. Tootoonchian, A. Kumar, A. Vallone, A. Karpathy, A. Braunstein, A. Cann, A. Codispoti, A. Galu, A. Kondrich, A. Tulloch, A. Mishchenko, A. Baek, A. Jiang, A. Pelisse, A. Woodford, A. Gosalia, A. Dhar, A. Pantuliano, A. Nayak, A. Oliver, B. Zoph, B. Ghorbani, B. Leimberger, B. Rossen, B. Sokolowsky, B. Wang, B. Zweig, B. Hoover, B. Samic, B. McGrew, B. Spero, B. Giertler, B. Cheng, B. Lightcap, B. Walkin, B. Quinn, B. Guarraci, B. Hsu, B. Kellogg, B. Eastman, C. Lugaresi, C. Wainwright, C. Bassin, C. Hudson, C. Chu, C. Nelson, C. Li, C. J. Shern, C. Conger, C. Barette, C. Voss, C. Ding, C. Lu, C. Zhang, C. Beaumont, C. Hallacy, C. Koch, C. Gibson, C. Kim, C. Choi, C. McLeavey, C. Hesse, C. Fischer, C. Winter, C. Czarnecki, C. Jarvis, C. Wei, C. Koumouzelis, D. Sherburn, D. Kappler, D. Levin, D. Levy, D. Carr, D. Farhi, D. Mely, D. Robinson, D. Sasaki, D. Jin, D. Valladares, D. Tsipras, D. Li, D. P. Nguyen, D. Findlay, E. Oiwoh, E. Wong, E. Asdar, E. Proehl, E. Yang, E. Antonow, E. Kramer, E. Peterson, E. Sigler, E. Wallace, E. Brevdo, E. Mays, F. Khorasani, F. P. Such, F. Raso, F. Zhang, F. von Lohmann, F. Sulit, G. Goh, G. Oden, G. Salmon, G. Starace, G. Brockman, H. Salman, H. Bao, H. Hu, H. Wong, H. Wang, H. Schmidt, H. Whitney, H. Jun, H. Kirchner, H. P. de Oliveira Pinto, H. Ren, H. Chang, H. W. Chung, I. Kivlichan, I. O’Connell, I. O’Connell, I. Osband, I. Silber, I. Sohl, I. Okuyucu, I. Lan, I. Kostrikov, I. Sutskever, I. Kanitscheider, I. Gulrajani, J. Coxon, J. Menick, J. Pachocki, J. Aung, J. Betker, J. Crooks, J. Lennon, J. Kiros, J. Leike, J. Park, J. Kwon, J. Phang, J. Teplitz, J. Wei, J. Wolfe, J. Chen, J. Harris, J. Varavva, J. G. Lee, J. Shieh, J. Lin, J. Yu, J. Weng, J. Tang, J. Yu, J. Jang, J. Q. Candela, J. Beutler, J. Landers, J. Parish, J. Heidecke, J. Schulman, J. Lachman, J. McKay, J. Uesato, J. Ward, J. W. Kim, J. Huizinga, J. Sitkin, J. Kraaijeveld, J. Gross, J. Kaplan, J. Snyder, J. Achiam, J. Jiao, J. Lee, J. Zhuang, J. Harriman, K. Fricke, K. Hayashi, K. Singhal, K. Shi, K. Karthik, K. Wood, K. Rimbach, K. Hsu, K. Nguyen, K. Gu-Lemberg, K. Button, K. Liu, K. Howe, K. Muthukumar, K. Luther, L. Ahmad, L. Kai, L. Itow, L. Workman, L. Pathak, L. Chen, L. Jing, L. Guy, L. Fedus, L. Zhou, L. Mamitsuka, L. Weng, L. McCallum, L. Held, L. Ouyang, L. Feuvrier, L. Zhang, L. Kondraciuk, L. Kaiser, L. Hewitt, L. Metz, L. Doshi, M. Aflak, M. Simens, M. Boyd, M. Thompson, M. Dukhan, M. Chen, M. Gray, M. Hudnall, M. Zhang, M. Aljubeh, M. Litwin, M. Zeng, M. Johnson, M. Shetty, M. Gupta, M. Shah, M. Yatbaz, M. J. Yang, M. Zhong, M. Glaese, M. Chen, M. Janner, M. Lampe, M. Petrov, M. Wu, M. Wang, M. Fradin, M. Pokrass, M. Castro, M. O. T. de Castro, M. Pavlov, M. Brundage, M. Wang, M. Khan, M. Murati, M. Bavarian, M. Lin, M. Yesildal, N. Soto, N. Gimelshein, N. Cone, N. Staudacher, N. Summers, N. LaFontaine, N. Chowdhury, N. Ryder, N. Stathas, N. Turley, N. Tezak, N. Felix, N. Kudige, N. Keskar, N. Deutsch, N. Bundick, N. Puckett, O. Nachum, O. Okelola, O. Boiko, O. Murk, O. Jaffe, O. Watkins, O. Godement, O. Campbell-Moore, P. Chao, P. McMillan, P. Belov, P. Su, P. Bak, P. Bakkum, P. Deng, P. Dolan, P. Hoeschele, P. Welinder, P. Tillet, P. Pronin, P. Tillet, P. Dhariwal, Q. Yuan, R. Dias, R. Lim, R. Arora, R. Troll, R. Lin, R. G. Lopes, R. Puri, R. Miyara, R. Leike, R. Gaubert, R. Zamani, R. Wang, R. Donnelly, R. Honsby, R. Smith, R. Sahai, R. Ramchandani, R. Huet, R. Carmichael, R. Zellers, R. Chen, R. Chen, R. Nigmatullin, R. Cheu, S. Jain, S. Altman, S. Schoenholz, S. Toizer, S. Miserendino, S. Agarwal, S. Culver, S. Ethersmith, S. Gray, S. Grove, S. Metzger, S. Hermani, S. Jain, S. Zhao, S. Wu, S. Jomoto, S. Wu, Shuaiqi, Xia, S. Phene, S. Papay, S. Narayanan, S. Coffey, S. Lee, S. Hall, S. Balaji, T. Broda, T. Stramer, T. Xu, T. Gogineni, T. Christianson, T. Sanders, T. Patwardhan, T. Cunninghman, T. Degry, T. Dimson, T. Raoux, T. Shadwell, T. Zheng, T. Underwood, T. Markov, T. Sherbakov, T. Rubin, T. Stasi, T. Kaftan, T. Heywood, T. Peterson, T. Walters, T. Eloundou, V. Qi, V. Moeller, V. Monaco, V. Kuo, V. Fomenko, W. Chang, W. Zheng, W. Zhou, W. Manassra, W. Sheu, W. Zaremba, Y. Patil, Y. Qian, Y. Kim, Y. Cheng, Y. Zhang, Y. He, Y. Zhang, Y. Jin, Y. Dai, and Y. Malkov. Gpt-4o system card. 10 2024. URL https://arxiv.org/pdf/2410.21276. https://qiita.com/kaizen_nagoya/items/06e4c54af663456b49f9
1117 1 [1] OpenAI, “Hello gpt-4,” 2024.
1118 2 [2] T. Stivers, N. J. Enfield, P. Brown, C. Englert, M. Hayashi, T. Heinemann, G. Hoymann, F. Rossano, J. P. de Ruiter, K. E. Yoon, and S. C. Levinson, “Universals and cultural variation in turn-taking in conversation,” Proceedings of the National Academy of Sciences, vol. 106, no. 26, pp. 10587–10592, 2009.
1119 3 [3] The White House, “Fact sheet: Biden-harris administration secures voluntary commitments from leading artificial intelligence companies to manage the risks posed by ai,” 2023.
1120 4 [4] OpenAI, “Openai preparedness framework beta,” 2023. https://cdn.openai.com/openai-preparedness-framework-beta.pdf.
1121 5 [5] Shutterstock, “Shutterstock press release,” 2023.
1122 6 [6] OpenAI, “Gpt-4 technical report,” 2023.
1123 7 [7] OpenAI, “Gpt-4v(ision) system card.” https://openai.com/index/gpt-4v-system-card/, 2023. Accessed: 2024-07-22.
1124 8 [8] OpenAI, “Navigating the challenges and opportunities of synthetic voices.” https://openai.com/index/navigating-the-challenges-and-opportunities-of-synthetic-voices/, 2024. Accessed: 2024-07-22.
1125 9 [9] K. T. Mai, S. Bray, T. Davies, and L. D. Griffin, “Warning: Humans cannot reliably detect speech deepfakes,” PLoS One, vol. 18, p. e0285333, Aug. 2023.
1126 10 [10] M. Mori, K. F. MacDorman, and N. Kageki, “The uncanny valley [from the field],” IEEE Robotics & automation magazine, vol. 19, no. 2, pp. 98–100, 2012.
1127 11 [11] OpenAI, “How the voices for chatgpt were chosen,” 2024.
1128 12 [12] I. Solaiman, Z. Talat, W. Agnew, L. Ahmad, D. Baker, S. L. Blodgett, C. Chen, H. D. I. au2, J. Dodge, I. Duan, E. Evans, F. Friedrich, A. Ghosh, U. Gohar, S. Hooker, Y. Jernite, R. Kalluri, A. Lusoli, A. Leidinger, M. Lin, X. Lin, S. Luccioni, J. Mickel, M. Mitchell, J. Newman, A. Ovalle, M.-T. Png, S. Singh, A. Strait, L. Struppek, and A. Subramonian, “Evaluating the social impact of generative ai systems in systems and society,” 2024.
1129 13 [13] R. Shelby, S. Rismani, K. Henne, A. Moon, N. Rostamzadeh, P. Nicholas, N. Yilla, J. Gallegos, A. Smart, E. Garcia, and G. Virk, “Sociotechnical harms of algorithmic systems: Scoping a taxonomy for harm reduction,” 2023.
1130 14 [14] S. L. Blodgett, Q. V. Liao, A. Olteanu, R. Mihalcea, M. Muller, M. K. Scheuerman, C. Tan, and Q. Yang, “Responsible language technologies: Foreseeing and mitigating harms,” in Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems, CHI EA ’22, (New York, NY, USA), Association for Computing Machinery, 2022.
1131 15 [15] H. Suresh and J. Guttag, “A framework for understanding sources of harm throughout the machine learning life cycle,” in Equity and Access in Algorithms, Mechanisms, and Optimization, EAAMO ’21, ACM, Oct. 2021.
1132 16 [16] S. Shahriar, S. Allana, S. M. Hazratifard, and R. Dara, “A survey of privacy risks and mitigation strategies in the artificial intelligence life cycle,” IEEE Access, vol. 11, pp. 61829–61854, 2023.
1133 17 [17] OpenAI, “Moderation overview,” 2024.
1134 18 [18] A. Tamkin, M. Brundage, J. Clark, and D. Ganguli, “Understanding the capabilities, limitations, and societal impact of large language models,” 2021.
1135 19 [19] B. Buchanan, A. Lohn, M. Musser, and K. Sedova, “Truth, lies, and automation: How language models could change disinformation,” May 2021.
1136 20 [20] OpenAI, “Openai usage policies,” 2023. https://openai.com/policies/usagepolicies/.
1137 21 [21] OpenAI, “Building an early warning system for llm-aided biological threat creation,” 2024. https://openai.com/index/building-an-early-warning-system-for-llm-aided-biological-threat-creation/.
1138 22 [22] Deloitte, “Deloitte acquires gryphon scientific business to expand security science and public health capabilities,” 2024. https://www2.deloitte.com/us/en/pages/about-deloitte/articles/press-releases/deloitte-acquires-gryphon-scientific-business-to-expand-security-science-and-public-health-capabilities.html.
1139 23 [23] L. Weidinger, M. Rauh, N. Marchal, A. Manzini, L. A. Hendricks, J. Mateos-Garcia, S. Bergman, J. Kay, C. Griffin, B. Bariach, I. Gabriel, V. Rieser, and W. Isaac, “Sociotechnical safety evaluation of generative ai systems,” 2023.
1140 24 [24] A. Tamkin, A. Askell, L. Lovitt, E. Durmus, N. Joseph, S. Kravec, K. Nguyen, J. Kaplan, and D. Ganguli, “Evaluating and mitigating discrimination in language model decisions,” 2023.
1141 25 [25] J. A. Goldstein, G. Sastry, M. Musser, R. DiResta, M. Gentzel, and K. Sedova, “Generative language models and automated influence operations: Emerging threats and potential mitigations,” 2023.
1142 26 [26] I. Pentina, T. Hancock, and T. Xie, “Exploring relationship development with social chatbots: A mixed-method study of replika,” Computers in Human Behavior, vol. 140, p. 107600, 2023.
1143 27 [27] Y. Bengio, G. Hinton, A. Yao, D. Song, P. Abbeel, T. Darrell, Y. N. Harari, Y.-Q. Zhang, L. Xue, S. Shalev-Shwartz, G. Hadfield, J. Clune, T. Maharaj, F. Hutter, A. G. Baydin, S. McIlraith, Q. Gao, A. Acharya, D. Krueger, A. Dragan, P. Torr, S. Russell, D. Kahneman, J. Brauner, and S. Mindermann, “Managing extreme ai risks amid rapid progress,” Science, vol. 384, no. 6698, pp. 842–845, 2024.
1144 28 [28] S. B. Johnson, J. R. Clark, M. C. Luetke, N. M. Butala, A. T. Pearson, J. M. Shapiro, D. M. Aleman, J. M. Lee, M. M. Beil, C. V. Winkle, M. C. Boudreaux, R. C. D’Cunha, H. J. Krouse, and C. Li, “Chatgpt in medical education: a workshop-based large language model-powered intervention for evidence-based clinical decision making in medical students,” Nature Medicine, vol. 29, pp. 1534–1542, 2023.
1145 29 [29] K. Kavukcuoglu, “Real-world challenges for agi,” Nov 2021.
1146 30 [30] S. Altman, “Planning for agi and beyond,” OpenAI, 2023.
1147 31 [31] T. Eloundou, S. Manning, P. Mishkin, and D. Rock, “Gpts are gpts: An early look at the labor market impact potential of large language models,” arXiv preprint arXiv:2303.10130, 2023.
1148 32 [32] L. Weidinger, M. Rauh, N. Marchal, A. Manzini, L. A. Hendricks, J. Mateos-Garcia, S. Bergman, J. Kay, C. Griffin, B. Bariach, et al., “Sociotechnical safety evaluation of generative ai systems,” arXiv preprint arXiv:2310.11986, 2023.
1149 33 [33] S. Cox, M. Hammerling, J. Lála, J. Laurent, S. Rodriques, M. Rubashkin, and A. White, “Wikicrow: Automating synthesis of human scientific knowledge,” Future House, 2023.
1150 34 [34] S. A. Athaluri, S. V. Manthena, V. S. R. K. M. Kesapragada, V. Yarlagadda, T. Dave, and R. T. S. Duddumpudi, “Exploring the boundaries of reality: Investigating the phenomenon of artificial intelligence hallucination in scientific writing through chatgpt references,” Cureus, vol. 15, no. 4, p. e37432, 2023.
1151 35 [35] Z. Li, “The dark side of chatgpt: Legal and ethical challenges from stochastic parrots and hallucination,” 2023.
1152 36 [36] M. Dubiel, A. Sergeeva, and L. A. Leiva, “Impact of voice fidelity on decision making: A potential dark pattern?,” 2024.
1153 37 [37] B. Waber, M. Williams, J. S. Carroll, and A. S. Pentland, “A voice is worth a thousand words: The implications of the micro-coding of social signals in speech for trust research,” in Handbook of Research Methods on Trust (G. M. Fergus Lyon and M. N. Saunders, eds.), ch. 23, p. 320, New York: Edward Elgar Publishing, 2011.
1154 38 [38] I. Pentina, B. Guo, and W. P. Fan, “Friend, mentor, lover: Does chatbot engagement lead to psychological dependence?,” Journal of Service Management, 2023.
1155 39 [39] H. Nori, N. King, S. M. McKinney, D. Carignan, and E. Horvitz, “Capabilities of gpt-4 on medical challenge problems,” arXiv preprint arXiv:2303.13375, 2023.
1156 40 [40] H. Nori, Y. T. Lee, S. Zhang, D. Carignan, R. Edgar, N. Fusi, N. King, J. Larson, Y. Li, W. Liu, et al., “Can generalist foundation models outcompete special-purpose tuning? case study in medicine,” arXiv preprint arXiv:2311.16452, 2023.
1157 41 [41] K. Singhal, S. Azizi, T. Tu, S. S. Mahdavi, J. Wei, H. W. Chung, N. Scales, A. Tanwani, H. Cole-Lewis, S. Pfohl, P. Payne, M. Seneviratne, P. Gamble, C. Kelly, N. Scharli, A. Chowdhery, P. Mansfield, B. A. y Arcas, D. Webster, G. S. Corrado, Y. Matias, K. Chou, J. Gottweis, N. Tomasev, Y. Liu, A. Rajkomar, J. Barral, C. Semturs, A. Karthikesalingam, and V. Natarajan, “Large language models encode clinical knowledge,” 2022.
1158 42 [42] K. Singhal, T. Tu, J. Gottweis, R. Sayres, E. Wulczyn, L. Hou, K. Clark, S. Pfohl, H. Cole-Lewis, D. Neal, M. Schaekermann, A. Wang, M. Amin, S. Lachgar, P. Mansfield, S. Prakash, B. Green, E. Dominowska, B. A. y Arcas, N. Tomasev, Y. Liu, R. Wong, C. Semturs, S. S. Mahdavi, J. Barral, D. Webster, G. S. Corrado, Y. Matias, S. Azizi, A. Karthikesalingam, and V. Natarajan, “Towards expert-level medical question answering with large language models,” 2023.
1159 43 [43] K. Saab, T. Tu, W.-H. Weng, R. Tanno, D. Stutz, E. Wulczyn, F. Zhang, T. Strother, C. Park, E. Vedadi, J. Z. Chaves, S.-Y. Hu, M. Schaekermann, A. Kamath, Y. Cheng, D. G. T. Barrett, C. Cheung, B. Mustafa, A. Palepu, D. McDuff, L. Hou, T. Golany, L. Liu, J. baptiste Alayrac, N. Houlsby, N. Tomasev, J. Freyberg, C. Lau, J. Kemp, J. Lai, S. Azizi, K. Kanada, S. Man, K. Kulkarni, R. Sun, S. Shakeri, L. He, B. Caine, A. Webson, N. Latysheva, M. Johnson, P. Mansfield, J. Lu, E. Rivlin, J. Anderson, B. Green, R. Wong, J. Krause, J. Shlens, E. Dominowska, S. M. A. Eslami, K. Chou, C. Cui, O. Vinyals, K. Kavukcuoglu, J. Manyika, J. Dean, D. Hassabis, Y. Matias, D. Webster, J. Barral, G. Corrado, C. Semturs, S. S. Mahdavi, J. Gottweis, A. Karthikesalingam, and V. Natarajan, “Capabilities of gemini models in medicine,” 2024.
1160 44 [44] Epic Systems Corporation, “Epic and microsoft bring gpt-4 to ehrs,” Epic, 2023.
1161 45 [45] D. Van Veen, C. Van Uden, L. Blankemeier, J.-B. Delbrouck, A. Aali, C. Bluethgen, A. Pareek, M. Polacin, E. P. Reis, A. Seehofnerová, et al., “Adapted large language models can outperform medical experts in clinical text summarization,” Nature medicine, vol. 30, no. 4, pp. 1134–1142, 2024.
1162 46 [46] Epic, “Epic and microsoft bring gpt-4 to ehrs,” 2023.
1163 47 [47] P. Garcia, S. P. Ma, S. Shah, M. Smith, Y. Jeong, A. Devon-Sand, M. Tai-Seale, K. Takazawa, D. Clutter, K. Vogt, C. Lugtu, M. Rojo, S. Lin, T. Shanafelt, M. A. Pfeffer, and C. Sharp, “Artificial Intelligence–Generated Draft Replies to Patient Inbox Messages,” JAMA Network Open, vol. 7, pp. e243201–e243201, 03 2024.
1164 48 [48] OpenAI, “Paradigm: Improving patient access to clinical trials.” https://openai.com/index/paradigm/, 2024. Accessed: 2024-08-07.
1165 49 [49] M. Hutson, “How ai is being used to accelerate clinical trials,” Nature, vol. 627, pp. S2–S5, 2024.
1166 50 [50] OpenAI, “Using gpt-4o reasoning to transform cancer care.” https://openai.com/index/color-health/, 2024. Accessed: 2024-08-07.
1167 51 [51] J. Varghese and J.-L. Chapiro, “Systematic analysis of chatgpt, google search and llama 2 for clinical decision support tasks,” Nature Communications, vol. 15, no. 1, p. 46411, 2024. Accessed: 2024-08-07.
1168 52 [52] E. Schmidt, “Ai will transform science.” https://www.technologyreview.com/2023/07/05/1075865/eric-schmidt-ai-will-transform-science/, 2023. Accessed: 2024-08-07.
1169 53 [53] N. Rosenberg, “Science, invention and economic growth,” The Economic Journal, vol. 84, no. 333, pp. 90–108, 1974.
1170 54 [54] R. M. Atlas and M. Dando, “The dual-use dilemma for the life sciences: Perspectives, conundrums, and global solutions,”Biosecurity and Bioterrorism: Biodefense Strategy, Practice, and Science, vol. 4, no. 3, pp. 276–286, 2006. PMID:16999588.
1171 55 [55] X. Gu and M. Krenn, “Generation and human-expert evaluation of interesting research ideas using knowledge graphs and large language models,” 2024.
1172 56 [56] A. Ghafarollahi and M. J. Buehler, “Atomagents: Alloy design and discovery through physics-aware multi-modal multi-agent artificial intelligence,” 2024.
1173 57 [57] J. M. Laurent, J. D. Janizek, M. Ruzo, M. M. Hinks, M. J. Hammerling, S. Narayanan, M. Ponnapati, A. D. White, and S. G. Rodriques, “Lab-bench: Measuring capabilities of language models for biology research,” 2024.
1174 58 [58] H. Cai, X. Cai, J. Chang, S. Li, L. Yao, C. Wang, Z. Gao, H. Wang, Y. Li, M. Lin, S. Yang, J. Wang, M. Xu, J. Huang, F. Xi, J. Zhuang, Y. Yin, Y. Li, C. Chen, Z. Cheng, Z. Zhao, L. Zhang, and G. Ke, “Sciassess: Benchmarking llm proficiency in scientific literature analysis,” 2024.
1175 59 [59] P. Clark, I. Cowhey, O. Etzioni, T. Khot, A. Sabharwal, C. Schoenick, and O. Tafjord, “Think you have solved question answering? try arc, the AI2 reasoning challenge,” CoRR, vol. abs/1803.05457, 2018.
1176 60 [60] S. Lin, J. Hilton, and O. Evans, “Truthfulqa: Measuring how models mimic human falsehoods,” CoRR, vol. abs/2109.07958, 2021.
1177 26 0 [26] M. Poli, J. Wang, S. Massaroli, J. Quesnelle, R. Carlow, E. Nguyen, and A. Thomas. Striped-Hyena: Moving Beyond Transformers with Hybrid Signal Processing Models, 12 2023. URL https://github.com/togethercomputer/stripedhyena. https://qiita.com/kaizen_nagoya/items/f87fac7f9a83f54328fe
1178 1 [27] Qwen, :, A. Yang, B. Yang, B. Zhang, B. Hui, B. Zheng, B. Yu, C. Li, D. Liu, F. Huang, H. Wei, H. Lin, J. Yang, J. Tu, J. Zhang, J. Yang, J. Yang, J. Zhou, J. Lin, K. Dang, K. Lu, K. Bao, K. Yang, L. Yu, M. Li, M. Xue, P. Zhang, Q. Zhu, R. Men, R. Lin, T. Li, T. Tang, T. Xia, X. Ren, X. Ren, Y. Fan, Y. Su, Y. Zhang, Y. Wan, Y. Liu, Z. Cui, Z. Zhang, and Z. Qiu. Qwen2.5 technical report. 12 2024. URL https://arxiv.org/pdf/2412.15115.
1179 2 Marah I Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat S. Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, S´ ebastien Bubeck, Martin Cai, Caio C´ esar Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Parul Chopra, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Dan Iter, Amit Garg, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Jamie Huynh, Mojan Javaheripi, Xin Jin, Piero Kauffmann, Nikos Karampatziakis, Dongwoo Kim, Mahoud Khademi, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Chen Liang, Weishung Liu, Eric Lin, Zeqi Lin, Piyush Madan, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Xia Song, Masahiro Tanaka, Xin Wang, Rachel Ward, Guanhua Wang, Philipp Witte, Michael Wyatt, Can Xu, Jiahang Xu, Sonali Yadav, Fan Yang, Ziyi Yang, Donghan Yu, Chengruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, and Xiren Zhou. Phi-3 technical report: A highly capable language model locally on your phone. CoRR, abs/2404.14219, 2024.
1180 3 Bo Adler, Niket Agarwal, Ashwath Aithal, Dong H. Anh, Pallab Bhattacharya, Annika Brundyn, Jared Casper, Bryan Catanzaro, Sharon Clay, Jonathan Cohen, Sirshak Das, Ayush Dattagupta, Olivier Delalleau, Leon Derczynski, Yi Dong, Daniel Egert, Ellie Evans, Aleksander Ficek, Denys Fridman, Shaona Ghosh, Boris Ginsburg, Igor Gitman, Tomasz Grzegorzek, Robert Hero, Jining Huang, Vibhu Jawa, Joseph Jennings, Aastha Jhunjhunwala, John Kamalu, Sadaf Khan, Oleksii Kuchaiev, Patrick LeGresley, Hui Li, Jiwei Liu, Zihan Liu, Eileen Peters Long, Ameya Mahabaleshwarkar, Somshubra Majumdar, James Maki, Miguel Martinez, Maer Rodrigues de Melo, Ivan Moshkov, Deepak Narayanan, Sean Narenthiran, Jesus Navarro, Phong Nguyen, Osvald Nitski, Vahid Noroozi, Guruprasad Nutheti, Christopher Parisien, Jupinder Parmar, Mostofa Patwary, Krzysztof Pawelec, Wei Ping, Shrimai Prabhumoye, Rajarshi Roy, Trisha Saar, Vasanth Rao Naik Sabavat, Sanjeev Satheesh, Jane Polak Scowcroft, Jason D. Sewall, Pavel Shamis, Gerald Shen, Mohammad Shoeybi, Dave Sizer, Misha Smelyanskiy, Felipe Soares, Makesh Narsimhan Sreedhar, Dan Su, Sandeep Subramanian, Shengyang Sun, Shubham Toshniwal, Hao Wang, Zhilin Wang, Jiaxuan You, Jiaqi Zeng, Jimmy Zhang, Jing Zhang, Vivienne Zhang, Yian Zhang, and Chen Zhu. Nemotron-4 340B technical report. CoRR, abs/2406.11704, 2024.
1181 4 Joshua Ainslie, James Lee-Thorp, Michiel de Jong, Yury Zemlyanskiy, Federico Lebr´ on, and Sumit Sanghai. GQA: Training generalized multi-query Transformer models from multi-head checkpoints. In EMNLP, pp. 4895–4901. Association for Computational Linguistics, 2023.
1182 5 Ebtesam Almazrouei, Hamza Alobeidli, Abdulaziz Alshamsi, Alessandro Cappelli, Ruxandra Cojocaru, M´ erouane Debbah, Etienne Goffinet, Daniel Hesslow, Julien Launay, Quentin Malartic, Daniele Mazzotta, Badreddine Noune, Baptiste Pannier, and Guilherme Penedo. The Falcon series of open language models. CoRR, abs/2311.16867, 2023.
1183 6 Chenxin An, Fei Huang, Jun Zhang, Shansan Gong, Xipeng Qiu, Chang Zhou, and Lingpeng Kong. Training-free long-context scaling of large language models. CoRR, abs/2402.17463, 2024.
1184 7 Anthropic. Introducing Claude, 2023a. URL https://www.anthropic.com/index/introducing-claude. Anthropic. Claude 2. Technical report, Anthropic, 2023b. URL https://www-files.anthropic.com/production/images/Model-Card-Claude-2.pdf.
1185 8 Anthropic. The Claude 3 model family: Opus, Sonnet, Haiku. Technical report, Anthropic, AI, 2024. URL https://www-cdn.anthropic.com/de8ba9b01c9ab7cbabf5c33b80b7bbc618857627/Model Card Claude 3.pdf.
1186 9 Jacob Austin, Augustus Odena, Maxwell I. Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie J. Cai, Michael Terry, Quoc V. Le, and Charles Sutton. Program synthesis with large language models. CoRR, abs/2108.07732, 2021.
1187 10 Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng Xu, Jin Xu, An Yang, Hao Yang, Jian Yang, Shusheng Yang, Yang Yao, Bowen Yu, Hongyi Yuan, Zheng Yuan, Jianwei Zhang, Xingxuan Zhang, Yichang Zhang, Zhenru Zhang, Chang Zhou, Jingren Zhou, Xiaohuan Zhou, and Tianhang Zhu. Qwen technical report. CoRR, abs/2309.16609, 2023.
1188 11 Yushi Bai, Xin Lv, Jiajie Zhang, Yuze He, Ji Qi, Lei Hou, Jie Tang, Yuxiao Dong, and Juanzi Li. LongAlign: A recipe for long context alignment of large language models. In EMNLP (Findings), pp. 1376–1395. Association for Computational Linguistics, 2024.
1189 12 Lucas Bandarkar, Davis Liang, Benjamin Muller, Mikel Artetxe, Satya Narayan Shukla, Donald Husa, Naman Goyal, Abhinandan Krishnan, Luke Zettlemoyer, and Madian Khabsa. The Belebele benchmark: A parallel reading comprehension dataset in 122 language variants. CoRR, abs/2308.16884, 2023.
1190 13 Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. Language models are few-shot learners. In NeurIPS, 2020.
1191 14 Boxi Cao, Keming Lu, Xinyu Lu, Jiawei Chen, Mengjie Ren, Hao Xiang, Peilin Liu, Yaojie Lu, Ben He, Xianpei Han, Le Sun, Hongyu Lin, and Bowen Yu. Towards scalable automated alignment of LLMs: A survey. CoRR, abs/2406.01252, 2024.
1192 15 Federico Cassano, John Gouwar, Daniel Nguyen, Sydney Nguyen, Luna Phipps-Costin, Donald Pinckney, Ming-Ho Yee, Yangtian Zi, Carolyn Jane Anderson, Molly Q. Feldman, Arjun Guha, Michael Greenberg, and Abhinav Jangda. MultiPL-E: A scalable and polyglot approach to benchmarking neural code generation. IEEE Trans. Software Eng., 49(7):3675–3691, 2023.
1193 16 Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Pond´ e de Oliveira Pinto, Jared Kaplan, Harrison Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shantanu Jain, William Saunders, Christopher Hesse, Andrew N. Carr, Jan Leike, Joshua Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, and Wojciech Zaremba. Evaluating large language models trained on code. CoRR, abs/2107.03374, 2021.
1194 17 Wenhu Chen, Ming Yin, Max Ku, Pan Lu, Yixin Wan, Xueguang Ma, Jianyu Xu, Xinyi Wang, and Tony Xia. TheoremQA: A theorem-driven question answering dataset. In EMNLP, pp. 7889–7901. Association for Computational Linguistics, 2023a.
1195 18 Zhihong Chen, Shuo Yan, Juhao Liang, Feng Jiang, Xiangbo Wu, Fei Yu, Guiming Hardy Chen, Junying Chen, Hongbo Zhang, Li Jianquan, Wan Xiang, and Benyou Wang. MultilingualSIFT: Multilingual supervised instruction fine-tuning, 2023b. URL https://github.com/FreedomIntelligence/MultilingualSIFT.
1196 19 Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, and Oyvind Tafjord. Think you have solved question answering? Try ARC, the AI2 reasoning challenge. CoRR, abs/1803.05457, 2018.
1197 20 Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, Christopher Hesse, and John Schulman. Training verifiers to solve math word problems. CoRR, abs/2110.14168, 2021.
1198 21 Damai Dai, Chengqi Deng, Chenggang Zhao, R. X. Xu, Huazuo Gao, Deli Chen, Jiashi Li, Wangding Zeng, Xingkai Yu, Y. Wu, Zhenda Xie, Y. K. Li, Panpan Huang, Fuli Luo, Chong Ruan, Zhifang Sui, and Wenfeng Liang. DeepSeekMoE: Towards ultimate expert specialization in mixture-of-experts language models. CoRR, abs/2401.06066, 2024.
1199 22 Yann N. Dauphin, Angela Fan, Michael Auli, and David Grangier. Language modeling with gated convolutional networks. In ICML, volume 70 of Proceedings of Machine Learning Research, pp. 933–941. PMLR, 2017.
1200 23 Guanting Dong, Keming Lu, Chengpeng Li, Tingyu Xia, Bowen Yu, Chang Zhou, and Jingren Zhou. Self-play with execution feedback: Improving instruction-following capabilities of large language models. CoRR, abs/2406.13542, 2024.
1201 24 Shihan Dou, Jiazheng Zhang, Jianxiang Zang, Yunbo Tao, Haoxiang Jia, Shichun Liu, Yuming Yang, Shenxi Wu, Shaoqing Zhang, Muling Wu, et al. Multi-programming language sandbox for llms. CoRR, abs/2410.23074, 2024.
1202 25 Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aur´ elien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Rozi` ere, Bethany Biron, Binh Tang, Bobbie Chern, Charlotte Caucheteux, Chaya Nayak, Chloe Bi, Chris Marra, Chris McConnell, Christian Keller, Christophe Touret, Chunyang Wu, Corinne Wong, Cristian Canton Ferrer, Cyrus Nikolaidis, Damien Allonsius, Daniel Song, Danielle Pintz, Danny Livshits, David Esiobu, Dhruv Choudhary, Dhruv Mahajan, Diego Garcia-Olano, Diego Perino, Dieuwke Hupkes, Egor Lakomkin, Ehab AlBadawy, Elina Lobanova, Emily Dinan, Eric Michael Smith, Filip Radenovic, Frank Zhang,
1203 26 Gabriel Synnaeve, Gabrielle Lee, Georgia Lewis Anderson, Graeme Nail, Gr´ egoire Mialon, Guan Pang, Guillem Cucurell, Hailey Nguyen, Hannah Korevaar, Hu Xu, Hugo Touvron, Iliyan Zarov, Imanol Arrieta Ibarra, Isabel M. Kloumann, Ishan Misra, Ivan Evtimov, Jade Copet, Jaewon Lee, Jan Geffert, Jana Vranes, Jason Park, Jay Mahadeokar, Jeet Shah, Jelmer van der Linde, Jennifer Billock, Jenny Hong, Jenya Lee, Jeremy Fu, Jianfeng Chi, Jianyu Huang, Jiawen Liu, Jie Wang, Jiecao Yu, Joanna Bitton, Joe Spisak, Jongsoo Park, Joseph Rocca, Joshua Johnstun, Joshua Saxe, Junteng Jia,
1204 27 Kalyan Vasuden Alwala, Kartikeya Upasani, Kate Plawiak, Ke Li, Kenneth Heafield, Kevin Stone, and et al. The Llama 3 herd of models. CoRR, abs/2407.21783, 2024.
1205 28 William Fedus, Barret Zoph, and Noam Shazeer. Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity. J. Mach. Learn. Res., 23:120:1–120:39, 2022.
1206 29 Alena Fenogenova, Artem Chervyakov, Nikita Martynov, Anastasia Kozlova, Maria Tikhonova, Albina Akhmetgareeva, Anton A. Emelyanov, Denis Shevelev, Pavel Lebedev, Leonid Sinev, Ulyana Isaeva, Katerina Kolomeytseva, Daniil Moskovskiy, Elizaveta Goncharova, Nikita Savushkin, Polina Mikhailova, Denis Dimitrov, Alexander Panchenko, and Sergey Markov. MERA: A comprehensive LLM evaluation in russian. CoRR, abs/2401.04531, 2024.
1207 30 Evan Frick, Peter Jin, Tianle Li, Karthik Ganesan, Jian Zhang, Jiantao Jiao, and Banghua Zhu. Athene-70b: Redefining the boundaries of post-training for open models, July 2024a. URL https://nexusflow.ai/blogs/athene.
1208 31 Evan Frick, Tianle Li, Connor Chen, Wei-Lin Chiang, Anastasios Nikolas Angelopoulos, Jiantao Jiao, Banghua Zhu, Joseph E. Gonzalez, and Ion Stoica. How to evaluate reward models for RLHF. CoRR, abs/2410.14872, 2024b.
1209 32 Aryo Pradipta Gema, Joshua Ong Jun Leang, Giwon Hong, Alessio Devoto, Alberto Carlo Maria Mancino, Rohit Saxena, Xuanli He, Yu Zhao, Xiaotang Du, Mohammad Reza Ghasemi Madani, et al. Are we done with mmlu? CoRR, abs/2406.04127, 2024.
1210 33 Gemini Team. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. Technical report, Google, 2024. URL https://storage.googleapis.com/deepmind-media/gemini/gemini v1 5 report.pdf.
1211 34 Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhupatiraju, L´ eonard Hussenot, Thomas Mesnard, Bobak Shahriari, Alexandre Ram´ e, et al. Gemma 2: Improving open language models at a practical size. CoRR, abs/2408.00118, 2024.
1212 35 Naman Goyal, Cynthia Gao, Vishrav Chaudhary, Peng-Jen Chen, Guillaume Wenzek, Da Ju, Sanjana Krishnan, Marc’Aurelio Ranzato, Francisco Guzm´ an, and Angela Fan. The Flores-101 evaluation benchmark for low-resource and multilingual machine translation. Trans. Assoc. Comput. Linguistics, 10:522–538, 2022.
1213 36 Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt. Measuring massive multitask language understanding. In ICLR. OpenReview.net, 2021a.
1214 37 Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, and Jacob Steinhardt. Measuring mathematical problem solving with the MATH dataset. In NeurIPS Datasets and Benchmarks, 2021b.
1215 38 Jordan Hoffmann, Sebastian Borgeaud, Arthur Mensch, Elena Buchatskaya, Trevor Cai, Eliza Rutherford, Diego de Las Casas, Lisa Anne Hendricks, Johannes Welbl, Aidan Clark, et al. Training computeoptimal large language models. CoRR, abs/2203.15556, 2022.
1216 39 Keith Hoskin. The “awful idea of accountability”: Inscribing people into the measurement of objects. Accountability: Power, ethos and the technologies of managing, 1996.
1217 40 Cheng-Ping Hsieh, Simeng Sun, Samuel Kriman, Shantanu Acharya, Dima Rekesh, Fei Jia, Yang Zhang, and Boris Ginsburg. RULER: What’s the real context size of your long-context language models? CoRR, abs/2404.06654, 2024.
1218 41 Shengding Hu, Yuge Tu, Xu Han, Chaoqun He, Ganqu Cui, Xiang Long, Zhi Zheng, Yewei Fang, Yuxiang Huang, Weilin Zhao, Xinrong Zhang, Zhen Leng Thai, Kai Zhang, Chongyi Wang, Yuan Yao, Chenyang Zhao, Jie Zhou, Jie Cai, Zhongwu Zhai, Ning Ding, Chao Jia, Guoyang Zeng, Dahai Li, Zhiyuan Liu, and Maosong Sun. MiniCPM: Unveiling the potential of small language models with scalable training strategies. CoRR, abs/2404.06395, 2024.
1219 42 Binyuan Hui, Jian Yang, Zeyu Cui, Jiaxi Yang, Dayiheng Liu, Lei Zhang, Tianyu Liu, Jiajun Zhang, Bowen Yu, Keming Lu, et al. Qwen2.5-Coder technical report. CoRR, abs/2409.12186, 2024.
1220 43 Naman Jain, King Han, Alex Gu, Wen-Ding Li, Fanjia Yan, Tianjun Zhang, Sida Wang, Armando Solar-Lezama, Koushik Sen, and Ion Stoica. LiveCodeBench: Holistic and contamination free evaluation of large language models for code. CoRR, abs/2403.07974, 2024.
1221 44 Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, L´ elio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timoth´ee Lacroix, and William El Sayed. Mistral 7B. CoRR, abs/2310.06825, 2023a.
1222 45 Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, L´ elio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Sandeep Subramanian, Sophia Yang, Szymon Antoniak, Teven Le Scao, Th´ eophile Gervet, Thibaut Lavril, Thomas Wang, Timoth´ ee Lacroix, and William El Sayed. Mixtral of experts. CoRR, abs/2401.04088, 2024a.
1223 46 Huiqiang Jiang, Yucheng Li, Chengruidong Zhang, Qianhui Wu, Xufang Luo, Surin Ahn, Zhenhua Han, Amir H Abdi, Dongsheng Li, Chin-Yew Lin, Yuqing Yang, and Lili Qiu. Minference 1.0: Accelerating pre-filling for long-context llms via dynamic sparse attention. arXiv preprint arXiv:2407.02490, 2024b.
1224 47 Zixuan Jiang, Jiaqi Gu, Hanqing Zhu, and David Z. Pan. Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and efficient pre-LN Transformers. CoRR, abs/2305.14858, 2023b.
1225 48 Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language models. CoRR, abs/2001.08361, 2020.
1226 49 Fajri Koto, Nurul Aisyah, Haonan Li, and Timothy Baldwin. Large language models only pass primary school exams in Indonesia: A comprehensive test on IndoMMLU. In EMNLP, pp. 12359–12374. Association for Computational Linguistics, 2023.
1227 50 Nathan Lambert, Valentina Pyatkin, Jacob Daniel Morrison, Lester James Validad Miranda, Bill Yuchen Lin, Khyathi Raghavi Chandu, Nouha Dziri, Sachin Kumar, Tom Zick, Yejin Choi, Noah A. Smith, and Hanna Hajishirzi. RewardBench: Evaluating reward models for language modeling. CoRR, abs/2403.13787, 2024.
1228 51 Dmitry Lepikhin, HyoukJoong Lee, Yuanzhong Xu, Dehao Chen, Orhan Firat, Yanping Huang, Maxim Krikun, Noam Shazeer, and Zhifeng Chen. GShard: Scaling giant models with conditional computation and automatic sharding. CoRR, abs/2006.16668, 2020.
1229 52 Tianle Li, Wei-Lin Chiang, Evan Frick, Lisa Dunlap, Tianhao Wu, Banghua Zhu, Joseph E. Gonzalez, and Ion Stoica. From crowdsourced data to high-quality benchmarks: Arena-Hard and BenchBuilder pipeline. CoRR, abs/2406.11939, 2024.
1230 53 Stephanie Lin, Jacob Hilton, and Owain Evans. TruthfulQA: Measuring how models mimic human falsehoods. In ACL (1), pp. 3214–3252. Association for Computational Linguistics, 2022a.
1231 54 Xi Victoria Lin, Todor Mihaylov, Mikel Artetxe, Tianlu Wang, Shuohui Chen, Daniel Simig, Myle Ott, Naman Goyal, Shruti Bhosale, Jingfei Du, Ramakanth Pasunuru, Sam Shleifer, Punit Singh Koura, Vishrav Chaudhary, Brian O’Horo, Jeff Wang, Luke Zettlemoyer, Zornitsa Kozareva, Mona T. Diab, Veselin Stoyanov, and Xian Li. Few-shot learning with multilingual generative language models. In EMNLP, pp. 9019–9052. Association for Computational Linguistics, 2022b.
1232 55 Jiawei Liu, Chunqiu Steven Xia, Yuyao Wang, and Lingming Zhang. Is your code generated by ChatGPT really correct? Rigorous evaluation of large language models for code generation. In NeurIPS, 2023.
1233 56 Keming Lu, Bowen Yu, Fei Huang, Yang Fan, Runji Lin, and Chang Zhou. Online merging optimizers for boosting rewards and mitigating tax in alignment. CoRR, abs/2405.17931, 2024a.
1234 57 Keming Lu, Bowen Yu, Chang Zhou, and Jingren Zhou. Large language models are superpositions of all
1235 58 characters: Attaining arbitrary role-play via self-alignment. CoRR, abs/2401.12474, 2024b.
1236 59 Niklas Muennighoff, Thomas Wang, Lintang Sutawika, Adam Roberts, Stella Biderman, Teven Le Scao, M. Saiful Bari, Sheng Shen, Zheng Xin Yong, Hailey Schoelkopf, Xiangru Tang, Dragomir Radev, Alham Fikri Aji, Khalid Almubarak, Samuel Albanie, Zaid Alyafeai, Albert Webson, Edward Raff, and Colin Raffel. Crosslingual generalization through multitask finetuning. In ACL (1), pp. 15991–16111. Association for Computational Linguistics, 2023.
1237 60 Junho Myung, Nayeon Lee, Yi Zhou, Jiho Jin, Rifki Afina Putri, Dimosthenis Antypas, Hsuvas Borkakoty, Eunsu Kim, Carla P´ erez-Almendros, Abinew Ali Ayele, V´ ıctor Guti´ errez-Basulto, Yazm´ ın Ib´ a nez-Garc´ ıa, Hwaran Lee, Shamsuddeen Hassan Muhammad, Ki-Woong Park, Anar Sabuhi Rzayev, Nina White, Seid Muhie Yimam, Mohammad Taher Pilehvar, Nedjma Ousidhoum, Jos´ e Camacho-Collados, and Alice Oh. Blend: A benchmark for llms on everyday knowledge in diverse cultures and languages. CoRR, abs/2406.09948, 2024.
1238 61 OpenAI. GPT4 technical report. CoRR, abs/2303.08774, 2023.
1239 62 OpenAI. Hello GPT-4o, 2024a. URL https://openai.com/index/hello-gpt-4o/.
1240 63 OpenAI. Learning to reason with LLMs, 2024b. URL https://openai.com/index/learning-to-reason-with-llms/.
1241 64 Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul F. Christiano, Jan Leike, and Ryan Lowe. Training language models to follow instructions with human feedback. In NeurIPS, 2022.
1242 65 Bowen Peng, Jeffrey Quesnelle, Honglu Fan, and Enrico Shippole. YaRN: Efficient context window extension of large language models. CoRR, abs/2309.00071, 2023.
1243 66 Edoardo Maria Ponti, Goran Glavas, Olga Majewska, Qianchu Liu, Ivan Vulic, and Anna Korhonen. XCOPA: A multilingual dataset for causal commonsense reasoning. In EMNLP (1), pp. 2362–2376.
1244 67 Association for Computational Linguistics, 2020.
1245 68 Shanghaoran Quan, Tianyi Tang, Bowen Yu, An Yang, Dayiheng Liu, Bofei Gao, Jianhong Tu, Yichang Zhang, Jingren Zhou, and Junyang Lin. Language models can self-lengthen to generate long texts. CoRR, abs/2410.23933, 2024.
1246 69 Qwen Team. Code with CodeQwen1.5, 2024a. URL https://qwenlm.github.io/blog/codeqwen1.5/.
1247 70 Qwen Team. Introducing Qwen1.5, 2024b. URL https://qwenlm.github.io/blog/qwen1.5/. Qwen Team. Introducing Qwen2-Math, 2024c. URL https://qwenlm.github.io/blog/qwen2-math/.
1248 71 Qwen Team. QwQ: Reflect deeply on the boundaries of the unknown, 2024d. URL https://qwenlm.git hub.io/blog/qwq-32b-preview/.
1249 72 Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever, et al. Improving language understanding by generative pre-training. Technical report, OpenAI, 2018.
1250 73 Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D. Manning, Stefano Ermon, and Chelsea Finn. Direct preference optimization: Your language model is secretly a reward model. In NeurIPS, 2023.
1251 74 Samyam Rajbhandari, Conglong Li, Zhewei Yao, Minjia Zhang, Reza Yazdani Aminabadi, Ammar Ahmad Awan, Jeff Rasley, and Yuxiong He. DeepSpeed-MoE: Advancing mixture-of-experts inference and training to power next-generation AI scale. In ICML, volume 162 of Proceedings of Machine Learning Research, pp. 18332–18346. PMLR, 2022.
1252 75 David Rein, Betty Li Hou, Asa Cooper Stickland, Jackson Petty, Richard Yuanzhe Pang, Julien Dirani, Julian Michael, and Samuel R. Bowman. GPQA: A graduate-level Google-proof Q&A benchmark. CoRR, abs/2311.12022, 2023.
1253 76 Keisuke Sakaguchi, Ronan Le Bras, Chandra Bhagavatula, and Yejin Choi. WinoGrande: An adversarial winograd schema challenge at scale. Commun. ACM, 64(9):99–106, 2021.
1254 77 Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural machine translation of rare words with subword units. In ACL (1). The Association for Computer Linguistics, 2016.
1255 78 Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Mingchuan Zhang, Y. K. Li, Y. Wu, and Daya Guo. Deepseekmath: Pushing the limits of mathematical reasoning in open language models. CoRR, abs/2402.03300, 2024.
1256 79 Jianlin Su. The magical effect of the Bias term: RoPE + Bias = better length extrapolation, 2023. URL https://spaces.ac.cn/archives/9577.
1257 80 Jianlin Su, Murtadha H. M. Ahmed, Yu Lu, Shengfeng Pan, Wen Bo, and Yunfeng Liu. Roformer: Enhanced Transformer with rotary position embedding. Neurocomputing, 568:127063, 2024.
1258 81 Mirac Suzgun, Nathan Scales, Nathanael Sch¨ arli, Sebastian Gehrmann, Yi Tay, Hyung Won Chung, Aakanksha Chowdhery, Quoc V. Le, Ed H. Chi, Denny Zhou, and Jason Wei. Challenging BIG-Bench tasks and whether chain-of-thought can solve them. In ACL (Findings), pp. 13003–13051. Association for Computational Linguistics, 2023.
1259 82 Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timoth´ ee Lacroix, Baptiste Rozi` ere, Naman Goyal, Eric Hambro, Faisal Azhar, Aur´ elien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. LLaMA: Open and efficient foundation language models. CoRR, abs/2302.13971, 2023a.
1260 83 Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton-Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aur´ elien Rodriguez, Robert Stojnic, Sergey Edunov, and Thomas Scialom. Llama 2: Open foundation and fine-tuned chat models. CoRR, abs/2307.09288, 2023b.
1261 84 Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. In NIPS, pp. 5998–6008, 2017.
1262 85 Binghai Wang, Rui Zheng, Lu Chen, Yan Liu, Shihan Dou, Caishuang Huang, Wei Shen, Senjie Jin, Enyu Zhou, Chenyu Shi, et al. Secrets of RLHF in large language models part II: Reward modeling. CoRR, abs/2401.06080, 2024a.
1263 86 Changhan Wang, Kyunghyun Cho, and Jiatao Gu. Neural machine translation with byte-level subwords. In AAAI, pp. 9154–9160. AAAI Press, 2020.
1264 87 Yubo Wang, Xueguang Ma, Ge Zhang, Yuansheng Ni, Abhranil Chandra, Shiguang Guo, Weiming Ren, Aaran Arulraj, Xuan He, Ziyan Jiang, Tianle Li, Max Ku, Kai Wang, Alex Zhuang, Rongqi Fan, Xiang Yue, and Wenhu Chen. MMLU-Pro: A more robust and challenging multi-task language understanding benchmark. CoRR, abs/2406.01574, 2024b.
1265 88 Zhilin Wang, Alexander Bukharin, Olivier Delalleau, Daniel Egert, Gerald Shen, Jiaqi Zeng, Oleksii Kuchaiev, and Yi Dong. HelpSteer2-Preference: Complementing ratings with preferences. CoRR, abs/2410.01257, 2024c.
1266 89 Colin White, Samuel Dooley, Manley Roberts, Arka Pal, Benjamin Feuer, Siddhartha Jain, Ravid Shwartz-Ziv, Neel Jain, Khalid Saifullah, Siddartha Naidu, Chinmay Hegde, Yann LeCun, Tom Goldstein, Willie Neiswanger, and Micah Goldblum. LiveBench: A challenging, contamination-free LLM benchmark. CoRR, abs/2406.19314, 2024.
1267 90 Hao Xiang, Bowen Yu, Hongyu Lin, Keming Lu, Yaojie Lu, Xianpei Han, Le Sun, Jingren Zhou, and Junyang Lin. Aligning large language models via self-steering optimization. CoRR, abs/2410.17131, 2024.
1268 91 Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, and Hao Ma. Effective long-context scaling of foundation models. CoRR, abs/2309.16039, 2023.
1269 92 An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Ma, Jianxin Yang, Jin Xu, Jingren Zhou, Jinze Bai, Jinzheng He, Junyang Lin, Kai Dang, Keming Lu, Keqin Chen, Kexin Yang, Mei Li, Mingfeng Xue, Na Ni, Pei Zhang, Peng Wang, Ru Peng, Rui Men, Ruize Gao, Runji Lin, Shijie Wang, Shuai Bai, Sinan Tan, Tianhang Zhu, Tianhao Li, Tianyu Liu, Wenbin Ge, Xiaodong Deng, Xiaohuan Zhou, Xingzhang Ren, Xinyu Zhang, Xipin Wei, Xuancheng Ren, Xuejing Liu, Yang Fan, Yang Yao, Yichang Zhang, Yu Wan, Yunfei Chu, Yuqiong Liu, Zeyu Cui, Zhenru Zhang, Zhifang Guo, and Zhihao Fan. Qwen2 technical report. CoRR, abs/2407.10671, 2024a.
1270 93 An Yang, Beichen Zhang, Binyuan Hui, Bofei Gao, Bowen Yu, Chengpeng Li, Dayiheng Liu, Jianhong Tu, Jingren Zhou, Junyang Lin, et al. Qwen2.5-Math technical report: Toward mathematical expert model via self-improvement. CoRR, abs/2409.12122, 2024b.
1271 94 Jian Yang, Jiaxi Yang, Ke Jin, Yibo Miao, Lei Zhang, Liqun Yang, Zeyu Cui, Yichang Zhang, Binyuan Hui, and Junyang Lin. Evaluating and aligning codellms on human preference. CoRR, abs/2412.05210, 2024c.
1272 95 Yinfei Yang, Yuan Zhang, Chris Tar, and Jason Baldridge. PAWS-X: A cross-lingual adversarial dataset for paraphrase identification. In EMNLP/IJCNLP (1), pp. 3685–3690. Association for Computational Linguistics, 2019.
1273 96 Alex Young, Bei Chen, Chao Li, Chengen Huang, Ge Zhang, Guanwei Zhang, Heng Li, Jiangcheng Zhu, Jianqun Chen, Jing Chang, Kaidong Yu, Peng Liu, Qiang Liu, Shawn Yue, Senbin Yang, Shiming Yang, Tao Yu, Wen Xie, Wenhao Huang, Xiaohui Hu, Xiaoyi Ren, Xinyao Niu, Pengcheng Nie, Yuchi Xu, Yudong Liu, Yue Wang, Yuxuan Cai, Zhenyu Gu, Zhiyuan Liu, and Zonghong Dai. Yi: Open foundation models by 01.AI. CoRR, abs/2403.04652, 2024.
1274 97 Tao Yuan, Xuefei Ning, Dong Zhou, Zhijie Yang, Shiyao Li, Minghui Zhuang, Zheyue Tan, Zhuyu Yao, Dahua Lin, Boxun Li, Guohao Dai, Shengen Yan, and Yu Wang. LV-Eval: A balanced long-context benchmark with 5 length levels up to 256K. CoRR, abs/2402.05136, 2024.
1275 98 Zheng Yuan, Hongyi Yuan, Chengpeng Li, Guanting Dong, Chuanqi Tan, and Chang Zhou. Scaling relationship on learning mathematical reasoning with large language models. CoRR, abs/2308.01825, 2023.
1276 99 Rowan Zellers, Ari Holtzman, Yonatan Bisk, Ali Farhadi, and Yejin Choi. HellaSwag: Can a machine really finish your sentence? In ACL (1), pp. 4791–4800. Association for Computational Linguistics, 2019.
1277 100 Yidan Zhang, Boyi Deng, Yu Wan, Baosong Yang, Haoran Wei, Fei Huang, Bowen Yu, Junyang Lin, and Jingren Zhou. P-MMEval: A parallel multilingual multitask benchmark for consistent evaluation of LLMs. CoRR, abs/2411.09116, 2024.
1278 101 Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric P. Xing, Hao Zhang, Joseph E. Gonzalez, and Ion Stoica. Judging LLM-as-a-judge with MT-Bench and Chatbot Arena. In NeurIPS, 2023.
1279 102 Enyu Zhou, Guodong Zheng, Bing Wang, Zhiheng Xi, Shihan Dou, Rong Bao, Wei Shen, Limao Xiong, Jessica Fan, Yurong Mou, Rui Zheng, Tao Gui, Qi Zhang, and Xuanjing Huang. RMB: Comprehensively benchmarking reward models in LLM alignment. CoRR, abs/2410.09893, 2024.
1280 103 Jeffrey Zhou, Tianjian Lu, Swaroop Mishra, Siddhartha Brahma, Sujoy Basu, Yi Luan, Denny Zhou, and Le Hou. Instruction-following evaluation for large language models. CoRR, abs/2311.07911, 2023.
1281 104 Barret Zoph, Irwan Bello, Sameer Kumar, Nan Du, Yanping Huang, Jeff Dean, Noam Shazeer, and William Fedus. ST-MoE: Designing stable and transferable sparse expert models. CoRR, abs/2202.08906, 2022.
1282 28 0 [28] Z. Shao, P. Wang, Q. Zhu, R. Xu, J. Song, X. Bi, H. Zhang, M. Zhang, Y. K. Li, Y. Wu, and D. Guo. Deepseekmath: Pushing the limits of mathematical reasoning in open language models, 2024. URL https://arxiv.org/abs/2402.03300. https://qiita.com/kaizen_nagoya/items/2add7a80056850b9ce87
1283 1 R. Anil, S. Borgeaud, Y. Wu, J. Alayrac, J. Yu, R. Soricut, J. Schalkwyk, A. M. Dai, A. Hauth, K. Millican, D. Silver, S. Petrov, M. Johnson, I. Antonoglou, J. Schrittwieser, A. Glaese, J. Chen, E. Pitler, T. P. Lillicrap, A. Lazaridou, O. Firat, J. Molloy, M. Isard, P. R. Barham, T. Hennigan, B. Lee, F. Viola, M. Reynolds, Y. Xu, R. Doherty, E. Collins, C. Meyer, E. Rutherford, E. Moreira, K. Ayoub, M. Goel, G. Tucker, E. Piqueras, M. Krikun, I. Barr, N. Savinov, I. Danihelka, B. Roelofs, A. White, A. Andreassen, T. von Glehn, L. Yagati, M. Kazemi, L. Gonzalez, M. Khalman, J. Sygnowski, and et al. Gemini: A family of highly capable multimodal models. CoRR, abs/2312.11805, 2023. doi: 10.48550/ARXIV.2312.11805. URL https://doi.org/10.48550/arXiv.2312.11805.
1284 2 J. Austin, A. Odena, M. Nye, M. Bosma, H. Michalewski, D. Dohan, E. Jiang, C. Cai, M. Terry, Q. Le, et al. Program synthesis with large language models. arXiv preprint arXiv:2108.07732, 2021.
1285 3 Z. Azerbayev, H. Schoelkopf, K. Paster, M. D. Santos, S. McAleer, A. Q. Jiang, J. Deng, S. Biderman, and S. Welleck. Llemma: An open language model for mathematics. arXiv preprint arXiv:2310.10631, 2023.
1286 4 J. Bai, S. Bai, Y. Chu, Z. Cui, K. Dang, X. Deng, Y. Fan, W. Ge, Y. Han, F. Huang, et al. Qwen technical report. arXiv preprint arXiv:2309.16609, 2023. C. Burns, P. Izmailov, J. H. Kirchner, B. Baker, L. Gao, L. Aschenbrenner, Y. Chen, A. Ecoffet, M. Joglekar, J. Leike, et al. Weak-to-strong generalization: Eliciting strong capabilities with weak supervision. arXiv preprint arXiv:2312.09390, 2023.
1287 5 ChatGLM3 Team. Chatglm3 series: Open bilingual chat llms, 2023. URL https://github.com/THUDM/ChatGLM3. M. Chen, J. Tworek, H. Jun, Q. Yuan, H. P. de Oliveira Pinto, J. Kaplan, H. Edwards, Y. Burda, N. Joseph, G. Brockman, A. Ray, R. Puri, G. Krueger, M. Petrov, H. Khlaaf, G. Sastry, P. Mishkin, B. Chan, S. Gray, N. Ryder, M. Pavlov, A. Power, L. Kaiser, M. Bavarian, C. Winter, P. Tillet, F. P. Such, D. Cummings, M. Plappert, F. Chantzis, E. Barnes, A. Herbert-Voss, W. H. Guss, A. Nichol, A. Paino, N. Tezak, J. Tang, I. Babuschkin, S. Balaji, S. Jain, W. Saunders, C. Hesse, A. N. Carr, J. Leike, J. Achiam, V. Misra, E. Morikawa, A. Radford, M. Knight, M. Brundage,
1288 6 M. Murati, K. Mayer, P. Welinder, B. McGrew, D. Amodei, S. McCandlish, I. Sutskever, and W. Zaremba. Evaluating large language models trained on code. CoRR, abs/2107.03374, 2021. URL https://arxiv.org/abs/2107.03374.
1289 7 W. Chen, X. Ma, X. Wang, and W. W. Cohen. Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. CoRR, abs/2211.12588, 2022. doi: 10.48550/ARXIV.2211.12588. URL https://doi.org/10.48550/arXiv.2211.12588.
1290 8 K. Cobbe, V. Kosaraju, M. Bavarian, M. Chen, H. Jun, L. Kaiser, M. Plappert, J. Tworek, J. Hilton, R. Nakano, et al. Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168, 2021.
1291 9 T. Computer. Redpajama: an open dataset for training large language models, Oct. 2023. URL https://github.com/togethercomputer/RedPajama-Data.
1292 10 DeepSeek-AI. Deepseek LLM: scaling open-source language models with longtermism. CoRR,abs/2401.02954, 2024. doi: 10.48550/ARXIV.2401.02954. URL https://doi.org/10.48550/arXiv.2401.02954.
1293 11 Z. Du, Y. Qian, X. Liu, M. Ding, J. Qiu, Z. Yang, and J. Tang. Glm: General language model pretraining with autoregressive blank infilling. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 320–335, 2022.
1294 12 L. Gao, A. Madaan, S. Zhou, U. Alon, P. Liu, Y. Yang, J. Callan, and G. Neubig. PAL: program-aided language models. In A. Krause, E. Brunskill, K. Cho, B. Engelhardt, S. Sabato, and J. Scarlett, editors, International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, volume 202 of Proceedings of Machine Learning Research, pages 10764–10799. PMLR, 2023. URL https://proceedings.mlr.press/v202/gao23f.html.
1295 13 Z. Gou, Z. Shao, Y. Gong, Y. Shen, Y. Yang, M. Huang, N. Duan, and W. Chen. Tora: A tool-integrated reasoning agent for mathematical problem solving. CoRR, abs/2309.17452, 2023. doi: 10.48550/ARXIV.2309.17452. URL https://doi.org/10.48550/arXiv.2309.1745 2.
1296 14 D. Guo, Q. Zhu, D. Yang, Z. Xie, K. Dong, W. Zhang, G. Chen, X. Bi, Y. Wu, Y. K. Li, F. Luo, Y. Xiong, and W. Liang. Deepseek-coder: When the large language model meets programming – the rise of code intelligence, 2024.
1297 15 D. Hendrycks, C. Burns, S. Basart, A. Zou, M. Mazeika, D. Song, and J. Steinhardt. Measuring massive multitask language understanding. arXiv preprint arXiv:2009.03300, 2020.
1298 16 D. Hendrycks, C. Burns, S. Kadavath, A. Arora, S. Basart, E. Tang, D. Song, and J. Steinhardt. Measuring mathematical problem solving with the math dataset. arXiv preprint arXiv:2103.03874, 2021.
1299 17 High-flyer. Hai-llm: 高效且轻量的大模型训练工具, 2023. URL https://www.high-flyer.cn/en/blog/hai-llm.
1300 18 Inflection AI. Inflection-2, 2023. URL https://inflection.ai/inflection-2.
1301 19 A. Q. Jiang, S. Welleck, J. P. Zhou, W. Li, J. Liu, M. Jamnik, T. Lacroix, Y. Wu, and G. Lample. Draft, sketch, and prove: Guiding formal theorem provers with informal proofs. arXiv preprint arXiv:2210.12283, 2022.
1302 20 A. Q. Jiang, A. Sablayrolles, A. Mensch, C. Bamford, D. S. Chaplot, D. d. l. Casas, F. Bressand, G. Lengyel, G. Lample, L. Saulnier, et al. Mistral 7b. arXiv preprint arXiv:2310.06825, 2023.
1303 21 A. Joulin, E. Grave, P. Bojanowski, M. Douze, H. Jégou, and T. Mikolov. Fasttext. zip: Compressing text classification models. arXiv preprint arXiv:1612.03651, 2016.
1304 22 W. Kwon, Z. Li, S. Zhuang, Y. Sheng, L. Zheng, C. H. Yu, J. E. Gonzalez, H. Zhang, and I. Stoica. Efficient memory management for large language model serving with pagedattention. In Proceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles, 2023.
1305 23 Y. Leviathan, M. Kalman, and Y. Matias. Fast inference from transformers via speculative decoding. 2023. In International Conference on Machine Learning, pages 19274–19286. PMLR,
1306 24 A. Lewkowycz, A. Andreassen, D. Dohan, E. Dyer, H. Michalewski, V. Ramasesh, A. Slone, C. Anil, I. Schlag, T. Gutman-Solo, et al. Solving quantitative reasoning problems with language models. Advances in Neural Information Processing Systems, 35:3843–3857, 2022a.
1307 25 A. Lewkowycz, A. Andreassen, D. Dohan, E. Dyer, H. Michalewski, V. V. Ramasesh, A. Slone, C. Anil, I. Schlag, T. Gutman-Solo, Y. Wu, B. Neyshabur, G. Gur-Ari, and V. Misra. Solving quantitative reasoning problems with language models. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, 2022b. URL http://papers.nips.cc/paper_files/paper/2022/hash/18abbeef8cfe9203fdf9053c9c4fe191-Abstract-Conference.html.
1308 26 H. Lightman, V. Kosaraju, Y. Burda, H. Edwards, B. Baker, T. Lee, J. Leike, J. Schulman, I. Sutskever, and K. Cobbe. Let’s verify step by step. arXiv preprint arXiv:2305.20050, 2023.
1309 27 I. Loshchilov and F. Hutter. arXiv:1711.05101, 2017. Decoupled weight decay regularization. arXiv preprint
1310 28 H. Luo, Q. Sun, C. Xu, P. Zhao, J. Lou, C. Tao, X. Geng, Q. Lin, S. Chen, and D. Zhang. Wizardmath: Empowering mathematical reasoning for large language models via reinforced evol-instruct. arXiv preprint arXiv:2308.09583, 2023.
1311 29 S. Mishra, M. Finlayson, P. Lu, L. Tang, S. Welleck, C. Baral, T. Rajpurohit, O. Tafjord, A. Sabharwal, P. Clark, and A. Kalyan. LILA: A unified benchmark for mathematical reasoning. In Y. Goldberg, Z. Kozareva, and Y. Zhang, editors, Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, pages 5807–5832. Association for Computational Linguistics,2022. doi: 10.18653/V1/2022.EMNLP-MAIN.392. URL https://doi.org/10.18653/v1/2022.emnlp-main.392.
1312 30 X. Nguyen, W. Zhang, X. Li, M. M. Aljunied, Q. Tan, L. Cheng, G. Chen, Y. Deng, S. Yang, C. Liu, H. Zhang, and L. Bing. Seallms - large language models for southeast asia. CoRR, abs/2312.00738, 2023. doi: 10.48550/ARXIV.2312.00738. URL https://doi.org/10.48550/arXiv.2312.00738.
1313 31 OpenAI. GPT4 technical report. arXiv preprint arXiv:2303.08774, 2023.
1314 32 L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray, et al. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
1315 33 K. Paster, M. D. Santos, Z. Azerbayev, and J. Ba. Openwebmath: An open dataset of high-quality mathematical web text. CoRR, abs/2310.06786, 2023. doi: 10.48550/ARXIV.2310.06786. URL https://doi.org/10.48550/arXiv.2310.06786.
1316 34 L. C. Paulson. Three years of experience with sledgehammer, a practical link between automatic and interactive theorem provers. In R. A. Schmidt, S. Schulz, and B. Konev, editors, Proceedings of the 2nd Workshop on Practical Aspects of Automated Reasoning, PAAR-2010, Edinburgh, Scotland, UK, July 14, 2010, volume 9 of EPiC Series in Computing, pages 1–10. EasyChair, 2010. doi: 10.29007/TNFD. URL https://doi.org/10.29007/tnfd.
1317 35 S. Polu and I. Sutskever. Generative language modeling for automated theorem proving. CoRR, abs/2009.03393, 2020. URL https://arxiv.org/abs/2009.03393.
1318 36 R. Rafailov, A. Sharma, E. Mitchell, S. Ermon, C. D. Manning, and C. Finn. Direct preference optimization: Your language model is secretly a reward model. 2023.
1319 37 J. Schulman. Approximating kl divergence, 2020. URL http://joschu.net/blog/kl-approx.html.
1320 38 J. Schulman, P. Moritz, S. Levine, M. Jordan, and P. Abbeel. High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438, 2015.
1321 39 J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
1322 40 F. Shi, M. Suzgun, M. Freitag, X. Wang, S. Srivats, S. Vosoughi, H. W. Chung, Y. Tay, S. Ruder, D. Zhou, D. Das, and J. Wei. Language models are multilingual chain-of-thought reasoners. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023. URL https://openreview.net/pdf?id=fR3wGCk-IXp.
1323 41 F. Song, B. Yu, M. Li, H. Yu, F. Huang, Y. Li, and H. Wang. Preference ranking optimization for human alignment. arXiv preprint arXiv:2306.17492, 2023.
1324 42 M. Suzgun, N. Scales, N. Schärli, S. Gehrmann, Y. Tay, H. W. Chung, A. Chowdhery, Q. V. Le, E. H. Chi, D. Zhou, et al. Challenging big-bench tasks and whether chain-of-thought can solve them. arXiv preprint arXiv:2210.09261, 2022.
1325 43 T. Tao. Embracing change and resetting expectations, 2023. URL https://unlocked.microsoft.com/ai-anthology/terence-tao/.
1326 44 H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosale, D. Bikel, L. Blecher, C. Canton-Ferrer, M. Chen, G. Cucurull, D. Esiobu, J. Fernandes, J. Fu, W. Fu, B. Fuller, C. Gao, V. Goswami, N. Goyal, A. Hartshorn, S. Hosseini, R. Hou, H. Inan, M. Kardas, V. Kerkez, M. Khabsa, I. Kloumann, A. Korenev, P. S. Koura, M. Lachaux, T. Lavril, J. Lee, D. Liskovich, Y. Lu, Y. Mao, X. Martinet, T. Mihaylov, P. Mishra, I. Molybog, Y. Nie, A. Poulton, J. Reizenstein, R. Rungta, K. Saladi, A. Schelten, R. Silva, E. M. Smith, R. Subramanian, X. E. Tan, B. Tang, R. Taylor, A. Williams, J. X. Kuan, P. Xu, Z. Yan, I. Zarov, Y. Zhang, A. Fan, M. Kambadur, S. Narang, A. Rodriguez, R. Stojnic, S. Edunov, and T. Scialom. Llama 2: Open foundation and fine-tuned chat models. CoRR, abs/2307.09288,2023. doi: 10.48550/arXiv.2307.09288. URL https://doi.org/10.48550/arXiv.2307.09288.
1327 45 T. H. Trinh, Y. Wu, Q. V. Le, H. He, and T. Luong. Solving olympiad geometry without human demonstrations. Nature, 625(7995):476–482, 2024.
1328 46 P. Wang, L. Li, L. Chen, F. Song, B. Lin, Y. Cao, T. Liu, and Z. Sui. Making large language models better reasoners with alignment. arXiv preprint arXiv:2309.02144, 2023a.
1329 47 P. Wang, L. Li, Z. Shao, R. Xu, D. Dai, Y. Li, D. Chen, Y. Wu, and Z. Sui. Math-shepherd: Verify and reinforce llms step-by-step without human annotations. CoRR, abs/2312.08935, 2023b.
1330 48 Z. Wang, R. Xia, and P. Liu. Generative AI for math: Part I - mathpile: A billion-token-scale pretraining corpus for math. CoRR, abs/2312.17120, 2023c. doi: 10.48550/ARXIV.2312.17120. URL https://doi.org/10.48550/arXiv.2312.17120.
1331 49 J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. H. Chi, Q. V. Le, and D. Zhou. Chain-of-thought prompting elicits reasoning in large language models. In NeurIPS, 2022. URL http://papers.nips.cc/paper_files/paper/2022/hash/9d5609613524ecf4f15af0f7b31abca4-Abstract-Conference.html.
1332 50 T. Wei, J. Luan, W. Liu, S. Dong, and B. Wang. Cmath: Can your language model pass chinese elementary school math test?, 2023.
1333 51 M. Wenzel, L. C. Paulson, and T. Nipkow. The isabelle framework. In O. A. Mohamed, C. A. Muñoz, and S. Tahar, editors, Theorem Proving in Higher Order Logics, 21st International Conference, TPHOLs 2008, Montreal, Canada, August 18-21, 2008. Proceedings, volume 5170 of Lecture Notes in Computer Science, pages 33–38. Springer, 2008. doi: 10.1007/978-3-540-71067-7_7. URL https://doi.org/10.1007/978-3-540-71067-7_7.
1334 52 H. Xia, T. Ge, P. Wang, S.-Q. Chen, F. Wei, and Z. Sui. Speculative decoding: Exploiting speculative execution for accelerating seq2seq generation. In H. Bouamor, J. Pino, and K. Bali, editors, Findings of the Association for Computational Linguistics: EMNLP 2023, pages 3909–3925, Singapore, Dec. 2023. Association for Computational Linguistics. doi: 10.18653/v1/2023.findings-emnlp.257. URL https://aclanthology.org/2023.findings-emnlp.257.
1335 53 H. Xia, Z. Yang, Q. Dong, P. Wang, Y. Li, T. Ge, T. Liu, W. Li, and Z. Sui. Unlocking efficiency in large language model inference: A comprehensive survey of speculative decoding. arXiv preprint arXiv:2401.07851, 2024.
1336 54 S. Yao, D. Yu, J. Zhao, I. Shafran, T. L. Griffiths, Y. Cao, and K. Narasimhan. Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601, 2023.
1337 55 L. Yu, W. Jiang, H. Shi, J. Yu, Z. Liu, Y. Zhang, J. T. Kwok, Z. Li, A. Weller, and W. Liu. Metamath: Bootstrap your own mathematical questions for large language models. CoRR, abs/2309.12284, 2023. doi: 10.48550/ARXIV.2309.12284. URL https://doi.org/10.48550/arXiv.2309.12284.
1338 56 Z. Yuan, H. Yuan, C. Li, G. Dong, C. Tan, and C. Zhou. Scaling relationship on learning mathematical reasoning with large language models. arXiv preprint arXiv:2308.01825, 2023a.
1339 57 Z. Yuan, H. Yuan, C. Tan, W. Wang, S. Huang, and F. Huang. Rrhf: Rank responses to align language models with human feedback without tears. arXiv preprint arXiv:2304.05302, 2023b.
1340 58 X. Yue, X. Qu, G. Zhang, Y. Fu, W. Huang, H. Sun, Y. Su, and W. Chen. Mammoth: Building math generalist models through hybrid instruction tuning. CoRR, abs/2309.05653, 2023. doi: 10.48550/ARXIV.2309.05653. URL https://doi.org/10.48550/arXiv.2309.05653.
1341 59 K. Zheng, J. M. Han, and S. Polu. Minif2f: a cross-system benchmark for formal olympiad-level mathematics. arXiv preprint arXiv:2109.00110, 2021.
1342 60 W. Zhong, R. Cui, Y. Guo, Y. Liang, S. Lu, Y. Wang, A. Saied, W. Chen, and N. Duan. AGIEval: A human-centric benchmark for evaluating foundation models. CoRR, abs/2304.06364, 2023. doi: 10.48550/arXiv.2304.06364. URL https://doi.org/10.48550/arXiv.2304.06364.
1343 29 0 [29] S. T. Sherry, M. H. Ward, M. Kholodov, J. Baker, L. Phan, E. M. Smigielski, and K. Sirotkin. dbsnp: the ncbi database of genetic variation. Nucleic Acids Research, 29:308–311, 1 2001. ISSN 0305-1048. doi: 10.1093/NAR/29.1.308. URL https://dx.doi.org/10.1093/nar/29.1.308. https://qiita.com/kaizen_nagoya/items/756da32e4c0868d84da0
1344 1 1 Kruglyak,L. (1999) Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nature Genet., 22, 139–144.
1345 2 2 Carulli,J.P., Artinger,M., Swain,P.M., Root,C.D., Chee,L., Tulig,C., Guerin,J., Osborne,M., Stein,G., Lian,J. and Lomedico,P.T. (1998) High throughput analysis of differential gene expression. J. Cell. Biochem., 30–31 (Suppl.), 286–296.
1346 3 3 Cavalli-Sforza,L.L. (1998) The DNA revolution in population genetics. Trends Genet., 14, 60–65.
1347 4 4 Collins,F.S. (1999) Shattuck lecture–medical and societal consequences of the Human Genome Project. N. Engl. J. Med., 341, 28–37.
1348 5 5 Buetow,K.H., Edmonson,M.N. and Cassidy,A.B. (1999) Reliable identification of large numbers of candidate SNPs from public EST data. Nature Genet., 21, 323–325.
1349 6 6 Masood,E. (1999) As consortium plans free SNP map of human genome. Nature, 398, 545–546.
1350 7 7 Brookes,A.J., Lehväslaiho,H., Siegfried,M., Boehm,J.G., Yuan,Y.P., Sarkar,C.M., Bork,P. and Ortigao,F. (2000) HGBASE A Database of SNPs and other variations in and around human genes. Nucleic Acids Res., 28, 356–360.
1351 [30] Z. Sondka, N. B. Dhir, D. Carvalho-Silva, S. Jupe, Madhumita, K. McLaren, M. Starkey, S. Ward, J. Wilding, M. Ahmed, J. Argasinska, D. Beare, M. S. Chawla, S. Duke, I. Fasanella, A. G. Neogi, S. Haller, B. Hetenyi, L. Hodges, A. Holmes, R. Lyne, T. Maurel, S. Nair, H. Pedro, A. Sangrador-Vegas, H. Schuilenburg, Z. Sheard, S. Y. Yong, and J. Teague. Cosmic: a curated database of somatic variants and clinical data for cancer. Nucleic Acids Research, 52:D1210–D1217, 1 2024. ISSN 0305-1048. doi: 10.1093/NAR/GKAD986. URL https://dx.doi.org/10.1093/nar/gkad986. https://qiita.com/kaizen_nagoya/items/2b0960d4e1ff26a9b01f
1352 1 Chang K., Creighton C.J., Davis C., Donehower L., Drummond J., Wheeler D., Ally A., Balasundaram M., Birol I., Butterfield Y.S.N.et al. . The cancer genome atlas pan-cancer analysis project. Nat. Genet. 2013; 45:1113–1120.
1353 2 ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium Pan-cancer analysis of whole genomes. Nature. 2020; 578:82–93.
1354 3 Pugh T.J., Bell J.L., Bruce J.P., Doherty G.J., Galvin M., Green M.F., Hunter-Zinck H., Kumari P., Lenoue-Newton M.L., Li M.M.et al. . AACR Project GENIE: 100,000 cases and beyond. Cancer Discov. 2022; 12:2044–2057.
1355 4 Forbes S.A., Beare D., Boutselakis H., Bamford S., Bindal N., Tate J., Cole C.G., Ward S., Dawson E., Ponting L.et al. . COSMIC: somatic cancer genetics at high-resolution. Nucleic Acids Res. 2017; 45:D777–D783.
1356 5 Tate J.G., Bamford S., Jubb H.C., Sondka Z., Beare D.M., Bindal N., Boutselakis H., Cole C.G., Creatore C., Dawson E.et al. . COSMIC: the catalogue of somatic mutations In cancer. Nucleic Acids Res. 2019; 47:D941–D947.
1357 6 Bamford S., Dawson E., Forbes S., Clements J., Pettett R., Dogan A., Flanagan A., Teague J., Futreal P.A., Stratton M.R.et al. . The COSMIC (Catalogue of Somatic Mutations in Cancer) database and website. Br. J. Cancer. 2004; 91:355–358.
1358 7 Sondka Z., Bamford S., Cole C.G., Ward S.A., Dunham I., Forbes S.A. The COSMIC Cancer Gene Census: describing genetic dysfunction across all human cancers. Nat. Rev. Cancer. 2018; 18:696–705.
1359 8 Wilkinson M.D., Dumontier M., Aalbersberg I.J., Appleton G., Axton M., Baak A., Blomberg N., Boiten J.-W., da Silva Santos L.B., Bourne P.E.et al. . The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data. 2016; 3:160018.
1360 9 Hudson T.J., Anderson W., Aretz A., Barker A.D., Bell C., Bernabé R.R., Bhan M.K., Calvo F., Eerola I., Gerhard D.S.et al. . International network of cancer genome projects. Nature. 2010; 464:993–998.
1361 10 Seal R.L., Braschi B., Gray K., Jones T.E.M., Tweedie S., Haim-Vilmovsky L., Bruford E.A. Genenames.Org: the HGNC resources in 2023. Nucleic Acids Res. 2023; 51:D1003–D1009.
1362 11 Eilbeck K., Lewis S.E., Mungall C.J., Yandell M., Stein L., Durbin R., Ashburner M. The Sequence ontology: a tool for the unification of genome annotations. Genome Biol. 2005; 6:R44.
1363 12 den Dunnen J.T., Dalgleish R., Maglott D.R., Hart R.K., Greenblatt M.S., McGowan-Jordan J., Roux A.F., Smith T., Antonarakis S.E., Taschner P.E. HGVS recommendations for the description of sequence variants: 2016 update. Hum. Mutat. 2016; 37:564–569.
1364 13 Martin F.J., Amode M.R., Aneja A., Austine-Orimoloye O., Azov A.G., Barnes I., Becker A., Bennett R., Berry A., Bhai J.et al. . Ensembl 2023. Nucleic Acids Res. 2023; 51:D933–D941.
1365 14 Sioutos N., Coronado S.d., Haber M.W., Hartel F.W., Shaiu W.-L., Wright L.W. NCI Thesaurus: a semantic model integrating cancer-related clinical and molecular information. J. Biomed. Inform. 2007; 40:30–43.
1366 15 Frankish A., Diekhans M., Jungreis I., Lagarde J., Loveland J.E., Mudge J.M., Sisu C., Wright J.C., Armstrong J., Barnes I.et al. . Gencode 2021. Nucleic Acids Res. 2021; 49:D916–D923.
1367 16 McLaren W., Gil L., Hunt S.E., Riat H.S., Ritchie G.R.S., Thormann A., Flicek P., Cunningham F. The Ensembl variant effect predictor. Genome Biol. 2016; 17:122.
1368 17 Alexandrov L.B., Kim J., Haradhvala N.J., Huang M.N., Tian Ng A.W., Wu Y., Boot A., Covington K.R., Gordenin D.A., Bergstrom E.N.et al. . The repertoire of mutational signatures in human cancer. Nature. 2020; 578:94–101.
1369 18 Koh G., Degasperi A., Zou X., Momen S., Nik-Zainal S. Mutational signatures: emerging concepts, caveats and clinical applications. Nat. Rev. Cancer. 2021; 21:619–637.
1370 19 Islam S.M.A., Díaz-Gay M., Wu Y., Barnes M., Vangara R., Bergstrom E.N., He Y., Vella M., Wang J., Teague J.W.et al. . Uncovering novel mutational signatures by de novo extraction with SigProfilerExtractor. Cell Genomics. 2022; 2:https://doi.org/10.1016/j.xgen.2022.100179.
1371 20 Landrum M.J., Lee J.M., Benson M., Brown G.R., Chao C., Chitipiralla S., Gu B., Hart J., Hoffman D., Jang W.et al. . ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 2018; 46:D1062–D1067.
1372 21 Karczewski K.J., Francioli L.C., Tiao G., Cummings B.B., Alföldi J., Wang Q., Collins R.L., Laricchia K.M., Ganna A., Birnbaum D.P.et al. . The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020; 581:443–434.
1373 22 Martincorena I., Raine K.M., Gerstung M., Dawson K.J., Haase K., Van Loo P., Davies H., Stratton M.R., Campbell P.J. Universal patterns of selection in cancer and somatic tissues. Cell. 2017; 171:1029–1041.
1374 23 Cooper G.M., Stone E.A., Asimenos G., Green E.D., Batzoglou S., Sidow A. Distribution and intensity of constraint in mammalian genomic sequence. Genome Res. 2005; 15:901–913.
1375 24 Vaser R., Adusumalli S., Leng S.N., Sikic M., Ng P.C. SIFT missense predictions for genomes. Nat. Protoc. 2016; 11:1–9.
1376 25 Hanahan D. Hallmarks of cancer: new dimensions. Cancer Discov. 2022; 12:31–46.
1377 26 Hanahan D., Weinberg R.A. The hallmarks of cancer. Cell. 2000; 100:57–70.
1378 27 Hanahan D., Weinberg R.A. Hallmarks of cancer: the next generation. Cell. 2011; 144:646–674.
1379 28 Eisenhauer E.A., Therasse P., Bogaerts J., Schwartz L.H., Sargent D., Ford R., Dancey J., Arbuck S., Gwyther S., Mooney M.et al. . New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). Eur. J. Cancer. 2009; 45:228–247.
1380 29 Jubb H.C., Saini H.K., Verdonk M.L., Forbes S.A. COSMIC-3D provides structural perspectives on cancer genetics for drug discovery. Nat. Genet. 2018; 50:1200–1202.
1381 30 Ju D., Hui D., Hammond D.A., Wonkam A., Tishkoff S.A. Importance of including Non-European populations in large Human genetic studies to enhance precision medicine. Annu. Rev. Biomed. Data Sci. 2022; 5:321–339.
1382 31 Jia F., Teer J.K., Knepper T.C., Lee J.K., Zhou H.H., He Y.J., McLeod H.L. Discordance of somatic mutations between Asian and Caucasian patient populations with gastric cancer. Mol. Diagn. Ther. 2017; 21:179–185.
1383 31 0 [31] J. Su, Y. Lu, S. Pan, A. Murtadha, B. Wen, and Y. Liu. Roformer: Enhanced transformer with rotary position embedding, 2023. URL https://arxiv.org/abs/2104.09864. https://qiita.com/kaizen_nagoya/items/a12a45518f28a5133af2
1384 1 Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, and Yann N Dauphin. Convolutional sequence to sequence learning. In International Conference on Machine Learning, pages 1243–1252. PMLR, 2017.
1385 2 Md. Amirul Islam, Sen Jia, and Neil D. B. Bruce. How much position information do convolutional neural networks encode? ArXiv, abs/2001.08248, 2020.
1386 3 Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, L ukasz Kaiser, and Illia Polosukhin. Attention is all you need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. URL https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.
1387 4 J. Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT, 2019.
1388 5 A. Radford, Jeffrey Wu, R. Child, David Luan, Dario Amodei, and Ilya Sutskever. Language models are unsupervised multitask learners. 2019.
1389 6 Chulhee Yun, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank Reddi, and Sanjiv Kumar. Are transformers universal approximators of sequence-to-sequence functions? In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=ByxRM0Ntvr.
1390 7 Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. Albert: A lite bert for self-supervised learning of language representations. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=H1eA7AEtvS.
1391 8 Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning. ELECTRA: Pre-training text encoders as discriminators rather than generators. In ICLR, 2020. URL https://openreview.net/pdf?id=r1xMH1BtvB.
1392 9 A. Radford and Karthik Narasimhan. Improving language understanding by generative pre-training. 2018.
1393 10 Ankur P. Parikh, Oscar Täckström, Dipanjan Das, and Jakob Uszkoreit. A decomposable attention model for natural language inference. In EMNLP, 2016.
1394 11 Peter Shaw, Jakob Uszkoreit, and Ashish Vaswani. Self-attention with relative position representations. In NAACL-HLT, 2018.
1395 12 Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Noam Shazeer, I. Simon, C. Hawthorne, Andrew M. Dai, M. Hoffman, M. Dinculescu, and D. Eck. Music transformer. arXiv: Learning, 2018.
1396 13 Zihang Dai, Z. Yang, Yiming Yang, J. Carbonell, Quoc V. Le, and R. Salakhutdinov. Transformer-xl: Attentive language models beyond a fixed-length context. In ACL, 2019.
1397 14 Z. Yang, Zihang Dai, Yiming Yang, J. Carbonell, R. Salakhutdinov, and Quoc V. Le. Xlnet: Generalized autoregressive pretraining for language understanding. In NeurIPS, 2019.
1398 15 Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, W. Li, and Peter J. Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21: 140:1–140:67, 2020.
1399 16 Guolin Ke, Di He, and T. Liu. Rethinking positional encoding in language pre-training. ArXiv, abs/2006.15595, 2020.
1400 17 Pengcheng He, Xiaodong Liu, Jianfeng Gao, and Weizhu Chen. Deberta: Decoding-enhanced bert with disentangled attention. ArXiv, abs/2006.03654, 2020.
1401 18 Zhiheng Huang, Davis Liang, Peng Xu, and Bing Xiang. Improve transformer models with better relative position embeddings. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 3327–3335, Online, November 2020. Association for Computational Linguistics. doi:10.18653/v1/2020.findings-emnlp.298. URL https://www.aclweb.org/anthology/2020.findings-emnlp.298.
1402 19 Xuanqing Liu, Hsiang-Fu Yu, Inderjit S. Dhillon, and Cho-Jui Hsieh. Learning to encode position for transformer with continuous dynamical model. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, volume 119 of Proceedings of Machine Learning Research, pages 6327–6335. PMLR, 2020. URL http://proceedings.mlr.press/v119/liu20n.html.
1403 20 Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural ordinary differential equations. In Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett, editors, Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, pages 6572–6583, 2018a. URL https://proceedings.neurips.cc/paper/2018/hash/69386f6bb1dfed68692a24c8686939b9-Abstract.html.
1404 21 Benyou Wang, Donghao Zhao, Christina Lioma, Qiuchi Li, Peng Zhang, and Jakob Grue Simonsen. Encoding word order in complex embeddings. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=Hke-WTVtwr.
1405 22 Angelos Katharopoulos, Apoorv Vyas, Nikolaos Pappas, and François Fleuret. Transformers are rnns: Fast autoregressive transformers with linear attention. In International Conference on Machine Learning, pages 5156–5165. PMLR, 2020.
1406 23 Zhuoran Shen, Mingyuan Zhang, Haiyu Zhao, Shuai Yi, and Hongsheng Li. Efficient attention: Attention with linear complexities. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 3531–3539, 2021.
1407 24 Amapreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel Bowman. Glue: A multi-task benchmark and analysis platform for natural language understanding. 04 2018.
1408 25 Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, A. Gane, Tamás Sarlós, Peter Hawkins, J. Davis, Afroz Mohiuddin, Lukasz Kaiser, David Belanger, Lucy J. Colwell, and Adrian Weller. Rethinking attention with performers. ArXiv, abs/2009.14794, 2020.
1409 26 Ondrej Bojar, Christian Buck, Christian Federmann, Barry Haddow, Philipp Koehn, Johannes Leveling, Christof Monz, Pavel Pecina, Matt Post, Herve Saint-Amand, Radu Soricut, Lucia Specia, and Alevs Tamchyna. Findings of the 2014 workshop on statistical machine translation. pages 12–58, 06 2014. doi:10.3115/v1/W14-3302.
1410 27 Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural machine translation of rare words with subword units. 08 2015.
1411 28 Myle Ott, Sergey Edunov, Alexei Baevski, Angela Fan, Sam Gross, Nathan Ng, David Grangier, and Michael Auli. fairseq: A fast, extensible toolkit for sequence modeling. pages 48–53, 01 2019. doi:10.18653/v1/N19-4009.
1412 29 Kishore Papineni, Salim Roukos, Todd Ward, and Wei Jing Zhu. Bleu: a method for automatic evaluation of machine translation. 10 2002. doi:10.3115/1073083.1073135.
1413 30 Yukun Zhu, Ryan Kiros, Richard Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In arXiv preprint arXiv:1506.06724, 2015.
1414 31 Wikimedia Foundation. Wikimedia downloads, https://dumps.wikimedia.org, 2021.
1415 32 Ilya Loshchilov and Frank Hutter. Decoupled Weight Decay Regularization. arXiv e-prints, art. arXiv:1711.05101,November 2017.
1416 33 William B. Dolan and Chris Brockett. Automatically constructing a corpus of sentential paraphrases. In Proceedings of the Third International Workshop on Paraphrasing (IWP2005), 2005. URL https://www.aclweb.org/anthology/I05-5002.
1417 34 Richard Socher, A. Perelygin, J.Y. Wu, J. Chuang, C.D. Manning, A.Y. Ng, and C. Potts. Recursive deep models for semantic compositionality over a sentiment treebank. EMNLP, 1631:1631–1642, 01 2013.
1418 35 Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. Squad: 100,000+ questions for machine comprehension of text. pages 2383–2392, 01 2016. doi:10.18653/v1/D16-1264.
1419 36 Hussein Al-Natsheh. Udl at semeval-2017 task 1: Semantic textual similarity estimation of english sentence pairs using regression model over pairwise features. 08 2017.
1420 37 Z. Chen, H. Zhang, and L. Zhang, X.and Zhao. Quora question pairs., 2018b. URL https://www.kaggle.com/c/quora-question-pairs.
1421 38 Adina Williams, Nikita Nangia, and Samuel Bowman. A broad-coverage challenge corpus for sentence understanding through inference. pages 1112–1122, 01 2018. doi:10.18653/v1/N18-1101.
1422 39 Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online, October 2020. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/2020.emnlp-demos.6.
1423 40 Matt Mahoney. Large text compression benchmark, http://www.mattmahoney.net/dc/text.html, 2006.
1424 41 Jianlin Su. Wobert: Word-based chinese bert model - zhuiyiai. Technical report, 2020. URL https://github.com/ZhuiyiTechnology/WoBERT.
1425 42 Victor Junqiu Wei, Xiaozhe Ren, Xiaoguang Li, Wenyong Huang, Yi Liao, Yasheng Wang, Jiashu Lin, Xin Jiang, Xiao Chen, and Qun Liu. Nezha: Neural contextualized representation for chinese language understanding. 08 2019.
1426 43 Chaojun Xiao, Haoxi Zhong, Zhipeng Guo, Cunchao Tu, Zhiyuan Liu, Maosong Sun, Tianyang Zhang, Xianpei Han, Zhen hu, Heng Wang, and Jianfeng Xu. Cail2019-scm: A dataset of similar case matching in legal domain. 11 2019.
1427 32 0 [32] E. Wang, S. Schmidgall, P. F. Jaeger, F. Zhang, R. Pilgrim, Y. Matias, J. Barral, D. Fleet, and S. Azizi. Txgemma: Efficient and agentic llms for therapeutics. https://arxiv.org/pdf/2504.06196 https://qiita.com/kaizen_nagoya/items/e4eff5d51f926e943b9e
1428 1 Chen, J., Hu, Y., Wang, Y., Lu, Y., Cao, X., Lin, M., Xu, H., Wu, J., Xiao, C., Sun, J., et al. TrialBench: Multi-modal artificial intelligence-ready clinical trial datasets. arXiv preprint arXiv:2407.00631 (2024).
1429 2 Kuo, K.-T., Mao, T.-L., Jones, S., Veras, E., Ayhan, A., Wang, T.-L., Glas, R., Slamon, D., Velculescu, V. E., Kuman, R. J., et al. Frequent activating mutations of PIK3CA in ovarian clear cell carcinoma. The American journal of pathology 174, 1597–1601 (2009).
1430 3 Leontiadou, H., Galdadas, I., Athanasiou, C. & Cournia, Z. Insights into the mechanism of the PIK3CA E545K activating mutation using MD simulations. Scientific reports 8, 15544 (2018).
1431 4 Chen, H., Si, Y., Wen, J., Hu, C., Xia, E., Wang, Y. & Wang, O. P110αinhibitor alpelisib exhibits a synergistic effect with pyrotinib and reverses pyrotinib resistant in HER2+ breast cancer. Neoplasia 43, 100913 (2023).
1432 5 Fritsch, C., Huang, A., Chatenay-Rivauday, C., Schnell, C., Reddy, A., Liu, M., Kauffmann, A., Guthy, D., Erdmann, D., De Pover, A., et al. Characterization of the novel and specific PI3Kα inhibitor NVP-BYL719 and development of the patient stratification strategy for clinical trials. Molecular cancer therapeutics 13, 1117–1129 (2014).
1433 6 Narayan, P., Prowell, T. M., Gao, J. J., Fernandes, L. L., Li, E., Jiang, X., Qiu, J., Fan, J., Song, P., Yu, J., et al. FDA approval summary: alpelisib plus fulvestrant for patients with HR-positive, HER2-negative, PIK3CA-mutated, advanced or metastatic breast cancer. Clinical Cancer Research 27, 1842–1849 (2021).
1434 7 Passarelli, A., Carbone, V., Pignata, S., Mazzeo, R., Lorusso, D., Scambia, G., Canova, S., Di Palma, T., Tasca, G., Mantiero, M., et al. Alpelisib for PIK3CA-mutated advanced gynecological cancers: first clues of clinical activity. Gynecologic Oncology 183, 61–67 (2024).
1435 8 Thibault, B., Thole, A., D’Angelo, R., Basset, C. & Guillermet-Guibert, J. PI3Kα-specific inhibitor BYL-719 synergizes with cisplatin in vitro in PIK3CA-mutated ovarian cancer cells. Scientific Reports 15, 6265 (2025).
1436 9 Hu, X., Xia, M., Wang, J., Yu, H., Chai, J., Zhang, Z., Sun, Y., Su, J. & Sun, L. Dual PI3K/mTOR inhibitor PKI-402 suppresses the growth of ovarian cancer cells by degradation of Mcl-1 through autophagy. Biomedicine & Pharmacotherapy 129, 110397 (2020).
1437 10 Turon, G., Hlozek, J., Woodland, J. G., Kumar, A., Chibale, K. & Duran-Frigola, M. First fully-automated AI/ML virtual screening cascade implemented at a drug discovery centre in Africa. Nature Communications 14, 5736 (2023).
1438 11 Fontenot, R., Kathad, U., McDermott, J., Sturtevant, D., Sharma, P. & Carr, P. Predicting a Compounds Blood-Brain-Barrier Permeability with Lantern Pharma’s AI and ML Platform, RADR 2023.
1439 12 Bera, S., Dent, J., Gill, G., Stolman, A. & Wu, B. SimGCN for TDC Benchmarks (2022).
1440 13 Plonka, W., Stork, C., Šícho, M. & Kirchmair, J. CYPlebrity: Machine learning models for the prediction of inhibitors of cytochrome P450 enzymes. Bioorganic & medicinal chemistry 46, 116388 (2021).
1441 14 Hu, W., Liu, B., Gomes, J., Zitnik, M., Liang, P., Pande, V. & Leskovec, J. Strategies for pre-training graph neural networks. arXiv preprint arXiv:1905.12265 (2019).
1442 15 Huang, K., Fu, T., Glass, L. M., Zitnik, M., Xiao, C. & Sun, J. DeepPurpose: a deep learning library for drug–target interaction prediction. Bioinformatics 36, 5545–5547 (2020).
1443 16 Lagunin, A., Filimonov, D., Zakharov, A., Xie, W., Huang, Y., Zhu, F., Shen, T., Yao, J. & Poroikov, V. Computer-aided prediction of rodent carcinogenicity by PASS and CISOC-PSCT. QSAR & Combinatorial Science 28, 806–810 (2009).
1444 17 Li, P., Li, Y., Hsieh, C.-Y., Zhang, S., Liu, X., Liu, H., Song, S. & Yao, X. TrimNet: learning molecular representation from triplet messages for biomedicine. Briefings in Bioinformatics 22, bbaa266 (2021).
1445 18 Huang, D., Chowdhuri, S. R., Li, A., Li, A., Agrawal, A., Gano, K. & Zhu, A. A Unified System for Molecular Property Predictions: Oloren ChemEngine and its Applications (2022).
1446 19 Li, J., Cai, D. & He, X. Learning graph-level representation for drug discovery. arXiv preprint arXiv:1709.03741 (2017).
1447 20 Raimondi, D., Simm, J., Arany, A. & Moreau, Y. A novel method for data fusion over entity-relation graphs and its application to protein–protein interaction prediction. Bioinformatics 37, 2275–2281 (2021).
1448 21 Gfeller, D., Schmidt, J., Croce, G., Guillaume, P., Bobisse, S., Genolet, R., Queiroz, L., Cesbron, J., Racle, J. & Harari, A. Improved predictions of antigen presentation and TCR recognition with MixMHCpred2. 2 and PRIME2. 0 reveal potent SARS-CoV-2 CD8+ T-cell epitopes. Cell Systems 14, 72–83 (2023).
1449 22 Motmaen, A., Dauparas, J., Baek, M., Abedi, M. H., Baker, D. & Bradley, P. Peptide-binding specificity prediction using fine-tuned protein structure prediction networks. Proceedings of the National Academy of Sciences 120, e2216697120 (2023).
1450 23 Siramshetty, V., Williams, J., Nguyen, Ð., Neyra, J., Southall, N., Mathé, E., Xu, X. & Shah, P. Validating ADME QSAR models using marketed drugs. SLAS DISCOVERY: Advancing the Science of Drug Discovery 26, 1326–1336 (2021).
1451 24 Haneczok, J. & Delijewski, M. Machine learning enabled identification of potential SARS-CoV-2 3CLpro inhibitors based on fixed molecular fingerprints and Graph-CNN neural representations. Journal of Biomedical Informatics 119, 103821 (2021).
1452 25 Liu, Y., Wu, Y., Shen, X. & Xie, L. COVID-19 multi-targeted drug repurposing using few-shot learning. Frontiers in Bioinformatics 1, 693177 (2021).
1453 26 Chen, X., Dougherty, T., Hong, C., Schibler, R., Zhao, Y. C., Sadeghi, R., Matasci, N., Wu, Y.-C. & Kerman, I. Predicting antibody developability from sequence using machine learning. biorxiv, 2020–06 (2020).
1454 27 Alves, V. M., Muratov, E., Fourches, D., Strickland, J., Kleinstreuer, N., Andrade, C. H. & Tropsha, A. Predicting chemically-induced skin reactions. Part I: QSAR models of skin sensitization and their application to identify potentially hazardous compounds. Toxicology and applied pharmacology 284, 262–272 (2015).
1455 28 Shermukhamedov, S., Mamurjonova, D. & Probst, M. Structure to Property: Chemical Element Embeddings and a Deep Learning Approach for Accurate Prediction of Chemical Properties. arXiv preprint arXiv:2309.09355 (2023).
1456 29 29. Vu, O., Mendenhall, J., Altarawy, D. & Meiler, J. BCL:: Mol2D—a robust atom environment descriptor for QSAR modeling and lead optimization. Journal of computer-aided molecular design 33, 477–486 (2019).
1457 30 30. Karim, A., Lee, M., Balle, T. & Sattar, A. CardioTox net: a robust predictor for hERG channel blockade based on deep learning meta-feature ensembles. Journal of Cheminformatics 13, 1–13 (2021).
1458 31 31. Korotcov, A., Tkachenko, V., Russo, D. P. & Ekins, S. Comparison of deep learning with multiple machine learning methods and metrics using diverse drug discovery data sets. Molecular pharmaceutics 14, 4462–4475 (2017).
1459 32 32. Wong, L., You, Z.-H., Guo, Z.-H., Yi, H.-C., Chen, Z.-H. & Cao, M.-Y. MIPDH: a novel computational model for predicting microRNA–mRNA interactions by DeepWalk on a heterogeneous network. ACS omega 5, 17022–17032 (2020).
1460 33 33. Fu, T., Huang, K., Xiao, C., Glass, L. M. & Sun, J. Hint: Hierarchical interaction network for clinical-trial-outcome predictions. Patterns 3 (2022).
1461 34 34. Weber, A., Born, J. & Rodriguez Martínez, M. TITAN: T-cell receptor specificity prediction with bimodal attention networks. Bioinformatics 37, i237–i244 (2021).
1462 35 35. Lam, H. T., Sbodio, M. L., Galindo, M. M., Zayats, M., Fernandez-Diaz, R., Valls, V., Picco, G., Ramis, C. B. & Lopez, V. Otter-Knowledge: benchmarks of multimodal knowledge graph representation learning from different sources for drug discovery. arXiv preprint arXiv:2306.12802 (2023).
1463 36 36. Kinnings, S. L., Liu, N., Tonge, P. J., Jackson, R. M., Xie, L. & Bourne, P. E. A machine learning-based method to improve docking scoring functions and its application to drug repurposing. Journal of chemical information and modeling 51, 408–419 (2011).
1464 37 37. Kalemati, M., Zamani Emani, M. & Koohi, S. BiComp-DTA: Drug-target binding affinity prediction through complementary biological-related and compression-based featurization approach. PLOS Computational Biology 19, e1011036 (2023).
1465 38 38. 39. Wei, B. & Gong, X. DeepPLA: a novel deep learning-based model for protein-ligand binding affinity prediction (2021).
1466 39 Probst, D., Schwaller, P. & Reymond, J.-L. Reaction classification and yield prediction using the differential reaction fingerprint DRFP. Digital discovery 1, 91–97 (2022).
1467 40 40. Rivera, Z. A., Tayo, L., Chen, B.-Y. & Tsai, P.-W. In silico Evaluation of the Feasibility of Magnolia officinalis Electron-shuttling Compounds as Parkinson’s Disease Remedy. Letters in Drug Design & Discovery 21, 3039–3048 (2024).
1468 41 41. Pei, Q., Wu, L., Zhu, J., Xia, Y., Xie, S., Qin, T., Liu, H., Liu, T.-Y. & Yan, R. Breaking the barriers of data scarcity in drug–target affinity prediction. Briefings in Bioinformatics 24, bbad386 (2023).
1469 42 42. Xia, F., Shukla, M., Brettin, T., Garcia-Cardona, C., Cohn, J., Allen, J. E., Maslov, S., Holbeck, S. L., Doroshow, J. H., Evrard, Y. A., et al. Predicting tumor cell line response to drug pairs with deep learning. BMC bioinformatics 19, 71–79 (2018).
1470 43 43. Lind, A. P. & Anderson, P. C. Predicting drug activity against cancer cells by random forest models based on minimal genomic information and chemical properties. PloS one 14, e0219774 (2019).
1471 44 44. Euclia. https://github.com/euclia/public-models. 2023.
1472 45 45. Leenay, R. T., Aghazadeh, A., Hiatt, J., Tse, D., Roth, T. L., Apathy, R., Shifrut, E., Hultquist, J. F., Krogan, N., Wu, Z., et al. Large dataset enables prediction of repair after CRISPR–Cas9 editing in primary T cells. Nature biotechnology 37,1034–1037 (2019).
1473 46 46. Yang, K., Swanson, K., Jin, W., Coley, C., Eiden, P., Gao, H., Guzman-Perez, A., Hopper, T., Kelley, B., Mathea, M., et al. Analyzing learned molecular representations for property prediction. Journal of chemical information and modeling 59, 3370–3388 (2019).
1474 47 47. Preuer, K., Lewis, R. P., Hochreiter, S., Bender, A., Bulusu, K. C. & Klambauer, G. DeepSynergy: predicting anti-cancer drug synergy with Deep Learning. Bioinformatics 34, 1538–1546 (2018).
1475 48 48. Zheng, S., Rao, J., Zhang, Z., Xu, J. & Yang, Y. Predicting retrosynthetic reactions using self-corrected transformer neural networks. Journal of chemical information and modeling 60, 47–55 (2019).
1476 49 49. Boral, N., Ghosh, P., Goswami, A. & Bhattacharyya, M. Accountable prediction of drug ADMET Properties with molecular descriptors. bioRxiv, 2022–06 (2022).
1477 50 50. Hendrycks, D., Burns, C., Basart, S., Zou, A., Mazeika, M., Song, D. & Steinhardt, J. Measuring massive multitask language understanding. arXiv preprint arXiv:2009.03300 (2020).
1478 33 0 [33] A. Yang, B. Yang, B. Hui, B. Zheng, B. Yu, C. Zhou, C. Li, C. Li, D. Liu, F. Huang, G. Dong, H. Wei, H. Lin, J. Tang, J. Wang, J. Yang, J. Tu, J. Zhang, J. Ma, J. Xu, J. Zhou, J. Bai, J. He, J. Lin, K. Dang, K. Lu, K. Chen, K. Yang, M. Li, M. Xue, N. Ni, P. Zhang, P. Wang, R. Peng, R. Men, R. Gao, R. Lin, S. Wang, S. Bai, S. Tan, T. Zhu, T. Li, T. Liu, W. Ge, X. Deng, X. Zhou, X. Ren, X. Zhang, X. Wei, X. Ren, Y. Fan, Y. Yao, Y. Zhang, Y. Wan, Y. Chu, Y. Liu, Z. Cui, Z. Zhang, and Z. Fan. Qwen2 technical report. arXiv preprint arXiv:2407.10671, 2024.https://arxiv.org/pdf/2407.10671 https://qiita.com/kaizen_nagoya/items/29a77b25282c8822011e
1479 1 Marah Abdin, Jyoti Aneja, Sebastien Bubeck, Caio C´ esar Teodoro Mendes, Weizhu Chen, Allie Del Giorno, Ronen Eldan, Sivakanth Gopi, Suriya Gunasekar, Mojan Javaheripi, Piero Kauffmann, Yin Tat Lee, Yuanzhi Li, Anh Nguyen, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Michael Santacroce, Harkirat Singh Behl, Adam Taumann Kalai, Xin Wang, Rachel Ward, Philipp Witte, Cyril Zhang, and Yi Zhang. Phi-2: The surprising power of small language models, 2024. URL https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/.
1480 2 AI@Meta. Llama 3 model card, 2024. URL https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md.
1481 3 Joshua Ainslie, James Lee-Thorp, Michiel de Jong, Yury Zemlyanskiy, Federico Lebr´ on, and Sumit Sanghai. GQA: Training generalized multi-query Transformer models from multi-head checkpoints. In EMNLP, pp. 4895–4901. Association for Computational Linguistics, 2023.
1482 4 Chenxin An, Fei Huang, Jun Zhang, Shansan Gong, Xipeng Qiu, Chang Zhou, and Lingpeng Kong. Training-free long-context scaling of large language models. CoRR, abs/2402.17463, 2024.
1483 5 Anthropic. The Claude 3 model family: Opus, Sonnet, Haiku. Technical report, Anthropic, AI, 2024. URL https://www-cdn.anthropic.com/de8ba9b01c9ab7cbabf5c33b80b7bbc618857627/Model_Card_Claude_3.pdf.
1484 6 Jacob Austin, Augustus Odena, Maxwell I. Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie J. Cai, Michael Terry, Quoc V. Le, and Charles Sutton. Program synthesis with large language models. CoRR, abs/2108.07732, 2021.
1485 7 Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng Xu, Jin Xu, An Yang, Hao Yang, Jian Yang, Shusheng Yang, Yang Yao, Bowen Yu, Hongyi Yuan, Zheng Yuan, Jianwei Zhang, Xingxuan Zhang, Yichang Zhang, Zhenru Zhang, Chang Zhou, Jingren Zhou, Xiaohuan Zhou, and Tianhang Zhu. Qwen technical report. CoRR, abs/2309.16609, 2023a.
1486 8 Jinze Bai, Shuai Bai, Shusheng Yang, Shijie Wang, Sinan Tan, Peng Wang, Junyang Lin, Chang Zhou, and Jingren Zhou. Qwen-VL: A frontier large vision-language model with versatile abilities. CoRR, abs/2308.12966, 2023b.
1487 9 Yuntao Bai, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, Andy Jones, Anna Chen, Anna Goldie, Azalia Mirhoseini, Cameron McKinnon, Carol Chen, Catherine Olsson, Christopher Olah, Danny Hernandez, Dawn Drain, Deep Ganguli, Dustin Li, Eli Tran-Johnson, Ethan Perez, Jamie Kerr, Jared Mueller, Jeffrey Ladish, Joshua Landau, Kamal Ndousse, Kamile Lukosiute, Liane Lovitt, Michael Sellitto, Nelson Elhage, Nicholas Schiefer, Noem´ ı Mercado, Nova DasSarma, Robert Lasenby, Robin Larson, Sam Ringer, Scott Johnston, Shauna Kravec, Sheer El Showk, Stanislav Fort, Tamera Lanham, Timothy Telleen-Lawton, Tom Conerly, Tom Henighan,Tristan Hume, Samuel R. Bowman, Zac Hatfield-Dodds, Ben Mann, Dario Amodei, Nicholas Joseph, Sam McCandlish, Tom Brown, and Jared Kaplan. Constitutional AI: Harmlessness from AI feedback. CoRR, abs/2212.08073, 2022.
1488 10 Lucas Bandarkar, Davis Liang, Benjamin Muller, Mikel Artetxe, Satya Narayan Shukla, Donald Husa, Naman Goyal, Abhinandan Krishnan, Luke Zettlemoyer, and Madian Khabsa. The Belebele benchmark: A parallel reading comprehension dataset in 122 language variants. CoRR, abs/2308.16884, 2023.
1489 11 Boxi Cao, Keming Lu, Xinyu Lu, Jiawei Chen, Mengjie Ren, Hao Xiang, Peilin Liu, Yaojie Lu, Ben He, Xianpei Han, Le Sun, Hongyu Lin, and Bowen Yu. Towards scalable automated alignment of LLMs: A survey. CoRR, abs/2406.01252, 2024.
1490 12 Federico Cassano, John Gouwar, Daniel Nguyen, Sydney Nguyen, Luna Phipps-Costin, Donald Pinckney, Ming-Ho Yee, Yangtian Zi, Carolyn Jane Anderson, Molly Q. Feldman, Arjun Guha, Michael Greenberg, and Abhinav Jangda. MultiPL-E: A scalable and polyglot approach to benchmarking neural code generation. IEEE Trans. Software Eng., 49(7):3675–3691, 2023.
1491 13 Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Pond´ e de Oliveira Pinto, Jared Kaplan, Harrison Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shantanu Jain, William Saunders, Christopher Hesse, Andrew N. Carr, Jan Leike, Joshua Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, and Wojciech Zaremba. Evaluating large language models trained on code. CoRR, abs/2107.03374, 2021.
1492 14 Wenhu Chen, Ming Yin, Max Ku, Pan Lu, Yixin Wan, Xueguang Ma, Jianyu Xu, Xinyi Wang, and Tony Xia. TheoremQA: A theorem-driven question answering dataset. In EMNLP, pp. 7889–7901. Association for Computational Linguistics, 2023a.
1493 15 Zhihong Chen, Shuo Yan, Juhao Liang, Feng Jiang, Xiangbo Wu, Fei Yu, Guiming Hardy Chen, Junying Chen, Hongbo Zhang, Li Jianquan, Wan Xiang, and Benyou Wang. Multilingual- SIFT: Multilingual supervised instruction fine-tuning, 2023b. URL https://github.com/FreedomIntelligence/MultilingualSIFT.
1494 16 Wei-Lin Chiang, Lianmin Zheng, Ying Sheng, Anastasios Nikolas Angelopoulos, Tianle Li, Dacheng Li, Hao Zhang, Banghua Zhu, Michael I. Jordan, Joseph E. Gonzalez, and Ion Stoica. Chatbot arena: An open platform for evaluating LLMs by human preference. CoRR, abs/2403.04132, 2024.
1495 17 Yunfei Chu, Jin Xu, Xiaohuan Zhou, Qian Yang, Shiliang Zhang, Zhijie Yan, Chang Zhou, and Jingren Zhou. Qwen-Audio: Advancing universal audio understanding via unified large-scale audio-language models. CoRR, abs/2311.07919, 2023.
1496 18 Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, and Oyvind Tafjord. Think you have solved question answering? Try ARC, the AI2 reasoning challenge. CoRR, abs/1803.05457, 2018.
1497 19 Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, Christopher Hesse, and John Schulman. Training verifiers to solve math word problems. CoRR, abs/2110.14168, 2021.
1498 20 Damai Dai, Chengqi Deng, Chenggang Zhao, R. X. Xu, Huazuo Gao, Deli Chen, Jiashi Li, Wangding Zeng, Xingkai Yu, Y. Wu, Zhenda Xie, Y. K. Li, Panpan Huang, Fuli Luo, Chong Ruan, Zhifang Sui, and Wenfeng Liang. DeepSeekMoE: Towards ultimate expert specialization in mixture-of-experts language models. CoRR, abs/2401.06066, 2024.
1499 21 Yann N. Dauphin, Angela Fan, Michael Auli, and David Grangier. Language modeling with gated convolutional networks. In ICML, volume 70 of Proceedings of Machine Learning Research, pp. 933–941. PMLR, 2017.
1500 22 Guanting Dong, Hongyi Yuan, Keming Lu, Chengpeng Li, Mingfeng Xue, Dayiheng Liu, Wei Wang, Zheng Yuan, Chang Zhou, and Jingren Zhou. How abilities in large language models are affected by supervised fine-tuning data composition. CoRR, abs/2310.05492, 2023.
1501 23 Guanting Dong, Keming Lu, Chengpeng Li, Tingyu Xia, Bowen Yu, Chang Zhou, and Jingren Zhou. Self-play with execution feedback: Improving instruction-following capabilities of large language models. CoRR, abs/2406.13542, 2024.
1502 24 Alena Fenogenova, Artem Chervyakov, Nikita Martynov, Anastasia Kozlova, Maria Tikhonova, Albina Akhmetgareeva, Anton A. Emelyanov, Denis Shevelev, Pavel Lebedev, Leonid Sinev, Ulyana Isaeva, Katerina Kolomeytseva, Daniil Moskovskiy, Elizaveta Goncharova, Nikita Savushkin, Polina Mikhailova, Denis Dimitrov, Alexander Panchenko, and Sergey Markov. MERA: A comprehensive LLM evaluation in russian. CoRR, abs/2401.04531, 2024.
1503 25 Shahriar Golchin and Mihai Surdeanu. Time travel in llms: Tracing data contamination in large language models. In ICLR. OpenReview.net, 2024.
1504 26 Naman Goyal, Cynthia Gao, Vishrav Chaudhary, Peng-Jen Chen, Guillaume Wenzek, Da Ju, Sanjana Krishnan, Marc’Aurelio Ranzato, Francisco Guzm´ an, and Angela Fan. The Flores-101 evaluation benchmark for low-resource and multilingual machine translation. Trans. Assoc. Comput. Linguistics, 10:522–538, 2022.
1505 27 Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt. Measuring massive multitask language understanding. In ICLR. OpenReview.net, 2021a.
1506 28 Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, and Jacob Steinhardt. Measuring mathematical problem solving with the MATH dataset. In NeurIPS Datasets and Benchmarks, 2021b.
1507 29 Yuzhen Huang, Yuzhuo Bai, Zhihao Zhu, Junlei Zhang, Jinghan Zhang, Tangjun Su, Junteng Liu, Chuancheng Lv, Yikai Zhang, Jiayi Lei, Yao Fu, Maosong Sun, and Junxian He. C-Eval: A multi-level multi-discipline chinese evaluation suite for foundation models. In NeurIPS, 2023.
1508 30 Naman Jain, King Han, Alex Gu, Wen-Ding Li, Fanjia Yan, Tianjun Zhang, Sida Wang, Armando Solar-Lezama, Koushik Sen, and Ion Stoica. LiveCodeBench: Holistic and contamination free evaluation of large language models for code. CoRR, abs/2403.07974, 2024.
1509 31 Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, L´ elio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timoth´ ee Lacroix, and William El Sayed. Mistral 7B. CoRR, abs/2310.06825, 2023a.
1510 32 Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, L´ elio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Sandeep Subramanian, Sophia Yang, Szymon Antoniak, Teven Le Scao, Th´ eophile Gervet, Thibaut Lavril, Thomas Wang, Timoth´ ee Lacroix, and William El Sayed. Mixtral of experts. CoRR, abs/2401.04088, 2024.
1511 33 Zixuan Jiang, Jiaqi Gu, Hanqing Zhu, and David Z. Pan. Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and efficient pre-LN Transformers. CoRR, abs/2305.14858, 2023b.
1512 34 Gregory Kamradt. Needle in a haystack - pressure testing LLMs, 2023. URL https://github.com/gkamradt/LLMTest_NeedleInAHaystack.
1513 35 Aran Komatsuzaki, Joan Puigcerver, James Lee-Thorp, Carlos Riquelme Ruiz, Basil Mustafa, Joshua Ainslie, Yi Tay, Mostafa Dehghani, and Neil Houlsby. Sparse upcycling: Training mixture-of-experts from dense checkpoints. In ICLR. OpenReview.net, 2023.
1514 36 Fajri Koto, Nurul Aisyah, Haonan Li, and Timothy Baldwin. Large language models only pass primary school exams in Indonesia: A comprehensive test on IndoMMLU. In EMNLP, pp. 12359–12374. Association for Computational Linguistics, 2023.
1515 37 Haonan Li, Yixuan Zhang, Fajri Koto, Yifei Yang, Hai Zhao, Yeyun Gong, Nan Duan, and Timothy Baldwin. CMMLU: Measuring massive multitask language understanding in Chinese. CoRR, abs/2306.09212, 2023.
1516 38 Tianle Li, Wei-Lin Chiang, Evan Frick, Lisa Dunlap, Tianhao Wu, Banghua Zhu, Joseph E. Gonzalez, and Ion Stoica. From crowdsourced data to high-quality benchmarks: Arena-Hard and BenchBuilder pipeline. CoRR, abs/2406.11939, 2024.
1517 39 Opher Lieber, Barak Lenz, Hofit Bata, Gal Cohen, Jhonathan Osin, Itay Dalmedigos, Erez Safahi, Shaked Meirom, Yonatan Belinkov, Shai Shalev-Shwartz, Omri Abend, Raz Alon, Tomer Asida, Amir Bergman, Roman Glozman, Michael Gokhman, Avashalom Manevich, Nir Ratner, Noam Rozen, Erez Shwartz, Mor Zusman, and Yoav Shoham. Jamba: A hybrid Transformer-Mamba language model. CoRR, abs/2403.19887, 2024.
1518 40 Stephanie Lin, Jacob Hilton, and Owain Evans. TruthfulQA: Measuring how models mimic human falsehoods. In ACL (1), pp. 3214–3252. Association for Computational Linguistics, 2022a.
1519 41 Xi Victoria Lin, Todor Mihaylov, Mikel Artetxe, Tianlu Wang, Shuohui Chen, Daniel Simig, Myle Ott, Naman Goyal, Shruti Bhosale, Jingfei Du, Ramakanth Pasunuru, Sam Shleifer, Punit Singh Koura, Vishrav Chaudhary, Brian O’Horo, Jeff Wang, Luke Zettlemoyer, Zornitsa Kozareva, Mona T. Diab, Veselin Stoyanov, and Xian Li. Few-shot learning with multilingual generative language models. In EMNLP, pp. 9019–9052. Association for Computational Linguistics, 2022b.
1520 42 Jiawei Liu, Chunqiu Steven Xia, Yuyao Wang, and Lingming Zhang. Is your code generated by ChatGPT really correct? Rigorous evaluation of large language models for code generation. In NeurIPS, 2023a.
1521 43 Xiao Liu, Xuanyu Lei, Shengyuan Wang, Yue Huang, Zhuoer Feng, Bosi Wen, Jiale Cheng, Pei Ke, Yifan Xu, Weng Lam Tam, Xiaohan Zhang, Lichao Sun, Hongning Wang, Jing Zhang, Minlie Huang, Yuxiao Dong, and Jie Tang. AlignBench: Benchmarking Chinese alignment of large language models. CoRR, abs/2311.18743, 2023b.
1522 44 Keming Lu, Bowen Yu, Fei Huang, Yang Fan, Runji Lin, and Chang Zhou. Online merging optimizers for boosting rewards and mitigating tax in alignment. CoRR, abs/2405.17931, 2024a.
1523 45 Keming Lu, Bowen Yu, Chang Zhou, and Jingren Zhou. Large language models are superpositions of all characters: Attaining arbitrary role-play via self-alignment. CoRR, abs/2401.12474, 2024b.
1524 46 Keming Lu, Hongyi Yuan, Zheng Yuan, Runji Lin, Junyang Lin, Chuanqi Tan, Chang Zhou, and Jingren Zhou. #InsTag: Instruction tagging for analyzing supervised fine-tuning of large language models. In ICLR. OpenReview.net, 2024c.
1525 47 Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivi` ere, Mihir Sanjay Kale, Juliette Love, Pouya Tafti, L´ eonard Hussenot, Pier Giuseppe Sessa, Aakanksha Chowdhery, Adam Roberts, Aditya Barua, Alex Botev, Alex Castro-Ros, Ambrose Slone, Am´ elie H´ eliou, Andrea Tacchetti, Anna Bulanova, Antonia Paterson, Beth Tsai, Bobak Shahriari, Charline Le Lan, Christopher A. Choquette-Choo, Cl´ ement Crepy, Daniel Cer, Daphne Ippolito, David Reid, Elena Buchatskaya, Eric Ni, Eric Noland, Geng Yan, George Tucker, George-Christian Muraru, Grigory Rozhdestvenskiy, Henryk Michalewski, Ian Tenney, Ivan Grishchenko, Jacob Austin, James Keeling, Jane Labanowski, Jean-Baptiste Lespiau, Jeff Stanway, Jenny Brennan, Jeremy Chen, Johan Ferret, Justin Chiu, Justin Mao-Jones, Katherine Lee, Kathy Yu, Katie Millican, Lars Lowe Sjoesund, Lisa Lee, Lucas Dixon, Machel Reid, Maciej Mikuła, Mateo Wirth, Michael Sharman, Nikolai Chinaev, Nithum Thain, Olivier Bachem, Oscar Chang, Oscar Wahltinez, Paige Bailey, Paul Michel, Petko Yotov, Rahma Chaabouni, Ramona Comanescu, Reena Jana, Rohan Anil, Ross McIlroy, Ruibo Liu, Ryan Mullins, Samuel L Smith, Sebastian Borgeaud, Sertan Girgin, Sholto Douglas, Shree Pandya, Siamak Shakeri, Soham De, Ted Klimenko, Tom Hennigan, Vlad Feinberg, Wojciech Stokowiec, Yu hui Chen, Zafarali Ahmed, Zhitao Gong, Tris Warkentin, Ludovic Peran, Minh Giang, Cl´ ement Farabet, Oriol Vinyals, Jeff Dean, Koray Kavukcuoglu, Demis Hassabis, Zoubin Ghahramani, Douglas Eck, Joelle Barral, Fernando Pereira, Eli Collins, Armand Joulin, Noah Fiedel, Evan Senter, Alek Andreev, and Kathleen Kenealy. Gemma: Open models based on Gemini research and technology. CoRR, abs/2403.08295, 2024.
1526 48 Niklas Muennighoff, Thomas Wang, Lintang Sutawika, Adam Roberts, Stella Biderman, Teven Le Scao, M. Saiful Bari, Sheng Shen, Zheng Xin Yong, Hailey Schoelkopf, Xiangru Tang, Dragomir Radev, Alham Fikri Aji, Khalid Almubarak, Samuel Albanie, Zaid Alyafeai, Albert Webson, Edward Raff, and Colin Raffel. Crosslingual generalization through multitask finetuning. In ACL (1), pp. 15991–16111. Association for Computational Linguistics, 2023.
1527 49 Jinjie Ni, Fuzhao Xue, Xiang Yue, Yuntian Deng, Mahir Shah, Kabir Jain, Graham Neubig, and Yang You. MixEval: Deriving wisdom of the crowd from LLM benchmark mixtures. CoRR, abs/2406.06565, 2024.
1528 50 OpenAI. Introducing ChatGPT, 2022. URL https://openai.com/index/chatgpt/.
1529 51 OpenAI. GPT4 technical report. arXiv preprint arXiv:2303.08774, 2023.
1530 52 OpenAI. Hello GPT-4o, 2024. URL https://openai.com/index/hello-gpt-4o/.
1531 53 OpenCompass Contributors. OpenCompass: A universal evaluation platform for foundation models, 2023. URL https://github.com/open-compass/opencompass.
1532 54 Bowen Peng, Jeffrey Quesnelle, Honglu Fan, and Enrico Shippole. YaRN: Efficient context window extension of large language models. CoRR, abs/2309.00071, 2023.
1533 55 Edoardo Maria Ponti, Goran Glavas, Olga Majewska, Qianchu Liu, Ivan Vulic, and Anna Korhonen. XCOPA: A multilingual dataset for causal commonsense reasoning. In EMNLP (1), pp. 2362–2376. Association for Computational Linguistics, 2020.
1534 56 Qwen Team. Introducing Qwen1.5, 2024a. URL https://qwenlm.github.io/blog/qwen1.5/.
1535 57 Qwen Team. Qwen1.5-110B: The first 100B+ model of the Qwen1.5 series, 2024b. URL https://qwenlm.github.io/blog/qwen1.5-110b/.
1536 58 Qwen Team. Qwen1.5-MoE: Matching 7B model performance with 1/3 activated parameters, 2024c. URL https://qwenlm.github.io/blog/qwen-moe/.
1537 59 Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D. Manning, Stefano Ermon, and Chelsea Finn. Direct preference optimization: Your language model is secretly a reward model. In NeurIPS, 2023.
1538 60 Samyam Rajbhandari, Conglong Li, Zhewei Yao, Minjia Zhang, Reza Yazdani Aminabadi, Ammar Ahmad Awan, Jeff Rasley, and Yuxiong He. DeepSpeed-MoE: Advancing mixture-of-experts inference and training to power next-generation AI scale. In ICML, volume 162 of Proceedings of Machine Learning Research, pp. 18332–18346. PMLR, 2022.
1539 61 Mathieu Ravaut, Bosheng Ding, Fangkai Jiao, Hailin Chen, Xingxuan Li, Ruochen Zhao, Chengwei Qin, Caiming Xiong, and Shafiq Joty. How much are LLMs contaminated? A comprehensive survey and the llmsanitize library. CoRR, abs/2404.00699, 2024.
1540 62 David Rein, Betty Li Hou, Asa Cooper Stickland, Jackson Petty, Richard Yuanzhe Pang, Julien Dirani, Julian Michael, and Samuel R. Bowman. GPQA: A graduate-level Google-proof Q&A benchmark. CoRR, abs/2311.12022, 2023.
1541 63 Oscar Sainz, Jon Ander Campos, Iker Garc´ ıa-Ferrero, Julen Etxaniz, Oier Lopez de Lacalle, and Eneko Agirre. NLP evaluation in trouble: On the need to measure LLM data contamination for each benchmark. In EMNLP (Findings), pp. 10776–10787. Association for Computational Linguistics, 2023.
1542 64 Keisuke Sakaguchi, Ronan Le Bras, Chandra Bhagavatula, and Yejin Choi. WinoGrande: An adversarial winograd schema challenge at scale. Commun. ACM, 64(9):99–106, 2021.
1543 65 Jianlin Su. The magical effect of the Bias term: RoPE + Bias = better length extrapolation, 2023. URL https://spaces.ac.cn/archives/9577.
1544 66 Jianlin Su, Murtadha H. M. Ahmed, Yu Lu, Shengfeng Pan, Wen Bo, and Yunfeng Liu. Roformer: Enhanced Transformer with rotary position embedding. Neurocomputing, 568:127063, 2024.
1545 67 Mirac Suzgun, Nathan Scales, Nathanael Sch¨ arli, Sebastian Gehrmann, Yi Tay, Hyung Won Chung, Aakanksha Chowdhery, Quoc V. Le, Ed H. Chi, Denny Zhou, and Jason Wei. Challenging BIG- Bench tasks and whether chain-of-thought can solve them. In ACL (Findings), pp. 13003–13051. Association for Computational Linguistics, 2023.
1546 68 Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timoth´ee Lacroix, Baptiste Rozi` ere, Naman Goyal, Eric Hambro, Faisal Azhar, Aur´ elien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. LLaMA: Open and efficient foundation language models. CoRR, abs/2302.13971, 2023.
1547 69 Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. In NIPS, pp. 5998–6008, 2017.
1548 70 Yubo Wang, Xueguang Ma, Ge Zhang, Yuansheng Ni, Abhranil Chandra, Shiguang Guo, Weiming Ren, Aaran Arulraj, Xuan He, Ziyan Jiang, Tianle Li, Max Ku, Kai Wang, Alex Zhuang, Rongqi Fan, Xiang Yue, and Wenhu Chen. MMLU-Pro: A more robust and challenging multi-task language understanding benchmark. CoRR, abs/2406.01574, 2024.
1549 71 Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, and Hao Ma. Effective long-context scaling of foundation models. CoRR, abs/2309.16039, 2023.
1550 72 Yinfei Yang, Yuan Zhang, Chris Tar, and Jason Baldridge. PAWS-X: A cross-lingual adversarial dataset for paraphrase identification. In EMNLP/IJCNLP (1), pp. 3685–3690. Association for Computational Linguistics, 2019.
1551 73 Alex Young, Bei Chen, Chao Li, Chengen Huang, Ge Zhang, Guanwei Zhang, Heng Li, Jiangcheng Zhu, Jianqun Chen, Jing Chang, Kaidong Yu, Peng Liu, Qiang Liu, Shawn Yue, Senbin Yang, Shiming Yang, Tao Yu, Wen Xie, Wenhao Huang, Xiaohui Hu, Xiaoyi Ren, Xinyao Niu, Pengcheng Nie, Yuchi Xu, Yudong Liu, Yue Wang, Yuxuan Cai, Zhenyu Gu, Zhiyuan Liu, and Zonghong Dai. Yi: Open foundation models by 01.AI. CoRR, abs/2403.04652, 2024.
1552 74 Tao Yuan, Xuefei Ning, Dong Zhou, Zhijie Yang, Shiyao Li, Minghui Zhuang, Zheyue Tan, Zhuyu Yao, Dahua Lin, Boxun Li, Guohao Dai, Shengen Yan, and Yu Wang. LV-Eval: A balanced long-context benchmark with 5 length levels up to 256K. CoRR, abs/2402.05136, 2024.
1553 75 Zheng Yuan, Hongyi Yuan, Chengpeng Li, Guanting Dong, Chuanqi Tan, and Chang Zhou. Scaling relationship on learning mathematical reasoning with large language models. CoRR, abs/2308.01825, 2023.
1554 76 Rowan Zellers, Ari Holtzman, Yonatan Bisk, Ali Farhadi, and Yejin Choi. Hellaswag: Can a machine really finish your sentence? In ACL (1), pp. 4791–4800. Association for Computational Linguistics, 2019.
1555 77 Aohan Zeng, Bin Xu, Bowen Wang, Chenhui Zhang, Da Yin, Diego Rojas, Guanyu Feng, Hanlin Zhao, Hanyu Lai, Hao Yu, Hongning Wang, Jiadai Sun, Jiajie Zhang, Jiale Cheng, Jiayi Gui, Jie Tang, Jing Zhang, Juanzi Li, Lei Zhao, Lindong Wu, Lucen Zhong, Mingdao Liu, Minlie Huang, Peng Zhang, Qinkai Zheng, Rui Lu, Shuaiqi Duan, Shudan Zhang, Shulin Cao, Shuxun Yang, Weng Lam Tam, Wenyi Zhao, Xiao Liu, Xiao Xia, Xiaohan Zhang, Xiaotao Gu, Xin Lv, Xinghan Liu, Xinyi Liu, Xinyue Yang, Xixuan Song, Xunkai Zhang, Yifan An, Yifan Xu, Yilin Niu, Yuantao Yang, Yueyan Li, Yushi Bai, Yuxiao Dong, Zehan Qi, Zhaoyu Wang, Zhen Yang, Zhengxiao Du, Zhenyu Hou, and Zihan Wang. ChatGLM: A family of large language models from GLM-130B to GLM-4 all tools. CoRR, abs/2406.12793, 2024.
1556 78 Yingxiu Zhao, Bowen Yu, Binyuan Hui, Haiyang Yu, Minghao Li, Fei Huang, Nevin L. Zhang, and Yongbin Li. Tree-Instruct: A preliminary study of the intrinsic relationship between complexity and alignment. In LREC/COLING, pp. 16776–16789. ELRA and ICCL, 2024.
1557 79 Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric P. Xing, Hao Zhang, Joseph E. Gonzalez, and Ion Stoica. Judging LLM-as-a-judge with MT-Bench and Chatbot Arena. In NeurIPS, 2023.
1558 80 Jeffrey Zhou, Tianjian Lu, Swaroop Mishra, Siddhartha Brahma, Sujoy Basu, Yi Luan, Denny Zhou, and Le Hou. Instruction-following evaluation for large language models. CoRR, abs/2311.07911, 2023.
1559 34 0 [34] A. Yang, B. Yang, B. Zhang, B. Hui, B. Zheng, B. Yu, C. Li, D. Liu, F. Huang, H. Wei, H. Lin, J. Yang, J. Tu, J. Zhang, J. Yang, J. Yang, J. Zhou, J. Lin, K. Dang, K. Lu, K. Bao, K. Yang, L. Yu, M. Li, M. Xue, P. Zhang, Q. Zhu, R. Men, R. Lin, T. Li, T. Xia, X. Ren, X. Ren, Y. Fan, Y. Su, Y. Zhang, Y. Wan, Y. Liu, Z. Cui, Z. Zhang, and Z. Qiu. Qwen2.5 technical report. arXiv preprint arXiv:2412.15115, 2024.https://arxiv.org/pdf/2412.15115
1560 1 Marah I Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat S. Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, S´ ebastien Bubeck, Martin Cai, Caio C´ esar Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Parul Chopra, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Dan Iter, Amit Garg, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Jamie Huynh, Mojan Javaheripi, Xin Jin, Piero Kauffmann, Nikos Karampatziakis, Dongwoo Kim, Mahoud Khademi, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Chen Liang, Weishung Liu, Eric Lin, Zeqi Lin, Piyush Madan, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Xia Song, Masahiro Tanaka, Xin Wang, Rachel Ward, Guanhua Wang, Philipp Witte, Michael Wyatt, Can Xu, Jiahang Xu, Sonali Yadav, Fan Yang, Ziyi Yang, Donghan Yu, Chengruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, and Xiren Zhou. Phi-3 technical report: A highly capable language model locally on your phone. CoRR, abs/2404.14219, 2024.
1561 2 Bo Adler, Niket Agarwal, Ashwath Aithal, Dong H. Anh, Pallab Bhattacharya, Annika Brundyn, Jared Casper, Bryan Catanzaro, Sharon Clay, Jonathan Cohen, Sirshak Das, Ayush Dattagupta, Olivier Delalleau, Leon Derczynski, Yi Dong, Daniel Egert, Ellie Evans, Aleksander Ficek, Denys Fridman, Shaona Ghosh, Boris Ginsburg, Igor Gitman, Tomasz Grzegorzek, Robert Hero, Jining Huang, Vibhu Jawa, Joseph Jennings, Aastha Jhunjhunwala, John Kamalu, Sadaf Khan, Oleksii Kuchaiev, Patrick LeGresley, Hui Li, Jiwei Liu, Zihan Liu, Eileen Peters Long, Ameya Mahabaleshwarkar, Somshubra Majumdar, James Maki, Miguel Martinez, Maer Rodrigues de Melo, Ivan Moshkov, Deepak Narayanan, Sean Narenthiran, Jesus Navarro, Phong Nguyen, Osvald Nitski, Vahid Noroozi, Guruprasad Nutheti, Christopher Parisien, Jupinder Parmar, Mostofa Patwary, Krzysztof Pawelec, Wei Ping, Shrimai Prabhumoye, Rajarshi Roy, Trisha Saar, Vasanth Rao Naik Sabavat, Sanjeev Satheesh, Jane Polak Scowcroft, Jason D. Sewall, Pavel Shamis, Gerald Shen, Mohammad Shoeybi, Dave Sizer, Misha Smelyanskiy, Felipe Soares, Makesh Narsimhan Sreedhar, Dan Su, Sandeep Subramanian, Shengyang Sun, Shubham Toshniwal, Hao Wang, Zhilin Wang, Jiaxuan You, Jiaqi Zeng, Jimmy Zhang, Jing Zhang, Vivienne Zhang, Yian Zhang, and Chen Zhu. Nemotron-4 340B technical report. CoRR, abs/2406.11704, 2024.
1562 3 Joshua Ainslie, James Lee-Thorp, Michiel de Jong, Yury Zemlyanskiy, Federico Lebr´ on, and Sumit Sanghai. GQA: Training generalized multi-query Transformer models from multi-head checkpoints. In EMNLP, pp. 4895–4901. Association for Computational Linguistics, 2023.
1563 4 Ebtesam Almazrouei, Hamza Alobeidli, Abdulaziz Alshamsi, Alessandro Cappelli, Ruxandra Cojocaru,´M´ erouane Debbah, Etienne Goffinet, Daniel Hesslow, Julien Launay, Quentin Malartic, Daniele Mazzotta, Badreddine Noune, Baptiste Pannier, and Guilherme Penedo. The Falcon series of open language models. CoRR, abs/2311.16867, 2023.
1564 5 Chenxin An, Fei Huang, Jun Zhang, Shansan Gong, Xipeng Qiu, Chang Zhou, and Lingpeng Kong. Training-free long-context scaling of large language models. CoRR, abs/2402.17463, 2024.
1565 6 Anthropic. Introducing Claude, 2023a. URL https://www.anthropic.com/index/introducing-claude. Anthropic. Claude 2. Technical report, Anthropic, 2023b. URL https://www-files.anthropic.com/production/images/Model-Card-Claude-2.pdf.
1566 7 Anthropic. The Claude 3 model family: Opus, Sonnet, Haiku. Technical report, Anthropic, AI, 2024. URL https://www-cdn.anthropic.com/de8ba9b01c9ab7cbabf5c33b80b7bbc618857627/Model Card Claude 3.pdf.
1567 8 Jacob Austin, Augustus Odena, Maxwell I. Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie J. Cai, Michael Terry, Quoc V. Le, and Charles Sutton. Program synthesis with large language models. CoRR, abs/2108.07732, 2021.
1568 9 Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng Xu, Jin Xu, An Yang, Hao Yang, Jian Yang, Shusheng Yang, Yang Yao, Bowen Yu, Hongyi Yuan, Zheng Yuan, Jianwei Zhang, Xingxuan Zhang, Yichang Zhang, Zhenru Zhang, Chang Zhou, Jingren Zhou, Xiaohuan Zhou, and Tianhang Zhu. Qwen technical report. CoRR, abs/2309.16609, 2023.
1569 10 Yushi Bai, Xin Lv, Jiajie Zhang, Yuze He, Ji Qi, Lei Hou, Jie Tang, Yuxiao Dong, and Juanzi Li. LongAlign: A recipe for long context alignment of large language models. In EMNLP (Findings), pp. 1376–1395. Association for Computational Linguistics, 2024.
1570 11 Lucas Bandarkar, Davis Liang, Benjamin Muller, Mikel Artetxe, Satya Narayan Shukla, Donald Husa, Naman Goyal, Abhinandan Krishnan, Luke Zettlemoyer, and Madian Khabsa. The Belebele benchmark: A parallel reading comprehension dataset in 122 language variants. CoRR, abs/2308.16884, 2023.
1571 12 Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. Language models are few-shot learners. In NeurIPS, 2020.
1572 13 Boxi Cao, Keming Lu, Xinyu Lu, Jiawei Chen, Mengjie Ren, Hao Xiang, Peilin Liu, Yaojie Lu, Ben He, Xianpei Han, Le Sun, Hongyu Lin, and Bowen Yu. Towards scalable automated alignment of LLMs: A survey. CoRR, abs/2406.01252, 2024.
1573 14 Federico Cassano, John Gouwar, Daniel Nguyen, Sydney Nguyen, Luna Phipps-Costin, Donald Pinckney, Ming-Ho Yee, Yangtian Zi, Carolyn Jane Anderson, Molly Q. Feldman, Arjun Guha, Michael Greenberg, and Abhinav Jangda. MultiPL-E: A scalable and polyglot approach to benchmarking neural code generation. IEEE Trans. Software Eng., 49(7):3675–3691, 2023.
1574 15 Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Pond´ e de Oliveira Pinto, Jared Kaplan, Harrison Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shantanu Jain, William Saunders, Christopher Hesse, Andrew N. Carr, Jan Leike, Joshua Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, and Wojciech Zaremba. Evaluating large language models trained on code. CoRR, abs/2107.03374, 2021.
1575 16 Wenhu Chen, Ming Yin, Max Ku, Pan Lu, Yixin Wan, Xueguang Ma, Jianyu Xu, Xinyi Wang, and Tony Xia. TheoremQA: A theorem-driven question answering dataset. In EMNLP, pp. 7889–7901. Association for Computational Linguistics, 2023a.
1576 17 Zhihong Chen, Shuo Yan, Juhao Liang, Feng Jiang, Xiangbo Wu, Fei Yu, Guiming Hardy Chen, Junying Chen, Hongbo Zhang, Li Jianquan, Wan Xiang, and Benyou Wang. MultilingualSIFT: Multilingual supervised instruction fine-tuning, 2023b. URL https://github.com/FreedomIntelligence/MultilingualSIFT.
1577 18 Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, and Oyvind Tafjord. Think you have solved question answering? Try ARC, the AI2 reasoning challenge. CoRR, abs/1803.05457, 2018.
1578 19 Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, Christopher Hesse, and John Schulman. Training verifiers to solve math word problems. CoRR, abs/2110.14168, 2021.
1579 20 Damai Dai, Chengqi Deng, Chenggang Zhao, R. X. Xu, Huazuo Gao, Deli Chen, Jiashi Li, Wangding Zeng, Xingkai Yu, Y. Wu, Zhenda Xie, Y. K. Li, Panpan Huang, Fuli Luo, Chong Ruan, Zhifang Sui, and Wenfeng Liang. DeepSeekMoE: Towards ultimate expert specialization in mixture-of-experts language models. CoRR, abs/2401.06066, 2024.
1580 21 Yann N. Dauphin, Angela Fan, Michael Auli, and David Grangier. Language modeling with gated convolutional networks. In ICML, volume 70 of Proceedings of Machine Learning Research, pp. 933–941. PMLR, 2017.
1581 22 Guanting Dong, Keming Lu, Chengpeng Li, Tingyu Xia, Bowen Yu, Chang Zhou, and Jingren Zhou. Self-play with execution feedback: Improving instruction-following capabilities of large language models. CoRR, abs/2406.13542, 2024.
1582 23 Shihan Dou, Jiazheng Zhang, Jianxiang Zang, Yunbo Tao, Haoxiang Jia, Shichun Liu, Yuming Yang, Shenxi Wu, Shaoqing Zhang, Muling Wu, et al. Multi-programming language sandbox for llms. CoRR, abs/2410.23074, 2024.
1583 24 Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aur´ elien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Rozi` ere, Bethany Biron, Binh Tang, Bobbie Chern, Charlotte Caucheteux, Chaya Nayak, Chloe Bi, Chris Marra, Chris McConnell, Christian Keller, Christophe Touret, Chunyang Wu, Corinne Wong, Cristian Canton Ferrer, Cyrus Nikolaidis, Damien Allonsius, Daniel Song, Danielle Pintz, Danny Livshits, David Esiobu, Dhruv
1584 25 Choudhary, Dhruv Mahajan, Diego Garcia-Olano, Diego Perino, Dieuwke Hupkes, Egor Lakomkin, Ehab AlBadawy, Elina Lobanova, Emily Dinan, Eric Michael Smith, Filip Radenovic, Frank Zhang, Gabriel Synnaeve, Gabrielle Lee, Georgia Lewis Anderson, Graeme Nail, Gr´ egoire Mialon, Guan Pang, Guillem Cucurell, Hailey Nguyen, Hannah Korevaar, Hu Xu, Hugo Touvron, Iliyan Zarov, Imanol Arrieta Ibarra, Isabel M. Kloumann, Ishan Misra, Ivan Evtimov, Jade Copet, Jaewon Lee, Jan Geffert, Jana Vranes, Jason Park, Jay Mahadeokar, Jeet Shah, Jelmer van der Linde, Jennifer Billock, Jenny Hong, Jenya Lee, Jeremy Fu, Jianfeng Chi, Jianyu Huang, Jiawen Liu, Jie Wang, Jiecao Yu, Joanna Bitton, Joe Spisak, Jongsoo Park, Joseph Rocca, Joshua Johnstun, Joshua Saxe, Junteng Jia, Kalyan Vasuden Alwala, Kartikeya Upasani, Kate Plawiak, Ke Li, Kenneth Heafield, Kevin Stone, and et al. The Llama 3 herd of models. CoRR, abs/2407.21783, 2024.
1585 26 William Fedus, Barret Zoph, and Noam Shazeer. Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity. J. Mach. Learn. Res., 23:120:1–120:39, 2022.
1586 27 Alena Fenogenova, Artem Chervyakov, Nikita Martynov, Anastasia Kozlova, Maria Tikhonova, Albina Akhmetgareeva, Anton A. Emelyanov, Denis Shevelev, Pavel Lebedev, Leonid Sinev, Ulyana Isaeva, Katerina Kolomeytseva, Daniil Moskovskiy, Elizaveta Goncharova, Nikita Savushkin, Polina Mikhailova, Denis Dimitrov, Alexander Panchenko, and Sergey Markov. MERA: A comprehensive LLM evaluation in russian. CoRR, abs/2401.04531, 2024.
1587 28 Evan Frick, Peter Jin, Tianle Li, Karthik Ganesan, Jian Zhang, Jiantao Jiao, and Banghua Zhu. Athene-70b: Redefining the boundaries of post-training for open models, July 2024a. URL https://nexusflow.ai/blogs/athene.
1588 29 Evan Frick, Tianle Li, Connor Chen, Wei-Lin Chiang, Anastasios Nikolas Angelopoulos, Jiantao Jiao, Banghua Zhu, Joseph E. Gonzalez, and Ion Stoica. How to evaluate reward models for RLHF. CoRR, abs/2410.14872, 2024b.
1589 30 Aryo Pradipta Gema, Joshua Ong Jun Leang, Giwon Hong, Alessio Devoto, Alberto Carlo Maria Mancino, Rohit Saxena, Xuanli He, Yu Zhao, Xiaotang Du, Mohammad Reza Ghasemi Madani, et al. Are we done with mmlu? CoRR, abs/2406.04127, 2024.
1590 31 Gemini Team. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. Technical report, Google, 2024. URL https://storage.googleapis.com/deepmind-media/gemini/gemini v1 5 report.pdf.
1591 32 Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhupatiraju, L´ eonard Hussenot, Thomas Mesnard, Bobak Shahriari, Alexandre Ram´ e, et al. Gemma 2: Improving open language models at a practical size. CoRR, abs/2408.00118, 2024.
1592 33 Naman Goyal, Cynthia Gao, Vishrav Chaudhary, Peng-Jen Chen, Guillaume Wenzek, Da Ju, Sanjana Krishnan, Marc’Aurelio Ranzato, Francisco Guzm´ an, and Angela Fan. The Flores-101 evaluation benchmark for low-resource and multilingual machine translation. Trans. Assoc. Comput. Linguistics, 10:522–538, 2022.
1593 34 Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt. Measuring massive multitask language understanding. In ICLR. OpenReview.net, 2021a.
1594 35 Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, and Jacob Steinhardt. Measuring mathematical problem solving with the MATH dataset. In NeurIPS Datasets and Benchmarks, 2021b.
1595 36 Jordan Hoffmann, Sebastian Borgeaud, Arthur Mensch, Elena Buchatskaya, Trevor Cai, Eliza Rutherford, Diego de Las Casas, Lisa Anne Hendricks, Johannes Welbl, Aidan Clark, et al. Training computeoptimal large language models. CoRR, abs/2203.15556, 2022.
1596 37 Keith Hoskin. The “awful idea of accountability”: Inscribing people into the measurement of objects. Accountability: Power, ethos and the technologies of managing, 1996.
1597 38 Cheng-Ping Hsieh, Simeng Sun, Samuel Kriman, Shantanu Acharya, Dima Rekesh, Fei Jia, Yang Zhang, and Boris Ginsburg. RULER: What’s the real context size of your long-context language models? CoRR, abs/2404.06654, 2024.
1598 39 Shengding Hu, Yuge Tu, Xu Han, Chaoqun He, Ganqu Cui, Xiang Long, Zhi Zheng, Yewei Fang, Yuxiang Huang, Weilin Zhao, Xinrong Zhang, Zhen Leng Thai, Kai Zhang, Chongyi Wang, Yuan Yao, Chenyang Zhao, Jie Zhou, Jie Cai, Zhongwu Zhai, Ning Ding, Chao Jia, Guoyang Zeng, Dahai Li, Zhiyuan Liu, and Maosong Sun. MiniCPM: Unveiling the potential of small language models with scalable training strategies. CoRR, abs/2404.06395, 2024.
1599 40 Binyuan Hui, Jian Yang, Zeyu Cui, Jiaxi Yang, Dayiheng Liu, Lei Zhang, Tianyu Liu, Jiajun Zhang, Bowen Yu, Keming Lu, et al. Qwen2.5-Coder technical report. CoRR, abs/2409.12186, 2024.
1600 41 Naman Jain, King Han, Alex Gu, Wen-Ding Li, Fanjia Yan, Tianjun Zhang, Sida Wang, Armando SolarLezama, Koushik Sen, and Ion Stoica. LiveCodeBench: Holistic and contamination free evaluation of large language models for code. CoRR, abs/2403.07974, 2024.
1601 42 Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, L´ elio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timoth´ee Lacroix, and William El Sayed. Mistral 7B. CoRR, abs/2310.06825, 2023a.
1602 43 Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, L´ elio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Sandeep Subramanian, Sophia Yang, Szymon Antoniak, Teven Le Scao, Th´ eophile Gervet, Thibaut Lavril, Thomas Wang, Timoth´ ee Lacroix, and William El Sayed. Mixtral of experts. CoRR, abs/2401.04088, 2024a.
1603 44 Huiqiang Jiang, Yucheng Li, Chengruidong Zhang, Qianhui Wu, Xufang Luo, Surin Ahn, Zhenhua Han, Amir H Abdi, Dongsheng Li, Chin-Yew Lin, Yuqing Yang, and Lili Qiu. Minference 1.0: Accelerating pre-filling for long-context llms via dynamic sparse attention. arXiv preprint arXiv:2407.02490, 2024b.
1604 45 Zixuan Jiang, Jiaqi Gu, Hanqing Zhu, and David Z. Pan. Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and efficient pre-LN Transformers. CoRR, abs/2305.14858, 2023b.
1605 46 Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language models. CoRR, abs/2001.08361, 2020.
1606 47 Fajri Koto, Nurul Aisyah, Haonan Li, and Timothy Baldwin. Large language models only pass primary school exams in Indonesia: A comprehensive test on IndoMMLU. In EMNLP, pp. 12359–12374. Association for Computational Linguistics, 2023.
1607 48 Nathan Lambert, Valentina Pyatkin, Jacob Daniel Morrison, Lester James Validad Miranda, Bill Yuchen Lin, Khyathi Raghavi Chandu, Nouha Dziri, Sachin Kumar, Tom Zick, Yejin Choi, Noah A. Smith, and Hanna Hajishirzi. RewardBench: Evaluating reward models for language modeling. CoRR, abs/2403.13787, 2024.
1608 49 Dmitry Lepikhin, HyoukJoong Lee, Yuanzhong Xu, Dehao Chen, Orhan Firat, Yanping Huang, Maxim Krikun, Noam Shazeer, and Zhifeng Chen. GShard: Scaling giant models with conditional computation and automatic sharding. CoRR, abs/2006.16668, 2020.
1609 50 Tianle Li, Wei-Lin Chiang, Evan Frick, Lisa Dunlap, Tianhao Wu, Banghua Zhu, Joseph E. Gonzalez, and Ion Stoica. From crowdsourced data to high-quality benchmarks: Arena-Hard and BenchBuilder pipeline. CoRR, abs/2406.11939, 2024.
1610 51 Stephanie Lin, Jacob Hilton, and Owain Evans. TruthfulQA: Measuring how models mimic human falsehoods. In ACL (1), pp. 3214–3252. Association for Computational Linguistics, 2022a.
1611 52 Xi Victoria Lin, Todor Mihaylov, Mikel Artetxe, Tianlu Wang, Shuohui Chen, Daniel Simig, Myle Ott, Naman Goyal, Shruti Bhosale, Jingfei Du, Ramakanth Pasunuru, Sam Shleifer, Punit Singh Koura, Vishrav Chaudhary, Brian O’Horo, Jeff Wang, Luke Zettlemoyer, Zornitsa Kozareva, Mona T. Diab, Veselin Stoyanov, and Xian Li. Few-shot learning with multilingual generative language models. In EMNLP, pp. 9019–9052. Association for Computational Linguistics, 2022b.
1612 53 Jiawei Liu, Chunqiu Steven Xia, Yuyao Wang, and Lingming Zhang. Is your code generated by ChatGPT really correct? Rigorous evaluation of large language models for code generation. In NeurIPS, 2023.
1613 54 Keming Lu, Bowen Yu, Fei Huang, Yang Fan, Runji Lin, and Chang Zhou. Online merging optimizers for boosting rewards and mitigating tax in alignment. CoRR, abs/2405.17931, 2024a.
1614 55 Keming Lu, Bowen Yu, Chang Zhou, and Jingren Zhou. Large language models are superpositions of all characters: Attaining arbitrary role-play via self-alignment. CoRR, abs/2401.12474, 2024b.
1615 56 Niklas Muennighoff, Thomas Wang, Lintang Sutawika, Adam Roberts, Stella Biderman, Teven Le Scao, M. Saiful Bari, Sheng Shen, Zheng Xin Yong, Hailey Schoelkopf, Xiangru Tang, Dragomir Radev, Alham Fikri Aji, Khalid Almubarak, Samuel Albanie, Zaid Alyafeai, Albert Webson, Edward Raff, and Colin Raffel. Crosslingual generalization through multitask finetuning. In ACL (1), pp. 15991–16111. Association for Computational Linguistics, 2023.
1616 57 Junho Myung, Nayeon Lee, Yi Zhou, Jiho Jin, Rifki Afina Putri, Dimosthenis Antypas, Hsuvas Borkakoty, Eunsu Kim, Carla P´ erez-Almendros, Abinew Ali Ayele, V´ ıctor Guti´ errez-Basulto, Yazm´ ın Ib´˜anez-Garc´ ıa, Hwaran Lee, Shamsuddeen Hassan Muhammad, Ki-Woong Park, Anar Sabuhi Rzayev, Nina White, Seid Muhie Yimam, Mohammad Taher Pilehvar, Nedjma Ousidhoum, Jos´ e Camacho-Collados, and Alice Oh. Blend: A benchmark for llms on everyday knowledge in diverse cultures and languages. CoRR, abs/2406.09948, 2024.
1617 58 OpenAI. GPT4 technical report. CoRR, abs/2303.08774, 2023.
1618 59 OpenAI. Hello GPT-4o, 2024a. URL https://openai.com/index/hello-gpt-4o/.
1619 60 OpenAI. Learning to reason with LLMs, 2024b. URL https://openai.com/index/learning-to-reason-with-llms/.
1620 61 Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul F. Christiano, Jan Leike, and Ryan Lowe. Training language models to follow instructions with human feedback. In NeurIPS, 2022.
1621 62 Bowen Peng, Jeffrey Quesnelle, Honglu Fan, and Enrico Shippole. YaRN: Efficient context window extension of large language models. CoRR, abs/2309.00071, 2023.
1622 63 Edoardo Maria Ponti, Goran Glavas, Olga Majewska, Qianchu Liu, Ivan Vulic, and Anna Korhonen. XCOPA: A multilingual dataset for causal commonsense reasoning. In EMNLP (1), pp. 2362–2376. Association for Computational Linguistics, 2020.
1623 64 Shanghaoran Quan, Tianyi Tang, Bowen Yu, An Yang, Dayiheng Liu, Bofei Gao, Jianhong Tu, Yichang Zhang, Jingren Zhou, and Junyang Lin. Language models can self-lengthen to generate long texts. CoRR, abs/2410.23933, 2024.
1624 65 Qwen Team. Code with CodeQwen1.5, 2024a. URL https://qwenlm.github.io/blog/codeqwen1.5/.
1625 66 Qwen Team. Introducing Qwen1.5, 2024b. URL https://qwenlm.github.io/blog/qwen1.5/.
1626 67 Qwen Team. Introducing Qwen2-Math, 2024c. URL https://qwenlm.github.io/blog/qwen2-math/.
1627 68 Qwen Team. QwQ: Reflect deeply on the boundaries of the unknown, 2024d. URL https://qwenlm.github.io/blog/qwq-32b-preview/.
1628 69 Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever, et al. Improving language understanding by generative pre-training. Technical report, OpenAI, 2018.
1629 70 Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D. Manning, Stefano Ermon, and Chelsea Finn. Direct preference optimization: Your language model is secretly a reward model. In NeurIPS, 2023.
1630 71 Samyam Rajbhandari, Conglong Li, Zhewei Yao, Minjia Zhang, Reza Yazdani Aminabadi, Ammar Ahmad Awan, Jeff Rasley, and Yuxiong He. DeepSpeed-MoE: Advancing mixture-of-experts inference and training to power next-generation AI scale. In ICML, volume 162 of Proceedings of Machine Learning Research, pp. 18332–18346. PMLR, 2022.
1631 72 David Rein, Betty Li Hou, Asa Cooper Stickland, Jackson Petty, Richard Yuanzhe Pang, Julien Dirani, Julian Michael, and Samuel R. Bowman. GPQA: A graduate-level Google-proof Q&A benchmark. CoRR, abs/2311.12022, 2023.
1632 73 Keisuke Sakaguchi, Ronan Le Bras, Chandra Bhagavatula, and Yejin Choi. WinoGrande: An adversarial winograd schema challenge at scale. Commun. ACM, 64(9):99–106, 2021.
1633 74 Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural machine translation of rare words with subword units. In ACL (1). The Association for Computer Linguistics, 2016.
1634 75 Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Mingchuan Zhang, Y. K. Li, Y. Wu, and Daya Guo. Deepseekmath: Pushing the limits of mathematical reasoning in open language models. CoRR, abs/2402.03300, 2024.
1635 76 Jianlin Su. The magical effect of the Bias term: RoPE + Bias = better length extrapolation, 2023. URL https://spaces.ac.cn/archives/9577.
1636 77 Jianlin Su, Murtadha H. M. Ahmed, Yu Lu, Shengfeng Pan, Wen Bo, and Yunfeng Liu. Roformer: Enhanced Transformer with rotary position embedding. Neurocomputing, 568:127063, 2024.
1637 78 Mirac Suzgun, Nathan Scales, Nathanael Sch¨ arli, Sebastian Gehrmann, Yi Tay, Hyung Won Chung, Aakanksha Chowdhery, Quoc V. Le, Ed H. Chi, Denny Zhou, and Jason Wei. Challenging BIG-Bench tasks and whether chain-of-thought can solve them. In ACL (Findings), pp. 13003–13051. Association for Computational Linguistics, 2023.
1638 79 Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timoth´ ee Lacroix, Baptiste Rozi` ere, Naman Goyal, Eric Hambro, Faisal Azhar, Aur´ elien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. LLaMA: Open and efficient foundation language models. CoRR, abs/2302.13971, 2023a.
1639 80 Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton-Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aur´ elien Rodriguez, Robert Stojnic, Sergey Edunov, and Thomas Scialom. Llama 2: Open foundation and fine-tuned chat models. CoRR, abs/2307.09288, 2023b.
1640 81 Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. In NIPS, pp. 5998–6008, 2017.
1641 82 Binghai Wang, Rui Zheng, Lu Chen, Yan Liu, Shihan Dou, Caishuang Huang, Wei Shen, Senjie Jin, Enyu Zhou, Chenyu Shi, et al. Secrets of RLHF in large language models part II: Reward modeling. CoRR, abs/2401.06080, 2024a.
1642 83 Changhan Wang, Kyunghyun Cho, and Jiatao Gu. Neural machine translation with byte-level subwords. In AAAI, pp. 9154–9160. AAAI Press, 2020.
1643 84 Yubo Wang, Xueguang Ma, Ge Zhang, Yuansheng Ni, Abhranil Chandra, Shiguang Guo, Weiming Ren, Aaran Arulraj, Xuan He, Ziyan Jiang, Tianle Li, Max Ku, Kai Wang, Alex Zhuang, Rongqi Fan, Xiang Yue, and Wenhu Chen. MMLU-Pro: A more robust and challenging multi-task language understanding benchmark. CoRR, abs/2406.01574, 2024b.
1644 85 Zhilin Wang, Alexander Bukharin, Olivier Delalleau, Daniel Egert, Gerald Shen, Jiaqi Zeng, Oleksii Kuchaiev, and Yi Dong. HelpSteer2-Preference: Complementing ratings with preferences. CoRR, abs/2410.01257, 2024c.
1645 86 Colin White, Samuel Dooley, Manley Roberts, Arka Pal, Benjamin Feuer, Siddhartha Jain, Ravid ShwartzZiv, Neel Jain, Khalid Saifullah, Siddartha Naidu, Chinmay Hegde, Yann LeCun, Tom Goldstein, Willie Neiswanger, and Micah Goldblum. LiveBench: A challenging, contamination-free LLM benchmark. CoRR, abs/2406.19314, 2024.
1646 87 Hao Xiang, Bowen Yu, Hongyu Lin, Keming Lu, Yaojie Lu, Xianpei Han, Le Sun, Jingren Zhou, and Junyang Lin. Aligning large language models via self-steering optimization. CoRR, abs/2410.17131, 2024.
1647 88 Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, and Hao Ma. Effective long-context scaling of foundation models. CoRR, abs/2309.16039, 2023.
1648 89 An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Ma, Jianxin Yang, Jin Xu, Jingren Zhou, Jinze Bai, Jinzheng He, Junyang Lin, Kai Dang, Keming Lu, Keqin Chen, Kexin Yang, Mei Li, Mingfeng Xue, Na Ni, Pei Zhang, Peng Wang, Ru Peng, Rui Men, Ruize Gao, Runji Lin, Shijie Wang, Shuai Bai, Sinan Tan, Tianhang Zhu, Tianhao Li, Tianyu Liu, Wenbin Ge, Xiaodong Deng, Xiaohuan Zhou, Xingzhang Ren, Xinyu Zhang, Xipin Wei, Xuancheng Ren, Xuejing Liu, Yang Fan, Yang Yao, Yichang Zhang, Yu Wan, Yunfei Chu, Yuqiong Liu, Zeyu Cui, Zhenru Zhang, Zhifang Guo, and Zhihao Fan. Qwen2 technical report. CoRR, abs/2407.10671, 2024a.
1649 90 An Yang, Beichen Zhang, Binyuan Hui, Bofei Gao, Bowen Yu, Chengpeng Li, Dayiheng Liu, Jianhong Tu, Jingren Zhou, Junyang Lin, et al. Qwen2.5-Math technical report: Toward mathematical expert model via self-improvement. CoRR, abs/2409.12122, 2024b.
1650 91 Jian Yang, Jiaxi Yang, Ke Jin, Yibo Miao, Lei Zhang, Liqun Yang, Zeyu Cui, Yichang Zhang, Binyuan Hui, and Junyang Lin. Evaluating and aligning codellms on human preference. CoRR, abs/2412.05210, 2024c.
1651 92 Yinfei Yang, Yuan Zhang, Chris Tar, and Jason Baldridge. PAWS-X: A cross-lingual adversarial dataset for paraphrase identification. In EMNLP/IJCNLP (1), pp. 3685–3690. Association for Computational Linguistics, 2019.
1652 93 Alex Young, Bei Chen, Chao Li, Chengen Huang, Ge Zhang, Guanwei Zhang, Heng Li, Jiangcheng Zhu, Jianqun Chen, Jing Chang, Kaidong Yu, Peng Liu, Qiang Liu, Shawn Yue, Senbin Yang, Shiming Yang, Tao Yu, Wen Xie, Wenhao Huang, Xiaohui Hu, Xiaoyi Ren, Xinyao Niu, Pengcheng Nie, Yuchi Xu, Yudong Liu, Yue Wang, Yuxuan Cai, Zhenyu Gu, Zhiyuan Liu, and Zonghong Dai. Yi: Open foundation models by 01.AI. CoRR, abs/2403.04652, 2024.
1653 94 Tao Yuan, Xuefei Ning, Dong Zhou, Zhijie Yang, Shiyao Li, Minghui Zhuang, Zheyue Tan, Zhuyu Yao, Dahua Lin, Boxun Li, Guohao Dai, Shengen Yan, and Yu Wang. LV-Eval: A balanced long-context benchmark with 5 length levels up to 256K. CoRR, abs/2402.05136, 2024.
1654 95 Zheng Yuan, Hongyi Yuan, Chengpeng Li, Guanting Dong, Chuanqi Tan, and Chang Zhou. Scaling relationship on learning mathematical reasoning with large language models. CoRR, abs/2308.01825, 2023.
1655 96 Rowan Zellers, Ari Holtzman, Yonatan Bisk, Ali Farhadi, and Yejin Choi. HellaSwag: Can a machine really finish your sentence? In ACL (1), pp. 4791–4800. Association for Computational Linguistics, 2019.
1656 97 Yidan Zhang, Boyi Deng, Yu Wan, Baosong Yang, Haoran Wei, Fei Huang, Bowen Yu, Junyang Lin, and Jingren Zhou. P-MMEval: A parallel multilingual multitask benchmark for consistent evaluation of LLMs. CoRR, abs/2411.09116, 2024.
1657 98 Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric P. Xing, Hao Zhang, Joseph E. Gonzalez, and Ion Stoica. Judging LLM-as-a-judge with MT-Bench and Chatbot Arena. In NeurIPS, 2023.
1658 99 Enyu Zhou, Guodong Zheng, Bing Wang, Zhiheng Xi, Shihan Dou, Rong Bao, Wei Shen, Limao Xiong, Jessica Fan, Yurong Mou, Rui Zheng, Tao Gui, Qi Zhang, and Xuanjing Huang. RMB: Comprehensively benchmarking reward models in LLM alignment. CoRR, abs/2410.09893, 2024.
1659 100 Jeffrey Zhou, Tianjian Lu, Swaroop Mishra, Siddhartha Brahma, Sujoy Basu, Yi Luan, Denny Zhou, and Le Hou. Instruction-following evaluation for large language models. CoRR, abs/2311.07911, 2023.
1660 101 Barret Zoph, Irwan Bello, Sameer Kumar, Nan Du, Yanping Huang, Jeff Dean, Noam Shazeer, and William Fedus. ST-MoE: Designing stable and transferable sparse expert models. CoRR, abs/2202.08906, 2022.
1661 35 0 [35] Q. Zhang, K. Ding, T. Lyv, X. Wang, Q. Yin, Y. Zhang, J. Yu, Y. Wang, X. Li, Z. Xiang, K. Feng, X. Zhuang, Z. Wang, M. Qin, M. Zhang, J. Zhang, J. Cui, T. Huang, P. Yan, R. Xu, H. Chen, X. Li, X. Fan, H. Xing, and H. Chen. Scientific large language models: A survey on biological and chemical domains. A Survey on Biological and Chemical Domains, 1:90, 1 2024. doi:10.1145/nnnnnnn.nnnnnnn. URL https://arxiv.org/pdf/2401.14656v2. https://qiita.com/kaizen_nagoya/items/6505717d7c4769a4ff31
36 0 [36] Z. Zhou, Y. Ji, W. Li, P. Dutta, R. V. Davuluri, and H. Liu. Dnabert-2: Efficient foundation model and benchmark for multi-species genome. 12th International Conference on Learning Representations, ICLR 2024, 6 2023. URL https://arxiv.org/pdf/2306.15006. https://qiita.com/kaizen_nagoya/items/d711266990ec2bed35f2
1 Žiga Avsec, Vikram Agarwal, Daniel Visentin, Joseph R Ledsam, Agnieszka Grabska-Barwinska, Kyle R Taylor, Yannis Assael, John Jumper, Pushmeet Kohli, and David R Kelley. Effective gene expression prediction from sequence by integrating long-range interactions. Nature methods, 18 (10):1196–1203, 2021.
2 Dennis A Benson, Mark Cavanaugh, Karen Clark, Ilene Karsch-Mizrachi, David J Lipman, James Ostell, and Eric W Sayers. Genbank. Nucleic acids research, 41(D1):D36–D42, 2012.
3 Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Published as a conference paper at ICLR 2024
4 Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, Aditi Raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, and Percy Liang. On the opportunities and risks of foundation models, 2022.
5 Marta Byrska-Bishop, Uday S. Evani, Xuefang Zhao, Anna O. Basile, Haley J. Abel, Allison A. Regier, André Corvelo, Wayne E. Clarke, Rajeeva Musunuri, Kshithija Nagulapalli, Susan Fairley, Alexi Runnels, Lara Winterkorn, Ernesto Lowy-Gallego, The Human Genome Structural Variation Consortium, Paul Flicek, Soren Germer, Harrison Brand, Ira M. Hall, Michael E. Talkowski, Giuseppe Narzisi, and Michael C. Zody. High coverage whole genome sequencing of the expanded 1000 genomes project cohort including 602 trios. bioRxiv, 2021. doi: 10.1101/2021.02.06.430068. URL https://www.biorxiv.org/content/early/2021/02/07/2021.02.06.430068.
6 Ken Chen, Huiying Zhao, and Yuedong Yang. Capturing large genomic contexts for accurately predicting enhancer-promoter interactions. Brief. Bioinform., 23(2), March 2022.
7 ENCODE Project Consortium et al. An integrated encyclopedia of dna elements in the human genome. Nature, 489(7414):57, 2012.
8 Hugo Dalla-Torre, Liam Gonzalez, Javier Mendoza-Revilla, Nicolas Lopez Carranza, Adam Henryk Grzywaczewski, Francesco Oteri, Christian Dallago, Evan Trop, Hassan Sirelkhatim, Guillaume Richard, Marcin Skwark, Karim Beguir, Marie Lopez, and Thomas Pierrot. The nucleotide transformer: Building and evaluating robust foundation models for human genomics. bioRxiv,2023. doi: 10.1101/2023.01.11.523679. URL https://www.biorxiv.org/content/early/2023/03/09/2023.01.11.523679.
9 Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, and Christopher Ré. Flashattention: Fast and memory-efficient exact attention with io-awareness, 2022.
10 Yann N. Dauphin, Angela Fan, Michael Auli, and David Grangier. Language modeling with gated convolutional networks. In Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, pp. 933–941. JMLR.org, 2017.
11 René Dreos, Giovanna Ambrosini, Rouayda Cavin Périer, and Philipp Bucher. Epd and epdnew, high-quality promoter resources in the next-generation sequencing era. Nucleic acids research, 41 (D1):D157–D164, 2013.
12 Katarína Grešová, Vlastimil Martinek, Davidˇ Cechák, Petr Šimeˇ cek, and Panagiotis Alexiou. Genomic benchmarks: a collection of datasets for genomic sequence classification. BMC Genomic Data, 24(1):25, 2023.
13 Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. Lora: Low-rank adaptation of large language models, 2021.
14 Yanrong Ji, Zhihan Zhou, Han Liu, and Ramana V Davuluri. Dnabert: pre-trained bidirectional encoder representations from transformers model for dna-language in genome. Bioinformatics, 37 (15):2112–2120, 2021. Published as a conference paper at ICLR 2024
15 Junru Jin, Yingying Yu, Ruheng Wang, Xin Zeng, Chao Pang, Yi Jiang, Zhongshen Li, Yutong Dai, Ran Su, Quan Zou, Kenta Nakai, and Leyi Wei. iDNA-ABF: multi-scale deep biological language learning model for the interpretable prediction of DNA methylations. Genome Biol., 23(1):219, October 2022.
16 Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, pp. 4171–4186, 2019.
17 Shruti Khare, Céline Gurry, Lucas Freitas, Mark B Schultz, Gunter Bach, Amadou Diallo, Nancy Akite, Joses Ho, Raphael TC Lee, Winston Yeo, et al. Gisaid’s role in pandemic response. China CDC weekly, 3(49):1049, 2021.
18 Taku Kudo and John Richardson. Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing, 2018.
19 Nguyen Quoc Khanh Le, Quang-Thai Ho, Van-Nui Nguyen, and Jung-Su Chang. Bert-promoter: An improved sequence-based predictor of dna promoter using bert pre-trained model and shap feature selection. Computational Biology and Chemistry, 99:107732, 2022.
20 Dohoon Lee, Jeewon Yang, and Sun Kim. Learning the histone codes with large genomic windows and three-dimensional chromatin interactions using transformer. Nature Communications, 13(1): 6678, 2022.
21 Zhongxiao Li, Elva Gao, Juexiao Zhou, Wenkai Han, Xiaopeng Xu, and Xin Gao. Applications of deep learning in understanding gene regulation. Cell Reports Methods, 3(1):100384, 2023. ISSN 2667-2375. doi: https://doi.org/10.1016/j.crmeth.2022.100384. URL https://www.sciencedirect.com/science/article/pii/S2667237522002892.
22 Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization, 2019.
23 Eric Nguyen, Michael Poli, Marjan Faizi, Armin Thomas, Michael Wornow, Callum Birch-Sykes, Stefano Massaroli, Aman Patel, Clayton Rabideau, Yoshua Bengio, et al. Hyenadna: Long-range genomic sequence modeling at single nucleotide resolution. Advances in neural information processing systems, 36, 2024.
24 Yu Ni, Linqi Fan, Miao Wang, Ning Zhang, Yongchun Zuo, and Mingzhi Liao. EPI-Mind: Identifying enhancer-promoter interactions based on transformer mechanism. Interdiscip. Sci., 14(3):786–794, September 2022.
25 OpenAI. Gpt-4 technical report, 2023.
26 Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, and Ryan Lowe. Training language models to follow instructions with human feedback, 2022.
27 Ofir Press, Noah A Smith, and Mike Lewis. Train short, test long: Attention with linear biases enables input length extrapolation. arXiv preprint arXiv:2108.12409, 2021.
28 Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551, 2020.
29 Joel Rozowsky, Jiahao Gao, Beatrice Borsari, Yucheng T Yang, Timur Galeev, Gamze Gürsoy, Charles B Epstein, Kun Xiong, Jinrui Xu, Tianxiao Li, Jason Liu, Keyang Yu, Ana Berthel, Zhanlin Chen, Fabio Navarro, Maxwell S Sun, James Wright, Justin Chang, Christopher J F Cameron, Noam Shoresh, Elizabeth Gaskell, Jorg Drenkow, Jessika Adrian, Sergey Aganezov, François Aguet, Gabriela Balderrama-Gutierrez, Samridhi Banskota, Guillermo Barreto Corona, Sora Chee, Surya B Chhetri, Gabriel Conte Cortez Martins, Cassidy Danyko, Carrie A Davis, Daniel Farid, Nina P Farrell, Idan Gabdank, Yoel Gofin, David U Gorkin, Mengting Gu, Vivian Hecht, Benjamin C Hitz, Robbyn Issner, Yunzhe Jiang, Melanie Kirsche, Xiangmeng Kong, Bonita R Lam, Shantao Li, Bian Li, Xiqi Li, Khine Zin Lin, Ruibang Luo, Mark Mackiewicz, Ran Meng, Published as a conference paper at ICLR 2024
30 Jill E Moore, Jonathan Mudge, Nicholas Nelson, Chad Nusbaum, Ioann Popov, Henry E Pratt, Yunjiang Qiu, Srividya Ramakrishnan, Joe Raymond, Leonidas Salichos, Alexandra Scavelli, Jacob M Schreiber, Fritz J Sedlazeck, Lei Hoon See, Rachel M Sherman, Xu Shi, Minyi Shi, Cricket Alicia Sloan, J Seth Strattan, Zhen Tan, Forrest Y Tanaka, Anna Vlasova, Jun Wang, Jonathan Werner, Brian Williams, Min Xu, Chengfei Yan, Lu Yu, Christopher Zaleski, Jing Zhang, Kristin Ardlie, J Michael Cherry, Eric M Mendenhall, William S Noble, Zhiping Weng, Morgan E Levine, Alexander Dobin, Barbara Wold, Ali Mortazavi, Bing Ren, Jesse Gillis, Richard M Myers, Michael P Snyder, Jyoti Choudhary, Aleksandar Milosavljevic, Michael C Schatz, Bradley E Bernstein, Roderic Guigó, Thomas R Gingeras, and Mark Gerstein. The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models. Cell, 186(7):1493–1511.e40, March 2023.
31 Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1715–1725, Berlin, Germany, August 2016. Association for Computational Linguistics. doi: 10.18653/v1/P16-1162. URL https://aclanthology.org/P16-1162. Noam Shazeer. Glu variants improve transformer, 2020.
32 Gabrielle D Smith, Wan Hern Ching, Paola Cornejo-Páramo, and Emily S Wong. Decoding enhancer complexity with machine learning and high-throughput discovery. Genome Biol., 24(1):116, May 2023.
33 John A Stamatoyannopoulos, Michael Snyder, Ross Hardison, Bing Ren, Thomas Gingeras, David M Gilbert, Mark Groudine, Michael Bender, Rajinder Kaul, Theresa Canfield, et al. An encyclopedia of mouse dna elements (mouse encode). Genome biology, 13(8):1–5, 2012.
34 Jianlin Su, Yu Lu, Shengfeng Pan, Ahmed Murtadha, Bo Wen, and Yunfeng Liu. Roformer: Enhanced transformer with rotary position embedding. arXiv preprint arXiv:2104.09864, 2021.
35 Chi Sun, Xipeng Qiu, Yige Xu, and Xuanjing Huang. How to fine-tune bert for text classification?, 2020.
36 The Mosaic ML Team. composer. https://github.com/mosaicml/composer/, 2021.
37 Christina V Theodoris, Ling Xiao, Anant Chopra, Mark D Chaffin, Zeina R Al Sayed, Matthew C Hill, Helene Mantineo, Elizabeth M Brydon, Zexian Zeng, X Shirley Liu, and Patrick T Ellinor. Transfer learning enables predictions in network biology. Nature, 618(7965):616–624, June 2023.
38 Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
39 Ruohan Wang, Zishuai Wang, Jianping Wang, and Shuaicheng Li. Splicefinder: ab initio prediction of splice sites using convolutional neural network. BMC bioinformatics, 20:1–13, 2019.
40 Zixuan Wang, Yongqing Zhang, Yuhang Liu, Shuwen Xiong, Maocheng Wang, Jiliu Zhou, and Meiqin Gong. Towards a better understanding of tf-dna binding prediction from genomic features. Computers in Biology and Medicine, pp. 105993, 2022.
41 Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45, Online, October 2020. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/2020.emnlp-demos.6.
42 Pengyu Zhang, Hongming Zhang, and Hao Wu. ipro-wael: a comprehensive and robust framework for identifying promoters in multiple species. Nucleic Acids Research, 50(18):10278–10289, 2022.
0
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?