Progress and opportunities of foundation models in bioinformatics. Briefings in Bioinformatics

Last updated at 2025-08-06Posted at 2025-08-06

Progress and opportunities of foundation models in bioinformatics. Briefings in

[22] Q. Li, Z. Hu, Y. Wang, L. Li, Y. Fan, I. King, G. Jia, S. Wang, L. Song, and Y. Li., 25:548, 9 2024. ISSN 14774054. doi: 10.1093/BIB/BBAE548. URL https://dx.doi.org/10.1093/bib/bbae548.

References

Hughes JP, Rees S, Kalindjian SB. et al. Principles of early drug discovery. Br J Pharmacol 2011;162:1239–49. 10.1111/j.1476-5381.2010.01127.x.
Bommasani DA, Hudson E, Adeli E. et al. On the opportunities and risks of foundation models.arXiv preprint arXiv:2108.07258, 2021.
Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med 2019;25:44–56. 10.1038/s41591-018-0300-7.
Park Y S, Lek S. Artificial Neural Networks: Multilayer Perceptron for Ecological Modeling[M]. In: Jørgensen SE, (eds.), Developments in Environmental Modeling. Netherlands: Elsevier, 2016;28: 123–40, 10.1016/B978-0-444-63623-2.00007-4.
Wang M, Tai CEW, Wei L. DeFine: deep convolutional neural networks accurately quantify intensities of transcription factor-DNA binding and facilitate evaluation of functional non-coding variants. Nucleic Acids Res 2018;46:e69–9. 10.1093/nar/gky215.
Shen J, Liu F, Tu Y. et al. Finding gene network topologies for given biological function with recurrent neural network. Nat Commun 2021;12:3125. 10.1038/s41467-021-23420-5.
Whalen S, Truty RM, Pollard KS. Enhancer–promoter interactions are encoded by complex genomic signatures on looping chromatin. Nat Genet 2016;48:488–96. 10.1038/ng.3539.
Forster DT, Li SC, Yashiroda Y. et al. BIONIC: biological network integration using convolutions. Nat Methods 2022;19:1250–61. 10.1038/s41592-022-01616-x.
Dong K, Zhang S. Deciphering spatial domains from spatially resolved transcriptomics with an adaptive graph attention auto-encoder. Nat Commun 2022;13:1739. 10.1038/s41467-022-29439-6.
Mahmud M, Kaiser MS, Hussain A. et al. Applications of deep learning and reinforcement learning to biological data. IEEE Trans Neural Netw Learn Syst 2018;29:2063–79. 10.1109/TNNLS.2018.2790388.
Wiggins WF, Tejani AS. On the opportunities and risks of foundation models for natural language processing in radiology. Radiol Artif Intell 2022;4:e220119. 10.1148/ryai.220119.
Baker B, Akkaya I, Zhokov P. et al. Video pretraining (vpt): learning to act by watching unlabeled online videos. Adv Neural Inf Process Syst 2022;35:24639–54.
Tack A, Piech C. The AI teacher test: measuring the pedagogical ability of blender and GPT-3 in educational dialogues. arXiv preprint arXiv:2205.07540, 2022.
Moor M, Banerjee O, Abad ZSH. et al. Foundation models for generalist medical artificial intelligence. Nature 2023;616:259–65. 10.1038/s41586-023-05881-4.
Rao R M, Liu J, Verkuil R. et al. MSA transformer. International Conference on Machine Learning PMLR 2021;139:8844–56.
Sapoval N, Aghazadeh A, Nute MG. et al. Current progress and open challenges for applying deep learning across the biosciences. Nat Commun 2022;13:1728. 10.1038/s41467-022-29268-7.
Theodoris CV, Xiao L, Chopra A. et al. Transfer learning enables predictions in network biology. Nature 2023;618:616–24. 10.1038/s41586-023-06139-9.
Zou J, Huss M, Abid A. et al. A primer on deep learning in genomics. Nat Genet 2019;51:12–8. 10.1038/s41588-018-0295-5.
Uhlmann V, Donati L, Sage D. A practical guide to supervised deep learning for bioimage analysis: challenges and good practices. IEEE Signal Process Mag 2022;39:73–86. 10.1109/MSP.2021.3123589.
Wasserman WW, Sandelin A. Applied bioinformatics for the identification of regulatory elements. Nat Rev Genet 2004;5:276–87. 10.1038/nrg1315.
Howard J, Ruder S. Universal language model fine-tuning for text classification. arXiv preprint arXiv:1801.06146, 2018.
Yuan L, Chen D, Chen YL. et al. Florence: a new foundation model for computervision, arXiv preprint arXiv:2111.11432, 2021.
Devlin J, Chang MW, Lee K. et al. Bert: pre-training of deep bidirectional transformers for languageunderstanding. arXiv preprint arXiv:1810.04805, 2018.
Lee J, Yoon W, Kim S. et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 2020;36:1234–40. 10.1093/bioinformatics/btz682.
Gu Y, Tinn R, Cheng H. et al. Domain-specific language model pretraining for biomedical natural language processing. ACM Trans Comput Healthc 2021;3:1–23.
Ji Y, Zhou Z, Liu H. et al. DNABERT: pre-trained bidirectional encoder representations from transformers model for DNA-language in genome. Bioinformatics 2021;37:2112–20. 10.1093/bioinformatics/btab083.
Brandes N, Ofer D, Peleg Y. et al. Proteinbert: a universal deep-learning model of protein sequence and function. Bioinformatics 2022;38:2102–10. 10.1093/bioinformatics/btac020.
Radford A, Wu J, Child R. et al. Language models are unsupervised multitask learners. OpenAI Blog 2019;1:9.
Wu Y, Wang S, Yang H. et al. An early evaluation of gpt-4v(ision).arXiv preprintarXiv:2310.16534, 2023.
Lin Z, Akin H, Rao R. et al. Language models of protein sequences at the scale of evolution enable accurate structure prediction. bioRxiv 500902, 2022.
Hayes T, Rao R, Akin H. et al. Simulating 500 million years of evolution with a language model. bioRxiv 600583, 2024.
Raffel C, Shazeer N, Roberts A. et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 2020;21:1–67.
Song K, Tan X, Qin T. et al. Mpnet: masked and permuted pre-training for language understanding. Adv Neural Inf Process Syst 2020;33:16857–67.
Avsec Ž, Agarwal V, Visentin D. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat Methods 2021;18:1196–203. 10.1038/s41592-021-01252-x.
Xu M, Yuan X, Miret S. et al. Protst: multi-modality learning of protein sequences and biomedicaltexts. arXiv preprint arXiv:2301.12040, 2023.
Ferruz N, Schmidt S, Höcker B. ProtGPT2 is a deep unsupervised language model for protein design. Nat Commun 2022;13:4348. 10.1038/s41467-022-32007-7.
Chen B, Cheng X, Geng Y. et al. xTrimoPGLM: unified 100B-scale pre-trained transformer for deciphering the language of protein. arXiv preprint arXiv:2401.06199, 2024.
Liu Y and Tian B. Protein-DNA binding sites prediction based on pre-trained protein language model and contrastive learning [J]. arXiv preprint arXiv, 2023;2306:15912.
Azher ZL, Suvarna A, Chen JQ. et al. Assessment of emerging pretraining strategies in interpretable multimodal deep learning for cancer prognostication. BioData Min 2023;16:23. 10.1186/s13040-023-00338-w.
Liu Y, Tian B. Protein-DNA binding sites prediction based on pre-trained protein language model and contrastive learning. Briefings in Bioinformatics 2024;25.1:bbad488. 10.1093/bib/bbad488.
Nguyen E, Poli M, Faizi M. et al. Hyenadna: long-range genomic sequence modeling at single nucleotide resolution. Advances in Neural Information Processing Systems, 2024;36.
Cui H, Wang C, Maan H. et al. scGPT: towards building a foundation model for single-cell multi-omics using generative AI. Nature Methods 2024;1–11.
Madani A, Krause B, Greene ER. et al. Large language models generate functional protein sequences across diverse families. Nat Biotechnol 2023;41:1099–106. 10.1038/s41587-022-01618-2.
Senior AW, Evans R, Jumper J. et al. Improved protein structure prediction using potentials from deep learning. Nature 2020;577:706–10. 10.1038/s41586-019-1923-7.
Walsh B, Mohamed SK, Nováček V. Biokg: A knowledge graph for relational learning on biological data. In: d'Aquin PM, Dietze PS, (eds.), Proceedings of the 29th ACM International Conference on Information & Knowledge Management. ACM (Association for Computing Machinery), New York, NY, USA, 2020; 3173–3180.
Bernstein NJ, Fong NL, Lam I. et al. Solo: doublet identification in single-cell RNA-seq via semi-supervised deep learning. Cell Syst 2020;11:95–101.e5e5. 10.1016/j.cels.2020.05.010.
Brendel M, Su C, Bai Z. et al. Application of deep learning on single-cell RNA sequencing data analysis: a review. Genomics Proteomics Bioinformatics 2022;20:814–35. 10.1016/j.gpb.2022.11.011.
Arisdakessian C, Poirion O, Yunits B. et al. DeepImpute: an accurate, fast, and scalable deep neural network method to impute single-cell RNA-seq data. Genome Biol 2019;20:211. 10.1186/s13059-019-1837-6.
Tran HTN, Ang KS, Chevrier M. et al. A benchmark of batch-effect correction methods for single-cell RNA sequencing data. Genome Biol 2020;21:1–32. 10.1186/s13059-019-1850-9.
Clement L. Statistical methods for quantitative MS-based proteomics: part I. Preprocessing.
Mowoe MO, Garnett S, Lennard K. et al. Pro-MAP: a robust pipeline for the pre-processing of single channel protein microarray data. BMC Bioinformatics 2022;23:534. 10.1186/s12859-022-05095-x.
Hong L, Sun S, Zheng L, Tan Q X, and Li Y. fastmsa: Accelerating multiple sequence alignment with dense retrieval on protein language. bioRxiv 2021;2021–12.
Steinegger M, Meier M, Mirdita M. et al. HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinformatics 2019;20:1–15. 10.1186/s12859-019-3019-7.
Stecher G, Tamura K, Kumar S. Molecular evolutionary genetics analysis (MEGA) for macOS. Mol Biol Evol 2020;37:1237–9. 10.1093/molbev/msz312.
Chen K, Zhao H, Yang Y. Capturing large genomic contexts for accurately predicting enhancer-promoter interactions. Brief Bioinform 2022; 23:bbab577. 10.1093/bib/bbab577.
Novakovsky G, Dexter N, Libbrecht MW. et al. Obtaining genetics insights from deep learning via explainable artificial intelligence. Nat Rev Genet 2023;24:125–37. 10.1038/s41576-022-00532-2.
Dalla-Torre H, Gonzalez L, Mendoza J. et al. The nucleotide transformer: building and evaluating robust foundation models for human genomics. bioRxiv 2023;2023–01.
Chen J, Hu Z, Sun S. et al. Interpretable RNA foundation model from unannotated data for highly accurate RNA structure and function predictions. arXiv preprint arXiv:2204.00300, 2022.
Alipanahi B, Delong A, Weirauch MT. et al. Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning. Nat Biotechnol 2015;33:831–8. 10.1038/nbt.3300.
Liu P, Yuan W, Fu J. et al. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput Surv 2023;55:1–35. 10.1145/3560815.
Rentzsch P, Schubach M, Shendure J. et al. CADD-splice—improving genome-wide variant effect prediction using deep learning-derived splice scores. Genome Med 2021;13:1–12. 10.1186/s13073-021-00835-9.
Mi H, Muruganujan A, Huang X. et al. Protocol update for large-scale genome and gene function analysis with the PANTHER classification system (v. 14.0). Nat Protoc 2019;14:703–21. 10.1038/s41596-019-0128-8.
Ernst J, Kheradpour P, Mikkelsen TS. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 2011;473:43–9. 10.1038/nature09906.
Tang Z, Li C, Kang B. et al. GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses. Nucleic Acids Res 2017;45:W98–102. 10.1093/nar/gkx247.
Saelens W, Cannoodt R, Todorov H. et al. A comparison of single-cell trajectory inference methods. Nat Biotechnol 2019;37:547–54. 10.1038/s41587-019-0071-9.
Kaddour J, Harris J, Mozes M. et al. Challenges and applications of large languagemodels. arXiv preprint arXiv:2307.10169. 2023.
Liu W, Zhou P, Zhao Z. et al. K-bert: enabling language representation with knowledge graph. In: Proceedings of the AAAI Conference on Artificial Intelligence 2020;34:2901–8. 10.1609/aaai.v34i03.5681.
Brown TB, Mann B, Ryder N. et al. Language models are few-shot learners. Adv Neural Inf Process Syst 2020;33:1877–901.
Yasunaga M, Bosselut A, Ren H. et al. Deep bidirectional language-knowledge graph pretraining. Adv Neural Inf Process Syst 2022;35:37309–23.
Denny V, Krötzsch M. Wikidata: a free collaborative knowledgebase. Communications of the ACM, 2014;57:78–85.
Zhu Y, Kiros R, Zemel R. et al. Aligning books and movies: towards story-like visual explanations by watching movies and reading books. arXiv preprint arXiv:1506.06724, 2015.
Speer R, Chin J, Havasi C. Conceptnet 5.5: an open multilingual graph of general knowledge. In Proceedings of the AAAI Conference on Artificial Intelligence 2017;31:4444–4451. 10.1609/aaai.v31i1.11164.
Jia G, Li Y, Zhong X. et al. The high-dimensional space of human diseases built from diagnosis records and mapped to genetic loci. Nat Comput Sci 2023;3:403–17. 10.1038/s43588-023-00453-y.
Jia G, Li Y, Zhang H. et al. Estimating heritability and genetic correlations from large health datasets in the absence of genetic data. Nat Commun 2019;10:5508. 10.1038/s41467-019-13455-0.
Singhal K, Azizi S, Tu T. et al. Large language models encode clinical knowledge. Nature 2023;620:172–80. 10.1038/s41586-023-06291-2.
Kanakarajan RK, Kundumani B, Sankarasubbu M. BioELECTRA: Pretrained biomedical text encoder using discriminators. In: Demner-Fushman D, Cohen KB, Ananiadou S, Tsujii J, (eds.), Proceedings of the 20th Workshop on Biomedical Language Processing. Association for Computational Linguistics, Online, 2021;143–154.
Babjac AN, Lu Z, Emrich SJ. CodonBERT: Using BERT for sentiment analysis to better predict genes with low expression. In: Wang MD, Byung-Jun Yoon P, (eds.), Proceedings of the 14th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics. Association for Computing Machinery, New York, NY, United States, 2023; 1–6.
Yuan H, Yuan Z, Gan R. et al. BioBART: Pretraining and evaluation of a biomedical generative language model. arXiv preprint arXiv:2204.03905, 2022.
Rajpurkar P, Zhang J, Lopyrev K. et al. Squad: 100,000+ questions for machine comprehension of textarXiv preprint. arXiv preprint arXiv:1606.05250, 2016.
Fiorini N, Leaman R, Lipman DJ. et al. How user intelligence is improving PubMed. Nat Biotechnol 2018;36:937–45. 10.1038/nbt.4267.
Wu J, Fu R, Fang H. et al. Medical sam adapter: adapting segment anything model for medical image segmentation. arXiv preprint arXiv:2304.12620, 2023.
Pathak Y, Shukla PK, Tiwari A. et al. Deep transfer learning based classification model for COVID-19 disease. Ing Rech Biomed 2022;43:87–92. 10.1016/j.irbm.2020.05.003.
Bolton E, Hall D, Yasunaga M. et al. Stanford crfm introduces pubmedgpt 2.7 b. 2022.
Zhou Z, Ji Y, Li W. et al. DNABERT-2: efficient foundation model and benchmark for multi-species genome. arXiv preprint arXiv:2306.15006, 2023.
Wang R, Wang Z, Wang J. et al. SpliceFinder: ab initio prediction of splice sites using convolutional neural network. BMC Bioinformatics 2019;20:1–13. 10.1186/s12859-019-3306-3.
Repecka D, Jauniskis V, Karpus L. et al. Expanding functional protein sequence spaces using generative adversarial networks. Nat Mach Intell 2021;3:324–33. 10.1038/s42256-021-00310-5.
Gresova K, Martinek V, Cechak D. et al. Genomic benchmarks: a collection of datasets for genomic sequence classification. BMC Genomic Data, 2023;24:25.
Wu R, Ding F, Wang R. et al. High-resolution de novo structure prediction from primary sequence. bioRxiv 2022; 2022-07.
Lin Z, Akin H, Rao R. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 2023;379:1123–30. 10.1126/science.ade2574.
Ruffolo JA, Chu LS, Mahajan SP. et al. Fast, accurate antibody structure prediction from deep learning on massive set of natural antibodies. Nat Commun 2023;14:2389. 10.1038/s41467-023-38063-x.
Wang Y, Xumeng Gong, Li S, Yang B, Sun Y, Chuan Shi, Wang Y, Yang C, Li H, and Song L. xtrimoabfold: de novo antibody structure prediction without msa. arXiv preprint arXiv:2212.00735, 2022.
Skinnider M, Johnston C, Gunabalasingam M. et al. Comprehensive prediction of secondary metabolite structure and biological activity from microbial genome sequences. Nat Commun 2020;11:6058. 10.1038/s41467-020-19986-1.
Jumper J, Evans R, Pritzel A. et al. Highly accurate protein structure prediction with AlphaFold. Nature 2021;596:583–9. 10.1038/s41586-021-03819-2.
Rives A, Meier J, Sercu T. et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc Natl Acad Sci 2021;118:e2016239118. 10.1073/pnas.2016239118.
Klausen MS, Jespersen MC, Nielsen H. et al. NetSurfP-2.0: improved prediction of protein structural features by integrated deep learning. Proteins 2019;87:520–7. 10.1002/prot.25674.
Elnaggar A, Heinzinger M, Dallago C. et al. Prottrans: toward understanding the language of life through self-supervised learning. IEEE Trans Pattern Anal Mach Intell 2021;44:7112–27. 10.1109/TPAMI.2021.3095381.
Zhou G, Gao Z, Ding Q. et al. Uni-Mol: A universal 3d molecular representation learning framework. chemrxiv, 2023.
Feynman R. The Character of Physical Law, with New Foreword. MIT Press, Cambridge, Massachusetts, USA, 2017, 10.7551/mitpress/11068.001.0001.
Chowdhury R, Bouatta N, Biswas S. et al. Single-sequence protein structure prediction using a language model and deep learning. Nat Biotechnol 2022;40:1617–23. 10.1038/s41587-022-01432-w.
Guo Y, Wu J, Ma H. et al. Self-supervised pre-training for protein embeddings using tertiary structures. Proceedings of the AAAI Conference on Artificial Intelligence 2022;36:6801–9. 10.1609/aaai.v36i6.20636.
McDermott M, Yap B, Szolovits P. et al. Structure-inducing pre-training. Nat Mach Intell 2023;5:612–21. 10.1038/s42256-023-00647-z.
Singh J, Hanson J, Paliwal K. et al. RNA secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning. Nat Commun 2019;10:5407. 10.1038/s41467-019-13395-9.
Fu L, Cao Y, Wu J. et al. UFold: fast and accurate RNA secondary structure prediction with deep learning. Nucleic Acids Res 2022;50:e14–4. 10.1093/nar/gkab1074.
Zhu H, Hu J, Song XN. et al. DNAPred: accurate identification of DNA-binding sites from protein sequence by ensembled hyperplane-distance-based support vector machines. J Chem Inf Model 2019;59:3057–71. 10.1021/acs.jcim.8b00749.
Zhang J, Chen Q, Liu B. NCBRPred: predicting nucleic acid binding residues in proteins based on multilabel learning. Brief Bioinform 2021;22:bbaa397. 10.1093/bib/bbaa397.
Su H, Liu M, Sun S. et al. Improving the prediction of protein-nucleic acids binding residues via multiple sequence profiles and the consensus of complementary methods. Bioinformatics 2019;35:930–6. 10.1093/bioinformatics/bty756.
Ashburner M, Ball C, Blake J. et al. Gene ontology: tool for the unification of biology. Nat Genet 2000;25:25–9. 10.1038/75556.
Gligorijević V, Renfrew PD, Kosciolek T. et al. Structure-based protein function prediction using graph convolutional networks. Nat Commun 2021;12:3168. 10.1038/s41467-021-23303-9.
Kulmanov M, Hoehndorf R. DeepGOZero: improving protein function prediction from sequence and zero-shot learning based on ontology axioms. Bioinformatics 2022;38:i238–45. 10.1093/bioinformatics/btac256.
Yang F, Wang W, Wang F. et al. scBERT as a large-scale pre-trained deep language model for cell type annotation of single-cell RNA-seq data. Nat Mach Intell 2022;4:852–66. 10.1038/s42256-022-00534-z.
Choromanski K, Likhosherstov V, Dohan D. et al. Rethinking attention with performers. arXiv preprint arXiv:2009.14794, 2020.
Lu Y, Jiang X, Fang Y. et al. Learning to pre-train graph neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 2021;35:4276–84. 10.1609/aaai.v35i5.16552.
Li C, Liu B, Kang B. et al. SciBet as a portable and fast single cell type identifier. Nat Commun 2020;11:1818. 10.1038/s41467-020-15523-2.
Kiselev VY, Yiu A, Hemberg M. Scmap: projection of single-cell RNA-seq data across data sets. Nat Methods 2018;15:359–62. 10.1038/nmeth.4644.
Yang X, Mann KK, Wu H. et al. scCross: a deep generative model for unifying single-cell multi-omics with seamless integration, cross-modal generation, and in silico exploration. Genome Biol 2024;25:198. 10.1186/s13059-024-03338-z.
Hao M, Gong J, Zeng X. et al. Large-scale foundation model on single-cell transcriptomics. Nat Methods 2024;21:1481–1491.
Saharia C, Chan W, Saxena S. et al. Photorealistic text-to-image diffusion models with deep language understanding. Adv Neural Inf Process Syst 2022;35:36479–94.
Cao ZJ, Gao G. Multi-omics single-cell data integration and regulatory inference with graph-linked embedding. Nat Biotechnol 2022;40:1458–66. 10.1038/s41587-022-01284-4.
Ciciani M, Demozzi M, Pedrazzoli E. et al. Automated identification of sequence-tailored Cas9 proteins using massive metagenomic data. Nat Commun 2022;13:6474. 10.1038/s41467-022-34213-9.
Ruiz C, Zitnik M, Leskovec J. Identification of disease treatment mechanisms through the multiscale interactome. Nat Commun 2021;12:1796. 10.1038/s41467-021-21770-8.
Eraslan G, Avsec Ž, Gagneur J. et al. Deep learning: new computational modeling techniques for genomics. Nat Rev Genet 2019;20:389–403. 10.1038/s41576-019-0122-6.
Poli M, Massaroli S, Nguyen E. et al. Hyena hierarchy: towards larger convolutional language models. In: International Conference on Machine Learning. PMLR, 2023;28043–28078.
Jeliazkov JR, del Alamo D, Karpiak JD. Esmfold hallucinates native-like protein sequences. bioRxiv 2023; 2023–05.
Wang Z, Dai Z, Póczos B. et al. Characterizing and avoiding negative transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019; 11293–11302.
Wang H, Kaddour J, Liu S. et al. Evaluating self-supervised learning for molecular graph embeddings. Advances in Neural Information Processing Systems, 2024;36.
Zhou H, Zhang S, Peng J. et al. Informer: beyond efficient transformer for long sequence time-series forecasting. Proceedings of the AAAI Conference on Artificial Intelligence 2021;35:11106–15. 10.1609/aaai.v35i12.17325.
Press O, Smith NA, Lewis M. Shortformer: better language modeling using shorterinputs. arXiv preprint arXiv:2012.15832, 2020.
Li C, Zhang M, He Y. The stability-efficiency dilemma: investigating sequence length warmup for training GPT models. Adv Neural Inf Process Syst 2022;35:26736–50.
Dao T, Fu D, Ermon S. et al. Flashattention: fast and memory-efficient exact attention with io-awareness. Adv Neural Inf Process Syst 2022;35:16344–59.
Ainslie J, Lee-Thorp J, de Jong M. et al. GQA: training generalized multi-query transformer models from multi-headcheckpoints. arXiv preprint arXiv:2305.13245, 2023.
Hijma P, Heldens S, Sclocco A. et al. Optimization techniques for GPU programming. ACM Comput Surv 2023;55:1–81. 10.1145/3570638.
Cui P, Athey S. Stable learning establishes some common ground between causal inference and machine learning. Nat Mach Intell 2022;4:110–5. 10.1038/s42256-022-00445-z.
Jin Q, Yuan Z, Xiong G. et al. Biomedical question answering: a survey of approaches and challenges. ACM Comput Surv 2022;55:1–36. 10.1145/3490238.
Danaee P, Rouches M, Wiley M. et al. bpRNA: large-scale automated annotation and analysis of RNA secondary structure. Nucleic Acids Res 2018;46:5381–94. 10.1093/nar/gky285.
Moon I, LoPiccolo J, Baca SC. et al. Machine learning for genetics-based classification and treatment response prediction in cancer of unknown primary. Nat Med 2023;29:2057–67. 10.1038/s41591-023-02482-6.
Wornow M, Xu Y, Thapa R. et al. The shaky foundations of large language models and foundation models for electronic health records. npj Digit Med 2023;6:135. 10.1038/s41746-023-00879-8.

Progress and opportunities of foundation models in bioinformatics. Briefings in Bioinformatics

Progress and opportunities of foundation models in bioinformatics. Briefings in

References

Related document on the Qiita