Q. Jin, Y. Yang, Q. Chen, and Z. Lu. Genegpt: augmenting large language models with domain tools for improved access to biomedical information. Bioinformatics, 40, 2 2024. ISSN 13674811. doi: 10.1093/BIOINFORMATICS/BTAE075. URL https://dx.doi.org/10.1093/bioinformatics/btae075.
References
Altschul SF, Gish W, Miller W et al. Basic local alignment search tool. J Mol Biol 1990;215:403–10.
Boratyn GM, Camacho C, Cooper PS et al. Blast: a more efficient report with usability improvements. Nucleic Acids Res 2013;41:W29–W33.
Borgeaud S, Mensch A, Hoffmann J et al. Improving language models by retrieving from trillions of tokens. In: International conference on machine learning, Baltimore, Maryland, USA, p. 2206–40. PMLR, 2022.
Brown T, Mann B, Ryder N et al. Language models are few-shot learners. Advances in Neural Information Processing Systems 2020;33:1877–901.
Chen M, Tworek J, Jun H et al. Evaluating large language models trained on code. arXiv, arXiv:2107.03374, 2021, preprint: not peer reviewed.
Chowdhery A, Narang S, Devlin J et al. Palm: scaling language modeling with pathways. arXiv, arXiv:2204.02311, 2022, preprint: not peer reviewed.
Ely JW, Osheroff JA, Chambliss ML et al. Answering physicians’ clinical questions: obstacles and potential solutions. J Am Med Inform Assoc 2005;12:217–24.
Gao L, Madaan A, Zhou S et al. Pal: program-aided language models. arXiv, arXiv:2211.10435, 2022, preprint: not peer reviewed.
Guu K, Lee K, Tung Z et al. Retrieval augmented language model pre-training. In: International conference on machine learning, p. 3929–3938. PMLR, 2020.
Hou W, Ji Z. Geneturing tests gpt models in genomics. bioRxiv 2023:2023–03. pages
Ji Z, Lee N, Frieske R et al. Survey of hallucination in natural language generation. ACM Comput Surv 2023;55:1–38.
Jin Q, Leaman R, Lu Z. Retrieve, summarize, and verify: how will chatgpt impact information seeking from the medical literature? J Am Soc Nephrol 2023a;34:1302–4.
Jin Q, Wang Z, Floudas CS et al. Matching patients to clinical trials with large language models. arXiv, arXiv:2307.15051, 2023b, preprint: not peer reviewed.
Jin Q, Yuan Z, Xiong G et al. Biomedical question answering: a survey of approaches and challenges. ACM Comput Surv 2022;55:1–36.
Kaplan J, McCandlish S, Henighan T et al. Scaling laws for neural language models. arXiv, arXiv:2001.08361, 2020, preprint: not peer reviewed.
Lewis P, Perez E, Piktus A et al. Retrieval-augmented generation for knowledge-intensive nlp tasks. Adv Neural Inform Process Syst 2020;33:9459–74.
Liévin V, Hother CE, Winther O. Can large language models reason about medical questions? arXiv, arXiv:2207.08143, 2022, preprint: not peer reviewed.
Luo R, Sun L, Xia Y et al. Biogpt: generative pre-trained transformer for biomedical text generation and mining. Brief Bioinform 2022;23. https://doi.org/10.1093/bib/bbac409.
Mialon G, Dessì R, Lomeli M et al. Augmented language models: a survey. arXiv, arXiv:2302.07842, 2023, preprint: not peer reviewed.
Nori H, King N, McKinney SM et al. Capabilities of gpt-4 on medical challenge problems. arXiv, arXiv:2303.13375, 2023, preprint: not peer reviewed.
OpenAI. GPT-4 technical report. CoRR abs/2303.08774, 2023.
Parisi A, Zhao Y, Fiedel N. Talm: tool augmented language models. arXiv, arXiv:2205.12255, 2022, preprint: not peer reviewed.
Qin Y, Hu S, Lin Y et al. Tool learning with foundation models. arXiv, arXiv:2304.08354, 2023, preprint: not peer reviewed. http://arxiv.org/pdf/2304.08354.pdf.
Radford A, Narasimhan K, Salimans T et al. Improving language understanding by generative pre-training, OpenAI Blog, 2018.
Radford A, Wu J, Child R et al. Language models are unsupervised multitask learners. OpenAI Blog 2019;1:9.
Sayers EW, Agarwala R, Bolton EE et al. Database resources of the national center for biotechnology information. Nucleic Acids Res 2019;47:D23–D28.
Schick T, Dwivedi-Yu J, Dessì R et al. Toolformer: language models can teach themselves to use tools. arXiv, arXiv:2302.04761, 2023, preprint: not peer reviewed.
Schuler G, Epstein J, Ohkawa H et al. Entrez: molecular biology database and retrieval system. Methods Enzymol 1996;266:141–62.
Singhal K, Azizi S, Tu T et al. Large language models encode clinical knowledge. arXiv, arXiv:2212.13138, 2022, preprint: not peer reviewed.
Tian S, Jin Q, Yeganova L et al. Opportunities and challenges for chatgpt and large language models in biomedicine and health. Brief Bioinform 2024;25(1). https://doi.org/10.1093/bib/bbad493.
Wei J, Tay Y, Bommasani R et al. Emergent abilities of large language models. arXiv, arXiv:2206.07682, 2022a, preprint: not peer reviewed.
Wei J, Wang X, Schuurmans D et al. Chain of thought prompting elicits reasoning in large language models. arXiv, arXiv:2201.11903, 2022b, preprint: not peer reviewed.
Wong C, Zheng S, Gu Y et al. Scaling clinical trial matching using large language models: a case study in oncology. arXiv, arXiv:2308.02180, 2023, preprint: not peer reviewed.
Yao S, Zhao J, Yu D et al. React: synergizing reasoning and acting in language models. arXiv, arXiv:2210.03629, 2022, preprint: not peer reviewed.
Yuan J, Tang R, Jiang X et al. Llm for patient-trial matching: privacy-aware data augmentation towards better performance and generalizability. arXiv, arXiv:2303.16756, 2023, preprint: not peer reviewed.
Related document on the Qiita
making reference list on biorxiv pdf file
https://qiita.com/kaizen_nagoya/items/75f6f93ce9872a5d622d
Genome modeling and design across all domains of life with evo 2
https://qiita.com/kaizen_nagoya/items/eecda74f758008633ee2
BIOREASON: DNA-LLMモデルによるマルチモーダル生物学的推論の動機付け
https://qiita.com/kaizen_nagoya/items/0718b214043a614deee0
Mckusick’s online mendelian inheritance in man (omim®)
https://qiita.com/kaizen_nagoya/items/c599d867201d1ffb1f4d
Anthropic. Claude 3.7 sonnet
https://qiita.com/kaizen_nagoya/items/4364d9c475114353cf2a
Genomic language models: Opportunities and challenges
https://qiita.com/kaizen_nagoya/items/f797330e64e0c7d05f39
A dna language model based on multispecies alignment predicts the effects of genome-wide variants
https://qiita.com/kaizen_nagoya/items/6e8858c2395dcc98804a
A genomic mutational constraint map using variation in 76,156 human genomes
https://qiita.com/kaizen_nagoya/items/e799ad85ee98bb2a8cf6
Genomic language models: Opportunities and challenges
https://qiita.com/kaizen_nagoya/items/f797330e64e0c7d05f39
Nucleotide transformer: building and evaluating robust foundation models for human genomics
https://qiita.com/kaizen_nagoya/items/1c147c2b095364f04ef7
A genomic mutational constraint map using variation in 76,156 human genomes
https://qiita.com/kaizen_nagoya/items/e799ad85ee98bb2a8cf6
DeepSeek-AI
https://qiita.com/kaizen_nagoya/items/bb5ee9f17c03e07659d8
Codontransformer: A multispecies codon optimizer using context-aware neural networks.
https://qiita.com/kaizen_nagoya/items/d4be1d4dd9eb307f09cc
Medrax: Medical reasoning agent for chest x-ray
https://qiita.com/kaizen_nagoya/items/94c7835b2f461452b2e7
Benchmarking dna foundation models for genomic sequence classification running title: Dna foundation models benchmarking.
https://qiita.com/kaizen_nagoya/items/01e3dde0d8274fee0fd8
Lora: Low-rank adaptation of large language models,
https://qiita.com/kaizen_nagoya/items/877058f681d77808b44c
kegg pull: a software package for the restful access and pulling from the kyoto encyclopedia of gene and genomes.
https://qiita.com/kaizen_nagoya/items/05be40565793f2b4f7f3
Genegpt: augmenting large language models with domain tools for improved access to biomedical information.
https://qiita.com/kaizen_nagoya/items/8897792ff52fb5e68a46