LLM(Large Language Model) Advent Calendar 2024

document list, AI(19)

Last updated at 2024-11-23Posted at 2024-11-18

Chen et al., MEDITRON-70B: Scaling Medical Pretraining for Large Language Models, arXiv, 2023. https://arxiv.org/pdf/2311.16079

Nori et al., Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine, arXiv, 2023.
Kasai et al., Evaluating GPT-4 and ChatGPT on Japanese Medical Licensing Examinations, arXiv, 2023. https://arxiv.org/pdf/2303.18027
Rein et al., GPQA: A Graduate-Level Google-Proof Q&A Benchmark , COLM, 2024.

出典)Li et al., CANCERLLM: A LARGE LANGUAGE MODEL IN CANCER DOMAIN, 2024.
Yu et al., ECG Semantic Integrator (ESI): A Foundation ECG Model Pretrained with LLM-Enhanced Cardiological Text, 2024.
Zhao et al., EpilepsyLLM: Domain-Specific Large Language Model Fine-tuned with Epilepsy Medical Knowledge, 2024.

(出典)Adams et al., LongHealth: A QUESTION ANSWERING BENCHMARK WITH LONG CLINICAL DOCUMENTS,2024.
JMED-LLM: Japanese Medical Evaluation Dataset for Large Language Models, https://github.com/sociocom/JMED-LLM

Ryo Kamoi, Sarkar Snigdha Sarathi Das, Renze Lou, Jihyun Janice
Ahn, Yilun Zhao, Xiaoxin Lu, Nan Zhang, Yusen Zhang, Ranran
Haoran Zhang, Sujeeth Reddy Vummanthala, Salika Dave, Shaobo
Qin, Arman Cohan, Wenpeng Yin, Rui Zhang. (2024) ”Evaluating
LLMs at Detecting Errors in LLM Responses” COLM 2024

• Ryo Kamoi, Yusen Zhang, Nan Zhang, Jiawei Han, Rui Zhang. (2024)
“When Can LLMs Actually Correct Their Own Mistakes? A Critical
Survey of Self-Correction of LLMs“ TACL 2024 (to appear)

Zakka et al., Almanac: Retrieval-Augmented Language Models for Clinical Medicine, arXiv, 2023. https://arxiv.org/pdf/2303.01229

Fleming et al., MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records, https://arxiv.org/pdf/2308.14089

Xiong et al., Benchmarking Retrieval-Augmented Generation for Medicine, ACL, 2024.

Wu et al., MEDICAL GRAPH RAG: TOWARDS SAFE MEDICAL LARGE LANGUAGE MODEL VIA GRAPH RETRIEVAL- AUGMENTED
GENERATION, arXiv, 2024.

O’Sullivan et al., Towards Democratization of Subspeciality Medical Expertise, arXiv, 2024.

出典)Li et al., Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents , 2024.

出典)
https://fortune.com/2023/07/10/google-ai-mayo-clinic-healthcare-med-palm-2-large-language-model/
https://www.tohoku.ac.jp/japanese/newimg/pressimg/tohokuuniv-press20231213_01web_llm.pdf
https://www.optim.co.jp/newsdetail/20240329-pressrelease-01
https://www.fixer.co.jp/news/2024/04/2024_0416_001_entrusted_with_the_design_and_development_work_of_the_standard_electronic_medical_record_system_alpha_version

Huang et al., TRUSTLLM: TRUSTWORTHINESS IN LARGE LANGUAGE MODELS – A PRINCIPLE AND BENCHMARK , arXiv, 2024.

Ning et al., Generative artificial intelligence and ethical considerations in health care: a scoping review and ethics checkl ist, The Lancet Digital Health, 2024.

Pal et al., Med-HALT: Medical Domain Hallucination Test for Large Language Models , CoNLL, 2023.
Vishwanath et al., Faithfulness Hallucination Detection in Healthcare AI, KDD-AIDSH, 2024. https://openreview.net/pdf?id=6eMIzKFOpJ

Reference
• Singhal et al., Towards Expert-Level Medical Question Answering with Large Language Models, arXiv, 2023. https://arxiv.org/pdf/2305.09617
• Chung et al., Scaling Instruction-Finetuned Language Models, JMLR, 2024. https://www.jmlr.org/papers/volume25/23-0870/23-0870.pdf
• Li et al., CANCERLLM: A LARGE LANGUAGE MODEL IN CANCER DOMAIN, 2024.
• Yu et al., ECG Semantic Integrator (ESI): A Foundation ECG Model Pretrained with LLM-Enhanced Cardiological Text, 2024.
• Zhao et al., EpilepsyLLM: Domain-Specific Large Language Model Fine-tuned with Epilepsy Medical Knowledge, 2024.
• Nori et al., Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine, arXiv, 2023.
• Kasai et al., Evaluating GPT-4 and ChatGPT on Japanese Medical Licensing Examinations, arXiv, 2023. https://arxiv.org/pdf/2303.18027
• Rein et al., GPQA: A Graduate-Level Google-Proof Q&A Benchmark , COLM, 2024.
• Adams et al., LongHealth: A QUESTION ANSWERING BENCHMARK WITH LONG CLINICAL DOCUMENTS,2024.
• Dorfner et al., BIOMEDICAL LARGE LANGUAGES MODELS SEEM NOT TO BE SUPERIOR TO GENERALIST MODELS ON UNSEEN MEDICAL DATA,
arXiv, 2024.
• Nest et al., MEDFUZZ: EXPLORING THE ROBUSTNESS OF LARGE LANGUAGE MODELS IN MEDICAL QUESTION ANSWERING, 2024.
• Nagar et al., LLMs are not Zero-Shot Reasoners for Biomedical Information Extraction , arXiv, 2024
• Wei et al., Chain-of-Thought Prompting Elicits Reasoning in Large Language Models, NeurIPS, 2022.
• Kojima et al., Large Language Models are Zero-Shot Reasoners, NeurIPS, 2022.
• Zakka et al., Almanac: Retrieval-Augmented Language Models for Clinical Medicine, arXiv, 2023. https://arxiv.org/pdf/2303.01229
• Tu et al., Towards Conversational Diagnostic AI, Google Research & Google DeepMind, 2024.
• Fleming et al., MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records , arXiv, 2023.
• Minaee et al., Large Language Models: A Survey, arXiv, 2024. https://arxiv.org/pdf/2402.06196
• Huang et al., TRUSTLLM: TRUSTWORTHINESS IN LARGE LANGUAGE MODELS – A PRINCIPLE AND BENCHMARK , arXiv, 2024.
• Ning et al., Generative artificial intelligence and ethical considerations in health care: a scoping review and ethics checkl ist, The Lancet Digital Health, 2024.
• Pal et al., Med-HALT: Medical Domain Hallucination Test for Large Language Models , CoNLL, 2023.
• Vishwanath et al., Faithfulness Hallucination Detection in Healthcare AI, KDD-AIDSH, 2024. https://openreview.net/pdf?id=6eMIzKFOpJ
• Li et al.(2024), Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents, arXiv:2405.02957
• Saab et al.(2024), Capabilities of Gemini Models in Medicine, arXiv:2404.18416
• Christophe et al.(2024), MED42-V2: A SUITE OF CLINICAL LLMS, arXiv:2408.06142
• Chen et al.(2024), MEDITRON-70B: Scaling Medical Pretraining for Large Language Models, arXiv:2311.16079
• Nori et al.(2023), Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine, arXiv:2311.16452
• Sukeda et al.(2023), JMedLoRA:Medical Domain Adaptation on Japanese Large Language Models using Instruction-tuning, NeurIPS
Workshop Deep Generative Models for Health.
• Yang et al.(2024), Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language Model through Expert Feedback and
Real-World Multi-Turn Dialogue, AAAI 2024.
• Pieri et al.(2024), BiMediX: Bilingual Medical Mixture of Experts LLM, arXiv:2402.13253
• Qiu et al.(2024) Towards Building Multilingual Language Model for Medicine, arXiv:2402.13963
• Li et al.(2023), LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day, NeurIPS 2023.

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up