AI・機械学習関連論文 Advent Calendar 2024

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI, AI(8)

Last updated at 2024-11-29Posted at 2024-11-10

AI・機械学習関連論文 Calendar 2024
https://qiita.com/advent-calendar/2024/aiml
Day 8投稿予定記事です。

最初は、
LLM(Large Language Model) Advent Calendar 2024
https://qiita.com/advent-calendar/2024/llm
3日目投稿予定でした。

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Xiang Yue, Yuansheng Ni, Kai Zhang, Tianyu Zheng, Ruoqi Liu, Ge Zhang, Samuel Stevens, Dongfu Jiang, Weiming Ren, Yuxuan Sun, Cong Wei, Botao Yu, Ruibin Yuan, Renliang Sun, Ming Yin, Boyuan Zheng, Zhenzhu Yang, Yibo Liu, Wenhao Huang, Huan Sun, Yu Su, Wenhu Chen
https://arxiv.org/abs/2311.16502v4

参考文献は、番号が一部抜けています。順次追記します。
単語帳は、一部、単語がうまく分割できず、現在、単語分解するか方法を検討中です。
今しばらくお待ちください。よい方法があれば、コメント欄にご記入いただけると幸いです。

<この項は書きかけです。順次追記します。>
This article is not completed. I will add some words and/or centences in order.

References

[1] Blaise Agu ̈era y Arcas and Peter Norvig. Artificial general intelligence is already here. Noema Magazine, 2023. 1
[2] Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc, Antoine
Miech, Iain Barr, Yana Hasson, Karel Lenc, Arthur Mensch, Katherine Millican, Malcolm Reynolds, et al. Flamingo: a visual language model for few-shot learning. In Advances in Neural Information Processing Systems, 2022. 3, 5
[3] Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C. Lawrence Zitnick, and Devi Parikh. VQA: Visual Question Answering. In International Conference on Computer Vision (ICCV), 2015. 2, 3
[4] AnasAwadalla,IrenaGao,JoshGardner,JackHessel,Yusuf Hanafy, Wanrong Zhu, Kalyani Marathe, Yonatan Bitton, Samir Gadre, Shiori Sagawa, et al. Openflamingo: An open- source framework for training large autoregressive vision- language models. arXiv preprint arXiv:2308.01390, 2023. 3, 5, 6, 15, 16, 17, 18, 19, 20, 21
[5] Jinze Bai, Shuai Bai, Shusheng Yang, Shijie Wang, Sinan Tan, Peng Wang, Junyang Lin, Chang Zhou, and Jingren Zhou. Qwen-vl: A versatile vision-language model for un- derstanding, localization, text reading, and beyond. arXiv preprint arXiv:2308.12966, 2023. 5, 6, 7, 15, 16, 17, 18, 19, 20, 21
[6] Rohan Bavishi, Erich Elsen, Curtis Hawthorne, Maxwell Nye, Augustus Odena, Arushi Somani, and Sag ̆nak Tas ̧ırlar. Introducing our multimodal models, 2023. 3, 5, 6, 7, 15, 16, 17, 18, 19, 20, 21
[7] Se ́bastienBubeck,VarunChandrasekaran,RonenEldan,Jo- hannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, et al. Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712, 2023. 1
[8] Bunny. Bunny-3b. https://github.com/cappuch/ Bunny-Qwen, 2024. GitHub Repository. 15, 16, 17, 18, 19, 20, 21
[9] Xi Chen, Josip Djolonga, Piotr Padlewski, Basil Mustafa, Soravit Changpinyo, Jialin Wu, Carlos Riquelme Ruiz, Se- bastian Goodman, Xiao Wang, Yi Tay, et al. Pali-x: On scaling up a multilingual vision and language model. arXiv preprint arXiv:2305.18565, 2023. 2
[10] Yen-Chun Chen, Linjie Li, Licheng Yu, Ahmed El Kholy, Faisal Ahmed, Zhe Gan, Yu Cheng, and Jingjing Liu. Uniter: Universal image-text representation learning. In European Conference on Computer Vision, pages 104–120, 2020. 3
[11] Zhe Chen, Jiannan Wu, Wenhai Wang, Weijie Su, Guo Chen, Sen Xing, Zhong Muyan, Qinglong Zhang, Xizhou Zhu, Lewei Lu, et al. Internvl: Scaling up vision foundation mod- els and aligning for generic visual-linguistic tasks. arXiv preprint arXiv:2312.14238, 2023. 6, 15, 16, 17, 18, 19, 20, 21
[12] Wei-Lin Chiang, Zhuohan Li, Zi Lin, Ying Sheng, Zhang- hao Wu, Hao Zhang, Lianmin Zheng, Siyuan Zhuang, Yong- hao Zhuang, Joseph E. Gonzalez, Ion Stoica, and Eric P. Xing. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality, 2023. 3, 5, 6, 15, 16, 17, 18, 19, 20, 21
[13] Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, et al. Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311, 2022. 1
[14] Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, et al. Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416, 2022. 3, 5, 6, 15, 16, 17, 18, 19, 20, 21
[15] Chenhang Cui, Yiyang Zhou, Xinyu Yang, Shirley Wu, Lin- jun Zhang, James Zou, and Huaxiu Yao. Holistic analysis of hallucination in gpt-4v (ision): Bias and interference chal- lenges. arXiv preprint arXiv:2311.03287, 2023. 3, 8
[16] Wenliang Dai, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong, Junqi Zhao, Weisheng Wang, Boyang Li, Pascale Fung, and Steven Hoi. Instructblip: Towards general- purpose vision-language models with instruction tuning. arXiv preprint arXiv:2305.06500, 2023. 2, 3, 5, 6, 7, 15, 16, 17, 18, 19, 20, 21
[17] Xiaoyi Dong, Pan Zhang, Yuhang Zang, Yuhang Cao, Bin Wang, Linke Ouyang, Xilin Wei, Songyang Zhang, Haodong Duan, Maosong Cao, et al. Internlm-xcomposer2: Mastering free-form text-image composition and compre- hension in vision-language large model. arXiv preprint arXiv:2401.16420, 2024. 6, 15, 16, 17, 18, 19, 20, 21
[18] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Syl- vain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representa- tions, 2021. 3
[19] Adept Fuyu Team. Adept fuyu-heavy: A new multimodal model. https://www.adept.ai/blog/adept- fuyu-heavy, 2024. 15, 16, 17, 18, 19, 20, 21
[20] Peng Gao, Jiaming Han, Renrui Zhang, Ziyi Lin, Shijie Geng, Aojun Zhou, Wei Zhang, Pan Lu, Conghui He, Xi- angyu Yue, et al. Llama-adapter v2: Parameter-efficient vi- sual instruction model. arXiv preprint arXiv:2304.15010, 2023. 3, 5
[21] Yingqiang Ge, Wenyue Hua, Jianchao Ji, Juntao Tan, Shuyuan Xu, and Yongfeng Zhang. Openagi: When llm meets domain experts. arXiv preprint arXiv:2304.04370, 2023. 1
[22] Google Gemini Team. Gemini: A family of highly capable multimodal models. https : / / storage . googleapis . com / deepmind - media / gemini / gemini_1_report.pdf, 2023. 15, 16, 17, 18, 19, 20, 21, 119
[23] Google Gemini Team. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. https: //storage.googleapis.com/deepmind-media/ gemini/gemini_v1_5_report.pdf, 2024. 6, 15, 119
[24] Yash Goyal, Tejas Khot, Douglas Summers-Stay, Dhruv Ba- tra, and Devi Parikh. Making the v in vqa matter: Elevating
the role of image understanding in visual question answer- ing. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6904–6913, 2017. 2, 3
[25] Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt. Mea- suring massive multitask language understanding. In Inter- national Conference on Learning Representations, 2020. 2
[26] Yupan Huang, Zaiqiao Meng, Fangyu Liu, Yixuan Su, Col- lier Nigel, and Yutong Lu. Sparkles: Unlocking chats across multiple images for multimodal instruction-following mod- els. arXiv preprint arXiv:2308.16463, 2023. 3
[27] Drew A Hudson and Christopher D Manning. Gqa: A new dataset for real-world visual reasoning and compositional question answering. In Proceedings of the IEEE/CVF con- ference on computer vision and pattern recognition, pages 6700–6709, 2019. 3
[28] HyperGAI.Revolutionizingthefuturewithhypergenerative ai. 2024. 15, 16, 17, 18, 19, 20, 21
[29] ChaoJia,YinfeiYang,YeXia,Yi-TingChen,ZaranaParekh, Hieu Pham, Quoc Le, Yun-Hsuan Sung, Zhen Li, and Tom Duerig. Scaling up visual and vision-language representa- tion learning with noisy text supervision. In International conference on machine learning, pages 4904–4916. PMLR, 2021. 3
[30] Sahar Kazemzadeh, Vicente Ordonez, Mark Matten, and Tamara Berg. Referitgame: Referring to objects in pho- tographs of natural scenes. In Proceedings of the 2014 con- ference on empirical methods in natural language processing (EMNLP), pages 787–798, 2014. 2
[31] Kunlun. Agi and aigc business skywork. 2024. 15, 16, 17, 18, 19, 20, 21
[32] Ehsan Latif, Gengchen Mai, Matthew Nyaaba, Xuansheng Wu, Ninghao Liu, Guoyu Lu, Sheng Li, Tianming Liu, and Xiaoming Zhai. Artificial general intelligence (agi) for edu- cation. arXiv preprint arXiv:2304.12479, 2023. 1
[33] Bohao Li, Rui Wang, Guangzhi Wang, Yuying Ge, Yix- iao Ge, and Ying Shan. Seed-bench: Benchmarking mul- timodal llms with generative comprehension. arXiv preprint arXiv:2307.16125, 2023. 2, 3
[34] Bo Li, Yuanhan Zhang, Liangyu Chen, Jinghao Wang, Jingkang Yang, and Ziwei Liu. Otter: A multi-modal model with in-context instruction tuning. arXiv preprint arXiv:2305.03726, 2023. 3, 5, 15, 16, 17, 18, 19, 20, 21
[35] Junnan Li, Dongxu Li, Silvio Savarese, and Steven Hoi. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. Inter- national Conference on Machine Learning, 2023. 2, 3, 5, 6, 7, 15, 16, 17, 18, 19, 20, 21
[36] Lei Li, Yuwei Yin, Shicheng Li, Liang Chen, Peiyi Wang, Shuhuai Ren, Mukai Li, Yazheng Yang, Jingjing Xu, Xu Sun, et al. M3it: A large-scale dataset towards multi- modal multilingual instruction tuning. arXiv preprint arXiv:2306.04387, 2023. 3
[37] Xiujun Li, Xi Yin, Chunyuan Li, Pengchuan Zhang, Xiaowei Hu, Lei Zhang, Lijuan Wang, Houdong Hu, Li Dong, Furu Wei, et al. Oscar: Object-semantics aligned pre-training for vision-language tasks. In Computer Vision–ECCV 2020:
16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXX 16, pages 121–137. Springer, 2020. 3
[38] Yifan Li, Yifan Du, Kun Zhou, Jinpeng Wang, Wayne Xin Zhao, and Ji-Rong Wen. Evaluating object hallucina- tion in large vision-language models. arXiv preprint arXiv:2305.10355, 2023. 3
[39] Ji Lin, Hongxu Yin, Wei Ping, Yao Lu, Pavlo Molchanov, Andrew Tao, Huizi Mao, Jan Kautz, Mohammad Shoeybi, and Song Han. Vila: On pre-training for visual language models. arXiv preprint arXiv:2312.07533, 2023. 6, 15, 16, 17, 18, 19, 20, 21
[40] Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dolla ́r, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014. 2, 3
[41] ZiyiLin,ChrisLiu,RenruiZhang,PengGao,LongtianQiu, Han Xiao, Han Qiu, Chen Lin, Wenqi Shao, Keqin Chen, et al. Sphinx: The joint mixing of weights, tasks, and visual embeddings for multi-modal large language models. arXiv preprint arXiv:2311.07575, 2023. 15, 16, 17, 18, 19, 20, 21
[42] Fuxiao Liu, Tianrui Guan, Zongxia Li, Lichang Chen, Yaser Yacoob, Dinesh Manocha, and Tianyi Zhou. Hallusion- bench: You see what you think? or you think what you see? an image-context reasoning benchmark challenging for gpt- 4v (ision), llava-1.5, and other multi-modality models. arXiv preprint arXiv:2310.14566, 2023. 3
[43] Fuxiao Liu, Kevin Lin, Linjie Li, Jianfeng Wang, Yaser Yacoob, and Lijuan Wang. Aligning large multi-modal model with robust instruction tuning. arXiv preprint arXiv:2306.14565, 2023. 3
[44] Haotian Liu, Chunyuan Li, Yuheng Li, and Yong Jae Lee. Improved baselines with visual instruction tuning. arXiv preprint arXiv:2310.03744, 2023. 2, 3, 5, 7, 15, 16, 17, 18, 19, 20, 21
[45] HaotianLiu,ChunyuanLi,QingyangWu,andYongJaeLee. Visual instruction tuning. arXiv preprint arXiv:2304.08485, 2023. 3
[46] Haotian Liu, Chunyuan Li, Yuheng Li, Bo Li, Yuanhan Zhang, Sheng Shen, and Yong Jae Lee. Llava-next: Im- proved reasoning, ocr, and world knowledge. 2024. 6, 15, 16, 17, 18, 19, 20, 21
[47] YuanLiu,HaodongDuan,YuanhanZhang,BoLi,Songyang Zhang, Wangbo Zhao, Yike Yuan, Jiaqi Wang, Conghui He, Ziwei Liu, et al. Mmbench: Is your multi-modal model an all-around player? arXiv preprint arXiv:2307.06281, 2023. 2, 3
[48] Yuliang Liu, Zhang Li, Hongliang Li, Wenwen Yu, Mingxin Huang, Dezhi Peng, Mingyu Liu, Mingrui Chen, Chunyuan Li, Lianwen Jin, et al. On the hidden mystery of ocr in large multimodal models. arXiv preprint arXiv:2305.07895, 2023. 3
[49] Jiasen Lu, Dhruv Batra, Devi Parikh, and Stefan Lee. Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. Advances in neural information processing systems, 32, 2019. 3
[50] Pan Lu, Swaroop Mishra, Tanglin Xia, Liang Qiu, Kai-Wei Chang, Song-Chun Zhu, Oyvind Tafjord, Peter Clark, and Ashwin Kalyan. Learn to explain: Multimodal reasoning via thought chains for science question answering. Advances in Neural Information Processing Systems, 35:2507–2521, 2022. 2
[51] Pan Lu, Hritik Bansal, Tony Xia, Jiacheng Liu, Chunyuan Li, Hannaneh Hajishirzi, Hao Cheng, Kai-Wei Chang, Michel Galley, and Jianfeng Gao. Mathvista: Evaluating mathemat- ical reasoning of foundation models in visual contexts. arXiv preprint arXiv:2310.02255, 2023. 3
[52] Kenneth Marino, Mohammad Rastegari, Ali Farhadi, and Roozbeh Mottaghi. Ok-vqa: A visual question answering benchmark requiring external knowledge. In Conference on Computer Vision and Pattern Recognition (CVPR), 2019. 3
[53] Gre ́goireMialon,Cle ́mentineFourrier,CraigSwift,Thomas Wolf, Yann LeCun, and Thomas Scialom. Gaia: a benchmark for general ai assistants. arXiv preprint arXiv:2311.12983, 2023. 1, 3
[54] MiniCPM. Minicpm-v. https://github.com/ OpenBMB/MiniCPM, 2024. GitHub Repository. 15, 16, 17, 18, 19, 20, 21
[55] MiniCPM. Minicpm-v-2, 2024. 15, 16, 17, 18, 19, 20, 21
[56] Masoud Monajatipoor, Liunian Harold Li, Mozhdeh Rouhsedaghat, Lin F Yang, and Kai-Wei Chang. Metavl: Transferring in-context learning ability from language models to vision-language models. arXiv preprint
arXiv:2306.01311, 2023. 3
[57] MeredithRingelMorris,JaschaSohl-dickstein,NoahFiedel,
Tris Warkentin, Allan Dafoe, Aleksandra Faust, Clement Farabet, and Shane Legg. Levels of agi: Opera- tionalizing progress on the path to agi. arXiv preprint arXiv:2311.02462, 2023. 1, 3, 8
[58] OminiLMM. Ominilmm-12b. https://github.com/ OpenBMB/OmniLMM, 2024. GitHub Repository. 15, 16, 17, 18, 19, 20, 21
[59] OpenAI. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023. 1, 6, 15, 16, 17, 18, 19, 20, 21
[60] OpenAI. Gpt-4v(ision) system card, 2023. 2, 6, 7, 15, 16, 17, 18, 19, 20, 21
[61] OpenAI. Gpt-4o. 2024. 6, 15, 119
[62] Aitor Ormazabal, Che Zheng, Cyprien de Masson d’Autume,
Dani Yogatama, Deyu Fu, Donovan Ong, et al. Reka core, flash, and edge: A series of powerful multimodal language models. https://publications.reka.ai/reka- core-tech-report.pdf, 2024. 15, 16, 17, 18, 19, 20, 21, 119
[63] Zhiliang Peng, Wenhui Wang, Li Dong, Yaru Hao, Shaohan Huang, Shuming Ma, and Furu Wei. Kosmos-2: Ground- ing multimodal large language models to the world. arXiv preprint arXiv:2306.14824, 2023. 5, 6, 15, 16, 17, 18, 19, 20, 21
[64] Qwen. Qwen-vl-plus. https://github.com/ QwenLM/Qwen-VL?tab=readme-ov-file#qwen- vl-plus, 2023. GitHub Repository. 15, 16, 17, 18, 19, 20, 21
[65] Qwen. Qwen-vl-max. https : / / github . com / QwenLM/Qwen-VL?tab=readme-ov-file#qwen- vl-max, 2024. GitHub Repository. 6, 15, 16, 17, 18, 19, 20, 21
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervi- sion. In International conference on machine learning, pages 8748–8763. PMLR, 2021. 3, 5
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information process- ing systems, 28, 2015. 3
sensenova. Sensechat-vision, 2024. 6, 15, 16, 17, 18, 19, 20, 21
Amanpreet Singh, Vivek Natarjan, Meet Shah, Yu Jiang, Xinlei Chen, Devi Parikh, and Marcus Rohrbach. Towards vqa models that can read. In Proceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition, pages 8317–8326, 2019. 2
Quan Sun, Yufeng Cui, Xiaosong Zhang, Fan Zhang, Qiying Yu, Zhengxiong Luo, Yueze Wang, Yongming Rao, Jingjing Liu, Tiejun Huang, et al. Generative multimodal models are in-context learners. arXiv preprint arXiv:2312.13286, 2023. 15, 16, 17, 18, 19, 20, 21
Hao Tan and Mohit Bansal. Lxmert: Learning cross- modality encoder representations from transformers. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP- IJCNLP), pages 5100–5111, 2019. 3
Claude Team. Introducing the next generation of claude.
https://www.anthropic.com/news/claude-3- family, 2024. 6, 15, 119
InfiMM Team. Infimm: Advancing multimodal understand- ing from flamingo’s legacy through diverse llm integration, 2024. 15, 16, 17, 18, 19, 20, 21
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothe ́e Lacroix, Baptiste Rozie`re, Naman Goyal, Eric Hambro, Faisal Azhar, et al. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023. 1, 5
Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023. 6, 15, 16, 17, 18, 19, 20, 21
Junyang Wang, Yiyang Zhou, Guohai Xu, Pengcheng Shi, Chenlin Zhao, Haiyang Xu, Qinghao Ye, Ming Yan, Ji Zhang, Jihua Zhu, et al. Evaluation and analysis of hal- lucination in large vision-language models. arXiv preprint arXiv:2308.15126, 2023. 3
Weihan Wang, Qingsong Lv, Wenmeng Yu, Wenyi Hong, Ji Qi, Yan Wang, Junhui Ji, Zhuoyi Yang, Lei Zhao, Xixuan Song, et al. Cogvlm: Visual expert for pretrained language models. arXiv preprint arXiv:2311.03079, 2023. 2, 5, 6, 15, 16, 17, 18, 19, 20, 21
[78] Zirui Wang, Jiahui Yu, Adams Wei Yu, Zihang Dai, Yulia
Tsvetkov, and Yuan Cao. Simvlm: Simple visual language model pretraining with weak supervision. In International Conference on Learning Representations, 2021. 3
[79] Peng Xu, Wenqi Shao, Kaipeng Zhang, Peng Gao, Shuo Liu, Meng Lei, Fanqing Meng, Siyuan Huang, Yu Qiao, and Ping Luo. Lvlm-ehub: A comprehensive evaluation benchmark for large vision-language models. arXiv preprint arXiv:2306.09265, 2023. 3
[80] Zhengyuan Yang, Linjie Li, Kevin Lin, Jianfeng Wang, Chung-Ching Lin, Zicheng Liu, and Lijuan Wang. The dawn of lmms: Preliminary explorations with gpt-4v (ision). arXiv preprint arXiv:2309.17421, 2023. 2
[81] Qinghao Ye, Haiyang Xu, Guohai Xu, Jiabo Ye, Ming Yan, Yiyang Zhou, Junyang Wang, Anwen Hu, Pengcheng Shi, Yaya Shi, et al. mplug-owl: Modularization empowers large language models with multimodality. arXiv preprint arXiv:2304.14178, 2023. 3
[82] Qinghao Ye, Haiyang Xu, Jiabo Ye, Ming Yan, Haowei Liu, Qi Qian, Ji Zhang, Fei Huang, and Jingren Zhou. mplug-owl2: Revolutionizing multi-modal large language model with modality collaboration. arXiv preprint arXiv:2311.04257, 2023. 3, 5, 15, 16, 17, 18, 19, 20, 21
[83] Zhenfei Yin, Jiong Wang, Jianjian Cao, Zhelun Shi, Dingn- ing Liu, Mukai Li, Lu Sheng, Lei Bai, Xiaoshui Huang, Zhiyong Wang, et al. Lamm: Language-assisted multi- modal instruction-tuning dataset, framework, and bench- mark. arXiv preprint arXiv:2306.06687, 2023. 2, 3
[84] AlexYoung,BeiChen,ChaoLi,ChengenHuang,GeZhang, Guanwei Zhang, Heng Li, Jiangcheng Zhu, Jianqun Chen, Jing Chang, et al. Yi: Open foundation models by 01. ai. arXiv preprint arXiv:2403.04652, 2024. 6, 15, 16, 17, 18, 19, 20, 21
[85] Jiahui Yu, Zirui Wang, Vijay Vasudevan, Legg Yeung, Mo- jtaba Seyedhosseini, and Yonghui Wu. Coca: Contrastive captioners are image-text foundation models. TMLR, 2022. 3
[86] Weihao Yu, Zhengyuan Yang, Linjie Li, Jianfeng Wang, Kevin Lin, Zicheng Liu, Xinchao Wang, and Lijuan Wang. Mm-vet: Evaluating large multimodal models for integrated capabilities. arXiv preprint arXiv:2308.02490, 2023. 2, 3
[87] Pengchuan Zhang, Xiujun Li, Xiaowei Hu, Jianwei Yang, Lei Zhang, Lijuan Wang, Yejin Choi, and Jianfeng Gao. Vinvl: Revisiting visual representations in vision-language models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5579–5588, 2021. 3
[88] Renrui Zhang, Jiaming Han, Aojun Zhou, Xiangfei Hu, Shilin Yan, Pan Lu, Hongsheng Li, Peng Gao, and Yu Qiao. Llama-adapter: Efficient fine-tuning of language models with zero-init attention. arXiv preprint arXiv:2303.16199, 2023. 3, 6, 15, 16, 17, 18, 19, 20, 21
[89] Bo Zhao, Boya Wu, and Tiejun Huang. Svit: Scaling up visual instruction tuning. arXiv preprint arXiv:2307.04087, 2023. 3, 15, 16, 17, 18, 19, 20, 21

term list

no	wprd	count
1	the	523
2	and	459
3	of	455
4	a	442
5	to	379
6	b	320
7	image	277
8	in	258
9	v	195
10	back	172
11	error	168
12	c	164
13	d	145
14	gpt	132
15	models	125
16	table	125
17	figure	124
18	is	119
19	arxiv	101
20	case	101
21	for	101
22	figures	96
23	on	95
24	sample	94
25	subfield	93
26	index	88
27	list	87
28	groundtruth	86
29	option	86
30	vl	82
31	t	81
32	as	80
33	we	80
34	with	76
35	reasoning	74
36	perceptual	66
37	s	64
38	category	62
39	mmmu	62
40	from	59
41	m	57
42	text	57
43	language	56
44	questions	56
45	expert	54
46	art	52
47	multimodal	52
48	correct	51
49	e	51
50	visual	51
51	knowledge	50
52	preprint	50
53	that	48
54	however	46
55	results	46
56	are	45
57	data	45
58	errorcategory	45
59	design	44
60	li	44
61	this	44
62	vision	43
63	by	42
64	llava	42
65	model	42
66	engineering	41
67	it	41
68	or	41
69	science	40
70	so	40
71	medicine	39
72	open	38
73	correctcase	37
74	benchmark	36
75	errorerrorreason	36
76	large	36
77	qwen	36
78	be	34
79	chat	34
80	history	34
81	wang	34
82	an	33
83	choice	33
84	different	33
85	lack	33
86	question	33
87	zhang	33
88	h	32
89	lmms	32
90	flan	31
91	xxl	31
92	chemistry	30
93	performance	30
94	types	30
95	r	29
96	these	29
97	best	28
98	gemini	28
99	images	27
100	liu	27
101	test	27
102	validation	27
103	perceptualerror	26
104	therefore	26
105	understanding	26
106	where	26
107	al	25
108	et	25
109	fuyu	25
110	ocr	25
111	shot	25
112	subjects	25
113	each	24
114	k	24
115	level	24
116	llama	24
117	music	24
118	only	24
119	blip	23
120	health	23
121	such	23
122	clinical	22
123	input	22
124	overall	22
125	reka	22
126	set	22
127	source	22
128	but	21
129	chen	21
130	our	21
131	vicuna	21
132	yi	21
133	agi	20
134	biology	20
135	both	20
136	computer	20
137	geography	20
138	has	20
139	instructblip	20
140	instruction	20
141	minicpm	20
142	mm	20
143	provided	20
144	subject	20
145	authors	19
146	benchmarks	19
147	can	19
148	conference	19
149	i	19
150	learning	19
151	llms	19
152	psychology	19
153	tasks	19
154	theory	19
155	answer	18
156	like	18
157	materials	18
158	multi	18
159	multiple	18
160	plus	18
161	should	18
162	which	18
163	appendix	17
164	caption	17
165	errors	17
166	internvl	17
167	math	17
168	th	17
169	thus	17
170	x	17
171	yu	17
172	adept	16
173	context	16
174	disciplines	16
175	economics	16
176	given	16
177	have	16
178	literature	16
179	more	16
180	paper	16
181	rad	16
182	reasoningerror	16
183	social	16
184	agriculture	15
185	all	15
186	analysis	15
187	college	15
188	dataset	15
189	human	15
190	its	15
191	let	15
192	lin	15
193	marco	15
194	openflamingo	15
195	other	15
196	physiology	15
197	specific	15
198	u	15
199	while	15
200	also	14
201	at	14
202	business	14
203	difficulty	14
204	domain	14
205	evaluation	14
206	foundation	14
207	information	14
208	ision	14
209	ln	14
210	not	14
211	otter	14
212	p	14
213	pro	14
214	their	14
215	tuning	14
216	xl	14
217	accounting	13
218	across	13
219	added	13
220	annotators	13
221	arrow	13
222	basic	13
223	between	13
224	challenges	13
225	electronics	13
226	github	13
227	hpt	13
228	https	13
229	management	13
230	marketing	13
231	nation	13
232	now	13
233	q	13
234	research	13
235	second	13
236	thecorrectansweris	13
237	there	13
238	towards	13
239	adapter	12
240	existing	12
241	few	12
242	finance	12
243	general	12
244	knowledgeerrorreason	12
245	lackofknowledge	12
246	lu	12
247	medium	12
248	norepinephrine	12
249	options	12
250	owl	12
251	pages	12
252	pathology	12
253	pharmacy	12
254	processing	12
255	random	12
256	shows	12
257	significant	12
258	sociology	12
259	st	12
260	tech	12
261	various	12
262	vqa	12
263	will	12
264	yang	12
265	annotation	11
266	based	11
267	been	11
268	bunny	11
269	challenging	11
270	cogvlm	11
271	crepe	11
272	diagnostics	11
273	diverse	11
274	frequent	11
275	heavy	11
276	humanities	11
277	kosmos	11
278	lens	11
279	max	11
280	mechanical	11
281	mechanics	11
282	medical	11
283	minigpt	11
284	mplug	11
285	no	11
286	report	11
287	subfields	11
288	total	11
289	training	11
290	type	11
291	ultra	11
292	wei	11
293	would	11
294	zhou	11
295	advanced	10
296	ai	10
297	animal	10
298	ca	10
299	cl	10
300	core	10
301	f	10
302	fine	10
303	flash	10
304	huang	10
305	infimm	10
306	internlm	10
307	laboratory	10
308	light	10
309	manage	10
310	modal	10
311	perception	10
312	performing	10
313	physics	10
314	pressure	10
315	problems	10
316	quality	10
317	representation	10
318	sensechat	10
319	sphinx	10
320	study	10
321	svit	10
322	team	10
323	textbooks	10
324	up	10
325	vila	10
326	xcomposer	10
327	abilities	9
328	annotated	9
329	architecture	9
330	author	9
331	claude	9
332	com	9
333	complex	9
334	control	9
335	example	9
336	experts	9
337	financial	9
338	first	9
339	if	9
340	interleaved	9
341	international	9
342	landsat	9
343	omnilmm	9
344	physical	9
345	point	9
346	preview	9
347	proceedings	9
348	process	9
349	project	9
350	public	9
351	recognition	9
352	see	9
353	selected	9
354	shares	9
355	sharesat	9
356	skywork	9
357	step	9
358	still	9
359	structure	9
360	sun	9
361	textual	9
362	through	9
363	tyramine	9
364	without	9
365	world	9
366	xu	9
367	you	9
368	yue	9
369	achieve	8
370	additionally	8
371	advantage	8
372	answers	8
373	any	8
374	bold	8
375	capabilities	8
376	cm	8
377	collection	8
378	comparative	8
379	contributed	8
380	cost	8
381	depth	8
382	detailed	8
383	diagrams	8
384	discipline	8
385	easy	8
386	edge	8
387	emu	8
388	evaluating	8
389	format	8
390	g	8
391	gao	8
392	historyquestion	8
393	including	8
394	intelligence	8
395	into	8
396	json	8
397	kai	8
398	kj	8
399	most	8
400	nh	8
401	one	8
402	peng	8
403	plant	8
404	playground	8
405	power	8
406	representations	8
407	significantly	8
408	systems	8
409	underlined	8
410	when	8
411	worst	8
412	wu	8
413	xiang	8
414	zephyr	8
415	zhao	8
416	accuracy	7
417	air	7
418	airbase	7
419	approximately	7
420	aqua	7
421	artificial	7
422	computerscience	7
423	distribution	7
424	drug	7
425	due	7
426	effect	7
427	energy	7
428	explanation	7
429	further	7
430	inorganic	7
431	ji	7
432	lei	7
433	less	7
434	market	7
435	mmhg	7
436	natural	7
437	next	7
438	pe	7
439	pre	7
440	production	7
441	response	7
442	se	7
443	sources	7
444	su	7
445	theansweris	7
446	transaction	7
447	understand	7
448	university	7
449	veh	7
450	very	7
451	y	7
452	year	7
453	zheng	7
454	among	6
455	bd	6
456	breadth	6
457	broad	6
458	captioning	6
459	cause	6
460	ch	6
461	chang	6
462	chemical	6
463	chunyuan	6
464	comprehensive	6
465	consider	6
466	direction	6
467	ended	6
468	european	6
469	evaluate	6
470	examples	6
471	file	6
472	fixed	6
473	flamingo	6
474	fluid	6
475	following	6
476	genetics	6
477	han	6
478	hard	6
479	hu	6
480	interpretation	6
481	j	6
482	jianfeng	6
483	key	6
484	leading	6
485	lee	6
486	massive	6
487	mathematical	6
488	mc	6
489	minimal	6
490	n	6
491	ni	6
492	oc	6
493	organic	6
494	over	6
495	painting	6
496	pattern	6
497	perform	6
498	relatively	6
499	scaling	6
500	terra	6
501	thescaleofthephotographsis	6
502	trade	6
503	two	6
504	use	6
505	weget	6
506	well	6
507	were	6
508	what	6
509	ye	6
510	yin	6
511	zhu	6
512	adults	5
513	advances	5
514	after	5
515	answering	5
516	available	5
517	baselines	5
518	because	5
519	besides	5
520	blood	5
521	calculus	5
522	carbon	5
523	charts	5
524	clinicalmedicine	5
525	collected	5
526	collecting	5
527	common	5
528	conducted	5
529	cover	5
530	cpu	5
531	critical	5
532	curation	5
533	current	5
534	designed	5
535	double	5
536	dynamics	5
537	efficient	5
538	epidemiology	5
539	experiments	5
540	fields	5
541	find	5
542	focus	5
543	forstate	5
544	fp	5
545	ge	5
546	graph	5
547	here	5
548	hox	5
549	improvement	5
550	instances	5
551	instead	5
552	levels	5
553	lijuan	5
554	ma	5
555	made	5
556	main	5
557	major	5
558	mi	5
559	microbiology	5
560	might	5
561	modality	5
562	nature	5
563	often	5
564	paintings	5
565	pan	5
566	part	5
567	per	5
568	pharmacology	5
569	phenelzine	5
570	photographs	5
571	principle	5
572	progress	5
573	protocol	5
574	range	5
575	reject	5
576	repository	5
577	require	5
578	requires	5
579	root	5
580	scale	5
581	secondary	5
582	section	5
583	sheng	5
584	signal	5
585	since	5
586	single	5
587	six	5
588	skilled	5
589	state	5
590	statistics	5
591	students	5
592	synthesis	5
593	system	5
594	tables	5
595	three	5
596	units	5
597	updated	5
598	within	5
599	yan	5
600	yes	5
601	ab	4
602	absolute	4
603	accurate	4
604	anatomy	4
605	ande	4
606	approach	4
607	arts	4
608	assess	4
609	atom	4
610	average	4
611	axis	4
612	bandc	4
613	barrel	4
614	batra	4
615	biochemistry	4
616	block	4
617	bo	4
618	bothtimeseriesaremeanstationary	4
619	breakdown	4
620	calculations	4
621	cao	4
622	cardiovascular	4
623	change	4
624	chemistryquestion	4
625	civil	4
626	co	4
627	cocaine	4
628	comparison	4
629	compiler	4
630	complexity	4
631	considerations	4
632	contamination	4
633	contemporary	4
634	contribution	4
635	converging	4
636	copyright	4
637	corporate	4
638	correctly	4
639	covering	4
640	dec	4
641	deliberate	4
642	derive	4
643	devi	4
644	diagnosticsandlabmedicine	4
645	directlabor	4
646	eliminate	4
647	embedding	4
648	end	4
649	eng	4
650	eric	4
651	even	4
652	exams	4
653	finishes	4
654	follow	4
655	foot	4
656	force	4
657	fpr	4
658	gap	4
659	generation	4
660	geometric	4
661	hallucination	4
662	handling	4
663	hao	4
664	heterogeneous	4
665	highly	4
666	huan	4
667	identify	4
668	ieee	4
669	indexvalueatt	4
670	introduces	4
671	joint	4
672	kevin	4
673	law	4
674	lead	4
675	liang	4
676	linjie	4
677	llm	4
678	long	4
679	machine	4
680	macroeconomics	4
681	mapping	4
682	mask	4
683	means	4
684	meng	4
685	ming	4
686	minor	4
687	mri	4
688	must	4
689	necessary	4
690	neural	4
691	neuropathology	4
692	new	4
693	non	4
694	number	4
695	o	4
696	objects	4
697	ofhand	4
698	openai	4
699	optical	4
700	outputs	4
701	parallel	4
702	parikh	4
703	path	4
704	perfect	4
705	personyears	4
706	peter	4
707	pharmaceutical	4
708	photo	4
709	physiologyquestion	4
710	present	4
711	presents	4
712	pretraining	4
713	probability	4
714	prompt	4
715	provide	4
716	qwenvl	4
717	radiology	4
718	regions	4
719	resources	4
720	role	4
721	samples	4
722	sci	4
723	selection	4
724	shi	4
725	show	4
726	shown	4
727	skills	4
728	solution	4
729	some	4
730	song	4
731	specialized	4
732	stage	4
733	starts	4
734	statement	4
735	states	4
736	strong	4
737	structures	4
738	surveying	4
739	tail	4
740	than	4
741	thecorrectoptionis	4
742	thedouble	4
743	thefatheriscomparedtoagypsy	4
744	thegeographicextentofthemonetizationofeurasianeconomies	4
745	theorderisfromlefttoright	4
746	thermodynamics	4
747	thespeedofaishalfthatofb	4
748	they	4
749	tianyu	4
750	tn	4
751	toptobottom	4
752	tuned	4
753	using	4
754	vaultedroofing	4
755	velocity	4
756	vet	4
757	wenhu	4
758	whichofthefollowingisthemostlikelydiagnosis	4
759	whorl	4
760	work	4
761	yuan	4
762	yuansheng	4
763	zero	4
764	ability	3
765	achieves	3
766	add	3
767	addition	3
768	adhere	3
769	adherence	3
770	administration	3
771	adopted	3
772	adult	3
773	aligning	3
774	andc	3
775	andstate	3
776	aob	3
777	aorta	3
778	architectureandengineering	3
779	area	3
780	areproductionzonesofprovenoilreserves	3
781	around	3
782	arttheory	3
783	aspect	3
784	atestoftwoindependentmeans	3
785	att	3
786	attention	3
787	au	3
788	augmented	3
789	bai	3
790	beam	3
791	being	3
792	believe	3
793	better	3
794	beyond	3
795	biological	3
796	bod	3
797	body	3
798	bothmovetostate	3
799	buttheydifferoninput	3
800	capability	3
801	capable	3
802	carefully	3
803	categories	3
804	cbt	3
805	cell	3
806	changes	3
807	chao	3
808	character	3
809	child	3
810	choosingthematchingterm	3
811	chung	3
812	classical	3
813	clef	3
814	clip	3
815	cod	3
816	comics	3
817	conclusion	3
818	consistent	3
819	contain	3
820	contains	3
821	contributions	3
822	crepes	3
823	criticism	3
824	ctr	3
825	cui	3
826	dandy	3
827	deep	3
828	demonstrate	3
829	details	3
830	determine	3
831	develop	3
832	development	3
833	dhruv	3
834	diagram	3
835	directmaterials	3
836	discussions	3
837	disease	3
838	do	3
839	dong	3
840	drama	3
841	drawing	3
842	duan	3
843	econometrics	3
844	effects	3
845	egoism	3
846	electrocardiography	3
847	elements	3
848	empirical	3
849	employed	3
850	encoder	3
851	engineeringquestion	3
852	ensure	3
853	epidemiologyquestion	3
854	essential	3
855	ethical	3
856	evident	3
857	evolution	3
858	exchange	3
859	expertise	3
860	expressions	3
861	fails	3
862	falsepositives	3
863	family	3
864	features	3
865	findings	3
866	formats	3
867	four	3
868	fromthetable	3
869	fundamental	3
870	future	3
871	generative	3
872	genes	3
873	geographyquestion	3
874	geometry	3
875	go	3
876	goal	3
877	gqa	3
878	graphic	3
879	grounding	3
880	haiyang	3
881	haotian	3
882	he	3
883	head	3
884	heart	3
885	height	3
886	higher	3
887	how	3
888	iii	3
889	illustrated	3
890	imperialist	3
891	importance	3
892	improved	3
893	incorrect	3
894	increase	3
895	indicating	3
896	industrial	3
897	inputs	3
898	insights	3
899	insteadof	3
900	instructions	3
901	introduce	3
902	investment	3
903	involve	3
904	isms	3
905	itislikelytobesn	3
906	jae	3
907	jingjing	3
908	jun	3
909	junyang	3
910	justinian	3
911	kpa	3
912	layer	3
913	led	3
914	length	3
915	licensing	3
916	limitations	3
917	line	3
918	linear	3
919	literaturequestion	3
920	lower	3
921	making	3
922	many	3
923	maoi	3
924	maturenewborn	3
925	may	3
926	measure	3
927	mechanicalengineering	3
928	meet	3
929	meticulously	3
930	mmbench	3
931	modalities	3
932	modern	3
933	months	3
934	musicquestion	3
935	mutant	3
936	name	3
937	naming	3
938	need	3
939	needs	3
940	netincome	3
941	notable	3
942	note	3
943	object	3
944	oninput	3
945	online	3
946	ophthalmic	3
947	opportunity	3
948	optics	3
949	opus	3
950	out	3
951	overhead	3
952	overheadrate	3
953	oxygen	3
954	pathologyquestion	3
955	pdf	3
956	phentolamine	3
957	photography	3
958	possible	3
959	posterior	3
960	prior	3
961	proprietary	3
962	publichealth	3
963	qinghao	3
964	qiu	3
965	queries	3
966	quizzes	3
967	rather	3
968	rays	3
969	redistribution	3
970	reduction	3
971	reference	3
972	region	3
973	regulations	3
974	relatedhand	3
975	ren	3
976	renrui	3
977	represents	3
978	respiratory	3
979	responses	3
980	room	3
981	rule	3
982	ruoqi	3
983	rupturedberryaneurysm	3
984	savior	3
985	scans	3
986	scienceqa	3
987	sciences	3
988	seed	3
989	seen	3
990	sense	3
991	separate	3
992	shapes	3
993	sheets	3
994	short	3
995	shortpastern	3
996	solvefor	3
997	space	3
998	standard	3
999	starting	3
1000	startsandfinisheswithoutanyinterleaving	3
1001	stateerror	3
1002	steven	3
1003	strategic	3
1004	structural	3
1005	substantial	3
1006	suggests	3
1007	sulfur	3
1008	supervision	3
1009	surgery	3
1010	symptom	3
1011	tallbacksofchairsandlampsatthecornersofdiningtables	3
1012	tan	3
1013	task	3
1014	theconfigurationatc	3
1015	them	3
1016	then	3
1017	thepainting	3
1018	thepontomedullaryjunction	3
1019	theyarenotequivalent	3
1020	third	3
1021	thomas	3
1022	those	3
1023	top	3
1024	totaldirectlabordollars	3
1025	totalfactoryoverhead	3
1026	typically	3
1027	typo	3
1028	united	3
1029	usingtheequation	3
1030	visionmayberestoredwithconcavelensandrefractivesurgery	3
1031	walkersyndrome	3
1032	way	3
1033	weighted	3
1034	wide	3
1035	writing	3
1036	xi	3
1037	xia	3
1038	xiao	3
1039	yiyang	3
1040	yong	3
1041	yuanhan	3
1042	_	2
1043	_report	2
1044	aandb	2
1045	abarbicanandbattlements	2
1046	able	2
1047	above	2
1048	abstract	2
1049	accepting	2
1050	accuracies	2
1051	accurately	2
1052	address	2
1053	adjacent	2
1054	adrenergic	2
1055	advancements	2
1056	adversarial	2
1057	advertisements	2
1058	advice	2
1059	agieval	2
1060	ahigherrooftomakeupfortheshortcolumns	2
1061	ahmed	2
1062	aim	2
1063	aims	2
1064	algebra	2
1065	algorithm	2
1066	align	2
1067	along	2
1068	alpha	2
1069	ambiguities	2
1070	amoatandcrenellations	2
1071	amothertellshersontostopwhining	2
1072	amount	2
1073	analog	2
1074	analyzed	2
1075	anda	2
1076	andcpu	2
1077	anddemocraticgovernments	2
1078	andm	2
1079	andmouthdiseaseintheplacebogroup	2
1080	andn	2
1081	andrequiresfurtherwork	2
1082	andsoon	2
1083	andthenthefirstonemightresume	2
1084	andthere	2
1085	andthisremainsunchanged	2
1086	aneurysm	2
1087	angle	2
1088	annotations	2
1089	another	2
1090	answererrorreason	2
1091	antagonist	2
1092	anterior	2
1093	anxietydisorder	2
1094	aojun	2
1095	apply	2
1096	areas	2
1097	artsquestion	2
1098	asaresult	2
1099	ascending	2
1100	assistedinsitukeratomileusis	2
1101	associated	2
1102	attachedgroupsandtheiratomicnumbers	2
1103	avoid	2
1104	avoided	2
1105	bansal	2
1106	baptiste	2
1107	bar	2
1108	basedmethod	2
1109	basedontheimageprovided	2
1110	basicmedicalscience	2
1111	beginningretainedearnings	2
1112	benchmarking	2
1113	beta	2
1114	betweensphere	2
1115	bias	2
1116	biodiversity	2
1117	biostatistics	2
1118	blocked	2
1119	blocks	2
1120	bone	2
1121	botany	2
1122	boyuan	2
1123	breast	2
1124	bridge	2
1125	brownstemrot	2
1126	butitfailedtocorrectlymaptheidstothecorrespondingillustrationsinthefigure	2
1127	cal	2
1128	calculatethemanufacturingcostperunitforproducta	2
1129	calculatethetotalmanufacturingcostforproducta	2
1130	calculatethetotaloverheadrate	2
1131	calculatethework	2
1132	calculation	2
1133	calculusquestion	2
1134	candd	2
1135	cartoon	2
1136	cartoons	2
1137	cases	2
1138	categorize	2
1139	cbtismoreeffectivethannotreatmentandmoreeffectivethanmeditation	2
1140	cbtisnotaseffectiveasmeditation	2
1141	cd	2
1142	chains	2
1143	challenge	2
1144	challenginghimtoconsiderthemultitudeofinterpretationsthepaintingrepresents	2
1145	chaotic	2
1146	cheng	2
1147	chi	2
1148	children	2
1149	chlorine	2
1150	chris	2
1151	chun	2
1152	cini	2
1153	circuit	2
1154	circulatory	2
1155	clark	2
1156	clc	2
1157	clearly	2
1158	close	2
1159	closed	2
1160	coauthors	2
1161	coca	2
1162	collaboration	2
1163	collect	2
1164	collectiveeffervescence	2
1165	commonsense	2
1166	completes	2
1167	comprehension	2
1168	comprising	2
1169	concave	2
1170	conceived	2
1171	conceptualization	2
1172	conghui	2
1173	considered	2
1174	consistency	2
1175	consistently	2
1176	consumerismandnationalidentities	2
1177	contrast	2
1178	converge	2
1179	copying	2
1180	corresponding	2
1181	correspondstoregionssuchasnortherncanadaandpartsofrussia	2
1182	cotton	2
1183	could	2
1184	covers	2
1185	creating	2
1186	cross	2
1187	crucial	2
1188	ct	2
1189	cu	2
1190	cvf	2
1191	dai	2
1192	daily	2
1193	datasets	2
1194	dawn	2
1195	decorativerhythmandrepetition	2
1196	decrease	2
1197	deeply	2
1198	deepmind	2
1199	default	2
1200	definition	2
1201	degradation	2
1202	degrees	2
1203	dehghani	2
1204	demonstrates	2
1205	dental	2
1206	depositionequilibrium	2
1207	designquestion	2
1208	designs	2
1209	despair	2
1210	detection	2
1211	determinethechangeininternalenergy	2
1212	deterministicfiniteautomaton	2
1213	dev	2
1214	developed	2
1215	diastolic	2
1216	difficult	2
1217	difficulties	2
1218	digital	2
1219	diminished	2
1220	direct	2
1221	disparity	2
1222	disred	2
1223	distributions	2
1224	dividends	2
1225	documents	2
1226	doesn	2
1227	doing	2
1228	domainis	2
1229	domains	2
1230	dongxu	2
1231	du	2
1232	duetothelackofspecificknowledgeabout	2
1233	dynamicsquestion	2
1234	eachofmass	2
1235	easyquestion	2
1236	eccv	2
1237	ecology	2
1238	effectively	2
1239	ehub	2
1240	eisblue	2
1241	electrical	2
1242	electromagnetism	2
1243	elementary	2
1244	embeddings	2
1245	energyandpower	2
1246	enhanced	2
1247	enhancements	2
1248	enhancing	2
1249	equilibrium	2
1250	ers	2
1251	etc	2
1252	evalai	2
1253	exhibit	2
1254	exists	2
1255	explanations	2
1256	explicitly	2
1257	expressed	2
1258	extensive	2
1259	external	2
1260	extraction	2
1261	faisal	2
1262	falls	2
1263	faster	2
1264	fe	2
1265	fiction	2
1266	finding	2
1267	firstly	2
1268	flawed	2
1269	focal	2
1270	focallength	2
1271	follows	2
1272	forc	2
1273	forchoice	2
1274	forexample	2
1275	foribssuffererswithoutananxietydisorder	2
1276	formula	2
1277	forsphere	2
1278	found	2
1279	framework	2
1280	free	2
1281	fromthegivenimage	2
1282	fromwhich	2
1283	frozen	2
1284	furu	2
1285	fuxiao	2
1286	gaia	2
1287	gas	2
1288	gene	2
1289	geneinamousewasreplacedwithahox	2
1290	geneticsquestion	2
1291	geotechnical	2
1292	give	2
1293	giventhis	2
1294	gives	2
1295	google	2
1296	googleapis	2
1297	goyal	2
1298	gpa	2
1299	groundheight	2
1300	group	2
1301	guardcells	2
1302	guo	2
1303	guohai	2
1304	haiku	2
1305	handpart	2
1306	haodong	2
1307	hardsubject	2
1308	heat	2
1309	help	2
1310	helping	2
1311	hence	2
1312	hg	2
1313	hidden	2
1314	hierarchical	2
1315	hierarchicalscale	2
1316	highlight	2
1317	highlights	2
1318	hoi	2
1319	holistic	2
1320	horizontal	2
1321	horror	2
1322	houdong	2
1323	hugo	2
1324	humanpapillomavirusinfection	2
1325	humans	2
1326	hydrogen	2
1327	hypertensive	2
1328	hyung	2
1329	ican	2
1330	ifahox	2
1331	immediacy	2
1332	immunology	2
1333	improve	2
1334	inaccuracies	2
1335	incidencedensity	2
1336	include	2
1337	included	2
1338	includes	2
1339	incorrectly	2
1340	increases	2
1341	indeed	2
1342	indicate	2
1343	indicates	2
1344	initial	2
1345	instance	2
1346	intercept	2
1347	interface	2
1348	interleavedprocessingoccurswhentwotransactionsareprocessedalternately	2
1349	interleaving	2
1350	internet	2
1351	interpret	2
1352	interval	2
1353	inthepoliticalcartoon	2
1354	inthepradomuseuminmadrid	2
1355	inthesecondimage	2
1356	inthestudyofkingphilipiv	2
1357	intricate	2
1358	introducing	2
1359	introduction	2
1360	involving	2
1361	io	2
1362	ipit	2
1363	ipitdisplaysastrongseasonality	2
1364	isminimal	2
1365	istheaccelerationduetogravity	2
1366	italy	2
1367	jack	2
1368	jacob	2
1369	james	2
1370	jiabo	2
1371	jiahui	2
1372	jiaming	2
1373	jiang	2
1374	jiasen	2
1375	jingren	2
1376	jointly	2
1377	junnan	2
1378	kgak	2
1379	kln	2
1380	labor	2
1381	lamm	2
1382	laser	2
1383	later	2
1384	lawrence	2
1385	le	2
1386	leaderboard	2
1387	leads	2
1388	leaving	2
1389	left	2
1390	leftventricle	2
1391	legg	2
1392	lewis	2
1393	limit	2
1394	limited	2
1395	linguistic	2
1396	llfollowthesesteps	2
1397	lmm	2
1398	log	2
1399	logic	2
1400	lone	2
1401	longer	2
1402	longpasternbone	2
1403	lookatthesituationinthe	2
1404	low	2
1405	luo	2
1406	lvlm	2
1407	lxmert	2
1408	macroeconomicsquestion	2
1409	magnitude	2
1410	managerial	2
1411	manuscript	2
1412	mao	2
1413	maps	2
1414	marks	2
1415	mathvista	2
1416	meaningthatonestarts	2
1417	meanwhile	2
1418	measuring	2
1419	mechanicsquestion	2
1420	meconiumaspirationsyndrome	2
1421	media	2
1422	medicinal	2
1423	mediumsubject	2
1424	meeting	2
1425	metavl	2
1426	methods	2
1427	mfroma	2
1428	mg	2
1429	middle	2
1430	mishra	2
1431	mitralregurgitation	2
1432	mmicl	2
1433	mmocr	2
1434	modernhistory	2
1435	module	2
1436	mohammad	2
1437	monochromatic	2
1438	moreover	2
1439	morris	2
1440	mostafa	2
1441	moving	2
1442	mukai	2
1443	multilingual	2
1444	multimodality	2
1445	nano	2
1446	narrative	2
1447	nationxhascomparativeadvantageinpaperproductionandshouldtradepapertonationyinexchangeforcrepes	2
1448	nationygivesupproducing	2
1449	network	2
1450	neurosciences	2
1451	nogpt	2
1452	noneoftheotheranswers	2
1453	normally	2
1454	notsure	2
1455	nuclear	2
1456	occur	2
1457	offered	2
1458	ohm	2
1459	ok	2
1460	ominilmm	2
1461	once	2
1462	ones	2
1463	openbmb	2
1464	operating	2
1465	opportunitycostof	2
1466	oppressor	2
1467	optic	2
1468	oscar	2
1469	others	2
1470	ov	2
1471	overview	2
1472	pairs	2
1473	panningblur	2
1474	parameter	2
1475	participated	2
1476	pathophysiology	2
1477	patientswithnon	2
1478	patterns	2
1479	pengcheng	2
1480	pengchuan	2
1481	percentile	2
1482	personality	2
1483	peutz	2
1484	phalanx	2
1485	photos	2
1486	photoscale	2
1487	phrases	2
1488	ping	2
1489	piotr	2
1490	pivotal	2
1491	played	2
1492	plots	2
1493	pmlr	2
1494	poetry	2
1495	pointe	2
1496	pointf	2
1497	posed	2
1498	poses	2
1499	potential	2
1500	presence	2
1501	price	2
1502	primarily	2
1503	primary	2
1504	principles	2
1505	priorityorder	2
1506	privacy	2
1507	processingquestion	2
1508	producing	2
1509	producta	2
1510	productasalesquantity	2
1511	productb	2
1512	productc	2
1513	prohibit	2
1514	projects	2
1515	pronounced	2
1516	propranolol	2
1517	prosperity	2
1518	providing	2
1519	psychologyquestion	2
1520	ptosisalready	2
1521	purpose	2
1522	puts	2
1523	pwave	2
1524	qi	2
1525	qiao	2
1526	qrscomplex	2
1527	qwenlm	2
1528	ran	2
1529	rapid	2
1530	rate	2
1531	rateofreturn	2
1532	ratio	2
1533	ray	2
1534	reach	2
1535	readme	2
1536	real	2
1537	reason	2
1538	recalling	2
1539	recent	2
1540	recently	2
1541	receptors	2
1542	referring	2
1543	refine	2
1544	refractive	2
1545	regular	2
1546	rejecttoanswer	2
1547	rekacore	2
1548	release	2
1549	relevant	2
1550	religion	2
1551	remedy	2
1552	reported	2
1553	represent	2
1554	represented	2
1555	representing	2
1556	repurposed	2
1557	requiring	2
1558	researchquestion	2
1559	respectively	2
1560	result	2
1561	retainedearningstobereported	2
1562	review	2
1563	revolutionizing	2
1564	right	2
1565	rightventricle	2
1566	rigorous	2
1567	robust	2
1568	robustness	2
1569	roomwithinaroom	2
1570	round	2
1571	sa	2
1572	sampled	2
1573	savingthemfrompovertyoroppressionandbringingthemtrade	2
1574	scene	2
1575	scenes	2
1576	scope	2
1577	sculpture	2
1578	sebastian	2
1579	select	2
1580	selective	2
1581	selects	2
1582	sequence	2
1583	several	2
1584	shade	2
1585	shall	2
1586	shao	2
1587	share	2
1588	sheet	2
1589	shen	2
1590	shijie	2
1591	shortcomings	2
1592	showingawillingnesstobecomparedtogreatspanishpaintersofthepast	2
1593	shuai	2
1594	simpedanceinthes	2
1595	simple	2
1596	simvlm	2
1597	sites	2
1598	siyuan	2
1599	size	2
1600	sketches	2
1601	sleg	2
1602	socialsci	2
1603	solid	2
1604	songyang	2
1605	sonnet	2
1606	sparkles	2
1607	speciesbdescendedfromspeciesa	2
1608	specifically	2
1609	springer	2
1610	square	2
1611	standardized	2
1612	standards	2
1613	startsafterafinishesandcompleteswithoutbeinginterleavedwithanyothertransaction	2
1614	statistical	2
1615	stem	2
1616	steps	2
1617	storage	2
1618	stored	2
1619	student	2
1620	subarachnoidspace	2
1621	success	2
1622	sufficient	2
1623	sustained	2
1624	synthesisquestion	2
1625	tab	2
1626	tackle	2
1627	tasked	2
1628	taxonomy	2
1629	tay	2
1630	technical	2
1631	technique	2
1632	testing	2
1633	textualunderstandingerror	2
1634	theartist	2
1635	thecabinisdepressurizedandtheoxygenmaskfallsfromtheceiling	2
1636	thecorrectcalculationshouldbe	2
1637	thedangeroflettinggoofadream	2
1638	thediffusionofculturaltraditionsalongeurasiantraderoutes	2
1639	theentranceoflightandairintothehall	2
1640	theextenttowhichgovernmenteconomicpoliciesineurasiaintheperiod	2
1641	theincidencedensity	2
1642	theinequitiesofsocieties	2
1643	thejamdensity	2
1644	themodel	2
1645	themostlikelydiagnosisis	2
1646	themousemaydevelopnoheadandtwotails	2
1647	themousemaydeveloptwoheadsandnotail	2
1648	thentheotherstartsbeforethefirstonefinishes	2
1649	theorderis	2
1650	theoryquestion	2
1651	thepatientisapost	2
1652	theperspectiveofthecartoonististhattheunitedstateshasbeenasaviortothenationsbroughtunderitscontrol	2
1653	theregionboundedbythegraphasshownabove	2
1654	thesearethecaseswherebothtestsarepositive	2
1655	thesewomenwanttheirchildrentobeeducated	2
1656	thespreadoftechnologicalinnovationsacrossregionsineurasia	2
1657	thetypeofalkylsubstituentbpresent	2
1658	thetypeofheterocyclicringcpresent	2
1659	thetypeofsubstituentaonthearomaticring	2
1660	theunitedstatesisseenasfulfillingwhichofthefollowingroles	2
1661	thevalueoftheindexis	2
1662	think	2
1663	thisconditionoftenoccursinelderlypeople	2
1664	thisisincorrect	2
1665	thisphenomenoncannotbefixedbylasik	2
1666	thisquestioncallsforknowledgerelatedtothestimulusmaterial	2
1667	thisstatementappearstobetrue	2
1668	thistumormayrepresentthemostcommontypeofintraocularneoplasm	2
1669	tiejun	2
1670	timelines	2
1671	tofind	2
1672	tofindthesteady	2
1673	took	2
1674	tool	2
1675	tot	2
1676	totalmanufacturingcostforproducta	2
1677	touvron	2
1678	train	2
1679	trainable	2
1680	trained	2
1681	transactionaoncpu	2
1682	transactionboncpu	2
1683	transformers	2
1684	trend	2
1685	truenegatives	2
1686	underscore	2
1687	underscores	2
1688	unit	2
1689	uniter	2
1690	universal	2
1691	unknown	2
1692	unlocking	2
1693	uponinspection	2
1694	uptodistinguishitfrommelanoma	2
1695	url	2
1696	used	2
1697	va	2
1698	variable	2
1699	vaultedroof	2
1700	vb	2
1701	vce	2
1702	vdoesn	2
1703	ventriculardepolarization	2
1704	version	2
1705	vfailstointerprettheimage	2
1706	vilbert	2
1707	vinvl	2
1708	visionmayberestoredwithconvexlensandrefractivesurgery	2
1709	visually	2
1710	vit	2
1711	vrecalledtherightknowledgeandmadetherightreasoning	2
1712	w	2
1713	weak	2
1714	web	2
1715	wecan	2
1716	wecandeduce	2
1717	weightedindexofthethreestocksforthefirstperiod	2
1718	weightedindexvalueat	2
1719	wenqi	2
1720	whatisthemostlikelydiagnosis	2
1721	whichisincorrect	2
1722	whichisnotexplicitlymarkedinthefigurebutisonlydescribedintext	2
1723	whichofthesepicturesshowsthereconciliationofegoismandother	2
1724	willbeactivatedandinhibittheseedlingtripleresponse	2
1725	withthehpointingtothebackground	2
1726	withtheswitchinposition	2
1727	won	2
1728	wouldn	2
1729	www	2
1730	xiaowei	2
1731	xing	2
1732	xiujun	2
1733	yacoob	2
1734	yao	2
1735	yaser	2
1736	years	2
1737	yifan	2
1738	ying	2
1739	youaretravelingonaplanewithasmallchild	2
1740	yuhang	2
1741	yuheng	2
1742	zhai	2
1743	zhe	2
1744	zhengyuan	2
1745	zhong	2
1746	zhuang	2
1747	zicheng	2
1748	zirui	2
1749	zitnick	2
1750	ziwei	2
1751	ziyi	2
1752	zou	2
合計	5,688	20,849

合計は出現数１の単語を含みます。

<この項は書きかけです。順次追記します。>
This article is not completed. I will add some words and/or centences in order.

Qiita Calendar 2024

2024 参加・主催Calendarと投稿記事一覧 Qiita(248)
https://qiita.com/kaizen_nagoya/items/d80b8fbac2496df7827f

主催Calendar2024分析 Qiita(254)
https://qiita.com/kaizen_nagoya/items/15807336d583076f70bc

Calendar 統計
https://qiita.com/kaizen_nagoya/items/e315558dcea8ee3fe43e

LLM 関連 Calendar 2024
https://qiita.com/kaizen_nagoya/items/c36033cf66862d5496fa

Large Language Model Related Calendar
https://qiita.com/kaizen_nagoya/items/3beb0bc3fb71e3ae6d66

博士論文 Calendar 2024 を開催します。
https://qiita.com/kaizen_nagoya/items/51601357efbcaf1057d0

博士論文(0)関連記事一覧
https://qiita.com/kaizen_nagoya/items/8f223a760e607b705e78

自己記事一覧

Qiitaで逆リンクを表示しなくなったような気がする。時々、スマフォで表示するとあらわっることがあり、完全に削除したのではなさそう。

４月以降、せっせとリンクリストを作り、統計を取って確率を説明しようとしている。
2025年２月末を目標にしている。

一覧の一覧( The directory of directories of mine.) Qiita(100)
https://qiita.com/kaizen_nagoya/items/7eb0e006543886138f39

仮説（0）一覧（目標100現在40）
https://qiita.com/kaizen_nagoya/items/f000506fe1837b3590df

Qiita(0)Qiita関連記事一覧（自分）
https://qiita.com/kaizen_nagoya/items/58db5fbf036b28e9dfa6

Error一覧 error(0)
https://qiita.com/kaizen_nagoya/items/48b6cbc8d68eae2c42b8

C++ Support(0)　
https://qiita.com/kaizen_nagoya/items/8720d26f762369a80514

Coding(0) Rules, C, Secure, MISRA and so on
https://qiita.com/kaizen_nagoya/items/400725644a8a0e90fbb0

Ethernet 記事一覧　Ethernet(0)
https://qiita.com/kaizen_nagoya/items/88d35e99f74aefc98794

Wireshark 一覧 wireshark(0)、Ethernet(48)
https://qiita.com/kaizen_nagoya/items/fbed841f61875c4731d0

線網（Wi-Fi）空中線(antenna)(0) 記事一覧(118/300目標)
https://qiita.com/kaizen_nagoya/items/5e5464ac2b24bd4cd001

なぜdockerで機械学習するか書籍・ソース一覧作成中 (目標100)
https://qiita.com/kaizen_nagoya/items/ddd12477544bf5ba85e2

プログラムちょい替え（0）一覧:4件
https://qiita.com/kaizen_nagoya/items/296d87ef4bfd516bc394

言語処理100本ノックをdockerで。python覚えるのに最適。:10+12
https://qiita.com/kaizen_nagoya/items/7e7eb7c543e0c18438c4

Python(0)記事をまとめたい。
https://qiita.com/kaizen_nagoya/items/088c57d70ab6904ebb53

安全（0）安全工学シンポジウムに向けて: 21
https://qiita.com/kaizen_nagoya/items/c5d78f3def8195cb2409

プログラマによる、プログラマのための、統計(0)と確率のプログラミングとその後
https://qiita.com/kaizen_nagoya/items/6e9897eb641268766909

転職(0)一覧
https://qiita.com/kaizen_nagoya/items/f77520d378d33451d6fe

技術士(0)一覧
https://qiita.com/kaizen_nagoya/items/ce4ccf4eb9c5600b89ea

Reserchmap(0) 一覧
https://qiita.com/kaizen_nagoya/items/506c79e562f406c4257e

物理記事　上位100
https://qiita.com/kaizen_nagoya/items/66e90fe31fbe3facc6ff

量子(0) 計算機, 量子力学
https://qiita.com/kaizen_nagoya/items/1cd954cb0eed92879fd4

数学関連記事１００
https://qiita.com/kaizen_nagoya/items/d8dadb49a6397e854c6d

coq(0) 一覧
https://qiita.com/kaizen_nagoya/items/d22f9995cf2173bc3b13

統計(0)一覧
https://qiita.com/kaizen_nagoya/items/80d3b221807e53e88aba

図(0) state, sequence and timing. UML and お絵描き
https://qiita.com/kaizen_nagoya/items/60440a882146aeee9e8f

色(0) 記事100書く切り口
https://qiita.com/kaizen_nagoya/items/22331c0335ed34326b9b

品質一覧
https://qiita.com/kaizen_nagoya/items/2b99b8e9db6d94b2e971

言語・文学記事　１００
https://qiita.com/kaizen_nagoya/items/42d58d5ef7fb53c407d6

医工連携関連記事一覧
https://qiita.com/kaizen_nagoya/items/6ab51c12ba51bc260a82

水の資料集(0)　方針と成果
https://qiita.com/kaizen_nagoya/items/f5dbb30087ea732b52aa

自動車　記事　１００
https://qiita.com/kaizen_nagoya/items/f7f0b9ab36569ad409c5

通信記事１００
https://qiita.com/kaizen_nagoya/items/1d67de5e1cd207b05ef7

日本語（０）一欄
https://qiita.com/kaizen_nagoya/items/7498dcfa3a9ba7fd1e68

英語(0) 一覧
https://qiita.com/kaizen_nagoya/items/680e3f5cbf9430486c7d

音楽　一覧(0)
https://qiita.com/kaizen_nagoya/items/b6e5f42bbfe3bbe40f5d

「@kazuo_reve 新人の方によく展開している有益な情報」確認一覧
https://qiita.com/kaizen_nagoya/items/b9380888d1e5a042646b

鉄道（０）鉄道のシステム考察はてっちゃんがてつだってくれる
https://qiita.com/kaizen_nagoya/items/faa4ea03d91d901a618a

OSEK OS設計の基礎　OSEK(100)
https://qiita.com/kaizen_nagoya/items/7528a22a14242d2d58a3

coding (101) 一覧を作成し始めた。omake:最近のQiitaで表示しない5つの事象
https://qiita.com/kaizen_nagoya/items/20667f09f19598aedb68

官公庁・学校・公的団体（NPOを含む）システムの課題、官（０）
https://qiita.com/kaizen_nagoya/items/04ee6eaf7ec13d3af4c3

「はじめての」シリーズ　ベクタージャパン　
https://qiita.com/kaizen_nagoya/items/2e41634f6e21a3cf74eb

AUTOSAR(0)Qiita記事一覧, OSEK(75)
https://qiita.com/kaizen_nagoya/items/89c07961b59a8754c869

プログラマが知っていると良い「公序良俗」
https://qiita.com/kaizen_nagoya/items/9fe7c0dfac2fbd77a945

LaTeX(0) 一覧　
https://qiita.com/kaizen_nagoya/items/e3f7dafacab58c499792

自動制御、制御工学一覧（０）
https://qiita.com/kaizen_nagoya/items/7767a4e19a6ae1479e6b

Rust(0) 一覧　
https://qiita.com/kaizen_nagoya/items/5e8bb080ba6ca0281927

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI, AI(8)

References

term list

Qiita Calendar 2024

自己記事一覧

関連資料

Engineering Festa 2024前に必読記事一覧

文書履歴(document history)

最後までおよみいただきありがとうございました。

Thank you very much for reading to the last sentence.