1

【2025年4月公開 Arxiv論文ランキング】2504.xxxxx

Last updated at 2025-08-01Posted at 2025-06-14

AI論文解説 Youtubeチャンネル: AI時代の羅針盤

2025年4月頃に公開されたcsカテゴリの論文 (ID: 2504.xxxxx)を被引用数のデータを元にランキングしています。ランキングは随時更新します。
(2025月8月2日更新)

被引用数	タイトル	動画
187	InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models	not yet
151	Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
113	VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model	not yet
69	VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks	not yet
66	Kimi-VL Technical Report	not yet
63	Reinforcement Learning for Reasoning in Large Language Models with One Training Example
61	$\pi_{0.5}$: a Vision-Language-Action Model with Open-World Generalization
57	Reasoning Models Can Be Effective Without Thinking
52	Inference-Time Scaling for Generalist Reward Modeling	not yet
50	DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments	not yet
49	ThinkPrune: Pruning Long Chain-of-Thought of LLMs via Reinforcement Learning	not yet
47	Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning	not yet
46	VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning	not yet
46	Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems	not yet
44	TTRL: Test-Time Reinforcement Learning
40	RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning	not yet
40	ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
38	DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition
38	SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models	not yet
38	GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents	not yet
36	VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning	not yet
35	A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment	not yet
35	ToolRL: Reward is All Tool Learning Needs	not yet
34	Phi-4-reasoning Technical Report
33	The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search
31	Step1X-Edit: A Practical Framework for General Image Editing	not yet
30	A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce	not yet
28	Dynamic Early Exit in Reasoning Models	not yet
28	Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought	not yet
28	PaperBench: Evaluating AI's Ability to Replicate AI Research	not yet
27	A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility	not yet
27	Concise Reasoning via Reinforcement Learning	not yet
27	GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation	not yet
27	Command A: An Enterprise-Ready Large Language Model
26	Kimi-Audio Technical Report	not yet
26	BrowseComp: A Simple Yet Challenging Benchmark for Browsing Agents	not yet
25	DeepSeek-R1 Thoughtology: Let's think about LLM Reasoning	not yet
24	Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning	not yet
24	A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems	not yet
24	Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model	not yet
23	WebThinker: Empowering Large Reasoning Models with Deep Research Capability	not yet
23	AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset	not yet
23	Efficient Reasoning Models: A Survey	not yet
23	Transfer between Modalities with MetaQueries	not yet
21	DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning	not yet
21	SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement	not yet
21	Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining	not yet
21	Rethinking Reflection in Pre-Training	not yet
20	Learning to Reason under Off-Policy Guidance	not yet
20	SRPO: A Cross-Domain Implementation of Large-Scale Reinforcement Learning on LLM	not yet
20	SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL	not yet
20	Perception-R1: Pioneering Perception Policy with Reinforcement Learning	not yet
20	Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving	not yet
20	LLM Social Simulations Are a Promising Research Method	not yet
20	OpenCodeReasoning: Advancing Data Distillation for Competitive Coding	not yet
20	Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents	not yet
19	Acting Less is Reasoning More! Teaching Model to Act Efficiently
19	Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization	not yet
19	Synthetic Data Generation & Multi-Step RL for Reasoning & Tool Use	not yet
18	ReasonIR: Training Retrievers for Reasoning Tasks	not yet
18	Building A Secure Agentic AI Application Leveraging A2A Protocol
18	InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners	not yet
18	NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation	not yet
18	Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?	not yet
18	Reasoning Models Know When They're Right: Probing Hidden States for Self-Verification	not yet
18	SmartBugBert: BERT-Enhanced Vulnerability Detection for Smart Contract Bytecode	not yet
18	GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning	not yet
17	Safety in Large Reasoning Models: A Survey	not yet
17	PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models	not yet
17	Enterprise-Grade Security for the Model Context Protocol (MCP): Frameworks and Mitigation Strategies	not yet
17	SmolVLM: Redefining small and efficient multimodal models	not yet
17	Enhancing Smart Contract Vulnerability Detection in DApps Leveraging Fine-Tuned LLM	not yet
17	Less-to-More Generalization: Unlocking More Controllability by In-Context Generation	not yet
17	Z1: Efficient Test-time Scaling with Code	not yet
16	The Leaderboard Illusion
16	A Survey of AI Agent Protocols	not yet
16	Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning	not yet
16	Embodied-R: Collaborative Framework for Activating Embodied Spatial Reasoning in Foundation Models via Reinforcement Learning	not yet
16	d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning	not yet
16	Seedream 3.0 Technical Report	not yet
16	GRPO-LEAD: A Difficulty-Aware Reinforcement Learning Approach for Concise Mathematical Reasoning in Language Models	not yet
16	TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning	not yet
16	SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning	not yet
16	On The Landscape of Spoken Language Models: A Comprehensive Survey	not yet
16	SEAL: Steerable Reasoning Calibration of Large Language Models for Free	not yet
16	ScreenSpot-Pro: GUI Grounding for Professional High-Resolution Computer Use	not yet
16	Efficient Reinforcement Finetuning via Adaptive Curriculum Learning	not yet
16	Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models	not yet
16	An Approach to Technical AGI Safety and Security	not yet
16	MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs	not yet
16	m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning with Large Language Models	not yet
15	One-Minute Video Generation with Test-Time Training
15	UniToken: Harmonizing Multimodal Understanding and Generation through Unified Visual Encoding	not yet
15	GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning	not yet
14	In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer	not yet
14	SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning	not yet
14	Optimized Path Planning for Logistics Robots Using Ant Colony Algorithm under Multiple Constraints	not yet
14	An Illusion of Progress? Assessing the Current State of Web Agents	not yet
14	WorldScore: A Unified Evaluation Benchmark for World Generation	not yet
14	JudgeLRM: Large Reasoning Models as a Judge	not yet
13	Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory	not yet
13	Fast-Slow Thinking for Large Vision-Language Model Reasoning	not yet
13	VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models	not yet
13	Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning	not yet
13	Packing Input Frame Context in Next-Frame Prediction Models for Video Generation	not yet
13	The Obvious Invisible Threat: LLM-Powered GUI Agents' Vulnerability to Fine-Print Injections	not yet
13	MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft	not yet
13	Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning	not yet
13	STAR-1: Safer Alignment of Reasoning LLMs with 1K Data	not yet
12	Ada-R1: Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization	not yet
12	From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review	not yet
12	Malicious Code Detection in Smart Contracts via Opcode Vectorization	not yet
12	WORLDMEM: Long-term Consistent World Simulation with Memory	not yet
12	MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning	not yet
12	RealSafe-R1: Safety-Aligned DeepSeek-R1 without Compromising Reasoning Capability	not yet
12	SafeMLRM: Demystifying Safety in Multi-modal Large Reasoning Models	not yet
12	TxGemma: Efficient and Agentic LLMs for Therapeutics	not yet
12	Understanding Aha Moments: from External Observations to Internal Mechanisms	not yet
12	Why do LLMs attend to the first token?	not yet
12	SkyReels-A2: Compose Anything in Video Diffusion Transformers	not yet
11	SWE-smith: Scaling Data for Software Engineering Agents
11	TesserAct: Learning 4D Embodied World Models	not yet
11	Think Deep, Think Fast: Investigating Efficiency of Verifier-free Inference-time-scaling Methods	not yet
11	A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future	not yet
11	TextArena	not yet
11	VisualPuzzles: Decoupling Multimodal Reasoning Evaluation from Domain Knowledge	not yet
11	SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World Users	not yet
11	Two Heads are Better Than One: Test-time Scaling of Multi-agent Collaborative Reasoning	not yet
11	SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills	not yet
11	Leanabell-Prover: Posttraining Scaling in Formal Reasoning	not yet
11	Think When You Need: Self-Adaptive Chain-of-Thought Learning	not yet
11	Improved Visual-Spatial Reasoning via R1-Zero-Like Training	not yet
10	ShorterBetter: Guiding Reasoning Models to Find Optimal Inference Length for Efficient Reasoning	not yet
10	WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks	not yet
10	HalluLens: LLM Hallucination Benchmark	not yet
10	DreamO: A Unified Framework for Image Customization	not yet
10	Describe Anything: Detailed Localized Image and Video Captioning
10	SConU: Selective Conformal Uncertainty in Large Language Models	not yet
10	IMAGGarment-1: Fine-Grained Garment Generation for Controllable Fashion Design	not yet
10	SkyReels-V2: Infinite-length Film Generative Model	not yet
10	Speculative Thinking: Enhancing Small-Model Reasoning with Large Model Guidance at Inference Time	not yet
10	REPA-E: Unlocking VAE for End-to-End Tuning with Latent Diffusion Transformers	not yet
10	Psychological Health Knowledge-Enhanced LLM-based Social Network Crisis Intervention Text Transfer Recognition Method	not yet
10	Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models	not yet
10	Retro-Search: Exploring Untaken Paths for Deeper and Efficient Reasoning	not yet
10	APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay	not yet
10	Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme	not yet
10	Cognitive Memory in Large Language Models	not yet
10	Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead
9	Multidimensional precipitation index prediction based on CNN-LSTM hybrid framework	not yet
9	Securing GenAI Multi-Agent Systems Against Tool Squatting: A Zero Trust Registry-Based Approach	not yet
9	BrowseComp-ZH: Benchmarking Web Browsing Ability of Large Language Models in Chinese	not yet
9	Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks	not yet
9	Process Reward Models That Think	not yet
9	Tina: Tiny Reasoning Models via LoRA	not yet

※ 被引用数は更新日における NASA ADSのデータを参照しています
https://ui.adsabs.harvard.edu/

1

Register as a new user and use Qiita more conveniently

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

1