1

【2025年2月公開 Arxiv論文ランキング】2502.xxxxx

Posted at 2025-06-14

AI論文解説 Youtubeチャンネル: AI時代の羅針盤

2025年2月頃に公開されたcsカテゴリの論文 (ID: 2502.xxxxx)を被引用数のデータを元にランキングしています。ランキングは随時更新します。
(2025年6月15日更新)

被引用数	タイトル	動画
720	Qwen2.5-VL Technical Report	not yet
159	LIMO: Less is More for Reasoning	not yet
119	From System 1 to System 2: A Survey of Reasoning Large Language Models	not yet
113	Demystifying Long Chain-of-Thought Reasoning in LLMs	not yet
107	Process Reinforcement through Implicit Rewards	not yet
97	Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning	not yet
80	Chain of Draft: Thinking Faster by Writing Less
69	SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features	not yet
67	TokenSkip: Controllable Chain-of-Thought Compression in LLMs	not yet
67	SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model	not yet
64	Training Language Models to Reason Efficiently	not yet
59	Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
57	Towards an AI co-scientist	not yet
52	LLM Post-Training: A Deep Dive into Reasoning Large Language Models
52	LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!
52	When More is Less: Understanding Chain-of-Thought Length in LLMs	not yet
50	Large Language Diffusion Models	not yet
47	SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
47	Competitive Programming with Large Reasoning Models
46	Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach	not yet
45	Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning	not yet
45	YOLOv12: Attention-Centric Real-Time Object Detectors	not yet
44	Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling	not yet
43	MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning	not yet
43	CoT-Valve: Length-Compressible Chain-of-Thought Tuning	not yet
42	The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks	not yet
40	Self-Training Elicits Concise Reasoning in Large Language Models	not yet
39	LIMR: Less is More for RL Scaling	not yet
35	Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model	not yet
34	MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency	not yet
33	MoBA: Mixture of Block Attention for Long-Context LLMs	not yet
33	SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities	not yet
31	Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success	not yet
31	LightThinker: Thinking Step-by-Step Compression	not yet
31	SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines	not yet
31	The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1	not yet
30	Small Models Struggle to Learn from Strong Reasoners	not yet
28	On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective	not yet
28	Multi-Agent Risks from Advanced AI	not yet
28	Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving	not yet
27	CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation	not yet
27	Preference Leakage: A Contamination Problem in LLM-as-a-judge	not yet
26	Muon is Scalable for LLM Training	not yet
26	Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity	not yet
25	Scaling Test-Time Compute Without Verification or RL is Suboptimal	not yet
25	Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research
24	H-CoT: Hijacking the Chain-of-Thought Safety Reasoning Mechanism to Jailbreak Large Reasoning Models, Including OpenAI o1/o3, DeepSeek-R1, and Gemini 2.0 Flash Thinking	not yet
24	DexVLA: Vision-Language Model with Plug-In Diffusion Expert for General Robot Control	not yet
23	Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs	not yet
23	Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities?	not yet
23	SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs	not yet
23	A-MEM: Agentic Memory for LLM Agents	not yet
23	EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents	not yet
23	AgentSociety: Large-Scale Simulation of LLM-Driven Generative Agents Advances Understanding of Human Behaviors and Society	not yet
23	SCALM: Detecting Bad Practices in Smart Contracts Through LLMs	not yet
23	ASAP: Aligning Simulation and Real-World Physics for Learning Agile Humanoid Whole-Body Skills	not yet
22	OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models	not yet
21	Stepwise Perplexity-Guided Refinement for Efficient Chain-of-Thought Reasoning in Large Language Models	not yet
21	Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning	not yet
21	OverThink: Slowdown Attacks on Reasoning LLMs	not yet
21	ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning	not yet
20	MMTEB: Massive Multilingual Text Embedding Benchmark	not yet
20	ACECODER: Acing Coder RL via Automated Test-Case Synthesis	not yet
19	Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models	not yet
19	Harnessing Multiple Large Language Models: A Survey on LLM Ensemble	not yet
19	Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?	not yet
19	History-Guided Video Diffusion	not yet
19	Modeling and Beamforming Optimization for Pinching-Antenna Systems	not yet
19	Advancing Reasoning in Large Language Models: Promising Methods and Approaches	not yet
19	Layer by Layer: Uncovering Hidden Representations in Language Models	not yet
18	Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models	not yet
18	MLGym: A New Framework and Benchmark for Advancing AI Research Agents	not yet
18	S: Test Time Scaling for Code Generation*	not yet
18	Magma: A Foundation Model for Multimodal AI Agents
18	SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?
18	Atom of Thoughts for Markov LLM Test-Time Scaling	not yet
18	MM-RLHF: The Next Step Forward in Multimodal LLM Alignment	not yet
18	Long-Term TalkingFace Generation via Motion-Prior Conditional Diffusion Model	not yet
18	ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates	not yet
18	TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models	not yet
18	Multi-agent Architecture Search via Agentic Supernet	not yet
18	Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2	not yet
18	Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning	not yet
17	BIG-Bench Extra Hard	not yet
17	Reasoning with Latent Thoughts: On the Power of Looped Transformers	not yet
17	NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions	not yet
17	Learning Smooth and Expressive Interatomic Potentials for Physical Property Prediction	not yet
17	Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search	not yet
16	Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction	not yet
16	MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations	not yet
16	Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and Sound	not yet
16	Goku: Flow Based Video Generative Foundation Models
16	DeepRAG: Thinking to Retrieve Step by Step for Large Language Models	not yet
16	STP: Self-play LLM Theorem Provers with Iterative Conjecturing and Proving	not yet
15	Self-rewarding correction for mathematical reasoning	not yet
15	The Relationship Between Reasoning and Performance in Large Language Models -- o3 (mini) Thinks Harder, Not Longer
15	ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model	not yet
15	S$^2$R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning	not yet
15	PRISM: Self-Pruning Intrinsic Selection Method for Training-Free Multimodal Data Selection	not yet
15	Injecting Domain-Specific Knowledge into Large Language Models: A Comprehensive Survey	not yet
15	Force Matching with Relativistic Constraints: A Physics-Inspired Approach to Stable and Efficient Generative Modeling	not yet
15	Universal Approximation of Visual Autoregressive Transformers	not yet
15	Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance	not yet
15	Fast Video Generation with Sliding Tile Attention	not yet
15	WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs
14	RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete	not yet
14	From RAG to Memory: Non-Parametric Continual Learning for Large Language Models	not yet
14	Reasoning-Augmented Conversation for Multi-Turn Jailbreak Attacks on Large Language Models	not yet
14	Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks	not yet
14	AnyEdit: Edit Any Knowledge Encoded in Language Models	not yet
14	GSM-Infinite: How Do Your LLMs Behave over Infinitely Increasing Context Length and Reasoning Complexity?	not yet
14	Safety at Scale: A Comprehensive Survey of Large Model Safety	not yet
14	Ola: Pushing the Frontiers of Omni-Modal Language Model	not yet
14	Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis	not yet
14	VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models	not yet
14	High-Order Matching for One-Step Shortcut Diffusion Models	not yet
13	UniTok: A Unified Tokenizer for Visual Generation and Understanding	not yet
13	Meta-Reasoner: Dynamic Guidance for Optimized Inference-time Reasoning in Large Language Models	not yet
13	Med-RLVR: Emerging Medical Reasoning from a 3B base model via reinforcement Learning	not yet
13	SpargeAttention: Accurate and Training-free Sparse Attention Accelerating Any Model Inference	not yet
13	Are Sparse Autoencoders Useful? A Case Study in Sparse Probing	not yet
13	Do Multilingual LLMs Think In English?	not yet
13	RAG-Gym: Systematic Optimization of Language Agents for Retrieval-Augmented Generation	not yet
13	HOMIE: Humanoid Loco-Manipulation with Isomorphic Exoskeleton Cockpit	not yet
13	Towards Reasoning Ability of Small Language Models	not yet
13	OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning	not yet
13	ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models	not yet
13	CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction	not yet
13	Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning	not yet
13	Teaching Language Models to Critique via Reinforcement Learning	not yet
13	Fully Autonomous AI Agents Should Not be Developed	not yet
13	Sample, Scrutinize and Scale: Effective Inference-Time Search by Scaling Verification	not yet
12	Reward Shaping to Mitigate Reward Hacking in RLHF	not yet
12	Dynamic Parallel Tree Search for Efficient LLM Reasoning	not yet
12	Red-Teaming LLM Multi-Agent Systems via Communication Attacks	not yet
12	Which Attention Heads Matter for In-Context Learning?	not yet
12	AIDE: AI-Driven Exploration in the Space of Code	not yet
12	Baichuan-M1: Pushing the Medical Capability of Large Language Models	not yet
12	Inference-Time Computations for LLM Reasoning and Planning: A Benchmark and Insights	not yet
12	Evaluating o1-Like LLMs: Unlocking Reasoning for Translation through Comprehensive Analysis	not yet
12	Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More	not yet
12	PlanGenLLMs: A Modern Survey of LLM Planning Capabilities	not yet
12	HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation	not yet
12	Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs	not yet
12	Salamandra Technical Report	not yet
12	Recent Advances in Discrete Speech Tokens: A Review	not yet
12	TabICL: A Tabular Foundation Model for In-Context Learning on Large Data	not yet
12	BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving	not yet
12	Boosting Multimodal Reasoning with Automated Structured Thinking	not yet
12	Latent Thought Models with Variational Bayes Inference-Time Computation	not yet
12	MM-IQ: Benchmarking Human-Like Abstraction and Reasoning in Multimodal Models	not yet
11	MV-MATH: Evaluating Multimodal Math Reasoning in Multi-Visual Contexts	not yet
11	Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs	not yet
11	FlexTok: Resampling Images into 1D Token Sequences of Flexible Length	not yet
11	PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning	not yet
11	A Survey of Personalized Large Language Models: Progress and Future Directions	not yet
11	A Survey of LLM-based Agents in Medicine: How far are we from Baymax?	not yet
11	On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning	not yet
11	Organize the Web: Constructing Domains Enhances Pre-Training Data Curation	not yet
11	Process Reward Models for LLM Agents: Practical Framework and Directions	not yet
11	DeltaProduct: Improving State-Tracking in Linear RNNs via Householder Products	not yet
11	Logical Reasoning in Large Language Models: A Survey	not yet
11	EnigmaEval: A Benchmark of Long Multimodal Reasoning Challenges	not yet
11	If Multi-Agent Debate is the Answer, What is the Question?	not yet
11	Training Deep Learning Models with Norm-Constrained LMOs	not yet
11	Self-Supervised Prompt Optimization	not yet
11	Performance Analysis of Pinching-Antenna Systems	not yet
11	Confidence Improves Self-Consistency in LLMs	not yet
11	Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuracy	not yet
11	Confident or Seek Stronger: Exploring Uncertainty-Based On-device LLM Routing From Benchmarking to Generalization	not yet
11	Do Large Language Model Benchmarks Test Reliability?	not yet
11	Masked Autoencoders Are Effective Tokenizers for Diffusion Models	not yet
11	Reinforcement Learning for Long-Horizon Interactive LLM Agents	not yet
10	On Benchmarking Human-Like Intelligence in Machines	not yet
10	Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids	not yet
10	Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?	not yet
10	OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment	not yet
10	MegaTTS 3: Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis	not yet
10	VEM: Environment-Free Exploration for Training GUI Agent with Value Environment Model	not yet

※ 被引用数は更新日における NASA ADSのデータを参照しています
https://ui.adsabs.harvard.edu/

1

Register as a new user and use Qiita more conveniently

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

1