0
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

【2025年2月公開 Arxiv論文ランキング】2502.xxxxx

Posted at

AI論文解説 Youtubeチャンネル: AI時代の羅針盤

2025年2月頃に公開されたcsカテゴリの論文 (ID: 2502.xxxxx)を被引用数のデータを元にランキングしています。ランキングは随時更新します。
(2025年6月15日更新)

被引用数   タイトル 動画
arxiv720 Qwen2.5-VL Technical Report not yet
arxiv159 LIMO: Less is More for Reasoning not yet
arxiv119 From System 1 to System 2: A Survey of Reasoning Large Language Models not yet
arxiv113 Demystifying Long Chain-of-Thought Reasoning in LLMs not yet
arxiv107 Process Reinforcement through Implicit Rewards not yet
arxiv97 Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning not yet
arxiv80 Chain of Draft: Thinking Faster by Writing Less
arxiv69 SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features not yet
arxiv67 TokenSkip: Controllable Chain-of-Thought Compression in LLMs not yet
arxiv67 SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model not yet
arxiv64 Training Language Models to Reason Efficiently not yet
arxiv59 Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
arxiv57 Towards an AI co-scientist not yet
arxiv52 LLM Post-Training: A Deep Dive into Reasoning Large Language Models
arxiv52 LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!
arxiv52 When More is Less: Understanding Chain-of-Thought Length in LLMs not yet
arxiv50 Large Language Diffusion Models not yet
arxiv47 SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
arxiv47 Competitive Programming with Large Reasoning Models
arxiv46 Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach not yet
arxiv45 Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning not yet
arxiv45 YOLOv12: Attention-Centric Real-Time Object Detectors not yet
arxiv44 Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling not yet
arxiv43 MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning not yet
arxiv43 CoT-Valve: Length-Compressible Chain-of-Thought Tuning not yet
arxiv42 The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks not yet
arxiv40 Self-Training Elicits Concise Reasoning in Large Language Models not yet
arxiv39 LIMR: Less is More for RL Scaling not yet
arxiv35 Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model not yet
arxiv34 MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency not yet
arxiv33 MoBA: Mixture of Block Attention for Long-Context LLMs not yet
arxiv33 SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities not yet
arxiv31 Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success not yet
arxiv31 LightThinker: Thinking Step-by-Step Compression not yet
arxiv31 SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines not yet
arxiv31 The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1 not yet
arxiv30 Small Models Struggle to Learn from Strong Reasoners not yet
arxiv28 On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective not yet
arxiv28 Multi-Agent Risks from Advanced AI not yet
arxiv28 Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving not yet
arxiv27 CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation not yet
arxiv27 Preference Leakage: A Contamination Problem in LLM-as-a-judge not yet
arxiv26 Muon is Scalable for LLM Training not yet
arxiv26 Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity not yet
arxiv25 Scaling Test-Time Compute Without Verification or RL is Suboptimal not yet
arxiv25 Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research
arxiv24 H-CoT: Hijacking the Chain-of-Thought Safety Reasoning Mechanism to Jailbreak Large Reasoning Models, Including OpenAI o1/o3, DeepSeek-R1, and Gemini 2.0 Flash Thinking not yet
arxiv24 DexVLA: Vision-Language Model with Plug-In Diffusion Expert for General Robot Control not yet
arxiv23 Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs not yet
arxiv23 Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities? not yet
arxiv23 SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs not yet
arxiv23 A-MEM: Agentic Memory for LLM Agents not yet
arxiv23 EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents not yet
arxiv23 AgentSociety: Large-Scale Simulation of LLM-Driven Generative Agents Advances Understanding of Human Behaviors and Society not yet
arxiv23 SCALM: Detecting Bad Practices in Smart Contracts Through LLMs not yet
arxiv23 ASAP: Aligning Simulation and Real-World Physics for Learning Agile Humanoid Whole-Body Skills not yet
arxiv22 OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models not yet
arxiv21 Stepwise Perplexity-Guided Refinement for Efficient Chain-of-Thought Reasoning in Large Language Models not yet
arxiv21 Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning not yet
arxiv21 OverThink: Slowdown Attacks on Reasoning LLMs not yet
arxiv21 ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning not yet
arxiv20 MMTEB: Massive Multilingual Text Embedding Benchmark not yet
arxiv20 ACECODER: Acing Coder RL via Automated Test-Case Synthesis not yet
arxiv19 Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models not yet
arxiv19 Harnessing Multiple Large Language Models: A Survey on LLM Ensemble not yet
arxiv19 Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path? not yet
arxiv19 History-Guided Video Diffusion not yet
arxiv19 Modeling and Beamforming Optimization for Pinching-Antenna Systems not yet
arxiv19 Advancing Reasoning in Large Language Models: Promising Methods and Approaches not yet
arxiv19 Layer by Layer: Uncovering Hidden Representations in Language Models not yet
arxiv18 Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models not yet
arxiv18 MLGym: A New Framework and Benchmark for Advancing AI Research Agents not yet
arxiv18 S*: Test Time Scaling for Code Generation not yet
arxiv18 Magma: A Foundation Model for Multimodal AI Agents
arxiv18 SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?
arxiv18 Atom of Thoughts for Markov LLM Test-Time Scaling not yet
arxiv18 MM-RLHF: The Next Step Forward in Multimodal LLM Alignment not yet
arxiv18 Long-Term TalkingFace Generation via Motion-Prior Conditional Diffusion Model not yet
arxiv18 ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates not yet
arxiv18 TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models not yet
arxiv18 Multi-agent Architecture Search via Agentic Supernet not yet
arxiv18 Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2 not yet
arxiv18 Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning not yet
arxiv17 BIG-Bench Extra Hard not yet
arxiv17 Reasoning with Latent Thoughts: On the Power of Looped Transformers not yet
arxiv17 NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions not yet
arxiv17 Learning Smooth and Expressive Interatomic Potentials for Physical Property Prediction not yet
arxiv17 Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search not yet
arxiv16 Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction not yet
arxiv16 MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations not yet
arxiv16 Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and Sound not yet
arxiv16 Goku: Flow Based Video Generative Foundation Models
arxiv16 DeepRAG: Thinking to Retrieve Step by Step for Large Language Models not yet
arxiv16 STP: Self-play LLM Theorem Provers with Iterative Conjecturing and Proving not yet
arxiv15 Self-rewarding correction for mathematical reasoning not yet
arxiv15 The Relationship Between Reasoning and Performance in Large Language Models -- o3 (mini) Thinks Harder, Not Longer
arxiv15 ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model not yet
arxiv15 S$^2$R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning not yet
arxiv15 PRISM: Self-Pruning Intrinsic Selection Method for Training-Free Multimodal Data Selection not yet
arxiv15 Injecting Domain-Specific Knowledge into Large Language Models: A Comprehensive Survey not yet
arxiv15 Force Matching with Relativistic Constraints: A Physics-Inspired Approach to Stable and Efficient Generative Modeling not yet
arxiv15 Universal Approximation of Visual Autoregressive Transformers not yet
arxiv15 Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance not yet
arxiv15 Fast Video Generation with Sliding Tile Attention not yet
arxiv15 WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs
arxiv14 RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete not yet
arxiv14 From RAG to Memory: Non-Parametric Continual Learning for Large Language Models not yet
arxiv14 Reasoning-Augmented Conversation for Multi-Turn Jailbreak Attacks on Large Language Models not yet
arxiv14 Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks not yet
arxiv14 AnyEdit: Edit Any Knowledge Encoded in Language Models not yet
arxiv14 GSM-Infinite: How Do Your LLMs Behave over Infinitely Increasing Context Length and Reasoning Complexity? not yet
arxiv14 Safety at Scale: A Comprehensive Survey of Large Model Safety not yet
arxiv14 Ola: Pushing the Frontiers of Omni-Modal Language Model not yet
arxiv14 Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis not yet
arxiv14 VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models not yet
arxiv14 High-Order Matching for One-Step Shortcut Diffusion Models not yet
arxiv13 UniTok: A Unified Tokenizer for Visual Generation and Understanding not yet
arxiv13 Meta-Reasoner: Dynamic Guidance for Optimized Inference-time Reasoning in Large Language Models not yet
arxiv13 Med-RLVR: Emerging Medical Reasoning from a 3B base model via reinforcement Learning not yet
arxiv13 SpargeAttention: Accurate and Training-free Sparse Attention Accelerating Any Model Inference not yet
arxiv13 Are Sparse Autoencoders Useful? A Case Study in Sparse Probing not yet
arxiv13 Do Multilingual LLMs Think In English? not yet
arxiv13 RAG-Gym: Systematic Optimization of Language Agents for Retrieval-Augmented Generation not yet
arxiv13 HOMIE: Humanoid Loco-Manipulation with Isomorphic Exoskeleton Cockpit not yet
arxiv13 Towards Reasoning Ability of Small Language Models not yet
arxiv13 OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning not yet
arxiv13 ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models not yet
arxiv13 CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction not yet
arxiv13 Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning not yet
arxiv13 Teaching Language Models to Critique via Reinforcement Learning not yet
arxiv13 Fully Autonomous AI Agents Should Not be Developed not yet
arxiv13 Sample, Scrutinize and Scale: Effective Inference-Time Search by Scaling Verification not yet
arxiv12 Reward Shaping to Mitigate Reward Hacking in RLHF not yet
arxiv12 Dynamic Parallel Tree Search for Efficient LLM Reasoning not yet
arxiv12 Red-Teaming LLM Multi-Agent Systems via Communication Attacks not yet
arxiv12 Which Attention Heads Matter for In-Context Learning? not yet
arxiv12 AIDE: AI-Driven Exploration in the Space of Code not yet
arxiv12 Baichuan-M1: Pushing the Medical Capability of Large Language Models not yet
arxiv12 Inference-Time Computations for LLM Reasoning and Planning: A Benchmark and Insights not yet
arxiv12 Evaluating o1-Like LLMs: Unlocking Reasoning for Translation through Comprehensive Analysis not yet
arxiv12 Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More not yet
arxiv12 PlanGenLLMs: A Modern Survey of LLM Planning Capabilities not yet
arxiv12 HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation not yet
arxiv12 Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs not yet
arxiv12 Salamandra Technical Report not yet
arxiv12 Recent Advances in Discrete Speech Tokens: A Review not yet
arxiv12 TabICL: A Tabular Foundation Model for In-Context Learning on Large Data not yet
arxiv12 BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving not yet
arxiv12 Boosting Multimodal Reasoning with Automated Structured Thinking not yet
arxiv12 Latent Thought Models with Variational Bayes Inference-Time Computation not yet
arxiv12 MM-IQ: Benchmarking Human-Like Abstraction and Reasoning in Multimodal Models not yet
arxiv11 MV-MATH: Evaluating Multimodal Math Reasoning in Multi-Visual Contexts not yet
arxiv11 Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs not yet
arxiv11 FlexTok: Resampling Images into 1D Token Sequences of Flexible Length not yet
arxiv11 PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning not yet
arxiv11 A Survey of Personalized Large Language Models: Progress and Future Directions not yet
arxiv11 A Survey of LLM-based Agents in Medicine: How far are we from Baymax? not yet
arxiv11 On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning not yet
arxiv11 Organize the Web: Constructing Domains Enhances Pre-Training Data Curation not yet
arxiv11 Process Reward Models for LLM Agents: Practical Framework and Directions not yet
arxiv11 DeltaProduct: Improving State-Tracking in Linear RNNs via Householder Products not yet
arxiv11 Logical Reasoning in Large Language Models: A Survey not yet
arxiv11 EnigmaEval: A Benchmark of Long Multimodal Reasoning Challenges not yet
arxiv11 If Multi-Agent Debate is the Answer, What is the Question? not yet
arxiv11 Training Deep Learning Models with Norm-Constrained LMOs not yet
arxiv11 Self-Supervised Prompt Optimization not yet
arxiv11 Performance Analysis of Pinching-Antenna Systems not yet
arxiv11 Confidence Improves Self-Consistency in LLMs not yet
arxiv11 Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuracy not yet
arxiv11 Confident or Seek Stronger: Exploring Uncertainty-Based On-device LLM Routing From Benchmarking to Generalization not yet
arxiv11 Do Large Language Model Benchmarks Test Reliability? not yet
arxiv11 Masked Autoencoders Are Effective Tokenizers for Diffusion Models not yet
arxiv11 Reinforcement Learning for Long-Horizon Interactive LLM Agents not yet
arxiv10 On Benchmarking Human-Like Intelligence in Machines not yet
arxiv10 Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids not yet
arxiv10 Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning? not yet
arxiv10 OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment not yet
arxiv10 MegaTTS 3: Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis not yet
arxiv10 VEM: Environment-Free Exploration for Training GUI Agent with Value Environment Model not yet

※ 被引用数は更新日における NASA ADSのデータを参照しています
https://ui.adsabs.harvard.edu/

0
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?