73 |
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models |
not yet |
59 |
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning |
 |
53 |
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity |
 |
44 |
AlphaEvolve: A coding agent for scientific and algorithmic discovery |
 |
39 |
FLUX.1 Kontext: Flow Matching for In-Context Image Generation and Editing in Latent Space |
not yet |
30 |
Reasoning with Exploration: An Entropy Perspective on Reinforcement Learning for LLMs |
not yet |
27 |
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention |
not yet |
27 |
Spurious Rewards: Rethinking Training Signals in RLVR |
not yet |
27 |
Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task |
 |
27 |
OpenThoughts: Data Recipes for Reasoning Models |
not yet |
25 |
Small Language Models are the Future of Agentic AI |
not yet |
23 |
Architectural mechanisms of a universal fault-tolerant quantum computer |
not yet |
23 |
OmniGen2: Exploration to Advanced Multimodal Generation |
not yet |
23 |
V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning |
 |
22 |
MiMo-VL Technical Report |
not yet |
20 |
UMA: A Family of Universal Models for Atoms |
not yet |
19 |
UniWorld-V1: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation |
not yet |
18 |
Show-o2: Improved Native Unified Multimodal Models |
not yet |
18 |
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics |
not yet |
17 |
Mercury: Ultra-Fast Language Models Based on Diffusion |
not yet |
17 |
From Ground to Sky: Architectures, Applications, and Challenges Shaping Low-Altitude Wireless Networks |
not yet |
17 |
dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching |
not yet |
16 |
MAM: Modular Multi-Agent Framework for Multi-Modal Medical Diagnosis via Role-Specialized Collaboration |
not yet |
16 |
Deep Research Agents: A Systematic Examination And Roadmap |
not yet |
15 |
Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers |
not yet |
15 |
OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling |
not yet |
15 |
Magistral |
not yet |
15 |
Constructive interference at the edge of quantum ergodic dynamics |
not yet |
15 |
Seedance 1.0: Exploring the Boundaries of Video Generation Models |
 |
14 |
CRISP-SAM2: SAM2 with Cross-Modal Interaction and Semantic Prompting for Multi-Organ Segmentation |
not yet |
13 |
MMSearch-R1: Incentivizing LMMs to Search |
not yet |
13 |
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation |
not yet |
13 |
AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT and RL Synergy |
not yet |
13 |
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents |
not yet |
13 |
The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning |
 |
12 |
Continuous operation of a coherent 3,000-qubit system |
not yet |
12 |
Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs |
not yet |
12 |
xbench: Tracking Agents Productivity Scaling with Profession-Aligned Real-World Evaluations |
not yet |
12 |
Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions |
not yet |
12 |
AgentCPM-GUI: Building Mobile-Use Agents with Reinforcement Fine-Tuning |
not yet |
11 |
Persona Features Control Emergent Misalignment |
not yet |
11 |
SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning |
not yet |
11 |
A Survey of LLM-Driven AI Agent Communication: Protocols, Security Risks, and Defense Countermeasures |
not yet |
11 |
ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation |
not yet |
11 |
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective |
not yet |
11 |
Leveraging erasure errors in logical qubits with metastable $^{171}$Yb atoms |
not yet |
11 |
A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications |
not yet |
11 |
Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency |
not yet |
11 |
Follow-Your-Motion: Video Motion Transfer via Efficient Spatial-Temporal Decoupled Finetuning |
not yet |
11 |
Follow-Your-Creation: Empowering 4D Creation through Video Inpainting |
not yet |
11 |
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents |
not yet |
10 |
AgentOrchestra: A Hierarchical Multi-Agent Framework for General-Purpose Task Solving |
not yet |
10 |
Future of Work with AI Agents: Auditing Automation and Augmentation Potential across the U.S. Workforce |
not yet |
10 |
Deep Research Bench: Evaluating AI Web Research Agents |
not yet |
10 |
Seed-Coder: Let the Code Model Curate Data for Itself |
not yet |
10 |
SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning |
not yet |
10 |
MCP-Zero: Active Tool Discovery for Autonomous LLM Agents |
not yet |
9 |
MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents |
not yet |
9 |
ComplexBench-Edit: Benchmarking Complex Instruction-Driven Image Editing via Compositional Dependencies |
not yet |
9 |
VGR: Visual Grounded Reasoning |
not yet |
9 |
G-Memory: Tracing Hierarchical Memory for Multi-Agent Systems |
not yet |
9 |
RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics |
not yet |
9 |
TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management in LLM-based Agentic Multi-Agent Systems |
not yet |
9 |
Accelerating Diffusion LLMs via Adaptive Parallel Decoding |
not yet |
8 |
Sequential Diagnosis with Language Models |
not yet |
8 |
WorldVLA: Towards Autoregressive Action World Model |
 |
8 |
RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation |
not yet |
8 |
Towards AI Search Paradigm |
not yet |
8 |
Hunyuan3D 2.5: Towards High-Fidelity 3D Assets Generation with Ultimate Details |
not yet |
8 |
Optimizing Length Compression in Large Reasoning Models |
not yet |
8 |
High-fidelity entanglement and coherent multi-qubit mapping in an atom array |
not yet |
8 |
Thought Crime: Backdoors and Emergent Misalignment in Reasoning Models |
not yet |
8 |
Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs |
not yet |
8 |
Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion |
 |
8 |
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning |
not yet |
8 |
Curriculum Reinforcement Learning from Easy to Hard Tasks Improves LLM Reasoning |
not yet |
8 |
Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling Library |
not yet |
8 |
When to use Graphs in RAG: A Comprehensive Analysis for Graph Retrieval-Augmented Generation |
not yet |
8 |
SeedEdit 3.0: Fast and High-Quality Generative Image Editing |
not yet |
8 |
MMSU: A Massive Multi-task Spoken Language Understanding and Reasoning Benchmark |
not yet |
8 |
The Cost of Dynamic Reasoning: Demystifying AI Agents and Test-Time Scaling from an AI Infrastructure Perspective |
not yet |
8 |
Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning |
not yet |
8 |
Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning |
not yet |
8 |
TIIF-Bench: How Does Your T2I Model Follow Your Instructions? |
not yet |
8 |
A Graph Neural Network for the Era of Large Atomistic Models |
not yet |
8 |
MMedAgent-RL: Optimizing Multi-Agent Collaboration for Multimodal Medical Reasoning |
not yet |
7 |
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning |
not yet |
7 |
Pinching-Antenna Systems with In-Waveguide Attenuation: Performance Analysis and Algorithm Design |
not yet |
7 |
Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge |
not yet |
7 |
RLPR: Extrapolating RLVR to General Domains without Verifiers |
not yet |
7 |
VLN-R1: Vision-Language Navigation via Reinforcement Fine-Tuning |
not yet |
7 |
OAgents: An Empirical Study of Building Effective Agents |
not yet |
7 |
OneRec Technical Report |
not yet |
7 |
We Should Identify and Mitigate Third-Party Safety Risks in MCP-Powered Agent Systems |
not yet |
7 |
Model Context Protocol (MCP) at First Glance: Studying the Security and Maintainability of MCP Servers |
not yet |
7 |
Continual Learning for Generative AI: From LLMs to MLLMs and Beyond |
not yet |
7 |
SoundMind: RL-Incentivized Logic Reasoning for Audio-Language Models |
not yet |
7 |
LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming? |
not yet |
7 |
Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing |
not yet |
7 |
Comment on The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity |
not yet |
7 |
Diffuse and Disperse: Image Generation with Representation Regularization |
not yet |
7 |
Design Patterns for Securing LLM Agents against Prompt Injections |
not yet |
7 |
Reinforcement Pre-Training |
not yet |
7 |
$\tau^2$-Bench: Evaluating Conversational Agents in a Dual-Control Environment |
not yet |
7 |
WeThink: Toward General-purpose Vision-Language Reasoning via Reinforcement Learning |
not yet |
7 |
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models |
not yet |
7 |
Research on E-Commerce Long-Tail Product Recommendation Mechanism Based on Large-Scale Language Models |
not yet |
7 |
Research on Personalized Financial Product Recommendation by Integrating Large Language Models and Graph Neural Networks |
not yet |
7 |
Log-Linear Attention |
not yet |
7 |
OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models |
not yet |
7 |
V2X-UniPool: Unifying Multimodal Perception and Knowledge Reasoning for Autonomous Driving |
not yet |
7 |
EvolveNav: Self-Improving Embodied Reasoning for LLM-Based Vision-Language Navigation |
not yet |
7 |
ACE-Step: A Step Towards Music Generation Foundation Model |
not yet |
6 |
Hierarchical Reasoning Model |
 |
6 |
Potemkin Understanding in Large Language Models |
 |
6 |
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language |
not yet |
6 |
Mobile-R1: Towards Interactive Reinforcement Learning for VLM-Based Mobile Agent via Task-Level Rewards |
not yet |
6 |
Unified Vision-Language-Action Model |
not yet |
6 |
ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs |
not yet |
6 |
OmniAvatar: Efficient Audio-Driven Avatar Video Generation with Adaptive Body Animation |
not yet |
6 |
No Free Lunch: Rethinking Internal Feedback for LLM Reasoning |
not yet |
6 |
TabArena: A Living Benchmark for Machine Learning on Tabular Data |
not yet |
6 |
Hunyuan3D 2.1: From Images to High-Fidelity 3D Assets with Production-Ready PBR Material |
not yet |
6 |
GMT: General Motion Tracking for Humanoid Whole-Body Control |
not yet |
6 |
ASCD: Attention-Steerable Contrastive Decoding for Reducing Hallucination in MLLM |
not yet |
6 |
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs |
not yet |
6 |
Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning |
not yet |
6 |
Direct Reasoning Optimization: LLMs Can Reward And Refine Their Own Reasoning for Open-Ended Tasks |
not yet |
6 |
Serving Large Language Models on Huawei CloudMatrix384 |
not yet |
6 |
$\xi$-Based adaptive phase field model for quasi-static anti-plane fracture |
not yet |
6 |
Model Organisms for Emergent Misalignment |
not yet |
6 |
Self-Adapting Language Models |
 |
6 |
Fast on the Easy, Deep on the Hard: Efficient Reasoning via Powered Length Penalty |
not yet |
6 |
TaskCraft: Automated Generation of Agentic Tasks |
not yet |
6 |
Repeated ancilla reuse for logical computation on a neutral atom quantum computer |
not yet |
6 |
e3: Learning to Explore Enables Extrapolation of Test-Time Compute for LLMs |
not yet |
6 |
BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models |
not yet |
6 |
MiniCPM4: Ultra-Efficient LLMs on End Devices |
not yet |