0
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

【2024年12月公開 Arxiv論文ランキング】2412.xxxxx

Posted at

AI論文解説 Youtubeチャンネル: AI時代の羅針盤

2024年12月頃に公開されたcsカテゴリの論文 (ID: 2412.xxxxx)を被引用数のデータを元にランキングしています。ランキングは随時更新します。
(2025年4月4日更新)

被引用数   タイトル 動画
arxiv595 Qwen2.5 Technical Report not yet
arxiv456 DeepSeek-V3 Technical Report not yet
arxiv262 OpenAI o1 System Card not yet
arxiv147 Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling not yet
arxiv94 HunyuanVideo: A Systematic Framework For Large Video Generative Models
arxiv82 Phi-4 Technical Report
arxiv50 DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding not yet
arxiv48 Do NOT Think That Much for 2+3
arxiv43 Training Large Language Models to Reason in a Continuous Latent Space
arxiv39 Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference not yet
arxiv38 Open-Sora Plan: Open-Source Large Video Generation Model
arxiv37 ProcessBench: Identifying Process Errors in Mathematical Reasoning not yet
arxiv36 Open-Sora: Democratizing Efficient Video Production for All not yet
arxiv36 Structured 3D Latents for Scalable and Versatile 3D Generation not yet
arxiv34 Deliberative Alignment: Reasoning Enables Safer Language Models not yet
arxiv33 Imitate, Explore, and Self-Improve: A Reproduction Report on Slow-thinking Reasoning Systems not yet
arxiv30 LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods not yet
arxiv26 Free Process Rewards without Process Labels not yet
arxiv25 Alignment faking in large language models not yet
arxiv25 Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
arxiv23 LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks not yet
arxiv23 Flow Matching Guide and Code not yet
arxiv23 Flexible-Antenna Systems: A Pinching-Antenna Perspective not yet
arxiv22 Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search not yet
arxiv22 NVILA: Efficient Frontier Visual Language Models not yet
arxiv22 Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation not yet
arxiv20 HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
arxiv20 Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces not yet
arxiv20 Byte Latent Transformer: Patches Scale Better Than Tokens
arxiv20 Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier not yet
arxiv20 GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot not yet
arxiv19 TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation not yet
arxiv18 Large Concept Models: Language Modeling in a Sentence Representation Space
arxiv18 VisionZip: Longer is Better but Not Necessary in Vision Language Models
arxiv18 PaliGemma 2: A Family of Versatile VLMs for Transfer
arxiv17 Fast Gradient Computation for RoPE Attention in Almost Linear Time not yet
arxiv17 Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective
arxiv16 Experimental Demonstration of Logical Magic State Distillation not yet
arxiv16 MetaMorph: Multimodal Understanding and Generation via Instruction Tuning not yet
arxiv16 ExBody2: Advanced Expressive Humanoid Whole-Body Control not yet
arxiv16 Frontier Models are Capable of In-context Scheming not yet
arxiv16 o1-Coder: an o1 Replication for Coding not yet
arxiv15 Token-Budget-Aware LLM Reasoning not yet
arxiv15 ARC Prize 2024: Technical Report
arxiv15 Best-of-N Jailbreaking not yet
arxiv14 Theoretical Constraints on the Expressive Power of $\mathsf{RoPE}$-based Tensor Attention Transformers not yet
arxiv14 Motion Prompting: Controlling Video Generation with Motion Trajectories not yet
arxiv13 CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language Models not yet
arxiv13 Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning not yet
arxiv13 Mobile-TeleVision: Predictive Motion Priors for Humanoid Whole-Body Control not yet
arxiv13 From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
arxiv13 The Computational Limits of State-Space Models and Mamba via the Lens of Circuit Complexity not yet
arxiv12 Formal Mathematical Reasoning: A New Frontier in AI not yet
arxiv12 FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching not yet
arxiv12 TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
arxiv12 RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation not yet
arxiv12 Compressed Chain of Thought: Efficient Reasoning Through Dense Representations not yet
arxiv12 Apollo: An Exploration of Video Understanding in Large Multimodal Models
arxiv12 Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
arxiv11 An analytic theory of creativity in convolutional diffusion models not yet
arxiv11 LMFusion: Adapting Pretrained Language Models for Multimodal Generation not yet
arxiv11 Flex Attention: A Programming Model for Generating Optimized Attention Kernels not yet
arxiv11 InstantSwap: Fast Customized Concept Swapping across Sharp Shape Differences not yet
arxiv10 A Survey on Large Language Model Acceleration based on KV Cache Management not yet
arxiv10 MEDEC: A Benchmark for Medical Error Detection and Correction in Clinical Notes
arxiv10 Jasper and Stella: distillation of SOTA embedding models not yet
arxiv10 DRT: Deep Reasoning Translation via Long Chain-of-Thought not yet
arxiv10 Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis not yet
arxiv10 Parallelized Autoregressive Visual Generation not yet
arxiv10 AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling not yet
arxiv10 Autoregressive Video Generation without Vector Quantization not yet
arxiv10 Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models not yet
arxiv10 LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers not yet
arxiv10 InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions not yet
arxiv10 The BrowserGym Ecosystem for Web Agent Research not yet
arxiv10 Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction not yet
arxiv10 Liquid: Language Models are Scalable and Unified Multi-modal Generators not yet
arxiv10 Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey not yet
arxiv10 [CLS] Attention is All You Need for Training-Free Visual Token Pruning: Make VLM Inference Faster not yet
arxiv9 OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis not yet
arxiv9 Dora: Sampling and Benchmarking for 3D Shape Variational Auto-Encoders not yet
arxiv9 LearnLM: Improving Gemini for Learning
arxiv9 Offline Reinforcement Learning for LLM Multi-Step Reasoning not yet
arxiv9 Score-based Generative Diffusion Models for Social Recommendations not yet
arxiv9 Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models not yet
arxiv9 Entropy-Regularized Process Reward Model not yet
arxiv9 TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies not yet
arxiv9 FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models not yet
arxiv9 UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics not yet
arxiv9 A Consolidated Volatility Prediction with Back Propagation Neural Network and Genetic Algorithm not yet
arxiv9 On Evaluating the Durability of Safeguards for Open-Weight LLMs not yet
arxiv9 Gated Delta Networks: Improving Mamba2 with Delta Rule not yet
arxiv9 BatchTopK Sparse Autoencoders not yet
arxiv9 Comprehensive Evaluation of Multimodal AI Models in Medical Imaging Diagnosis: From Data Augmentation to Preference-Based Comparison not yet
arxiv9 MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale not yet
arxiv9 Evaluating and Aligning CodeLLMs on Human Preference not yet
arxiv9 RandAR: Decoder-only Autoregressive Visual Generation in Random Orders not yet
arxiv9 Scaling New Frontiers: Insights into Large Recommendation Models not yet
arxiv8 Training Software Engineering Agents and Verifiers with SWE-Gym not yet
arxiv8 Aria-UI: Visual Grounding for GUI Instructions not yet
arxiv8 Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback not yet
arxiv8 Categorical Symmetries in Spin Models with Atom Arrays not yet
arxiv8 GUI Agents: A Survey not yet
arxiv8 RAG-Star: Enhancing Deliberative Reasoning with Retrieval Augmented Verification and Refinement not yet
arxiv8 Fault-Tolerant Operation and Materials Science with Neutral Atom Logical Qubits not yet
arxiv8 Hierarchical Split Federated Learning: Convergence Analysis and System Optimization not yet
arxiv8 On the Expressive Power of Modern Hopfield Networks not yet
arxiv8 International Scientific Report on the Safety of Advanced AI (Interim Report) not yet
arxiv8 Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression not yet
arxiv8 U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in LLMs not yet
arxiv8 Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models not yet
arxiv8 An Automated Data Mining Framework Using Autoencoders for Feature Extraction and Dimensionality Reduction not yet
arxiv7 TradingAgents: Multi-Agents LLM Financial Trading Framework not yet
arxiv7 SegKAN: High-Resolution Medical Image Segmentation with Long-Distance Dependencies not yet
arxiv7 DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers not yet
arxiv7 KG4Diagnosis: A Hierarchical Multi-Agent LLM Framework with Knowledge Graph Enhancement for Medical Diagnosis not yet
arxiv7 Progressive Multimodal Reasoning via Active Retrieval not yet
arxiv7 MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval not yet
arxiv7 Agent-SafetyBench: Evaluating the Safety of LLM Agents not yet
arxiv7 Minimum Data Rate Maximization for Uplink Pinching-Antenna Systems not yet
arxiv7 Large Language Model Enhanced Recommender Systems: A Survey not yet
arxiv7 SafeAgentBench: A Benchmark for Safe Task Planning of Embodied LLM Agents not yet
arxiv7 Attentive Eraser: Unleashing Diffusion Model's Object Removal Potential via Self-Attention Redirection Guidance not yet
arxiv7 C3oT: Generating Shorter Chain-of-Thought without Compromising Effectiveness not yet
arxiv7 Reinforcement Learning Enhanced LLMs: A Survey not yet
arxiv7 SCBench: A KV Cache-Centric Analysis of Long-Context Methods not yet
arxiv7 SPT: Sequence Prompt Transformer for Interactive Image Segmentation not yet
arxiv7 Simple Guidance Mechanisms for Discrete Diffusion Models not yet
arxiv7 A Survey on Uncertainty Quantification of Large Language Models: Taxonomy, Open Research Challenges, and Future Directions not yet
arxiv7 APOLLO: SGD-like Memory, AdamW-level Performance not yet
arxiv7 EXAONE 3.5: Series of Large Language Models for Real-world Use Cases
arxiv7 NaVILA: Legged Robot Vision-Language-Action Model for Navigation not yet
arxiv7 Advanced Risk Prediction and Stability Assessment of Banks Using Time Series Transformer Models not yet
arxiv7 Navigation World Models not yet
arxiv7 ChatTS: Aligning Time Series with LLMs via Synthetic Data for Enhanced Understanding and Reasoning not yet
arxiv7 Enhancing Recommendation Systems with GNNs and Addressing Over-Smoothing not yet
arxiv7 Prithvi-EO-2.0: A Versatile Multi-Temporal Foundation Model for Earth Observation Applications not yet
arxiv7 HUGSIM: A Real-Time, Photo-Realistic and Closed-Loop Simulator for Autonomous Driving not yet
arxiv7 Are We There Yet? Revealing the Risks of Utilizing Large Language Models in Scholarly Peer Review not yet
arxiv7 FullStack Bench: Evaluating LLMs as Full Stack Coders not yet
arxiv7 Task Singular Vectors: Reducing Task Interference in Model Merging not yet
arxiv6 VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation not yet
arxiv6 GME: Improving Universal Multimodal Retrieval by Multimodal LLMs not yet
arxiv6 Universal Machine Learning Interatomic Potentials are Ready for Phonons not yet
arxiv6 Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning not yet
arxiv6 Multi-LLM Text Summarization not yet
arxiv6 AutoTrust: Benchmarking Trustworthiness in Large Vision Language Models for Autonomous Driving not yet
arxiv6 MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark not yet
arxiv6 How to Synthesize Text Data without Model Collapse? not yet
arxiv6 Numerical Pruning for Efficient Autoregressive Models not yet
arxiv6 Wonderland: Navigating 3D Scenes from a Single Image not yet
arxiv6 ExecRepoBench: Multi-level Executable Code Completion Evaluation not yet
arxiv6 A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges not yet
arxiv6 SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models not yet
arxiv6 ChatTime: A Unified Multimodal Time Series Foundation Model Bridging Numerical and Textual Data not yet

※ 被引用数は更新日における NASA ADSのデータを参照しています
https://ui.adsabs.harvard.edu/

0
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?