3
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

【2024年11月公開 Arxiv論文ランキング】2411.xxxxx

Last updated at Posted at 2025-01-15

AI論文解説 Youtubeチャンネル: AI時代の羅針盤

2024年11月頃に公開されたcsカテゴリの論文 (ID: 2411.xxxxx)を被引用数のデータを元にランキングしています。ランキングは随時更新します。
(2025年4月4日更新)

被引用数   タイトル 動画
arxiv83 Tulu 3: Pushing Frontiers in Open Language Model Post-Training not yet
arxiv72 A Survey on LLM-as-a-Judge not yet
arxiv58 LLaVA-CoT: Let Vision Language Models Reason Step-by-Step
arxiv48 From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge not yet
arxiv45 Generative Agent Simulations of 1,000 People
arxiv42 Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions
arxiv40 FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI
arxiv30 O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? not yet
arxiv30 Measuring short-form factuality in large language models not yet
arxiv29 Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization not yet
arxiv27 OminiControl: Minimal and Universal Control for Diffusion Transformer not yet
arxiv27 Logical computation demonstrated with a neutral atom quantum processor not yet
arxiv26 Randomized Autoregressive Visual Generation not yet
arxiv25 OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models not yet
arxiv25 How Far is Video Generation from World Model: A Physical Law Perspective
arxiv24 RedPajama: an Open Dataset for Training Large Language Models not yet
arxiv24 Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM not yet
arxiv22 Does Prompt Formatting Have Any Impact on LLM Performance? not yet
arxiv22 Scaling Laws for Precision
arxiv21 Enhancing LLM Reasoning with Reward-guided Tree Search not yet
arxiv21 Circuit Complexity Bounds for RoPE-based Transformer Architecture not yet
arxiv20 VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models not yet
arxiv20 How to Build a Quantum Supercomputer: Scaling from Hundreds to Millions of Qubits not yet
arxiv19 Magentic-One: A Generalist Multi-Agent System for Solving Complex Tasks
arxiv18 Multimodal Whole Slide Foundation Model for Pathology not yet
arxiv18 VLRewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models not yet
arxiv18 SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models not yet
arxiv18 Taming Rectified Flow for Inversion and Editing not yet
arxiv17 CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models not yet
arxiv17 Identity-Preserving Text-to-Video Generation by Frequency Decomposition not yet
arxiv17 Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision not yet
arxiv17 Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models not yet
arxiv17 WavChat: A Survey of Spoken Dialogue Models not yet
arxiv17 Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models not yet
arxiv17 HourVideo: 1-Hour Video-Language Understanding not yet
arxiv17 Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent not yet
arxiv16 Large Language Model-Brained GUI Agents: A Survey not yet
arxiv16 On Statistical Rates of Conditional Diffusion Transformers: Approximation, Estimation and Minimax Optimality not yet
arxiv16 DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion not yet
arxiv15 Llama Guard 3 Vision: Safeguarding Human-AI Image Understanding Conversations not yet
arxiv15 Emotion-Aware Interaction Design in Intelligent User Interface Using Multi-Modal Deep Learning not yet
arxiv15 Towards evaluations-based safety cases for AI scheming not yet
arxiv15 Personalization of Large Language Models: A Survey not yet
arxiv14 Self-Generated Critiques Boost Reward Modeling for Language Models not yet
arxiv14 Fundamental Limits of Prompt Tuning Transformers: Universality, Capacity and Efficiency not yet
arxiv14 Hymba: A Hybrid-head Architecture for Small Language Models
arxiv14 The Surprising Effectiveness of Test-Time Training for Few-Shot Learning not yet
arxiv13 RE-Bench: Evaluating frontier AI R&D capabilities of language model agents against human experts not yet
arxiv13 Learning Humanoid Locomotion with Perceptive Internal Model not yet
arxiv13 OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs not yet
arxiv13 SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-thread INT4 Quantization not yet
arxiv13 Metric Learning for Tag Recommendation: Tackling Data Sparsity and Cold Start Issues not yet
arxiv13 A Comprehensive Survey of Small Language Models in the Era of Large Language Models: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness not yet
arxiv13 Improving Steering Vectors by Targeting Sparse Autoencoder Features not yet
arxiv12 Inference Scaling fLaws: The Limits of LLM Resampling with Imperfect Verifiers not yet
arxiv12 OphCLIP: Hierarchical Retrieval-Augmented Learning for Ophthalmic Surgical Video-Language Pretraining not yet
arxiv12 DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding
arxiv12 Self-Supervised Learning in Deep Networks: A Pathway to Robust Few-Shot Classification not yet
arxiv12 A Preview of XiYan-SQL: A Multi-Generator Ensemble Framework for Text-to-SQL not yet
arxiv12 Safety case template for frontier AI: A cyber inability argument not yet
arxiv12 JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation not yet
arxiv12 DiT4Edit: Diffusion Transformer for Image Editing not yet
arxiv12 Hunyuan3D 1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation not yet
arxiv12 Addressing Representation Collapse in Vector Quantized Models with One Linear Layer not yet
arxiv12 Vision-Language Models Can Self-Improve Reasoning via Reflection not yet
arxiv12 DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models not yet
arxiv12 AutoGLM: Autonomous Foundation Agents for GUIs not yet
arxiv12 PatternBoost: Constructions in Mathematics with a Little Help from AI not yet
arxiv11 Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability not yet
arxiv11 Scaling Speech-Text Pre-training with Synthetic Interleaved Data not yet
arxiv11 Enhancing Few-Shot Learning with Integrated Data and GAN Model Approaches not yet
arxiv11 All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages not yet
arxiv11 Optimizing Gesture Recognition for Seamless UI Interaction Using Convolutional Neural Networks not yet
arxiv11 Graph Neural Network-Based Entity Extraction and Relationship Reasoning in Complex Knowledge Graphs not yet
arxiv11 OASIS: Open Agent Social Interaction Simulations with One Million Agents not yet
arxiv11 Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model not yet
arxiv11 ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning not yet
arxiv11 MM-Embed: Universal Multimodal Retrieval with Multimodal LLMs not yet
arxiv11 MdEval: Massively Multilingual Code Debugging not yet
arxiv11 Rule Based Rewards for Language Model Safety not yet
arxiv11 Survey of Cultural Awareness in Language Models: Text and Beyond not yet
arxiv11 Adding Error Bars to Evals: A Statistical Approach to Language Model Evaluations not yet
arxiv10 VLSBench: Unveiling Visual Leakage in Multimodal Safety not yet
arxiv10 INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge not yet
arxiv10 CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation not yet
arxiv10 Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language Models not yet
arxiv10 Leveraging Semi-Supervised Learning to Enhance Data Mining for Image Classification under Limited Labeled Data not yet
arxiv10 ShowUI: One Vision-Language-Action Model for GUI Visual Agent not yet
arxiv10 Adaptive Cache Management for Complex Storage Systems Using CNN-LSTM-Based Spatiotemporal Prediction not yet
arxiv10 SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory
arxiv10 High-fidelity universal gates in the $^{171}$Yb ground state nuclear spin qubit not yet
arxiv10 Zero-Shot Automatic Annotation and Instance Segmentation using LLM-Generated Datasets: Eliminating Field Imaging and Manual Annotation for Deep Learning Model Development not yet
arxiv10 The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer Use not yet
arxiv10 LoRA-LiteE: A Computationally Efficient Framework for Chatbot Preference-Tuning not yet
arxiv10 Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows not yet
arxiv10 Rethinking Bradley-Terry Models in Preference-Based Reward Modeling: Foundations, Theory, and Alternatives not yet
arxiv10 From Medprompt to o1: Exploration of Run-Time Strategies for Medical Challenge Problems and Beyond not yet
arxiv10 Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level not yet
arxiv10 TableGPT2: A Large Multimodal Model with Tabular Data Integration not yet
arxiv10 Fish-Speech: Leveraging Large Language Models for Advanced Multilingual Text-to-Speech Synthesis not yet
arxiv9 GRAPE: Generalizing Robot Policy via Preference Alignment not yet
arxiv9 AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers not yet
arxiv9 TimeMarker: A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability not yet
arxiv9 SALMONN-omni: A Codec-free LLM for Full-duplex Speech Understanding and Generation not yet
arxiv9 WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model not yet
arxiv9 MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs not yet
arxiv9 XGrammar: Flexible and Efficient Structured Generation Engine for Large Language Models not yet
arxiv9 Towards Next-Generation Medical Agent: How o1 is Reshaping Decision-Making in Medical Scenarios not yet
arxiv9 Multimodal Autoregressive Pre-training of Large Vision Encoders not yet
arxiv9 BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games not yet
arxiv9 Disentangling Memory and Reasoning Ability in Large Language Models not yet
arxiv9 A Combined Encoder and Transformer Approach for Coherent and High-Quality Text Generation not yet
arxiv9 BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices not yet
arxiv9 Large Wireless Model (LWM): A Foundation Model for Wireless Channels not yet
arxiv9 Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents not yet
arxiv9 A Survey on Kolmogorov-Arnold Network not yet
arxiv9 Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks not yet
arxiv9 LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation
arxiv9 GUI Agents with Foundation Models: A Comprehensive Survey not yet
arxiv9 Advanced RAG Models with Graph Structures: Optimizing Complex Knowledge Reasoning and Text Generation not yet
arxiv9 Distributionally Robust Optimization not yet
arxiv9 Attacking Vision-Language Computer Agents via Pop-ups not yet
arxiv9 WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning not yet
arxiv9 On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback not yet
arxiv9 DexHub and DART: Towards Internet Scale Robot Data Collection not yet
arxiv9 GameGen-X: Interactive Open-world Game Video Generation
arxiv9 A Public Dataset Tracking Social Media Discourse about the 2024 U.S. Presidential Election on Twitter/X not yet
arxiv9 RSL-SQL: Robust Schema Linking in Text-to-SQL Generation not yet
arxiv8 Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS not yet
arxiv8 Filter, Correlate, Compress: Training-Free Token Reduction for MLLM Acceleration not yet
arxiv8 Transformers are Deep Optimizers: Provable In-Context Learning for Deep Model Training not yet
arxiv8 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training not yet
arxiv8 FocusLLaVA: A Coarse-to-Fine Approach for Efficient and Effective Visual Token Compression not yet
arxiv8 Evaluating the Robustness of Analogical Reasoning in Large Language Models not yet
arxiv8 When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training not yet
arxiv8 Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension not yet
arxiv8 Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
arxiv8 Understanding Chain-of-Thought in LLMs through Information Theory not yet
arxiv8 ACE2: Accurately learning subseasonal to decadal atmospheric variability and forced responses not yet
arxiv8 AnimateAnything: Consistent and Controllable Animation for Video Generation not yet
arxiv8 Seeing Clearly by Layer Two: Enhancing Attention Heads to Alleviate Hallucination in LVLMs not yet
arxiv8 Golden Noise for Diffusion Models: A Learning Framework
arxiv8 Game-theoretic LLM: Agent Workflow for Negotiation Games not yet
arxiv8 Autoregressive Models in Vision: A Survey not yet
arxiv8 LLMs as Research Tools: A Large Scale Survey of Researchers' Usage and Perceptions not yet
arxiv8 Quantum speedups in solving near-symmetric optimization problems by low-depth QAOA not yet
arxiv8 Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding not yet
arxiv8 Science and Project Planning for the Forward Physics Facility in Preparation for the 2024-2026 European Particle Physics Strategy Update not yet
arxiv8 Evaluation data contamination in LLMs: how do we measure it and (when) does it matter? not yet
arxiv8 Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback not yet
arxiv8 What do sin$(x)$ and arcsinh$(x)$ have in Common? not yet
arxiv8 Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement not yet
arxiv8 A Lorentz-Equivariant Transformer for All of the LHC not yet
arxiv8 Project Sid: Many-agent simulations toward AI civilization not yet
arxiv7 Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation not yet
arxiv7 $H^3$Fusion: Helpful, Harmless, Honest Fusion of Aligned LLMs not yet
arxiv7 I2VControl: Disentangled and Unified Video Motion Synthesis Control not yet

※ 被引用数は更新日における NASA ADSのデータを参照しています
https://ui.adsabs.harvard.edu/

3
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
3
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?