1
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

【2024年10月公開 Arxiv論文ランキング】2410.xxxxx

Last updated at Posted at 2024-12-11

AI論文解説 Youtubeチャンネル: AI時代の羅針盤

2024年10月頃に公開されたcsカテゴリの論文 (ID: 2410.xxxxx)を被引用数のデータを元にランキングしています。ランキングは随時更新します。
(2025年4月4日更新)

被引用数   タイトル 動画
arxiv604 GPT-4o System Card not yet
arxiv155 Movie Gen: A Cast of Media Foundation Models
arxiv134 GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
arxiv93 $\pi_0$: A Vision-Language-Action Flow Model for General Robot Control not yet
arxiv91 Video Instruction Tuning With Synthetic Data not yet
arxiv82 Pixtral 12B not yet
arxiv56 Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation not yet
arxiv53 O1 Replication Journey: A Strategic Progress Report -- Part 1 not yet
arxiv53 RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation not yet
arxiv52 Depth Pro: Sharp Monocular Metric Depth in Less Than a Second
arxiv52 Moshi: a speech-text foundation model for real-time dialogue not yet
arxiv46 Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs not yet
arxiv46 Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models not yet
arxiv46 MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion not yet
arxiv45 YOLOv11: An Overview of the Key Architectural Enhancements not yet
arxiv45 Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning not yet
arxiv43 GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation not yet
arxiv43 Pyramidal Flow Matching for Efficient Video Generative Modeling
arxiv41 LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning not yet
arxiv40 LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding
arxiv39 Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think not yet
arxiv38 Aria: An Open Multimodal Native Mixture-of-Experts Model not yet
arxiv37 OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models not yet
arxiv36 SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference not yet
arxiv36 OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data not yet
arxiv35 Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge not yet
arxiv35 How to Train Long-Context Language Models (Effectively) not yet
arxiv34 Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
arxiv33 MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
arxiv32 Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models not yet
arxiv30 LLaVA-Critic: Learning to Evaluate Multimodal Models not yet
arxiv29 ALOHA Unleashed: A Simple Recipe for Robot Dexterity not yet
arxiv29 DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
arxiv29 SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers not yet
arxiv28 HART: Efficient Visual Generation with Hybrid Autoregressive Transformer not yet
arxiv28 Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models not yet
arxiv27 CoTracker3: Simpler and Better Point Tracking by Pseudo-Labelling Real Videos not yet
arxiv27 Mini-Omni2: Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities not yet
arxiv27 AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents not yet
arxiv27 Differential Transformer
arxiv26 Orb: A Fast, Scalable Neural Network Potential not yet
arxiv26 Data Scaling Laws in Imitation Learning for Robotic Manipulation not yet
arxiv26 PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction not yet
arxiv26 F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
arxiv25 RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning
arxiv25 HelpSteer2-Preference: Complementing Ratings with Preferences not yet
arxiv24 Rethinking Visual Dependency in Long-Context Reasoning for Large Vision-Language Models not yet
arxiv23 Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models
arxiv23 Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations not yet
arxiv23 Loong: Generating Minute-level Long Videos with Autoregressive Language Models not yet
arxiv23 HELMET: How to Evaluate Long-Context Language Models Effectively and Thoroughly not yet
arxiv22 Generalizable Humanoid Manipulation with 3D Diffusion Policies not yet
arxiv22 HSR-Enhanced Sparse Attention Acceleration not yet
arxiv22 Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents not yet
arxiv22 A Survey on Diffusion Models for Inverse Problems not yet
arxiv21 Liger Kernel: Efficient Triton Kernels for LLM Training not yet
arxiv21 Agent-as-a-Judge: Evaluate Agents with Agents
arxiv21 AFlow: Automating Agentic Workflow Generation not yet
arxiv21 Looped ReLU MLPs May Be All You Need as Practical Programmable Computers not yet
arxiv21 Baichuan-Omni Technical Report not yet
arxiv20 DepthSplat: Connecting Gaussian Splatting and Depth not yet
arxiv20 A Comparative Study on Reasoning Patterns of OpenAI's o1 Model not yet
arxiv20 LightRAG: Simple and Fast Retrieval-Augmented Generation not yet
arxiv20 ImageFolder: Autoregressive Image Generation with Folded Tokens not yet
arxiv19 Improve Vision Language Model Chain-of-thought Reasoning not yet
arxiv19 Allegro: Open the Black Box of Commercial-Level Video Generation Model
arxiv19 Generative Reward Models not yet
arxiv19 JudgeBench: A Benchmark for Evaluating LLM-based Judges not yet
arxiv19 VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents not yet
arxiv19 Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training not yet
arxiv19 VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks not yet
arxiv19 Strong Model Collapse
arxiv19 CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL not yet
arxiv18 No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images not yet
arxiv18 ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference not yet
arxiv18 MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark not yet
arxiv18 Performance of the CMS high-level trigger during LHC Run 2 not yet
arxiv18 A Survey on Data Synthesis and Augmentation for Large Language Models not yet
arxiv18 TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models not yet
arxiv18 Fine-grained Attention I/O Complexity: Comprehensive Analysis for Backward Passes not yet
arxiv18 DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation not yet
arxiv18 Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models
arxiv17 EMMA: End-to-End Multimodal Model for Autonomous Driving not yet
arxiv17 Automatically Interpreting Millions of Features in Large Language Models not yet
arxiv17 Latent Action Pretraining from Videos
arxiv17 Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix not yet
arxiv17 When Attention Sink Emerges in Language Models: An Empirical View not yet
arxiv17 Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis not yet
arxiv17 ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery not yet
arxiv17 SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains? not yet
arxiv17 AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark not yet
arxiv17 ManiSkill3: GPU Parallelized Robotics Simulation and Rendering for Generalizable Embodied AI not yet
arxiv16 VoiceBench: Benchmarking LLM-Based Voice Assistants not yet
arxiv16 TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling not yet
arxiv16 DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation not yet
arxiv16 MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models not yet
arxiv16 How to Construct Random Unitaries not yet
arxiv16 Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation not yet
arxiv16 Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making not yet
arxiv16 Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG not yet
arxiv16 Tutor CoPilot: A Human-AI Approach for Scaling Real-Time Expertise not yet
arxiv15 Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics
arxiv15 Jailbreaking and Mitigation of Vulnerabilities in Large Language Models not yet
arxiv15 Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent not yet
arxiv15 Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues not yet
arxiv15 Impurities and polarons in bosonic quantum gases: a review on recent progress not yet
arxiv15 Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow not yet
arxiv15 T2V-Turbo-v2: Enhancing Video Generation Model Post-Training through Data, Reward, and Conditional Guidance Design not yet
arxiv15 Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification not yet
arxiv15 AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs not yet
arxiv15 ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection not yet
arxiv15 Inference Scaling for Long-Context Retrieval Augmented Generation
arxiv15 LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations
arxiv15 AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models not yet
arxiv15 VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment not yet
arxiv15 Uncertainty-aware Reward Model: Teaching Reward Models to Know What is Unknown not yet
arxiv14 In-Context LoRA for Diffusion Transformers not yet
arxiv14 Social Science Meets LLMs: How Reliable Are Large Language Models in Social Simulations? not yet
arxiv14 OS-ATLAS: A Foundation Action Model for Generalist GUI Agents not yet
arxiv14 HOVER: Versatile Neural Whole-Body Controller for Humanoid Robots not yet
arxiv14 Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages not yet
arxiv14 Jailbreaking LLM-Controlled Robots not yet
arxiv14 SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs not yet
arxiv14 The Ingredients for Robotic Diffusion Transformers not yet
arxiv14 ARCap: Collecting High-quality Human Demonstrations for Robot Learning with Augmented Reality Feedback not yet
arxiv14 Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation not yet
arxiv14 Falcon Mamba: The First Competitive Attention-free 7B Language Model not yet
arxiv14 TextHawk2: A Large Vision-Language Model Excels in Bilingual OCR and Grounding with 16x Fewer Tokens not yet
arxiv14 CulturalBench: a Robust, Diverse and Challenging Benchmark on Measuring the (Lack of) Cultural Knowledge of LLMs not yet
arxiv14 Interpretable Contrastive Monte Carlo Tree Search Reasoning not yet
arxiv13 SelfCodeAlign: Self-Alignment for Code Generation
arxiv13 MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision not yet
arxiv13 Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms not yet
arxiv13 WorldSimBench: Towards Video Generation Models as World Simulators not yet
arxiv13 The XLZD Design Book: Towards the Next-Generation Liquid Xenon Observatory for Dark Matter and Neutrino Physics not yet
arxiv13 Thinking LLMs: General Instruction Following with Thought Generation
arxiv13 Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent System not yet
arxiv13 Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents not yet
arxiv13 IC3M: In-Car Multimodal Multi-object Monitoring for Abnormal Status of Both Driver and Passengers not yet
arxiv12 On Memorization of Large Language Models in Logical Reasoning not yet
arxiv12 CaloChallenge 2022: A Community Challenge for Fast Calorimeter Simulation not yet
arxiv12 Safety cases for frontier AI not yet
arxiv12 MarDini: Masked Autoregressive Diffusion for Video Generation at Scale not yet
arxiv12 SafeBench: A Safety Evaluation Framework for Multimodal Large Language Models not yet
arxiv12 Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data not yet
arxiv12 Scaling Diffusion Language Models via Adaptation from Autoregressive Models not yet
arxiv12 Self-Supervised Graph Neural Networks for Enhanced Feature Extraction in Heterogeneous Information Networks not yet
arxiv12 Efficient and Aesthetic UI Design with a Deep Learning-Based Interface Generation Tree Algorithm not yet
arxiv12 RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style not yet
arxiv12 REEF: Representation Encoding Fingerprints for Large Language Models not yet
arxiv12 DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control not yet
arxiv12 Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats not yet
arxiv12 SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation not yet
arxiv12 G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks not yet
arxiv12 LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory not yet
arxiv12 IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation not yet
arxiv12 Sparse Autoencoders Reveal Universal Feature Spaces Across Large Language Models not yet
arxiv12 Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning not yet
arxiv12 Learning How Hard to Think: Input-Adaptive Allocation of LM Computation not yet
arxiv12 Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models not yet
arxiv12 FakeShield: Explainable Image Forgery Detection and Localization via Multi-modal Large Language Models not yet
arxiv12 Cut the Crap: An Economical Communication Pipeline for LLM-based Multi-Agent Systems not yet
arxiv12 Were RNNs All We Needed?
arxiv11 One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation not yet
arxiv11 AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions
arxiv11 Fast Best-of-N Decoding via Speculative Rejection not yet
arxiv11 Pay Attention and Move Better: Harnessing Attention for Interactive Motion Generation and Training-free Editing not yet
arxiv11 Why Does the Effective Context Length of LLMs Fall Short? not yet
arxiv11 One-Step Diffusion Distillation through Score Implicit Matching not yet
arxiv11 Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance not yet
arxiv11 CamI2V: Camera-Controlled Image-to-Video Diffusion Model not yet
arxiv11 A Recommendation Model Utilizing Separation Embedding and Self-Attention for Feature Mining not yet
arxiv11 From PINNs to PIKANs: Recent Advances in Physics-Informed Machine Learning not yet
arxiv11 Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents not yet
arxiv11 Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws not yet
arxiv11 Mechanistic? not yet
arxiv11 Losing dimensions: Geometric memorization in generative diffusion not yet

※ 被引用数は更新日における NASA ADSのデータを参照しています
https://ui.adsabs.harvard.edu/

1
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?