0
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

【2024年8月公開 Arxiv論文ランキング】2408.xxxxx

Last updated at Posted at 2024-10-01

AI論文解説 Youtubeチャンネル: AI時代の羅針盤

2024年8月頃に公開されたcsカテゴリの論文 (ID: 2408.xxxxx)を被引用数のデータを元にランキングしています。ランキングは随時更新します。
(2024年12月9日更新)

被引用数   タイトル 動画
arxiv270 Gemma 2: Improving Open Language Models at a Practical Size
arxiv200 SAM 2: Segment Anything in Images and Videos
arxiv129 LLaVA-OneVision: Easy Visual Task Transfer not yet
arxiv107 MiniCPM-V: A GPT-4V Level MLLM on Your Phone
arxiv101 CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer not yet
arxiv84 Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
arxiv57 The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
arxiv50 Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model not yet
arxiv48 Show-o: One Single Transformer to Unify Multimodal Understanding and Generation not yet
arxiv32 CogVLM2: Visual Language Models for Image and Video Understanding not yet
arxiv31 Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2 not yet
arxiv28 KAN 2.0: Kolmogorov-Arnold Networks Meet Science not yet
arxiv28 VITA: Towards Open-Source Interactive Omni Multimodal LLM
arxiv26 Generative Verifiers: Reward Modeling as Next-Token Prediction not yet
arxiv26 xGen-MM (BLIP-3): A Family of Open Large Multimodal Models not yet
arxiv25 Self-Taught Evaluators not yet
arxiv24 Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities
arxiv24 mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models not yet
arxiv23 Medical SAM 2: Segment medical images as video via Segment Anything Model 2
arxiv22 ControlNeXt: Powerful and Efficient Control for Image and Video Generation not yet
arxiv21 Towards Resilient and Efficient LLMs: A Comparative Study of Efficiency, Performance, and Adversarial Robustness not yet
arxiv20 Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents
arxiv20 Tamper-Resistant Safeguards for Open-Weight LLMs not yet
arxiv19 Identification of Prognostic Biomarkers for Stage III Non-Small Cell Lung Carcinoma in Female Nonsmokers Using Machine Learning not yet
arxiv18 Building and better understanding vision-language models: insights and future directions not yet
arxiv18 Imagen 3
arxiv18 Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining not yet
arxiv17 Adaptive Friction in Deep Learning: Enhancing Optimizers with Sigmoid and Tanh Function not yet
arxiv17 Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models not yet
arxiv16 A Sharp Convergence Theory for The Probability Flow ODEs of Diffusion Models not yet
arxiv15 Automated Design of Agentic Systems not yet
arxiv15 Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers
arxiv15 Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models not yet
arxiv14 Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling
arxiv14 Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming not yet
arxiv14 Optimizing Automated Picking Systems in Warehouse Robots Using Machine Learning not yet
arxiv14 Diffusion Models Are Real-Time Game Engines
arxiv14 LongVILA: Scaling Long-Context Visual Language Models for Long Videos not yet
arxiv14 DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search not yet
arxiv14 From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future not yet
arxiv13 LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
arxiv13 Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time not yet
arxiv13 Real-Time Video Generation with Pyramid Attention Broadcast not yet
arxiv13 A universal neutral-atom quantum computer with individual optical addressing and non-destructive readout not yet
arxiv13 Evaluating Modern Approaches in 3D Scene Reconstruction: NeRF vs Gaussian-Based Methods not yet
arxiv13 Attention Mechanism and Context Modeling System for Text Mining Machine Translation not yet
arxiv13 Advanced User Credit Risk Prediction Model using LightGBM, XGBoost and Tabnet with SMOTEENN not yet
arxiv12 Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
arxiv12 A Survey on Benchmarks of Multimodal Large Language Models not yet
arxiv12 LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs not yet
arxiv12 Harnessing Earnings Reports for Stock Predictions: A QLoRA-Enhanced LLM Approach
arxiv12 Capsule Vision 2024 Challenge: Multi-Class Abnormality Classification for Video Capsule Endoscopy not yet
arxiv12 Enhancing Healthcare through Large Language Models: A Study on Medical Question Answering not yet
arxiv12 GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI not yet
arxiv12 Segment Anything in Medical Images and Videos: Benchmark and Deployment not yet
arxiv12 MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine not yet
arxiv11 OmniRe: Omni Urban Scene Reconstruction not yet
arxiv11 WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling not yet
arxiv11 Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input not yet
arxiv11 Jamba-1.5: Hybrid Transformer-Mamba Models at Scale not yet
arxiv11 A Tighter Complexity Analysis of SparseGPT not yet
arxiv11 Critique-out-Loud Reward Models not yet
arxiv11 Graph Retrieval-Augmented Generation: A Survey not yet
arxiv11 Dynamic Hypergraph-Enhanced Prediction of Sequential Medical Visits not yet
arxiv11 VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents not yet
arxiv11 Deep Reinforcement Learning for Robotics: A Survey of Real-World Successes not yet
arxiv11 Language Model Can Listen While Speaking not yet
arxiv10 Review: Quantum Metrology and Sensing with Many-Body Systems not yet
arxiv10 MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans? not yet
arxiv10 Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models not yet
arxiv10 Algorithm Research of ELMo Word Embedding and Deep Learning Multimodal Transformer in Image Description not yet
arxiv10 ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities not yet
arxiv10 A comparative study of generative adversarial networks for image recognition algorithms based on deep learning and traditional methods not yet
arxiv10 Research on Autonomous Driving Decision-making Strategies based Deep Reinforcement Learning not yet
arxiv10 A Survey of Mamba not yet
arxiv10 DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency not yet
arxiv9 Text classification optimization algorithm based on graph neural network not yet
arxiv9 MLR-Copilot: Autonomous Machine Learning Research based on Large Language Models Agents not yet
arxiv9 ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM not yet
arxiv9 Rhyme-aware Chinese lyric generator based on GPT not yet
arxiv9 Antidote: Post-fine-tuning Safety Alignment for Large Language Models against Harmful Fine-tuning not yet
arxiv9 A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning not yet
arxiv9 ECG-FM: An Open Electrocardiogram Foundation Model not yet
arxiv9 Applying Conditional Generative Adversarial Networks for Imaging Diagnosis not yet
arxiv9 DriveArena: A Closed-loop Generative Simulation Platform for Autonomous Driving not yet
arxiv8 ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model not yet
arxiv8 Quantum Convolutional Neural Networks are (Effectively) Classically Simulable not yet
arxiv8 Convolutional Neural Networks for Predictive Modeling of Lung Disease not yet
arxiv8 Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation not yet
arxiv8 LLM Pruning and Distillation in Practice: The Minitron Approach not yet
arxiv8 Reefknot: A Comprehensive Benchmark for Relation Hallucination Evaluation, Analysis and Mitigation in Multimodal Large Language Models not yet
arxiv8 RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation not yet
arxiv8 The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models not yet
arxiv8 Robust Domain Generalization for Multi-modal Object Recognition not yet
arxiv8 A Survey of NL2SQL with Large Language Models: Where are we, and where are we going? not yet
arxiv8 Data Poisoning in LLMs: Jailbreak-Tuning and Scaling Laws not yet
arxiv8 Segment anything model 2: an application to 2D and 3D medical images not yet
arxiv8 SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement not yet
arxiv7 Machine Learning-Based Research on the Adaptability of Adolescents to Online Education not yet
arxiv7 Foundation Models for Music: A Survey not yet
arxiv7 Systematic Evaluation of LLM-as-a-Judge in LLM Alignment Tasks: Explainable Metrics and Diverse Prompt Templates not yet
arxiv7 Research on Improved U-net Based Remote Sensing Image Segmentation Algorithm not yet
arxiv7 Cross-border Commodity Pricing Strategy Optimization via Mixed Neural Network for Time Series Analysis not yet
arxiv7 ACE: A Cross-Platform Visual-Exoskeletons System for Low-Cost Dexterous Teleoperation not yet
arxiv7 MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models not yet
arxiv7 Asymptotically Good Quantum Codes with Transversal Non-Clifford Gates not yet
arxiv7 Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents not yet
arxiv7 HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Augmented Generation for Efficient Information Extraction not yet
arxiv7 Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation not yet
arxiv7 Biomedical SAM 2: Segment Anything in Biomedical Images and Videos not yet
arxiv7 Interactive 3D Medical Image Segmentation with SAM 2 not yet
arxiv7 CYBERSECEVAL 3: Advancing the Evaluation of Cybersecurity Risks and Capabilities in Large Language Models not yet
arxiv7 CFBench: A Comprehensive Constraints-Following Benchmark for LLMs not yet
arxiv7 Alleviating Hallucination in Large Vision-Language Models with Active Retrieval Augmentation not yet
arxiv7 Contrastive Graph Representation Learning with Adversarial Cross-view Reconstruction and Information Bottleneck not yet
arxiv6 Self-Improving Diffusion Models with Synthetic Data not yet
arxiv6 WebPilot: A Versatile and Autonomous Multi-Agent System for Web Task Execution with Strategic Exploration not yet
arxiv6 A Survey on Evaluation of Multimodal Large Language Models
arxiv6 The Mamba in the Llama: Distilling and Accelerating Hybrid Models not yet
arxiv6 Splatt3R: Zero-shot Gaussian Splatting from Uncalibrated Image Pairs not yet
arxiv6 The Ultimate Guide to Fine-Tuning LLMs from Basics to Breakthroughs: An Exhaustive Review of Technologies, Research, Best Practices, Applied Research Challenges and Opportunities not yet
arxiv6 BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks on Large Language Models not yet
arxiv6 LLM-PBE: Assessing Data Privacy in Large Language Models not yet
arxiv6 Scalable Autoregressive Image Generation with Mamba not yet
arxiv6 Transformers are Minimax Optimal Nonparametric In-Context Learners not yet
arxiv6 To Code, or Not To Code? Exploring Impact of Code in Pre-training
arxiv6 LoopSplat: Loop Closure by Registering 3D Gaussian Splats not yet
arxiv6 Classifier-Free Guidance is a Predictor-Corrector not yet
arxiv6 Evaluating Research Quality with Large Language Models: An Analysis of ChatGPT's Effectiveness with Different Settings and Inputs not yet
arxiv6 Physics-Informed Kolmogorov-Arnold Networks for Power System Dynamics not yet
arxiv6 Fast John Ellipsoid Computation with Differential Privacy Optimization not yet
arxiv6 Polynomial-time tolerant testing stabilizer states not yet
arxiv6 UniPortrait: A Unified Framework for Identity-Preserving Single- and Multi-Human Image Personalization not yet
arxiv6 SMILES-Mamba: Chemical Mamba Foundation Models for Drug ADMET Prediction not yet
arxiv6 SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning not yet
arxiv6 UniBench: Visual Reasoning Requires Rethinking Vision-Language Beyond Scaling not yet
arxiv6 Understanding the Performance and Estimating the Cost of LLM Fine-Tuning not yet
arxiv6 SAM 2 in Robotic Surgery: An Empirical Evaluation for Robustness and Generalization in Surgical Video Segmentation not yet
arxiv6 Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations not yet
arxiv6 Floquet engineering of interactions and entanglement in periodically driven Rydberg chains not yet
arxiv6 Evaluating and Enhancing LLMs Agent based on Theory of Mind in Guandan: A Multi-Player Cooperative Game under Imperfect Information not yet
arxiv6 MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization not yet
arxiv6 Mini-Monkey: Alleviating the Semantic Sawtooth Effect for Lightweight MLLMs via Complementary Image Pyramid not yet
arxiv6 Transformers are Universal In-context Learners not yet
arxiv6 Deep Learning in Medical Image Classification from MRI-based Brain Tumor Images not yet
arxiv6 OmniParser for Pure Vision Based GUI Agent not yet
arxiv6 Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models not yet
arxiv5 Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning not yet
arxiv5 Can We Leave Deepfake Data Behind in Training Deepfake Detector? not yet
arxiv5 Safety Layers in Aligned Large Language Models: The Key to LLM Security not yet
arxiv5 SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners not yet
arxiv5 A Survey on Evaluating Large Language Models in Code Generation Tasks not yet
arxiv5 FA-YOLO: Research On Efficient Feature Selection YOLO Improved Algorithm Based On FMDS and AGMF Modules not yet
arxiv5 Beyond Uncertainty: Evidential Deep Learning for Robust Video Temporal Grounding not yet
arxiv5 LeMON: Learning to Learn Multi-Operator Networks not yet
arxiv5 In-Context Imitation Learning via Next-Token Prediction not yet
arxiv5 LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation not yet
arxiv5 TourSynbio: A Multi-Modal Large Model and Agent Framework to Bridge Text and Protein Sequences for Protein Engineering not yet
arxiv5 Unveiling the Statistical Foundations of Chain-of-Thought Prompting Methods not yet
arxiv5 Advancing Humanoid Locomotion: Mastering Challenging Terrains with Denoising World Model Learning not yet
arxiv5 Video-CCAM: Enhancing Video-Language Understanding with Causal Cross-Attention Masks for Short and Long Videos not yet
arxiv5 DynaSurfGS: Dynamic Surface Reconstruction with Planar-based Gaussian Splatting not yet
arxiv5 PIE: Parkour with Implicit-Explicit Learning Framework for Legged Robots not yet
arxiv5 Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic not yet
arxiv5 Selective Preference Optimization via Token-Level Reward Function Estimation not yet
arxiv5 CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities not yet
arxiv5 The AI Risk Repository: A Comprehensive Meta-Review, Database, and Taxonomy of Risks From Artificial Intelligence not yet
arxiv5 Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications not yet
arxiv5 Nothing in Excess: Mitigating the Exaggerated Safety for LLMs via Safety-Conscious Activation Steering not yet
arxiv5 TrackGo: A Flexible and Efficient Method for Controllable Video Generation not yet
arxiv5 MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding not yet
arxiv5 HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models in Resource-Constrained Environments not yet
arxiv5 Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations not yet
arxiv5 HMoE: Heterogeneous Mixture of Experts for Language Modeling not yet
arxiv5 AI-Driven Review Systems: Evaluating LLMs in Scalable and Bias-Aware Academic Reviews not yet
arxiv5 MeshFormer: High-Quality Mesh Generation with 3D-Guided Reconstruction Model not yet
arxiv5 Transferring Backdoors between Large Language Models by Knowledge Distillation not yet
arxiv5 TableBench: A Comprehensive and Complex Benchmark for Table Question Answering not yet
arxiv5 Authorship Attribution in the Era of LLMs: Problems, Methodologies, and Challenges not yet
arxiv5 A Hassle-free Algorithm for Private Learning in Practice: Don't Use Tree Aggregation, Use BLTs not yet
arxiv5 FlashGS: Efficient 3D Gaussian Splatting for Large-scale and High-resolution Rendering not yet
arxiv5 Language Models as Models of Language not yet
arxiv5 Can LLMs Replace Manual Annotation of Software Engineering Artifacts? not yet
arxiv5 Affective Computing in the Era of Large Language Models: A Survey from the NLP Perspective not yet
arxiv5 MMREC: LLM Based Multi-Modal Recommender System not yet
arxiv5 CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases not yet
arxiv5 Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks not yet
arxiv5 Attacks and Defenses for Generative Diffusion Models: A Comprehensive Survey not yet
arxiv5 Synthesizing Text-to-SQL Data from Weak and Strong LLMs
arxiv5 Huge Ensembles Part I: Design of Ensemble Weather Forecasts using Spherical Fourier Neural Operators not yet
arxiv5 Compromising Embodied Agents with Contextual Backdoor Attacks not yet
arxiv5 VidGen-1M: A Large-Scale Dataset for Text-to-video Generation not yet
arxiv5 Designing Multi-layered Runtime Guardrails for Foundation Model Based Agents: Swiss Cheese Model for AI Safety by Design not yet
arxiv5 Zero-Shot Surgical Tool Segmentation in Monocular Video Using Segment Anything Model 2 not yet
arxiv5 The Quest for the Right Mediator: A History, Survey, and Theoretical Grounding of Causal Interpretability not yet
arxiv5 A Comprehensive Review of Multimodal Large Language Models: Performance and Challenges Across Different Tasks not yet
arxiv5 RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework not yet
arxiv5 BioRAG: A RAG-LLM Framework for Biological Question Reasoning not yet
arxiv5 MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities not yet
arxiv5 Coarse Correspondences Boost Spatial-Temporal Reasoning in Multimodal Language Model not yet
arxiv5 Virchow2: Scaling Self-Supervised Mixed Magnification Models in Pathology not yet
arxiv5 Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions not yet
arxiv5 Inductive or Deductive? Rethinking the Fundamental Reasoning Abilities of LLMs not yet
arxiv4 VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion Transformers not yet
arxiv4 Dynamic Self-Consistency: Leveraging Reasoning Paths for Efficient LLM Sampling not yet
arxiv4 Beyond Preferences in AI Alignment not yet
arxiv4 Efficient LLM Scheduling by Learning to Rank not yet
arxiv4 Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation not yet
arxiv4 SpikingSSMs: Learning Long Sequences with Sparse and Parallel Spiking State Space Models not yet
arxiv4 OctFusion: Octree-based Diffusion Models for 3D Shape Generation not yet
arxiv4 Agentic Retrieval-Augmented Generation for Time Series Analysis not yet
arxiv4 One-layer transformers fail to solve the induction heads task not yet
arxiv4 Biomedical Large Languages Models Seem not to be Superior to Generalist Models on Unseen Medical Data not yet
arxiv4 TranSplat: Generalizable 3D Gaussian Splatting from Sparse Multi-View Images with Transformers not yet
arxiv4 Segment Any Mesh: Zero-shot Mesh Part Segmentation via Lifting Segment Anything 2 to 3D not yet
arxiv4 Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation not yet
arxiv4 Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler not yet
arxiv4 Evidential Deep Partial Multi-View Classification With Discount Fusion not yet
arxiv4 Convergence of Unadjusted Langevin in High Dimensions: Delocalization of Bias not yet
arxiv4 Unleashing the Potential of SAM2 for Biomedical Images and Videos: A Survey not yet
arxiv4 NanoFlow: Towards Optimal Large Language Model Serving Throughput not yet
arxiv4 SQL-GEN: Bridging the Dialect Gap for Text-to-SQL Via Synthetic Data And Model Merging not yet
arxiv4 DUNE Phase II: Scientific Opportunities, Detector Concepts, Technological Solutions not yet
arxiv4 Non-Homophilic Graph Pre-Training and Prompt Learning not yet
arxiv4 MEDCO: Medical Education Copilots Based on A Multi-Agent Framework not yet
arxiv4 Better Debugging: Combining Static Analysis and LLMs for Explainable Crashing Fault Localization not yet
arxiv4 A Deconfounding Approach to Climate Model Bias Correction not yet
arxiv4 Let Community Rules Be Reflected in Online Content Moderation not yet
arxiv4 How Susceptible are LLMs to Influence in Prompts? not yet
arxiv4 Hermes 3 Technical Report not yet
arxiv4 AppAgent v2: Advanced Agent for Flexible Mobile Interactions not yet
arxiv4 DreamFactory: Pioneering Multi-Scene Long Video Generation with a Multi-Agent Framework not yet
arxiv4 Iterative Object Count Optimization for Text-to-image Diffusion Models not yet
arxiv4 Bidirectional Gated Mamba for Sequential Recommendation not yet
arxiv4 Denoising Pre-Training and Customized Prompt Learning for Efficient Multi-Behavior Sequential Recommendation not yet
arxiv4 HITS: High-coverage LLM-based Unit Test Generation via Method Slicing not yet
arxiv4 KAN4TSF: Are KAN and KAN-based models Effective for Time Series Forecasting? not yet
arxiv4 GSLoc: Efficient Camera Pose Refinement via 3D Gaussian Splatting not yet
arxiv4 Kilometer-Scale Convection Allowing Model Emulation using Generative Diffusion Modeling not yet
arxiv4 AnyGraph: Graph Foundation Model in the Wild not yet
arxiv4 Privacy-preserving Universal Adversarial Defense for Black-box Models not yet
arxiv4 OpenCity: Open Spatio-Temporal Foundation Models for Traffic Prediction not yet
arxiv4 SoK: Runtime Integrity not yet
arxiv4 Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models not yet
arxiv4 NeuRodin: A Two-stage Framework for High-Fidelity Neural Surface Reconstruction not yet
arxiv4 FFAA: Multimodal Large Language Model based Explainable Open-World Face Forgery Analysis Assistant not yet
arxiv4 BLADE: Benchmarking Language Model Agents for Data-Driven Science not yet
arxiv4 Out-of-distribution generalization via composition: a lens through induction heads in Transformers not yet
arxiv4 V2X-VLM: End-to-End V2X Cooperative Autonomous Driving Through Large Vision-Language Models not yet
arxiv4 LLMJudge: LLMs for Relevance Judgments not yet
arxiv4 The Fellowship of the LLMs: Multi-Agent Workflows for Synthetic Preference Optimization Dataset Generation not yet
arxiv4 A Mechanistic Interpretation of Syllogistic Reasoning in Auto-Regressive Language Models not yet
arxiv4 Study of MRI-compatible Notched Plastic Ultrasonic Stator with FEM Simulation and Holography Validation not yet
arxiv4 Activation Space Selectable Kolmogorov-Arnold Networks not yet
arxiv4 TurboEdit: Instant text-based image editing not yet
arxiv4 Benchmarking the Capabilities of Large Language Models in Transportation System Engineering: Accuracy, Consistency, and Reasoning Behaviors not yet
arxiv4 Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding not yet
arxiv4 Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning not yet
arxiv4 Squeezed states of light after high-harmonic generation in excited atomic systems not yet
arxiv4 Kolmogorov-Arnold Networks (KAN) for Time Series Classification and Robust Analysis not yet
arxiv4 Enhancing Visual Question Answering through Ranking-Based Hybrid Training and Multimodal Fusion not yet
arxiv4 The Design of Autonomous UAV Prototypes for Inspecting Tunnel Construction Environment not yet
arxiv4 ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area not yet
arxiv4 Prompt-Based Segmentation at Multiple Resolutions and Lighting Conditions using Segment Anything Model 2 not yet
arxiv4 Stabilizer bootstrapping: A recipe for efficient agnostic tomography and magic estimation not yet
arxiv4 Med42-v2: A Suite of Clinical LLMs not yet
arxiv4 LaWa: Using Latent Space for In-Generation Image Watermarking not yet
arxiv4 A Decoding Acceleration Framework for Industrial Deployable LLM-based Recommender Systems not yet
arxiv4 Research on Heterogeneous Computation Resource Allocation based on Data-driven Method not yet
arxiv4 Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network not yet
arxiv4 VACoDe: Visual Augmented Contrastive Decoding not yet
arxiv4 Revisiting Multi-Modal LLM Evaluation not yet
arxiv4 Performance Analysis of FAS-Aided NOMA-ISAC: A Backscattering Scenario not yet
arxiv4 Multi-Turn Context Jailbreak Attack on Large Language Models From First Principles not yet
arxiv4 Perceive, Reflect, and Plan: Designing LLM Agent for Goal-Directed City Navigation without Instructions not yet
arxiv4 Building Machines that Learn and Think with People
arxiv4 Decoding Biases: Automated Methods and LLM Judges for Gender Bias Detection in Language Models not yet
arxiv4 Achieving Human Level Competitive Robot Table Tennis not yet
arxiv4 From Data to Story: Towards Automatic Animated Data Video Creation with LLM-based Multi-Agent Systems not yet
arxiv4 Improving LLM-based Unit test generation via Template-based Repair not yet
arxiv4 500xCompressor: Generalized Prompt Compression for Large Language Models not yet
arxiv4 Data-Driven Stochastic Closure Modeling via Conditional Diffusion Model and Neural Operator not yet
arxiv4 Quantum simulation of dynamical gauge theories in periodically driven Rydberg atom arrays not yet
arxiv4 Operationalizing Contextual Integrity in Privacy-Conscious Assistants not yet
arxiv4 First search for dark photon dark matter with a MADMAX prototype not yet
arxiv4 Potential Hessian Ascent: The Sherrington-Kirkpatrick Model not yet
arxiv4 SpecRover: Code Intent Extraction via LLMs not yet
arxiv4 Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models not yet
arxiv4 Reinforcement Learning for an Efficient and Effective Malware Investigation during Cyber Incident Response not yet
arxiv4 Differentiable MadNIS-Lite not yet
arxiv4 MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models not yet
arxiv4 Wave-Mamba: Wavelet State Space Model for Ultra-High-Definition Low-Light Image Enhancement not yet
arxiv4 VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling not yet
arxiv4 IG-SLAM: Instant Gaussian SLAM not yet
arxiv4 On the Resilience of Multi-Agent Systems with Malicious Agents not yet
arxiv4 Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention not yet
arxiv4 Matched Guiding and Controlled Injection in Dark-Current-Free, 10-GeV-Class, Channel-Guided Laser Plasma Accelerators not yet
arxiv4 AutoM3L: An Automated Multimodal Machine Learning Framework with Large Language Models not yet
arxiv4 End-to-End Protocol for High-Quality QAOA Parameters with Few Shots not yet
arxiv4 SegStitch: Multidimensional Transformer for Robust and Efficient Medical Imaging Segmentation not yet
arxiv4 3D U-KAN Implementation for Multi-modal MRI Brain Tumor Segmentation not yet
arxiv4 Generative Learning of the Solution of Parametric Partial Differential Equations Using Guided Diffusion Models and Virtual Observations not yet
arxiv3 Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model not yet
arxiv3 Deep Feature Embedding for Tabular Data not yet
arxiv3 Generalizing Deepfake Video Detection with Plug-and-Play: Video-Level Blending and Spatiotemporal Adapter Tuning not yet
arxiv3 Maven: A Multimodal Foundation Model for Supernova Science not yet
arxiv3 Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems
arxiv3 Policy Adaptation via Language Optimization: Decomposing Tasks for Few-Shot Imitation not yet
arxiv3 SVDD 2024: The Inaugural Singing Voice Deepfake Detection Challenge not yet
arxiv3 Explicit Folded Reed-Solomon and Multiplicity Codes Achieve Relaxed Generalized Singleton Bounds not yet
arxiv3 Drone-assisted Road Gaussian Splatting with Cross-view Uncertainty not yet
arxiv3 Robo-GS: A Physics Consistent Spatial-Temporal Model for Robotic Arm with Hybrid Representation not yet
arxiv3 GINN-KAN: Interpretability pipelining with applications in Physics Informed Neural Networks not yet
arxiv3 Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning not yet
arxiv3 Snap and Diagnose: An Advanced Multimodal Retrieval System for Identifying Plant Diseases in the Wild not yet

※ 被引用数は更新日における NASA ADSのデータを参照しています
https://ui.adsabs.harvard.edu/

0
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?