1

More than 1 year has passed since last update.

【2024年8月公開 Arxiv論文ランキング】2408.xxxxx

0

Last updated at 2024-12-09Posted at 2024-10-01

AI論文解説 Youtubeチャンネル: AI時代の羅針盤

2024年8月頃に公開されたcsカテゴリの論文 (ID: 2408.xxxxx)を被引用数のデータを元にランキングしています。ランキングは随時更新します。
(2024年12月9日更新)

被引用数	タイトル	動画
270	Gemma 2: Improving Open Language Models at a Practical Size
200	SAM 2: Segment Anything in Images and Videos
129	LLaVA-OneVision: Easy Visual Task Transfer	not yet
107	MiniCPM-V: A GPT-4V Level MLLM on Your Phone
101	CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer	not yet
84	Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
57	The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
50	Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model	not yet
48	Show-o: One Single Transformer to Unify Multimodal Understanding and Generation	not yet
32	CogVLM2: Visual Language Models for Image and Video Understanding	not yet
31	Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2	not yet
28	KAN 2.0: Kolmogorov-Arnold Networks Meet Science	not yet
28	VITA: Towards Open-Source Interactive Omni Multimodal LLM
26	Generative Verifiers: Reward Modeling as Next-Token Prediction	not yet
26	xGen-MM (BLIP-3): A Family of Open Large Multimodal Models	not yet
25	Self-Taught Evaluators	not yet
24	Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities
24	mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models	not yet
23	Medical SAM 2: Segment medical images as video via Segment Anything Model 2
22	ControlNeXt: Powerful and Efficient Control for Image and Video Generation	not yet
21	Towards Resilient and Efficient LLMs: A Comparative Study of Efficiency, Performance, and Adversarial Robustness	not yet
20	Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents
20	Tamper-Resistant Safeguards for Open-Weight LLMs	not yet
19	Identification of Prognostic Biomarkers for Stage III Non-Small Cell Lung Carcinoma in Female Nonsmokers Using Machine Learning	not yet
18	Building and better understanding vision-language models: insights and future directions	not yet
18	Imagen 3
18	Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining	not yet
17	Adaptive Friction in Deep Learning: Enhancing Optimizers with Sigmoid and Tanh Function	not yet
17	Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models	not yet
16	A Sharp Convergence Theory for The Probability Flow ODEs of Diffusion Models	not yet
15	Automated Design of Agentic Systems	not yet
15	Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers
15	Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models	not yet
14	Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling
14	Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming	not yet
14	Optimizing Automated Picking Systems in Warehouse Robots Using Machine Learning	not yet
14	Diffusion Models Are Real-Time Game Engines
14	LongVILA: Scaling Long-Context Visual Language Models for Long Videos	not yet
14	DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search	not yet
14	From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future	not yet
13	LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
13	Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time	not yet
13	Real-Time Video Generation with Pyramid Attention Broadcast	not yet
13	A universal neutral-atom quantum computer with individual optical addressing and non-destructive readout	not yet
13	Evaluating Modern Approaches in 3D Scene Reconstruction: NeRF vs Gaussian-Based Methods	not yet
13	Attention Mechanism and Context Modeling System for Text Mining Machine Translation	not yet
13	Advanced User Credit Risk Prediction Model using LightGBM, XGBoost and Tabnet with SMOTEENN	not yet
12	Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
12	A Survey on Benchmarks of Multimodal Large Language Models	not yet
12	LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs	not yet
12	Harnessing Earnings Reports for Stock Predictions: A QLoRA-Enhanced LLM Approach
12	Capsule Vision 2024 Challenge: Multi-Class Abnormality Classification for Video Capsule Endoscopy	not yet
12	Enhancing Healthcare through Large Language Models: A Study on Medical Question Answering	not yet
12	GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI	not yet
12	Segment Anything in Medical Images and Videos: Benchmark and Deployment	not yet
12	MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine	not yet
11	OmniRe: Omni Urban Scene Reconstruction	not yet
11	WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling	not yet
11	Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input	not yet
11	Jamba-1.5: Hybrid Transformer-Mamba Models at Scale	not yet
11	A Tighter Complexity Analysis of SparseGPT	not yet
11	Critique-out-Loud Reward Models	not yet
11	Graph Retrieval-Augmented Generation: A Survey	not yet
11	Dynamic Hypergraph-Enhanced Prediction of Sequential Medical Visits	not yet
11	VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents	not yet
11	Deep Reinforcement Learning for Robotics: A Survey of Real-World Successes	not yet
11	Language Model Can Listen While Speaking	not yet
10	Review: Quantum Metrology and Sensing with Many-Body Systems	not yet
10	MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?	not yet
10	Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models	not yet
10	Algorithm Research of ELMo Word Embedding and Deep Learning Multimodal Transformer in Image Description	not yet
10	ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities	not yet
10	A comparative study of generative adversarial networks for image recognition algorithms based on deep learning and traditional methods	not yet
10	Research on Autonomous Driving Decision-making Strategies based Deep Reinforcement Learning	not yet
10	A Survey of Mamba	not yet
10	DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency	not yet
9	Text classification optimization algorithm based on graph neural network	not yet
9	MLR-Copilot: Autonomous Machine Learning Research based on Large Language Models Agents	not yet
9	ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM	not yet
9	Rhyme-aware Chinese lyric generator based on GPT	not yet
9	Antidote: Post-fine-tuning Safety Alignment for Large Language Models against Harmful Fine-tuning	not yet
9	A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning	not yet
9	ECG-FM: An Open Electrocardiogram Foundation Model	not yet
9	Applying Conditional Generative Adversarial Networks for Imaging Diagnosis	not yet
9	DriveArena: A Closed-loop Generative Simulation Platform for Autonomous Driving	not yet
8	ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model	not yet
8	Quantum Convolutional Neural Networks are (Effectively) Classically Simulable	not yet
8	Convolutional Neural Networks for Predictive Modeling of Lung Disease	not yet
8	Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation	not yet
8	LLM Pruning and Distillation in Practice: The Minitron Approach	not yet
8	Reefknot: A Comprehensive Benchmark for Relation Hallucination Evaluation, Analysis and Mitigation in Multimodal Large Language Models	not yet
8	RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation	not yet
8	The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models	not yet
8	Robust Domain Generalization for Multi-modal Object Recognition	not yet
8	A Survey of NL2SQL with Large Language Models: Where are we, and where are we going?	not yet
8	Data Poisoning in LLMs: Jailbreak-Tuning and Scaling Laws	not yet
8	Segment anything model 2: an application to 2D and 3D medical images	not yet
8	SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement	not yet
7	Machine Learning-Based Research on the Adaptability of Adolescents to Online Education	not yet
7	Foundation Models for Music: A Survey	not yet
7	Systematic Evaluation of LLM-as-a-Judge in LLM Alignment Tasks: Explainable Metrics and Diverse Prompt Templates	not yet
7	Research on Improved U-net Based Remote Sensing Image Segmentation Algorithm	not yet
7	Cross-border Commodity Pricing Strategy Optimization via Mixed Neural Network for Time Series Analysis	not yet
7	ACE: A Cross-Platform Visual-Exoskeletons System for Low-Cost Dexterous Teleoperation	not yet
7	MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models	not yet
7	Asymptotically Good Quantum Codes with Transversal Non-Clifford Gates	not yet
7	Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents	not yet
7	HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Augmented Generation for Efficient Information Extraction	not yet
7	Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation	not yet
7	Biomedical SAM 2: Segment Anything in Biomedical Images and Videos	not yet
7	Interactive 3D Medical Image Segmentation with SAM 2	not yet
7	CYBERSECEVAL 3: Advancing the Evaluation of Cybersecurity Risks and Capabilities in Large Language Models	not yet
7	CFBench: A Comprehensive Constraints-Following Benchmark for LLMs	not yet
7	Alleviating Hallucination in Large Vision-Language Models with Active Retrieval Augmentation	not yet
7	Contrastive Graph Representation Learning with Adversarial Cross-view Reconstruction and Information Bottleneck	not yet
6	Self-Improving Diffusion Models with Synthetic Data	not yet
6	WebPilot: A Versatile and Autonomous Multi-Agent System for Web Task Execution with Strategic Exploration	not yet
6	A Survey on Evaluation of Multimodal Large Language Models
6	The Mamba in the Llama: Distilling and Accelerating Hybrid Models	not yet
6	Splatt3R: Zero-shot Gaussian Splatting from Uncalibrated Image Pairs	not yet
6	The Ultimate Guide to Fine-Tuning LLMs from Basics to Breakthroughs: An Exhaustive Review of Technologies, Research, Best Practices, Applied Research Challenges and Opportunities	not yet
6	BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks on Large Language Models	not yet
6	LLM-PBE: Assessing Data Privacy in Large Language Models	not yet
6	Scalable Autoregressive Image Generation with Mamba	not yet
6	Transformers are Minimax Optimal Nonparametric In-Context Learners	not yet
6	To Code, or Not To Code? Exploring Impact of Code in Pre-training
6	LoopSplat: Loop Closure by Registering 3D Gaussian Splats	not yet
6	Classifier-Free Guidance is a Predictor-Corrector	not yet
6	Evaluating Research Quality with Large Language Models: An Analysis of ChatGPT's Effectiveness with Different Settings and Inputs	not yet
6	Physics-Informed Kolmogorov-Arnold Networks for Power System Dynamics	not yet
6	Fast John Ellipsoid Computation with Differential Privacy Optimization	not yet
6	Polynomial-time tolerant testing stabilizer states	not yet
6	UniPortrait: A Unified Framework for Identity-Preserving Single- and Multi-Human Image Personalization	not yet
6	SMILES-Mamba: Chemical Mamba Foundation Models for Drug ADMET Prediction	not yet
6	SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning	not yet
6	UniBench: Visual Reasoning Requires Rethinking Vision-Language Beyond Scaling	not yet
6	Understanding the Performance and Estimating the Cost of LLM Fine-Tuning	not yet
6	SAM 2 in Robotic Surgery: An Empirical Evaluation for Robustness and Generalization in Surgical Video Segmentation	not yet
6	Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations	not yet
6	Floquet engineering of interactions and entanglement in periodically driven Rydberg chains	not yet
6	Evaluating and Enhancing LLMs Agent based on Theory of Mind in Guandan: A Multi-Player Cooperative Game under Imperfect Information	not yet
6	MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization	not yet
6	Mini-Monkey: Alleviating the Semantic Sawtooth Effect for Lightweight MLLMs via Complementary Image Pyramid	not yet
6	Transformers are Universal In-context Learners	not yet
6	Deep Learning in Medical Image Classification from MRI-based Brain Tumor Images	not yet
6	OmniParser for Pure Vision Based GUI Agent	not yet
6	Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models	not yet
5	Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning	not yet
5	Can We Leave Deepfake Data Behind in Training Deepfake Detector?	not yet
5	Safety Layers in Aligned Large Language Models: The Key to LLM Security	not yet
5	SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners	not yet
5	A Survey on Evaluating Large Language Models in Code Generation Tasks	not yet
5	FA-YOLO: Research On Efficient Feature Selection YOLO Improved Algorithm Based On FMDS and AGMF Modules	not yet
5	Beyond Uncertainty: Evidential Deep Learning for Robust Video Temporal Grounding	not yet
5	LeMON: Learning to Learn Multi-Operator Networks	not yet
5	In-Context Imitation Learning via Next-Token Prediction	not yet
5	LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation	not yet
5	TourSynbio: A Multi-Modal Large Model and Agent Framework to Bridge Text and Protein Sequences for Protein Engineering	not yet
5	Unveiling the Statistical Foundations of Chain-of-Thought Prompting Methods	not yet
5	Advancing Humanoid Locomotion: Mastering Challenging Terrains with Denoising World Model Learning	not yet
5	Video-CCAM: Enhancing Video-Language Understanding with Causal Cross-Attention Masks for Short and Long Videos	not yet
5	DynaSurfGS: Dynamic Surface Reconstruction with Planar-based Gaussian Splatting	not yet
5	PIE: Parkour with Implicit-Explicit Learning Framework for Legged Robots	not yet
5	Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic	not yet
5	Selective Preference Optimization via Token-Level Reward Function Estimation	not yet
5	CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities	not yet
5	The AI Risk Repository: A Comprehensive Meta-Review, Database, and Taxonomy of Risks From Artificial Intelligence	not yet
5	Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications	not yet
5	Nothing in Excess: Mitigating the Exaggerated Safety for LLMs via Safety-Conscious Activation Steering	not yet
5	TrackGo: A Flexible and Efficient Method for Controllable Video Generation	not yet
5	MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding	not yet
5	HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models in Resource-Constrained Environments	not yet
5	Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations	not yet
5	HMoE: Heterogeneous Mixture of Experts for Language Modeling	not yet
5	AI-Driven Review Systems: Evaluating LLMs in Scalable and Bias-Aware Academic Reviews	not yet
5	MeshFormer: High-Quality Mesh Generation with 3D-Guided Reconstruction Model	not yet
5	Transferring Backdoors between Large Language Models by Knowledge Distillation	not yet
5	TableBench: A Comprehensive and Complex Benchmark for Table Question Answering	not yet
5	Authorship Attribution in the Era of LLMs: Problems, Methodologies, and Challenges	not yet
5	A Hassle-free Algorithm for Private Learning in Practice: Don't Use Tree Aggregation, Use BLTs	not yet
5	FlashGS: Efficient 3D Gaussian Splatting for Large-scale and High-resolution Rendering	not yet
5	Language Models as Models of Language	not yet
5	Can LLMs Replace Manual Annotation of Software Engineering Artifacts?	not yet
5	Affective Computing in the Era of Large Language Models: A Survey from the NLP Perspective	not yet
5	MMREC: LLM Based Multi-Modal Recommender System	not yet
5	CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases	not yet
5	Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks	not yet
5	Attacks and Defenses for Generative Diffusion Models: A Comprehensive Survey	not yet
5	Synthesizing Text-to-SQL Data from Weak and Strong LLMs
5	Huge Ensembles Part I: Design of Ensemble Weather Forecasts using Spherical Fourier Neural Operators	not yet
5	Compromising Embodied Agents with Contextual Backdoor Attacks	not yet
5	VidGen-1M: A Large-Scale Dataset for Text-to-video Generation	not yet
5	Designing Multi-layered Runtime Guardrails for Foundation Model Based Agents: Swiss Cheese Model for AI Safety by Design	not yet
5	Zero-Shot Surgical Tool Segmentation in Monocular Video Using Segment Anything Model 2	not yet
5	The Quest for the Right Mediator: A History, Survey, and Theoretical Grounding of Causal Interpretability	not yet
5	A Comprehensive Review of Multimodal Large Language Models: Performance and Challenges Across Different Tasks	not yet
5	RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework	not yet
5	BioRAG: A RAG-LLM Framework for Biological Question Reasoning	not yet
5	MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities	not yet
5	Coarse Correspondences Boost Spatial-Temporal Reasoning in Multimodal Language Model	not yet
5	Virchow2: Scaling Self-Supervised Mixed Magnification Models in Pathology	not yet
5	Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions	not yet
5	Inductive or Deductive? Rethinking the Fundamental Reasoning Abilities of LLMs	not yet
4	VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion Transformers	not yet
4	Dynamic Self-Consistency: Leveraging Reasoning Paths for Efficient LLM Sampling	not yet
4	Beyond Preferences in AI Alignment	not yet
4	Efficient LLM Scheduling by Learning to Rank	not yet
4	Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation	not yet
4	SpikingSSMs: Learning Long Sequences with Sparse and Parallel Spiking State Space Models	not yet
4	OctFusion: Octree-based Diffusion Models for 3D Shape Generation	not yet
4	Agentic Retrieval-Augmented Generation for Time Series Analysis	not yet
4	One-layer transformers fail to solve the induction heads task	not yet
4	Biomedical Large Languages Models Seem not to be Superior to Generalist Models on Unseen Medical Data	not yet
4	TranSplat: Generalizable 3D Gaussian Splatting from Sparse Multi-View Images with Transformers	not yet
4	Segment Any Mesh: Zero-shot Mesh Part Segmentation via Lifting Segment Anything 2 to 3D	not yet
4	Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation	not yet
4	Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler	not yet
4	Evidential Deep Partial Multi-View Classification With Discount Fusion	not yet
4	Convergence of Unadjusted Langevin in High Dimensions: Delocalization of Bias	not yet
4	Unleashing the Potential of SAM2 for Biomedical Images and Videos: A Survey	not yet
4	NanoFlow: Towards Optimal Large Language Model Serving Throughput	not yet
4	SQL-GEN: Bridging the Dialect Gap for Text-to-SQL Via Synthetic Data And Model Merging	not yet
4	DUNE Phase II: Scientific Opportunities, Detector Concepts, Technological Solutions	not yet
4	Non-Homophilic Graph Pre-Training and Prompt Learning	not yet
4	MEDCO: Medical Education Copilots Based on A Multi-Agent Framework	not yet
4	Better Debugging: Combining Static Analysis and LLMs for Explainable Crashing Fault Localization	not yet
4	A Deconfounding Approach to Climate Model Bias Correction	not yet
4	Let Community Rules Be Reflected in Online Content Moderation	not yet
4	How Susceptible are LLMs to Influence in Prompts?	not yet
4	Hermes 3 Technical Report	not yet
4	AppAgent v2: Advanced Agent for Flexible Mobile Interactions	not yet
4	DreamFactory: Pioneering Multi-Scene Long Video Generation with a Multi-Agent Framework	not yet
4	Iterative Object Count Optimization for Text-to-image Diffusion Models	not yet
4	Bidirectional Gated Mamba for Sequential Recommendation	not yet
4	Denoising Pre-Training and Customized Prompt Learning for Efficient Multi-Behavior Sequential Recommendation	not yet
4	HITS: High-coverage LLM-based Unit Test Generation via Method Slicing	not yet
4	KAN4TSF: Are KAN and KAN-based models Effective for Time Series Forecasting?	not yet
4	GSLoc: Efficient Camera Pose Refinement via 3D Gaussian Splatting	not yet
4	Kilometer-Scale Convection Allowing Model Emulation using Generative Diffusion Modeling	not yet
4	AnyGraph: Graph Foundation Model in the Wild	not yet
4	Privacy-preserving Universal Adversarial Defense for Black-box Models	not yet
4	OpenCity: Open Spatio-Temporal Foundation Models for Traffic Prediction	not yet
4	SoK: Runtime Integrity	not yet
4	Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models	not yet
4	NeuRodin: A Two-stage Framework for High-Fidelity Neural Surface Reconstruction	not yet
4	FFAA: Multimodal Large Language Model based Explainable Open-World Face Forgery Analysis Assistant	not yet
4	BLADE: Benchmarking Language Model Agents for Data-Driven Science	not yet
4	Out-of-distribution generalization via composition: a lens through induction heads in Transformers	not yet
4	V2X-VLM: End-to-End V2X Cooperative Autonomous Driving Through Large Vision-Language Models	not yet
4	LLMJudge: LLMs for Relevance Judgments	not yet
4	The Fellowship of the LLMs: Multi-Agent Workflows for Synthetic Preference Optimization Dataset Generation	not yet
4	A Mechanistic Interpretation of Syllogistic Reasoning in Auto-Regressive Language Models	not yet
4	Study of MRI-compatible Notched Plastic Ultrasonic Stator with FEM Simulation and Holography Validation	not yet
4	Activation Space Selectable Kolmogorov-Arnold Networks	not yet
4	TurboEdit: Instant text-based image editing	not yet
4	Benchmarking the Capabilities of Large Language Models in Transportation System Engineering: Accuracy, Consistency, and Reasoning Behaviors	not yet
4	Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding	not yet
4	Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning	not yet
4	Squeezed states of light after high-harmonic generation in excited atomic systems	not yet
4	Kolmogorov-Arnold Networks (KAN) for Time Series Classification and Robust Analysis	not yet
4	Enhancing Visual Question Answering through Ranking-Based Hybrid Training and Multimodal Fusion	not yet
4	The Design of Autonomous UAV Prototypes for Inspecting Tunnel Construction Environment	not yet
4	ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area	not yet
4	Prompt-Based Segmentation at Multiple Resolutions and Lighting Conditions using Segment Anything Model 2	not yet
4	Stabilizer bootstrapping: A recipe for efficient agnostic tomography and magic estimation	not yet
4	Med42-v2: A Suite of Clinical LLMs	not yet
4	LaWa: Using Latent Space for In-Generation Image Watermarking	not yet
4	A Decoding Acceleration Framework for Industrial Deployable LLM-based Recommender Systems	not yet
4	Research on Heterogeneous Computation Resource Allocation based on Data-driven Method	not yet
4	Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network	not yet
4	VACoDe: Visual Augmented Contrastive Decoding	not yet
4	Revisiting Multi-Modal LLM Evaluation	not yet
4	Performance Analysis of FAS-Aided NOMA-ISAC: A Backscattering Scenario	not yet
4	Multi-Turn Context Jailbreak Attack on Large Language Models From First Principles	not yet
4	Perceive, Reflect, and Plan: Designing LLM Agent for Goal-Directed City Navigation without Instructions	not yet
4	Building Machines that Learn and Think with People
4	Decoding Biases: Automated Methods and LLM Judges for Gender Bias Detection in Language Models	not yet
4	Achieving Human Level Competitive Robot Table Tennis	not yet
4	From Data to Story: Towards Automatic Animated Data Video Creation with LLM-based Multi-Agent Systems	not yet
4	Improving LLM-based Unit test generation via Template-based Repair	not yet
4	500xCompressor: Generalized Prompt Compression for Large Language Models	not yet
4	Data-Driven Stochastic Closure Modeling via Conditional Diffusion Model and Neural Operator	not yet
4	Quantum simulation of dynamical gauge theories in periodically driven Rydberg atom arrays	not yet
4	Operationalizing Contextual Integrity in Privacy-Conscious Assistants	not yet
4	First search for dark photon dark matter with a MADMAX prototype	not yet
4	Potential Hessian Ascent: The Sherrington-Kirkpatrick Model	not yet
4	SpecRover: Code Intent Extraction via LLMs	not yet
4	Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models	not yet
4	Reinforcement Learning for an Efficient and Effective Malware Investigation during Cyber Incident Response	not yet
4	Differentiable MadNIS-Lite	not yet
4	MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models	not yet
4	Wave-Mamba: Wavelet State Space Model for Ultra-High-Definition Low-Light Image Enhancement	not yet
4	VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling	not yet
4	IG-SLAM: Instant Gaussian SLAM	not yet
4	On the Resilience of Multi-Agent Systems with Malicious Agents	not yet
4	Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention	not yet
4	Matched Guiding and Controlled Injection in Dark-Current-Free, 10-GeV-Class, Channel-Guided Laser Plasma Accelerators	not yet
4	AutoM3L: An Automated Multimodal Machine Learning Framework with Large Language Models	not yet
4	End-to-End Protocol for High-Quality QAOA Parameters with Few Shots	not yet
4	SegStitch: Multidimensional Transformer for Robust and Efficient Medical Imaging Segmentation	not yet
4	3D U-KAN Implementation for Multi-modal MRI Brain Tumor Segmentation	not yet
4	Generative Learning of the Solution of Parametric Partial Differential Equations Using Guided Diffusion Models and Virtual Observations	not yet
3	Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model	not yet
3	Deep Feature Embedding for Tabular Data	not yet
3	Generalizing Deepfake Video Detection with Plug-and-Play: Video-Level Blending and Spatiotemporal Adapter Tuning	not yet
3	Maven: A Multimodal Foundation Model for Supernova Science	not yet
3	Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems
3	Policy Adaptation via Language Optimization: Decomposing Tasks for Few-Shot Imitation	not yet
3	SVDD 2024: The Inaugural Singing Voice Deepfake Detection Challenge	not yet
3	Explicit Folded Reed-Solomon and Multiplicity Codes Achieve Relaxed Generalized Singleton Bounds	not yet
3	Drone-assisted Road Gaussian Splatting with Cross-view Uncertainty	not yet
3	Robo-GS: A Physics Consistent Spatial-Temporal Model for Robotic Arm with Hybrid Representation	not yet
3	GINN-KAN: Interpretability pipelining with applications in Physics Informed Neural Networks	not yet
3	Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning	not yet
3	Snap and Diagnose: An Advanced Multimodal Retrieval System for Identifying Plant Diseases in the Wild	not yet

※ 被引用数は更新日における NASA ADSのデータを参照しています
https://ui.adsabs.harvard.edu/

1

Register as a new user and use Qiita more conveniently

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

1