1

More than 1 year has passed since last update.

【2024年6月公開 Arxiv論文ランキング】2406.xxxxx

0

Last updated at 2024-12-09Posted at 2024-10-01

AI論文解説 Youtubeチャンネル: AI時代の羅針盤

2024年6月頃に公開されたcsカテゴリの論文 (ID: 2406.xxxxx)を被引用数のデータを元にランキングしています。ランキングは随時更新します。
(2024年12月9日更新)

被引用数	タイトル	動画
146	ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools	not yet
106	OpenVLA: An Open-Source Vision-Language-Action Model	not yet
93	Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs	not yet
87	DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence	not yet
79	Depth Anything V2	not yet
77	The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale	not yet
76	Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation	not yet
74	MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark	not yet
71	VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs	not yet
61	A Survey on Large Language Models for Code Generation
59	Nemotron-4 340B Technical Report	not yet
58	From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline	not yet
57	Scaling and evaluating sparse autoencoders	not yet
52	ShareGPT4Video: Improving Video Understanding and Generation with Better Captions	not yet
51	Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts	not yet
49	U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation	not yet
47	Convolutional Kolmogorov-Arnold Networks	not yet
46	Autoregressive Image Generation without Vector Quantization	not yet
45	The Prompt Report: A Systematic Survey of Prompting Techniques
44	Refusal in Language Models Is Mediated by a Single Direction	not yet
41	Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing	not yet
40	DataComp-LM: In search of the next generation of training sets for language models	not yet
39	Long Context Transfer from Language to Vision	not yet
39	BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions	not yet
38	CodeGemma: Open Code Models Based on Gemma	not yet
38	Improving Alignment and Robustness with Circuit Breakers	not yet
36	Scaling Synthetic Data Creation with 1,000,000,000 Personas	not yet
36	HelpSteer2: Open-source dataset for training top-performing reward models	not yet
34	Improve Mathematical Reasoning in Language Models by Automated Process Supervision	not yet
33	An Empirical Study of Mamba-based Language Models	not yet
33	Kolmogorov-Arnold Networks for Time Series: Bridging Predictive Power and Interpretability	not yet
33	AIFS -- ECMWF's data-driven forecasting system	not yet
32	Time Series Modeling for Heart Rate Prediction: From ARIMA to Transformers	not yet
31	Mixture-of-Agents Enhances Large Language Model Capabilities
30	fKAN: Fractional Kolmogorov-Arnold Networks with trainable Jacobi basis functions	not yet
30	Seed-TTS: A Family of High-Quality Versatile Speech Generation Models	not yet
29	GKAN: Graph Kolmogorov-Arnold Networks	not yet
29	A Temporal Kolmogorov-Arnold Transformer for Time Series Forecasting	not yet
29	FourierKAN-GCF: Fourier Kolmogorov-Arnold Network -- An Effective and Efficient Feature Transformation for Graph Collaborative Filtering	not yet
28	On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A Survey	not yet
27	LiveBench: A Challenging, Contamination-Free LLM Benchmark	not yet
27	Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation	not yet
26	A Survey of Large Language Models for Financial Applications: Progress, Prospects and Challenges	not yet
26	An Image is Worth 32 Tokens for Reconstruction and Generation	not yet
26	PGSR: Planar-based Gaussian Splatting for Efficient and High-Fidelity Surface Reconstruction	not yet
26	Safety Alignment Should Be Made More Than Just a Few Tokens Deep	not yet
26	VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers	not yet
26	WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild	not yet
26	MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding	not yet
26	ReST-MCTS: LLM Self-Training via Process Reward Guided Tree Search*	not yet
26	Kolmogorov-Arnold Network for Satellite Image Classification in Remote Sensing	not yet
25	CU-Net: a U-Net architecture for efficient brain-tumor segmentation on BraTS 2019 dataset	not yet
25	Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling	not yet
24	APEER: Automatic Prompt Engineering Enhances Large Language Model Reranking	not yet
24	Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models
23	Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs	not yet
23	KAGNNs: Kolmogorov-Arnold Networks meet Graph Learning	not yet
23	LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training	not yet
23	MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding	not yet
23	Streamlining and standardizing software citations with The Software Citation Station	not yet
23	CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation	not yet
22	BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack	not yet
22	Simple and Effective Masked Diffusion Language Models	not yet
22	CRAG -- Comprehensive RAG Benchmark	not yet
21	MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance	not yet
21	LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks	not yet
21	SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors	not yet
21	Suitability of KANs for Computer Vision: A preliminary investigation	not yet
21	TextGrad: Automatic "Differentiation" via Text	not yet
21	XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model	not yet
21	Simplified and Generalized Masked Diffusion for Discrete Data	not yet
21	Credit Card Fraud Detection Using Advanced Transformer Model	not yet
21	Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms	not yet
21	To Believe or Not to Believe Your LLM
21	PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling	not yet
21	Two Tales of Persona in LLMs: A Survey of Role-Playing and Personalization	not yet
20	Simulating Classroom Education with LLM-Empowered Agents	not yet
20	WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs	not yet
20	Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?	not yet
20	NATURAL PLAN: Benchmarking LLMs on Natural Language Planning	not yet
20	Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation	not yet
20	CodeR: Issue Resolving with Multi-Agent and Task Graphs	not yet
20	BadRAG: Identifying Vulnerabilities in Retrieval Augmented Generation of Large Language Models	not yet
19	Demonstrating the Efficacy of Kolmogorov-Arnold Networks in Vision Tasks	not yet
19	RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold	not yet
19	HumanPlus: Humanoid Shadowing and Imitation from Humans	not yet
19	Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs	not yet
19	Grounding Image Matching in 3D with MASt3R	not yet
19	MAIRA-2: Grounded Radiology Report Generation	not yet
19	Are We Done with MMLU?	not yet
19	V-Express: Conditional Dropout for Progressive Training of Portrait Video Generation	not yet
18	RouteLLM: Learning to Route LLMs with Preference Data
18	LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs	not yet
18	rKAN: Rational Kolmogorov-Arnold Networks	not yet
18	BSRBF-KAN: A combination of B-splines and Radial Basis Functions in Kolmogorov-Arnold Networks	not yet
18	Incompressibility and spectral gaps of random circuits	not yet
18	Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B	not yet
17	Finite basis Kolmogorov-Arnold networks: domain decomposition for data-driven and physics-informed problems	not yet
17	Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT	not yet
17	Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models	not yet
17	EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees	not yet
17	Q: Improving Multi-step Reasoning for LLMs with Deliberative Planning*	not yet
17	Towards Infinite-Long Prefix in Transformer	not yet
17	Vision-LSTM: xLSTM as Generic Vision Backbone	not yet
17	SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models	not yet
17	RaDe-GS: Rasterizing Depth in Gaussian Splatting	not yet
16	Symbolic Learning Enables Self-Evolving Agents	not yet
16	CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs	not yet
16	Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA	not yet
16	GraphKAN: Enhancing Feature Extraction with Graph Kolmogorov Arnold Networks	not yet
16	Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models	not yet
16	Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges	not yet
16	DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning	not yet
16	Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models	not yet
16	Scaling Large-Language-Model-based Multi-Agent Collaboration	not yet
16	MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures	not yet
16	Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models	not yet
16	PowerInfer-2: Fast Large Language Model Inference on a Smartphone	not yet
16	MotionClone: Training-Free Motion Cloning for Controllable Video Generation	not yet
16	Benchmark Data Contamination of Large Language Models: A Survey	not yet
16	Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit	not yet
16	Spectroscopy and modeling of $^{171}$Yb Rydberg states for high-fidelity two-qubit gates	not yet
16	How to Understand Whole Software Repository?	not yet
16	D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models	not yet
15	HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale	not yet
15	The Multilingual Alignment Prism: Aligning Global and Local Preferences to Reduce Harm	not yet
15	Are Language Models Actually Useful for Time Series Forecasting?	not yet
15	One Thousand and One Pairs: A "novel" challenge for long-context language models	not yet
15	Blind Baselines Beat Membership Inference Attacks for Foundation Models	not yet
15	PKU-SafeRLHF: Towards Multi-Level Safety Alignment for LLMs with Human Preference	not yet
15	Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging	not yet
15	Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models	not yet
15	Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference	not yet
15	Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL	not yet
15	VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks	not yet
15	Unveiling the Power of Wavelets: A Wavelet-based Kolmogorov-Arnold Network for Hyperspectral Image Classification	not yet
15	CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark	not yet
15	Advanced Payment Security System:XGBoost, LightGBM and SMOTE Integrated	not yet
15	Towards Scalable Automated Alignment of LLMs: A Survey	not yet
14	APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets
14	WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models	not yet
14	MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding	not yet
14	RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models	not yet
14	GUI-WORLD: A Dataset for GUI-oriented Multimodal LLM-based Agents	not yet
14	Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback	not yet
14	Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions	not yet
14	MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance	not yet
14	CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models	not yet
14	On the Effects of Data Scale on UI Control Agents	not yet
14	Guiding a Diffusion Model with a Bad Version of Itself	not yet
14	ReLU-KAN: New Kolmogorov-Arnold Networks that Only Need Matrix Addition, Dot Multiplication, and ReLU	not yet
14	Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration	not yet
14	BoNBoN Alignment for Large Language Models and the Sweetness of Best-of-n Sampling	not yet
13	Changing Answer Order Can Decrease MMLU Accuracy	not yet
13	Kolmogorov-Arnold Graph Neural Networks	not yet
13	CodeRAG-Bench: Can Retrieval Augment Code Generation?	not yet
13	Instruction Pre-Training: Language Models are Supervised Multitask Learners	not yet
13	WebCanvas: Benchmarking Web Agents in Online Environments	not yet
13	$\tau$-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains	not yet
13	GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities	not yet
13	STAR: Scale-wise Text-to-image generation via Auto-Regressive representations	not yet
13	MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers	not yet
13	SkySenseGPT: A Fine-Grained Instruction Tuning Dataset and Model for Remote Sensing Vision-Language Understanding	not yet
13	OmniH2O: Universal and Dexterous Human-to-Humanoid Whole-Body Teleoperation and Learning	not yet
13	What If We Recaption Billions of Web Images with LLaMA-3?	not yet
13	Large Language Model Unlearning via Embedding-Corrupted Prompts	not yet
13	The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models	not yet
13	Towards Semantic Equivalence of Tokenization in Multimodal LLM	not yet
13	Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image	not yet
13	RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation	not yet
13	Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models	not yet
13	Transformers need glasses! Information over-squashing in language tasks	not yet
13	Jailbreak Vision Language Models via Bi-Modal Adversarial Prompt	not yet
13	Scalable MatMul-free Language Modeling	not yet
13	iKAN: Global Incremental Learning with KAN for Human Activity Recognition Across Heterogeneous Datasets	not yet
13	Unlocking Guidance for Discrete State-Space Diffusion and Flow Models	not yet
12	LLaRA: Supercharging Robot Learning Data for Vision-Language Policy	not yet
12	Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation	not yet
12	Understanding and Mitigating Language Confusion in LLMs	not yet
12	ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation	not yet
12	From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models	not yet
12	Modular Pluralism: Pluralistic Alignment via Multi-LLM Collaboration	not yet
12	NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking	not yet
12	Consistency Models Made Easy	not yet
12	A Benchmarking Study of Kolmogorov-Arnold Networks on Tabular Data	not yet
12	Generative AI Misuse: A Taxonomy of Tactics and Insights from Real-World Data	not yet
12	Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99%	not yet
12	GUICourse: From General Vision Language Models to Versatile GUI Agents	not yet
12	From Pixels to Prose: A Large Dataset of Dense Image Captions	not yet
12	Training-free Camera Control for Video Generation	not yet
12	Pandora: Towards General World Model with Natural Language Actions and Video States	not yet
12	Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models	not yet
12	GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices	not yet
12	One-Step Effective Diffusion Network for Real-World Image Super-Resolution	not yet
12	Delving into ChatGPT usage in academic writing through excess vocabulary	not yet
12	Parallelizing Linear Transformers with the Delta Rule over Sequence Length	not yet
12	Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters	not yet
12	How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States	not yet
12	LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model	not yet
12	Computational Limits of Low-Rank Adaptation (LoRA) for Transformer-Based Models	not yet
12	Enhance Image-to-Image Generation with LLaVA-generated Prompts	not yet
12	Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching	not yet
12	When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs	not yet
12	Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses	not yet
12	Dimba: Transformer-Mamba Diffusion Models	not yet
12	$\Delta$-DiT: A Training-Free Acceleration Method Tailored for Diffusion Transformers	not yet
11	The Remarkable Robustness of LLMs: Stages of Inference?	not yet
11	Resolving Discrepancies in Compute-Optimal Scaling of Language Models	not yet
11	Manipulate-Anything: Automating Real-World Robots using Vision-Language Models	not yet
11	Application of Multimodal Fusion Deep Learning Model in Disease Recognition	not yet
11	LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference	not yet
11	Image anomaly detection and prediction scheme based on SSA optimized ResNet50-BiGRU model	not yet
11	SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model	not yet
11	Intrinsic Evaluation of Unlearning Using Parametric Knowledge Traces	not yet
11	MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens	not yet
11	Vul-RAG: Enhancing LLM-based Vulnerability Detection via Knowledge-level RAG	not yet
11	Diffusion Models in Low-Level Vision: A Survey	not yet
11	WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences	not yet
11	A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners	not yet
11	LVBench: An Extreme Long Video Understanding Benchmark	not yet
11	McEval: Massively Multilingual Code Evaluation	not yet
11	MultiTrust: A Comprehensive Benchmark Towards Trustworthy Multimodal Large Language Models	not yet
11	LLM Dataset Inference: Did you train on my dataset?	not yet
11	Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference Optimization	not yet
11	RATT: A Thought Structure for Coherent and Correct LLM Reasoning	not yet
11	Safeguarding Large Language Models: A Survey	not yet
11	Teams of LLM Agents can Exploit Zero-Day Vulnerabilities	not yet
11	The Dawn of Natural Language to SQL: Are We Fully Ready?	not yet
10	OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding	not yet
10	Revisiting Backdoor Attacks against Large Vision-Language Models	not yet
10	Navigating LLM Ethics: Advancements, Challenges, and Future Directions	not yet
10	Localized statistics decoding: A parallel decoding algorithm for quantum low-density parity-check codes	not yet
10	Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers	not yet
10	Preference Tuning For Toxicity Mitigation Generalizes Across Languages	not yet
10	AudioBench: A Universal Benchmark for Audio Large Language Models	not yet
10	Adversarial Attacks on Multimodal Agents	not yet
10	MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMs	not yet
10	DiTTo-TTS: Efficient and Scalable Zero-Shot Text-to-Speech with Diffusion Transformer	not yet
10	Avoiding Copyright Infringement via Large Language Model Unlearning	not yet
10	L4GM: Large 4D Gaussian Reconstruction Model	not yet
10	ControlVAR: Exploring Controllable Visual Autoregressive Modeling	not yet
10	VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding	not yet
10	Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?	not yet
10	RGFN: Synthesizable Molecular Generation Using GFlowNets	not yet
10	Scaling Laws in Linear Regression: Compute, Parameters, and Data	not yet
10	CFG++: Manifold-constrained Classifier Free Guidance for Diffusion Models	not yet
10	Open-LLM-Leaderboard: From Multi-choice to Open-style Questions for LLMs Evaluation, Benchmark, and Arena	not yet
10	BAKU: An Efficient Transformer for Multi-Task Policy Learning	not yet
10	Beyond Model Collapse: Scaling Up with Synthesized Data Requires Verification	not yet
10	4Real: Towards Photorealistic 4D Scene Generation via Video Diffusion Models	not yet
10	MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models	not yet
10	Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion	not yet
10	Multistep Distillation of Diffusion Models via Moment Matching	not yet
10	Does your data spark joy? Performance gains from domain upsampling at the end of training	not yet
10	The Challenges of Evaluating LLM Applications: An Analysis of Automated, Human, and LLM-Based Approaches	not yet
10	Demystifying the Compression of Mixture-of-Experts Through a Unified Framework	not yet
10	DrEureka: Language Model Guided Sim-To-Real Transfer	not yet
10	An Enhanced Encoder-Decoder Network Architecture for Reducing Information Loss in Image Semantic Segmentation	not yet
10	The Geometry of Categorical and Hierarchical Concepts in Large Language Models	not yet
10	Learning Temporally Consistent Video Depth from Video Diffusion Priors	not yet
10	UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation	not yet
10	Enhancing Zero-shot Text-to-Speech Synthesis with Human Feedback	not yet
9	Fundamental Problems With Model Editing: How Should Rational Belief Revision Work in LLMs?	not yet
9	VERISCORE: Evaluating the factuality of verifiable claims in long-form text generation	not yet
9	Exploration of Multi-Scale Image Fusion Systems in Intelligent Medical Image Analysis	not yet
9	E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS	not yet
9	Following Length Constraints in Instructions	not yet
9	BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models	not yet
9	Steering Without Side Effects: Improving Post-Deployment Control of Language Models	not yet
9	Application of Computer Deep Learning Model in Diagnosis of Pulmonary Nodules	not yet
9	Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level	not yet
9	How Do Large Language Models Acquire Factual Knowledge During Pretraining?	not yet
9	Task Me Anything	not yet
9	Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs	not yet
9	MASAI: Modular Architecture for Software-engineering AI Agents	not yet
9	DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling	not yet
9	Step-level Value Preference Optimization for Mathematical Reasoning	not yet
9	GenQA: Generating Millions of Instructions from a Handful of Prompts	not yet
9	Quantifying Variance in Evaluation Benchmarks	not yet
9	BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages	not yet
9	Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models	not yet
9	OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation	not yet
9	Coupled Ocean-Atmosphere Dynamics in a Machine Learning Earth System Model	not yet
9	Judging the Judges: A Systematic Investigation of Position Bias in Pairwise Comparative Assessments by LLMs	not yet
9	Zero-shot Image Editing with Reference Imitation	not yet
9	AI Sandbagging: Language Models can Strategically Underperform on Evaluations	not yet
9	Needle In A Multimodal Haystack	not yet
9	A Survey of Backdoor Attacks and Defenses on Large Language Models: Implications for Security Measures	not yet
9	WoCoCo: Learning Whole-Body Humanoid Control with Sequential Contacts	not yet
9	Hello Again! LLM-powered Personalized Agent for Long-term Dialogue	not yet
9	Mamba YOLO: SSMs-Based YOLO For Object Detection	not yet
9	Deep Learning Powered Estimate of The Extrinsic Parameters on Unmanned Surface Vehicles	not yet
9	Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?	not yet
9	AgentGym: Evolving Large Language Model-based Agents across Diverse Environments	not yet
9	Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving	not yet
9	Event3DGS: Event-Based 3D Gaussian Splatting for High-Speed Robot Egomotion	not yet
9	Exploring the Potential of Polynomial Basis Functions in Kolmogorov-Arnold Networks: A Comparative Study of Different Groups of Polynomials	not yet
9	Cross-Modal Safety Alignment: Is textual unlearning all you need?	not yet
9	Long and Short Guidance in Score identity Distillation for One-Step Text-to-Image Generation	not yet
9	DreamPhysics: Learning Physical Properties of Dynamic 3D Gaussians with Video Diffusion Priors	not yet
9	Are you still on track!? Catching LLM Task Drift with Activations	not yet
9	Automatic Instruction Evolving for Large Language Models	not yet
9	Towards Rationality in Language and Multimodal Agents: A Survey	not yet
9	Exploration of Attention Mechanism-Enhanced Deep Learning Models in the Mining of Medical Textual Data	not yet
8	Decoding-Time Language Model Alignment with Multiple Objectives	not yet
8	AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies	not yet
8	MotionBooth: Motion-Aware Customized Text-to-Video Generation	not yet
8	FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models	not yet
8	VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models	not yet
8	Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration	not yet
8	VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation	not yet
8	Continuous Aperture Array (CAPA)-Based Wireless Communications: Capacity Characterization	not yet
8	Risk thresholds for frontier AI	not yet
8	Fantastic Copyrighted Beasts and How (Not) to Generate Them	not yet
8	Transferable Boltzmann Generators	not yet
8	DASB - Discrete Audio and Speech Benchmark	not yet
8	CityGPT: Empowering Urban Spatial Cognition of Large Language Models	not yet
8	SpatialBot: Precise Spatial Understanding with Vision Language Models	not yet
8	DF40: Toward Next-Generation Deepfake Detection	not yet
8	VRSBench: A Versatile Vision-Language Benchmark Dataset for Remote Sensing Image Understanding	not yet
8	Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models	not yet
8	WPO: Enhancing RLHF with Weighted Preference Optimization	not yet
8	Transcendence: Generative Models Can Outperform The Experts That Train Them	not yet
8	Can LLM be a Personalized Judge?	not yet
8	DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models	not yet
8	A Survey on Human Preference Learning for Large Language Models	not yet
8	Predict Click-Through Rates with Deep Interest Network Model in E-commerce Advertising	not yet
8	Detecting and Evaluating Medical Hallucinations in Large Vision Language Models	not yet
8	STALL+: Boosting LLM-based Repository-level Code Completion with Static Analysis	not yet
8	LRM-Zero: Training Large Reconstruction Models with Synthesized Data	not yet
8	Understanding Hallucinations in Diffusion Models through Mode Interpolation	not yet
8	Understanding Jailbreak Success: A Study of Latent Space Dynamics in Large Language Models	not yet
8	Multi-Agent Software Development through Cross-Team Collaboration	not yet
8	COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing	not yet
8	MMFakeBench: A Mixed-Source Multimodal Misinformation Detection Benchmark for LVLMs	not yet
8	RVT-2: Learning Precise Manipulation from Few Demonstrations	not yet
8	Discovering Preference Optimization Algorithms with and for Large Language Models	not yet
8	Large Language Models Must Be Taught to Know What They Don't Know	not yet
8	Designing a Dashboard for Transparency and Control of Conversational AI	not yet
8	Trim 3D Gaussian Splatting for Accurate Geometry Representation	not yet
8	Effectively Compress KV Heads for LLM	not yet
8	Mitigating Boundary Ambiguity and Inherent Bias for Text Classification in the Era of Large Language Models	not yet
8	Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation	not yet
8	RawBMamba: End-to-End Bidirectional State Space Model for Audio Deepfake Detection	not yet
8	6DMA Enhanced Wireless Network with Flexible Antenna Position and Rotation: Opportunities and Challenges	not yet
8	Machine Against the RAG: Jamming Retrieval-Augmented Generation with Blocker Documents	not yet
8	Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas: A Survey	not yet
8	UltraMedical: Building Specialized Generalists in Biomedicine	not yet
8	Lean Workbook: A large-scale Lean problem set formalized from natural language math problems	not yet
8	Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data	not yet
8	Understanding the Impact of Negative Prompts: When and How Do They Take Effect?	not yet
8	HYDRA: Model Factorization Framework for Black-Box LLM Personalization	not yet
8	Leveraging KANs For Enhanced Deep Koopman Operator Discovery	not yet
8	RKLD: Reverse KL-Divergence-based Knowledge Distillation for Unlearning Personal Information in Large Language Models	not yet
8	Process-Driven Autoformalization in Lean 4	not yet
8	DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs	not yet
8	Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation	not yet
8	Show, Don't Tell: Aligning Language Models with Demonstrated Feedback	not yet
8	LongSkywork: A Training Recipe for Efficiently Extending Context Length in Large Language Models	not yet
7	SpotlessSplats: Ignoring Distractors in 3D Gaussian Splatting	not yet
7	The SIFo Benchmark: Investigating the Sequential Instruction Following Ability of Large Language Models	not yet
7	Mamba or RWKV: Exploring High-Quality and High-Efficiency Segment Anything Model	not yet
7	UniGen: A Unified Framework for Textual Dataset Generation Using Large Language Models	not yet
7	Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation	not yet
7	Evaluating Copyright Takedown Methods for Language Models	not yet
7	Online Learning of Multiple Tasks and Their Relationships : Testing on Spam Email Data and EEG Signals Recorded in Construction Fields	not yet
7	On the Evaluation of Large Language Models in Unit Test Generation	not yet
7	Point-SAM: Promptable 3D Segmentation Model for Point Clouds	not yet
7	Quantifying AI Psychology: A Psychometrics Benchmark for Large Language Models	not yet
7	MemServe: Context Caching for Disaggregated LLM Serving with Elastic Memory Pool	not yet
7	A Complete Survey on LLM-based AI Chatbots	not yet
7	Dreamitate: Real-World Visuomotor Policy Learning via Video Generation	not yet
7	DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation	not yet
7	Adam-mini: Use Fewer Learning Rates To Gain More	not yet
7	WARP: On the Benefits of Weight Averaged Rewarded Policies	not yet
7	Trace is the Next AutoDiff: Generative Optimization with Rich Feedback, Execution Traces, and LLMs	not yet
7	LGS: A Light-weight 4D Gaussian Splatting for Efficient Surgical Scene Reconstruction	not yet
7	Can LLM Graph Reasoning Generalize beyond Pattern Memorization?	not yet
7	Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs	not yet
7	SampleAttention: Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention	not yet
7	Image Conductor: Precision Control for Interactive Video Synthesis	not yet
7	GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation	not yet
7	MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression	not yet
7	A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models	not yet
7	Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data	not yet
7	Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs	not yet
7	Timo: Towards Better Temporal Reasoning for Language Models	not yet
7	CityBench: Evaluating the Capabilities of Large Language Model as World Model	not yet
7	CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets	not yet
7	GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation	not yet
7	Nicer Than Humans: How do Large Language Models Behave in the Prisoner's Dilemma?	not yet
7	Coding Speech through Vocal Tract Kinematics	not yet
7	SWT-Bench: Testing and Validating Real-World Bug-Fixes with Code Agents	not yet
7	OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI	not yet
7	TSI-Bench: Benchmarking Time Series Imputation	not yet
7	AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention	not yet
7	AgentReview: Exploring Peer Review Dynamics with LLM Agents	not yet
7	VoCo-LLaMA: Towards Vision Compression with Large Language Models	not yet

※ 被引用数は更新日における NASA ADSのデータを参照しています
https://ui.adsabs.harvard.edu/

1

Register as a new user and use Qiita more conveniently

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

1