162 |
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution |
not yet |
45 |
Qwen2.5-Coder Technical Report |
not yet |
33 |
Emu3: Next-Token Prediction is All You Need |
|
24 |
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models |
|
24 |
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers |
|
23 |
Transforming Multidimensional Time Series into Interpretable Event Sequences for Advanced Data Mining |
not yet |
23 |
gsplat: An Open-Source Library for Gaussian Splatting |
not yet |
22 |
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning |
|
21 |
Training Language Models to Self-Correct via Reinforcement Learning |
|
20 |
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement |
not yet |
20 |
OLMoE: Open Mixture-of-Experts Language Models |
not yet |
20 |
ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis |
not yet |
19 |
Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction |
|
19 |
LLaMA-Omni: Seamless Speech Interaction with Large Language Models |
not yet |
19 |
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos |
not yet |
17 |
Evaluation of OpenAI o1: Opportunities and Challenges of AGI |
not yet |
17 |
VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation |
not yet |
17 |
Large Language Model-Based Agents for Software Engineering: A Survey |
not yet |
16 |
ReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for Robotic Manipulation |
not yet |
14 |
MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark |
not yet |
13 |
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction |
not yet |
13 |
Deep Reinforcement Learning-based Obstacle Avoidance for Robot Movement in Warehouse Environments |
not yet |
12 |
LLMs Still Can't Plan; Can LRMs? A Preliminary Evaluation of OpenAI's o1 on PlanBench |
|
12 |
OmniGen: Unified Image Generation |
|
12 |
Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation |
not yet |
11 |
An Adversarial Perspective on Machine Unlearning for AI Safety |
not yet |
11 |
Playground v3: Improving Text-to-Image Alignment with Deep-Fusion Large Language Models |
not yet |
11 |
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via a Hybrid Architecture |
not yet |
11 |
Diffusion Models Learn Low-Dimensional Distributions via Subspace Clustering |
not yet |
11 |
Large Language Models and Cognitive Science: A Comprehensive Review of Similarities, Differences, and Challenges |
not yet |
10 |
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning |
|
10 |
Unveiling the Potential of Graph Neural Networks in SME Credit Risk Assessment |
not yet |
10 |
Reducing Bias in Deep Learning Optimization: The RSGDM Approach |
not yet |
10 |
Contrastive Learning for Knowledge-Based Question Generation in Large Language Models |
not yet |
10 |
TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation |
not yet |
10 |
NVLM: Open Frontier-Class Multimodal LLMs |
|
10 |
LLMs Will Always Hallucinate, and We Need to Live With This |
|
10 |
Building Math Agents with Multi-Turn Iterative Preference Learning |
not yet |
9 |
Wasserstein Distance-Weighted Adversarial Network for Cross-Domain Credit Risk Assessment |
not yet |
9 |
Optimizing News Text Classification with Bi-LSTM and Attention Mechanism for Efficient Data Processing |
not yet |
9 |
Dynamic Fraud Detection: Integrating Reinforcement Learning into Graph Neural Networks |
not yet |
9 |
Graphical Structural Learning of rs-fMRI data in Heavy Smokers |
not yet |
9 |
Agent Workflow Memory |
|
9 |
DRAL: Deep Reinforcement Adaptive Learning for Multi-UAVs Navigation in Unknown Indoor Environment |
not yet |
9 |
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency |
not yet |
9 |
In Defense of RAG in the Era of Long-Context Language Models |
not yet |
8 |
A Lightweight GAN-Based Image Fusion Algorithm for Visible and Infrared Images |
not yet |
8 |
Direct Judgement Preference Optimization |
not yet |
8 |
Graph Neural Network Framework for Sentiment Analysis Using Syntactic Feature |
not yet |
8 |
Imagine yourself: Tuning-Free Personalized Image Generation |
|
8 |
On the Diagram of Thought |
not yet |
8 |
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale |
not yet |
8 |
Securing Large Language Models: Addressing Bias, Misinformation, and Prompt Attacks |
not yet |
8 |
Enhancing Convolutional Neural Networks with Higher-Order Numerical Difference Methods |
not yet |
8 |
Attention Heads of Large Language Models: A Survey |
not yet |
8 |
Planning In Natural Language Improves LLM Search For Code Generation |
not yet |
8 |
Classically estimating observables of noiseless quantum circuits |
not yet |
8 |
Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation |
not yet |
8 |
MarsCode Agent: AI-native Automated Bug Fixing |
not yet |
7 |
Deep Learning-Based Channel Squeeze U-Structure for Lung Nodule Detection and Segmentation |
not yet |
7 |
NEVLP: Noise-Robust Framework for Efficient Vision-Language Pre-training |
not yet |
7 |
Robot Utility Models: General Policies for Zero-Shot Deployment in New Environments |
not yet |
7 |
Improving Pretraining Data Using Perplexity Correlations |
not yet |
7 |
Enhancing Deep Learning with Optimized Gradient Descent: Bridging Numerical Methods and Neural Network Training |
not yet |
7 |
xLAM: A Family of Large Action Models to Empower AI Agent Systems |
not yet |
7 |
Configurable Foundation Models: Building LLMs from a Modular Perspective |
not yet |
7 |
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model |
not yet |
7 |
OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model |
not yet |
7 |
TinyAgent: Function Calling at the Edge |
not yet |
7 |
On-Device Language Models: A Comprehensive Review |
not yet |
6 |
EventHallusion: Diagnosing Event Hallucinations in Video LLMs |
not yet |
6 |
MaskBit: Embedding-free Image Generation via Bit Tokens |
not yet |
6 |
Uncovering Coordinated Cross-Platform Information Operations Threatening the Integrity of the 2024 U.S. Presidential Election Online Discussion |
not yet |
6 |
Axial Attention Transformer Networks: A New Frontier in Breast Cancer Detection |
not yet |
6 |
Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey |
not yet |
6 |
Alignment of Diffusion Models: Fundamentals, Challenges, and Future |
not yet |
6 |
FLoRA: Federated Fine-Tuning Large Language Models with Heterogeneous Low-Rank Adaptations |
not yet |
6 |
SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning |
not yet |
6 |
Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling |
not yet |
6 |
ToolACE: Winning the Points of LLM Function Calling |
not yet |
6 |
Diffusion Policy Policy Optimization |
not yet |
5 |
One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos |
not yet |
5 |
$O(d/T)$ Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions |
not yet |
5 |
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions |
not yet |
5 |
MonoFormer: One Transformer for Both Diffusion and Autoregression |
not yet |
5 |
EuroLLM: Multilingual Language Models for Europe |
not yet |
5 |
Looped Transformers for Length Generalization |
not yet |
5 |
Visual Prompting in Multimodal Large Language Models: A Survey |
not yet |
5 |
Archon: An Architecture Search Framework for Inference-Time Techniques |
not yet |
5 |
Scaling Diffusion Policy in Transformer to 1 Billion Parameters for Robotic Manipulation |
not yet |
5 |
Hype, Sustainability, and the Price of the Bigger-is-Better Paradigm in AI |
not yet |
5 |
Prithvi WxC: Foundation Model for Weather and Climate |
not yet |
5 |
SatFed: A Resource-Efficient LEO Satellite-Assisted Heterogeneous Federated Learning Framework |
not yet |
5 |
Comprehensive Overview of Artificial Intelligence Applications in Modern Industries |
not yet |
5 |
Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution |
not yet |
5 |
MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines |
not yet |
5 |
3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion |
not yet |
5 |
3DGS-LM: Faster Gaussian-Splatting Optimization with Levenberg-Marquardt |
not yet |
5 |
Scaling FP8 training to trillion-token LLMs |
not yet |
5 |
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think |
not yet |
5 |
Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers |
not yet |
5 |
Fit and Prune: Fast and Training-free Visual Token Pruning for Multi-modal Large Language Models |
not yet |
5 |
StruEdit: Structured Outputs Enable the Fast and Accurate Knowledge Editing for Large Language Models |
not yet |
5 |
Schr"odinger Bridge Flow for Unpaired Data Translation |
not yet |
5 |
Efficient Fine-Tuning of Large Language Models for Automated Medical Documentation |
not yet |
5 |
Adjoint Matching: Fine-tuning Flow and Diffusion Generative Models with Memoryless Stochastic Optimal Control |
not yet |
5 |
DSBench: How Far Are Data Science Agents to Becoming Data Science Experts? |
not yet |
5 |
NVRC: Neural Video Representation Compression |
not yet |
5 |
From optimal score matching to optimal sampling |
not yet |
5 |
Denoising: A Powerful Building-Block for Imaging, Inverse Problems, and Machine Learning |
not yet |
5 |
MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery |
not yet |
5 |
mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding |
not yet |
5 |
Exploring Low-Dimensional Subspaces in Diffusion Models for Controllable Image Editing |
not yet |
5 |
FMRFT: Fusion Mamba and DETR for Query Time Sequence Intersection Fish Tracking |
not yet |
5 |
LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models |
not yet |
5 |
GenAI-powered Multi-Agent Paradigm for Smart Urban Mobility: Opportunities and Challenges for Integrating Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) with Intelligent Transportation Systems |
not yet |
5 |
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action |
not yet |
4 |
Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers |
not yet |
4 |
Nonlinear Non-Hermitian Skin Effect and Skin Solitons in Temporal Photonic Feedback Lattices |
not yet |
4 |
Code Generation and Algorithmic Problem Solving Using Llama 3.1 405B |
not yet |
4 |
Surveying the MLLM Landscape: A Meta-Review of Current Surveys |
not yet |
4 |
Exploring Token Pruning in Vision State Space Models |
not yet |
4 |
Convergence of Diffusion Models Under the Manifold Hypothesis in High-Dimensions |
not yet |
4 |
A Survey on the Honesty of Large Language Models |
not yet |
4 |
Robustness of AI-based weather forecasts in a changing climate |
not yet |
4 |
Neural Collaborative Filtering to Detect Anomalies in Human Semantic Trajectories |
not yet |
4 |
Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey |
not yet |
4 |
A Survey on Multimodal Benchmarks: In the Era of Large AI Models |
not yet |
4 |
How Feature Learning Can Improve Neural Scaling Laws |
not yet |
4 |
Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale |
not yet |
4 |
In which fields can ChatGPT detect journal article quality? An evaluation of REF2021 results |
not yet |
4 |
Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation |
not yet |
4 |
EnIGMA: Enhanced Interactive Generative Model Agent for CTF Challenges |
not yet |
4 |
MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling |
|
4 |
Small Language Models: Survey, Measurements, and Insights |
not yet |
4 |
TFG: Unified Training-Free Guidance for Diffusion Models |
not yet |
4 |
Full-Order Sampling-Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing |
not yet |
4 |
Founding Quantum Cryptography on Quantum Advantage, or, Towards Cryptography from #P-Hardness |
not yet |
4 |
Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity |
not yet |
4 |
MobileVLM: A Vision-Language Model for Better Intra- and Inter-UI Understanding |
not yet |
4 |
ReMEmbR: Building and Reasoning Over Long-Horizon Spatio-Temporal Memory for Robot Navigation |
not yet |
4 |
Mitigating Unsafe Feedback with Learning Constraints |
not yet |
4 |
Language Models Learn to Mislead Humans via RLHF |
not yet |
4 |
Michelangelo: Long Context Evaluations Beyond Haystacks via Latent Structure Queries |
|
4 |
Harnessing LLMs for API Interactions: A Framework for Classification and Synthetic Data Generation |
not yet |
4 |
Monomial Matrix Group Equivariant Neural Functional Networks |
not yet |
4 |
LlamaF: An Efficient Llama2 Architecture Accelerator on Embedded FPGAs |
not yet |
4 |
Towards Time Series Reasoning with LLMs |
not yet |
4 |
EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage |
not yet |
4 |
Less is More: A Simple yet Effective Token Reduction Method for Efficient Multi-modal LLMs |
not yet |
4 |
Kolmogorov-Arnold Transformer |
not yet |
4 |
Eureka: Evaluating and Understanding Large Foundation Models |
not yet |
4 |
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval |
not yet |
4 |
jina-embeddings-v3: Multilingual Embeddings With Task LoRA |
not yet |
4 |
From a Single Trajectory to Safety Controller Synthesis of Discrete-Time Nonlinear Polynomial Systems |
not yet |
4 |
Deep Graph Anomaly Detection: A Survey and New Perspectives |
not yet |
4 |
PROSE-FD: A Multimodal PDE Foundation Model for Learning Multiple Operators for Forecasting Fluid Dynamics |
not yet |
4 |
Models Are Codes: Towards Measuring Malicious Code Poisoning Attacks on Pre-trained Model Hubs |
not yet |
4 |
Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark |
not yet |
4 |
XENONnT Analysis: Signal Reconstruction, Calibration and Event Selection |
not yet |
4 |
Synthetic continued pretraining |
not yet |
4 |
What is the Role of Small Models in the LLM Era: A Survey |
not yet |
4 |
DetailCLIP: Detail-Oriented CLIP for Fine-Grained Tasks |
not yet |
4 |
GigaGS: Scaling up Planar-Based 3D Gaussians for Large Scene Surface Reconstruction |
not yet |
4 |
Aligning Machine and Human Visual Representations across Abstraction Levels |
not yet |
4 |
High-fidelity heralded quantum state preparation and measurement |
not yet |
4 |
POINTS: Improving Your Vision-language Model with Affordable Strategies |
not yet |
4 |
Theory, Analysis, and Best Practices for Sigmoid Self-Attention |
not yet |
4 |
CoxKAN: Kolmogorov-Arnold Networks for Interpretable, High-Performance Survival Analysis |
not yet |
4 |
CDM: A Reliable Metric for Fair and Accurate Formula Recognition Evaluation |
not yet |
4 |
A Deceptively Simple Quadratic Recurrence |
not yet |
4 |
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA |
not yet |
4 |
Towards a Unified View of Preference Learning for Large Language Models: A Survey |
not yet |
4 |
FC-KAN: Function Combinations in Kolmogorov-Arnold Networks |
not yet |
4 |
Time-Varying Soft-Maximum Barrier Functions for Safety in Unmapped and Dynamic Environments |
not yet |
4 |
TempMe: Video Temporal Token Merging for Efficient Text-Video Retrieval |
not yet |
4 |
Follow-Your-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation |
not yet |
4 |
Emerging Vulnerabilities in Frontier Models: Multi-Turn Jailbreak Attacks |
not yet |
3 |
The Perfect Blend: Redefining RLHF with Mixture of Judges |
|
3 |
Balancing Cost and Effectiveness of Synthetic Data Generation Strategies for LLMs |
not yet |
3 |
Unveil Benign Overfitting for Transformer in Vision: Training Dynamics, Convergence, and Generalization |
not yet |
3 |
MinerU: An Open-Source Solution for Precise Document Content Extraction |
not yet |
3 |
LLM With Tools: A Survey |
not yet |
3 |
EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation |
not yet |
3 |
Weak-to-Strong Backdoor Attack for Large Language Models |
not yet |
3 |
Why Companies "Democratise" Artificial Intelligence: The Case of Open Source Software Donations |
not yet |
3 |
MIO: A Foundation Model on Multimodal Tokens |
not yet |
3 |
RED QUEEN: Safeguarding Large Language Models against Concealed Multi-Turn Jailbreaking |
not yet |
3 |
Observation of spin squeezing with contact interactions in one- and three-dimensional easy-plane magnets |
not yet |
3 |
Random ensembles of symplectic and unitary states are indistinguishable |
not yet |
3 |
HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions |
not yet |
3 |
Articulated Object Manipulation using Online Axis Estimation with SAM2-Based Tracking |
not yet |
3 |
CDChat: A Large Multimodal Model for Remote Sensing Change Description |
not yet |
3 |
A Deep Learning Earth System Model for Stable and Efficient Simulation of the Current Climate |
not yet |
3 |
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models |
not yet |
3 |
Zero-shot forecasting of chaotic systems |
|
3 |
Federated Large Language Models: Current Progress and Future Directions |
not yet |
3 |
Don't Use LLMs to Make Relevance Judgments |
not yet |
3 |
Evaluating Synthetic Activations composed of SAE Latents in GPT-2 |
not yet |
3 |
Inference-Friendly Models With MixAttention |
not yet |
3 |
AmpAgent: An LLM-based Multi-Agent System for Multi-stage Amplifier Schematic Design from Literature for Process and Performance Porting |
not yet |
3 |
Phantom of Latent for Large Language and Vision Models |
not yet |
3 |
RACER: Rich Language-Guided Failure Recovery Policies for Imitation Learning |
not yet |
3 |
AR Overlay: Training Image Pose Estimation on Curved Surface in a Synthetic Way |
not yet |
3 |
Fake It till You Make It: Curricular Dynamic Forgery Augmentations towards General Deepfake Detection |
not yet |
3 |
Higher-order-ReLU-KANs (HRKANs) for solving physics-informed neural networks (PINNs) more accurately, robustly and faster |
not yet |
3 |
SplatLoc: 3D Gaussian Splatting-based Visual Localization for Augmented Reality |
not yet |
3 |
Language agents achieve superhuman synthesis of scientific knowledge |
not yet |
3 |
MathGLM-Vision: Solving Mathematical Problems with Multi-Modal Large Language Model |
not yet |
3 |
Field theories and quantum methods for stochastic reaction-diffusion systems |
not yet |
3 |
What does guidance do? A fine-grained analysis in a simple setting |
not yet |
3 |
Interpolating Video-LLMs: Toward Longer-sequence LMMs in a Training-free Manner |
not yet |
3 |
Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation |
not yet |
3 |
Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization |
not yet |
3 |
The Central Role of the Loss Function in Reinforcement Learning |
not yet |
3 |
Enhancing Perception of Key Changes in Remote Sensing Image Change Captioning |
not yet |
3 |
PersonaFlow: Boosting Research Ideation with LLM-Simulated Expert Personas |
not yet |
3 |
Mastering Chess with a Transformer Model |
not yet |
3 |
From Lists to Emojis: How Format Bias Affects Model Alignment |
not yet |
3 |
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion |
not yet |
3 |
OSV: One Step is Enough for High-Quality Image to Video Generation |
not yet |
3 |
SOAP: Improving and Stabilizing Shampoo using Adam |
not yet |
3 |
LLM-as-a-Judge & Reward Model: What They Can and Cannot Do |
not yet |
3 |
On the limits of agency in agent-based models |
not yet |
3 |
SplatSim: Zero-Shot Sim2Real Transfer of RGB Manipulation Policies Using Gaussian Splatting |
not yet |
3 |
Smart Resetting: An Energy-Efficient Strategy for Stochastic Search Processes |
not yet |
3 |
Compositional Design of Safety Controllers for Large-scale Stochastic Hybrid Systems |
not yet |
3 |
Comprehensive Study on Sentiment Analysis: From Rule-based to modern LLM based system |
not yet |
3 |
SFR-RAG: Towards Contextually Faithful LLMs |
not yet |
3 |
GP-GPT: Large Language Model for Gene-Phenotype Mapping |
not yet |
3 |
Causal Inference with Large Language Model: A Survey |
not yet |
3 |
Reasoning Paths with Reference Objects Elicit Quantitative Spatial Reasoning in Large Vision-Language Models |
not yet |
3 |
Computing Arrangements of Hypersurfaces |
not yet |
3 |
Synthetic4Health: Generating Annotated Synthetic Clinical Letters |
not yet |
3 |
Phikon-v2, A large and public feature extractor for biomarker prediction |
not yet |
3 |
Sequential infinite-dimensional Bayesian optimal experimental design with derivative-informed latent attention neural operator |
not yet |
3 |
Agents in Software Engineering: Survey, Landscape, and Vision |
not yet |
3 |
Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation |
not yet |
3 |
A Grading Rubric for AI Safety Frameworks |
not yet |
3 |
Improved Unet model for brain tumor image segmentation based on ASPP-coordinate attention mechanism |
not yet |
3 |
ChangeChat: An Interactive Model for Remote Sensing Change Analysis via Multimodal Instruction Tuning |
not yet |
3 |
Integration of Mamba and Transformer -- MAT for Long-Short Range Time Series Forecasting with Application to Weather Dynamics |
not yet |
3 |
Touch2Touch: Cross-Modal Tactile Generation for Object Manipulation |
not yet |
3 |
Model Ensemble for Brain Tumor Segmentation in Magnetic Resonance Imaging |
not yet |
3 |
Deep Height Decoupling for Precise Vision-based 3D Occupancy Prediction |
not yet |
3 |
Towards Timetronics with Photonic Systems |
not yet |
3 |
Descriptors-free Collective Variables From Geometric Graph Neural Networks |
not yet |
3 |
Stratospheric aerosol source inversion: Noise, variability, and uncertainty quantification |
not yet |
3 |
Translating Step-by-Step: Decomposing the Translation Process for Improved Translation Quality of Long-Form Texts |
not yet |
3 |
KANtrol: A Physics-Informed Kolmogorov-Arnold Network Framework for Solving Multi-Dimensional and Fractional Optimal Control Problems |
not yet |
3 |
Randomized low-rank Runge-Kutta methods |
not yet |
3 |
G3PT: Unleash the power of Autoregressive Modeling in 3D Generation via Cross-scale Querying Transformer |
not yet |
3 |
Can Large Language Models Unlock Novel Scientific Research Ideas? |
not yet |
3 |
MLLM-LLaVA-FL: Multimodal Large Language Model Assisted Federated Learning |
not yet |
3 |
Programming Refusal with Conditional Activation Steering |
not yet |
3 |
Neural MP: A Generalist Neural Motion Planner |
not yet |
3 |
MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct |
not yet |
3 |
Are Heterophily-Specific GNNs and Homophily Metrics Really Effective? Evaluation Pitfalls and New Benchmarks |
not yet |
3 |
Unlearning or Concealment? A Critical Analysis and Evaluation Metrics for Unlearning in Diffusion Models |
not yet |
3 |
Rapid, strongly magnetized accretion in the zero-net-vertical-flux shearing box |
not yet |
3 |
TextToucher: Fine-Grained Text-to-Touch Generation |
not yet |
3 |
Bypassing the Noisy Parity Barrier: Learning Higher-Order Markov Random Fields from Dynamics |
not yet |
3 |
Magneto-optical trapping of a heavy polyatomic molecule for precision measurement |
not yet |
3 |
Neurosymbolic Methods for Dynamic Knowledge Graphs |
not yet |
3 |
Evaluating Open-Source Sparse Autoencoders on Disentangling Factual Knowledge in GPT-2 Small |
not yet |
3 |
Enhancing Skin Lesion Diagnosis with Ensemble Learning |
not yet |
3 |
Self-Harmonized Chain of Thought |
not yet |
3 |
Design and Characterization of MRI-compatible Plastic Ultrasonic Motor |
not yet |
3 |
Harnessing LLMs for Cross-City OD Flow Prediction |
not yet |
3 |
Reprogrammable sequencing for physically intelligent under-actuated robots |
not yet |
3 |
View-Invariant Policy Learning via Zero-Shot Novel View Synthesis |
not yet |
3 |
On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization |
not yet |