CVPR 2024 Day4 PMで気になったpaperを羅列。
後から忘れないようにするためのメモ的立ち位置。
詳しく知りたいものは後日paperを読む予定。
multimodal
Text2Loc: 3D Point Cloud Localization from Natural Language
https://cvpr.thecvf.com/virtual/2024/poster/29628
MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training
https://cvpr.thecvf.com/virtual/2024/poster/30022
Improved Zero-Shot Classification by Adapting VLMs with Text Descriptions
https://cvpr.thecvf.com/virtual/2024/poster/31839
Harnessing Large Language Models for Training-free Video Anomaly Detection
https://cvpr.thecvf.com/virtual/2024/poster/29246
Streaming Dense Video Captioning
https://cvpr.thecvf.com/virtual/2024/poster/31433
Multimodal Industrial Anomaly Detection by Crossmodal Feature Mapping
https://cvpr.thecvf.com/virtual/2024/poster/30320
LLMs are Good Sign Language Translators
https://cvpr.thecvf.com/virtual/2024/poster/30247
VideoLLM-online: Online Video Large Language Model for Streaming Video
https://cvpr.thecvf.com/virtual/2024/poster/29835
training
inference
RILA: Reflective and Imaginative Language Agent for Zero-Shot Semantic Audio-Visual Navigation
https://cvpr.thecvf.com/virtual/2024/poster/29730
task definition (semantic audio-visual navigation)
architecture
SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection
https://cvpr.thecvf.com/virtual/2024/poster/31092
その他
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
https://cvpr.thecvf.com/virtual/2024/poster/30131
PEM: Prototype-based Efficient MaskFormer for Image Segmentation
https://cvpr.thecvf.com/virtual/2024/poster/30870
Extreme Point Supervised Instance Segmentation
https://cvpr.thecvf.com/virtual/2024/poster/30370
Looking 3D: Anomaly Detection with 2D-3D Alignment
https://cvpr.thecvf.com/virtual/2024/poster/31190
VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly Supervised 3D Object Detection
https://cvpr.thecvf.com/virtual/2024/poster/31171
Supervised Anomaly Detection for Complex Industrial Images
https://cvpr.thecvf.com/virtual/2024/poster/30567
PELA: Learning Parameter-Efficient Models with Low-Rank Approximation
https://cvpr.thecvf.com/virtual/2024/poster/30829
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts
https://cvpr.thecvf.com/virtual/2024/poster/31830
Multi-Task Dense Prediction via Mixture of Low-Rank Experts
https://cvpr.thecvf.com/virtual/2024/poster/29418
Matching Anything by Segmenting Anything
https://cvpr.thecvf.com/virtual/2024/poster/29590