CVPR 2024 Day3で気になったpaperを羅列。
後から忘れないようにするためのメモ的立ち位置。
詳しく知りたいものは後日paperを読む予定。
3D
Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion
https://cvpr.thecvf.com/virtual/2024/poster/31239
Matching 2D Images in 3D: Metric Relative Pose from Metric Correspondences
https://cvpr.thecvf.com/virtual/2024/poster/31117
Diffusion Model
HOIAnimator: Generating Text-prompt Human-object Animations using Novel Perceptive Diffusion Models
https://cvpr.thecvf.com/virtual/2024/poster/31450
It's All About Your Sketch: Democratising Sketch Control in Diffusion Models
https://cvpr.thecvf.com/virtual/2024/poster/30738
Diffusion Handles Enabling 3D Edits for Diffusion Models by Lifting Activations to 3D
https://cvpr.thecvf.com/virtual/2024/poster/31189
PAIR Diffusion: A Comprehensive Multimodal Object-Level Image Editor
https://cvpr.thecvf.com/virtual/2024/poster/31044
Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models
https://cvpr.thecvf.com/virtual/2024/poster/29456
Beyond First-Order Tweedie: Solving Inverse Problems using Latent Diffusion
https://cvpr.thecvf.com/virtual/2024/poster/30363
Customization Assistant for Text-to-Image Generation
https://cvpr.thecvf.com/virtual/2024/poster/31014
AEROBLADE: Training-Free Detection of Latent Diffusion Images Using Autoencoder Reconstruction Error
https://cvpr.thecvf.com/virtual/2024/poster/29888
その他
Steerers: A Framework for Rotation Equivariant Keypoint Descriptors
https://cvpr.thecvf.com/virtual/2024/poster/30064
URHand: Universal Relightable Hands
https://cvpr.thecvf.com/virtual/2024/poster/29561
CLiC: Concept Learning in Context
https://cvpr.thecvf.com/virtual/2024/poster/31706
Self-Supervised Multi-Object Tracking with Path Consistency
https://cvpr.thecvf.com/virtual/2024/poster/30783
Grounded Text-to-Image Synthesis with Attention Refocusing
https://cvpr.thecvf.com/virtual/2024/poster/30020
TextCraftor: Your Text Encoder Can be Image Quality Controller
https://cvpr.thecvf.com/virtual/2024/poster/31410
SmartEdit: Exploring Complex Instruction-based Image Editing with Multimodal Large Language Models
https://cvpr.thecvf.com/virtual/2024/poster/30021
HIVE: Harnessing Human Feedback for Instructional Visual Editing
https://cvpr.thecvf.com/virtual/2024/poster/29474
ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Image
https://cvpr.thecvf.com/virtual/2024/poster/29270