Skip to main navigation Skip to search Skip to main content

HopNet: Harmonizing Object Placement Network for Realistic Image Generation via Object Composition

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Realistic image generation is an increasingly desired, but deceptively complicated computer vision task, especially when a specific object is required. Whether generating product advertisements or building novel datasets, object composition for realistic image generation depends on realistic object placements as well as believable object harmonization. To address this task, we introduce HopNet, the first network designed for end-to-end realistic image generation via object composition. HopNet excels in two pivotal tasks: object placement and harmonization, setting state-of-the-art performance in both domains. Unlike conventional methods that employ separate models for each task, HopNet seamlessly integrates object placement and harmonization to acquire knowledge of correlated information. It leverages a transformer-based framework to encode both foreground objects and background scenes and learns attention mechanisms crucial for both object placement and harmonization concurrently. We introduce a modified sparse contrastive loss, allowing our model to learn from multiple both good and bad placements while also learning object harmonization in a self-supervised manner. HopNet generalizes well on challenging scenes while removing the compounding errors associated with using separate models for each subtask.

Original languageEnglish (US)
Title of host publicationProceedings - 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2025
PublisherIEEE Computer Society
Pages6334-6344
Number of pages11
ISBN (Electronic)9798331599942
DOIs
StatePublished - 2025
Event2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2025 - Nashville, United States
Duration: Jun 11 2025Jun 12 2025

Publication series

NameIEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
ISSN (Print)2160-7508
ISSN (Electronic)2160-7516

Conference

Conference2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2025
Country/TerritoryUnited States
CityNashville
Period6/11/256/12/25

All Science Journal Classification (ASJC) codes

  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'HopNet: Harmonizing Object Placement Network for Realistic Image Generation via Object Composition'. Together they form a unique fingerprint.

Cite this