3260 papers • 126 benchmarks • 313 datasets
Given a single image with multiple concepts, annotated by loose segmentation masks, the method needs to learn a distinct token for each concept, and use natural language guidance to re-synthesize the individual concepts or combinations of them in various contexts.
(Image credit: Papersgraph)
These leaderboards are used to track progress in complex-scene-breaking-and-synthesis-11
No benchmarks available.
Use these libraries to find complex-scene-breaking-and-synthesis-11 models and implementations
No datasets available.
No subtasks available.
The task of textual scene decomposition is introduced: given a single image of a scene that may contain several concepts, it is aimed to extract a distinct text token for each concept, enabling fine-grained control over the generated scenes.
Adding a benchmark result helps the community track progress.