3260 papers • 126 benchmarks • 313 datasets
Expression-based referring image matting, taking an image and a flowery expression as the input.
(Image credit: Papersgraph)
These leaderboards are used to track progress in referring-image-matting-expression-based-10
Use these libraries to find referring-image-matting-expression-based-10 models and implementations
No subtasks available.
This work proposes a system that can generate image segmentations based on arbitrary prompts at test time, and builds upon the CLIP model as a backbone which it extends with a transformer-based decoder that enables dense prediction.
A large-scale challenging dataset Ref Matte is established by designing a comprehensive image composition and expression generation engine to automatically produce high-quality images along with diverse text attributes based on public datasets and presents a novel baseline method CLIPMat for RIM, including a context-embedded prompt, a text-driven semantic pop-up, and a multi-level details extractor.
Adding a benchmark result helps the community track progress.