3260 papers • 126 benchmarks • 313 datasets
The task of detecting images or image parts that have been tampered or manipulated (sometimes also referred to as doctored). This typically encompasses image splicing, copy-move, or image inpainting.
(Image credit: Papersgraph)
These leaderboards are used to track progress in image-manipulation-detection-1
Use these libraries to find image-manipulation-detection-1 models and implementations
No subtasks available.
To fight against real-life image forgery, which commonly involves different types and combined manipulations, we propose a unified deep neural architecture called ManTra-Net. Unlike many existing solutions, ManTra-Net is an end-to-end network that performs both detection and localization without extra preprocessing and postprocessing. \manifold{} is a fully convolutional network and handles images of arbitrary sizes and many known forgery types such splicing, copy-move, removal, enhancement, and even unknown types. This paper has three salient contributions. We design a simple yet effective self-supervised learning task to learn robust image manipulation traces from classifying 385 image manipulation types. Further, we formulate the forgery localization problem as a local anomaly detection problem, design a Z-score feature to capture local anomaly, and propose a novel long short-term memory solution to assess local anomalies. Finally, we carefully conduct ablation experiments to systematically optimize the proposed network design. Our extensive experimental results demonstrate the generalizability, robustness and superiority of ManTra-Net, not only in single types of manipulations/forgeries, but also in their complicated combinations.
A two-stream Faster R-CNN network is proposed and trained end-to-end to detect the tampered regions given a manipulated image and fuse features from the two streams through a bilinear pooling layer to further incorporate spatial co-occurrence of these two modalities.
It is shown that the model outperforms humans at the task of recognizing manipulated images, can predict the specific location of edits, and in some cases can be used to "undo" a manipulation to reconstruct the original, unedited image.
This paper studies the ensembling of different trained Convolutional Neural Network (CNN) models and shows that combining these networks leads to promising face manipulation detection results on two publicly available datasets with more than 119000 videos.
This paper addresses both aspects of image manipulation detection by multi-view feature learning and multi-scale supervision by exploiting noise distribution and boundary artifact surrounding tampered regions to learn semantic-agnostic and thus more generalizable features.
This work proposes multi-view feature learning to jointly exploit tampering boundary artifacts and the noise view of the input image to learn generalizable features sensitive to manipulations in novel data, whilst specific to prevent false alarms on the authentic.
This paper takes inspiration from the widely-used pre-training and then prompt tuning protocols in NLP and proposes a new visual prompting model, named Explicit Visual Prompting (EVP), which freezes a pre-trained model and then learns task-specific knowledge using a few extra parameters.
A manipulated image generation process that creates true positives using currently available datasets is introduced, and a novel generator for creating examples that force the algorithm to focus on boundary artifacts during training is proposed.
It is demonstrated that neural imaging pipelines can be trained to replace the internals of digital cameras, and jointly optimized for high-fidelity photo development and reliable provenance analysis at the end of the distribution channel.
Adding a benchmark result helps the community track progress.