3260 papers • 126 benchmarks • 313 datasets
Image Cropping is a common photo manipulation process, which improves the overall composition by removing unwanted regions. Image Cropping is widely used in photographic, film processing, graphic design, and printing businesses. Source: Listwise View Ranking for Image Cropping
(Image credit: Papersgraph)
These leaderboards are used to track progress in image-cropping-3
Use these libraries to find image-cropping-3 models and implementations
No subtasks available.
Kornia is composed of a set of modules containing operators that can be inserted inside neural networks to train models to perform image transformations, camera calibration, epipolar geometry, and low level image processing techniques, such as filtering and edge detection that operate directly on high dimensional tensor representations.
Comprehensive experiments in common fine-grained visual classification datasets show that the proposed Weakly Supervised Data Augmentation Network (WS-DAN) surpasses the state-of-the-art methods, which demonstrates its effectiveness.
Image cropping aims at improving the aesthetic quality of images by adjusting their composition. Most weakly supervised cropping methods (without bounding box supervision) rely on the sliding window mechanism. The sliding window mechanism requires fixed aspect ratios and limits the cropping region with arbitrary size. Moreover, the sliding window method usually produces tens of thousands of windows on the input image which is very time-consuming. Motivated by these challenges, we firstly formulate the aesthetic image cropping as a sequential decision-making process and propose a weakly supervised Aesthetics Aware Reinforcement Learning (A2-RL) framework to address this problem. Particularly, the proposed method develops an aesthetics aware reward function which especially benefits image cropping. Similar to human's decision making, we use a comprehensive state representation including both the current observation and the historical experience. We train the agent using the actor-critic architecture in an end-to-end manner. The agent is evaluated on several popular unseen cropping datasets. Experiment results show that our method achieves the state-of-the-art performance with much fewer candidate windows and much less time compared with previous weakly supervised methods.
A deep learning based framework to learn the objects composition from photos with high aesthetic qualities is proposed, where an anchor region is detected through a convolutional neural network (CNN) with the Gaussian kernel to maintain the interested objects' integrity.
An extensive analysis using formalized group fairness metrics finds systematic disparities in cropping and identifies contributing factors, including the fact that the cropping based on the single most salient point can amplify the disparities because of an effect the authors term argmax bias.
This research work proposes the Interval Dense Connection Strategy, which connects different blocks according to the newly designed algorithm to improve the model feature reuse, and presents a new model, which is named SwinOIR (Object Image Restoration Using Swin Transformer).
The Flexible Vision Transformer (FiT), a transformer architecture specifically designed for generating images with unrestricted resolutions and aspect ratios, exhibits remarkable flexibility in resolution extrapolation generation.
This work conducts an extensive study on traditional approaches as well as ranking-based croppers trained on various image features, and a new dataset consisting of high quality cropping and pairwise ranking annotations is presented to evaluate the performance of various baselines.
This work forms the photo composition problem as a view finding process which successively examines pairs of views and determines their aesthetic preferences, and exploits the rich professional photographs on the web to mine unlimited high-quality ranking samples and demonstrates that an aesthetics-aware deep ranking network can be trained without explicitly modeling any photographic rules.
Adding a benchmark result helps the community track progress.