3260 papers • 126 benchmarks • 313 datasets
Automatic assessment of aesthetic-related subjective ratings.
(Image credit: Papersgraph)
These leaderboards are used to track progress in aesthetics-quality-assessment-7
Use these libraries to find aesthetics-quality-assessment-7 models and implementations
No subtasks available.
This work proposes to learn a deep convolutional neural network to rank photo aesthetics in which the relative ranking of photo aesthetics are directly modeled in the loss function.
The proposed approach relies on the success (and retraining) of proven, state-of-the-art deep object recognition networks and can be used to not only score images reliably and with high correlation to human perception, but also to assist with adaptation and optimization of photo editing/enhancement algorithms in a photographic pipeline.
A novel multi-patch aggregation method for image aesthetic assessment using an attention-based mechanism that adaptively adjusts the weight of each patch during the training process to improve learning efficiency and outperforms existing methods by a large margin.
Typical image aesthetics assessment (IAA) is modeled for the generic aesthetics perceived by an “average” user. However, such generic aesthetics models neglect the fact that users’ aesthetic preferences vary significantly depending on their unique preferences. Therefore, it is essential to tackle the issue for personalized IAA (PIAA). Since PIAA is a typical small sample learning (SSL) problem, existing PIAA models are usually built by fine-tuning the well-established generic IAA (GIAA) models, which are regarded as prior knowledge. Nevertheless, this kind of prior knowledge based on “average aesthetics” fails to incarnate the aesthetic diversity of different people. In order to learn the shared prior knowledge when different people judge aesthetics, that is, learn how people judge image aesthetics, we propose a PIAA method based on meta-learning with bilevel gradient optimization (BLG-PIAA), which is trained using individual aesthetic data directly and generalizes to unknown users quickly. The proposed approach consists of two phases: 1) meta-training and 2) meta-testing. In meta-training, the aesthetics assessment of each user is regarded as a task, and the training set of each task is divided into two sets: 1) support set and 2) query set. Unlike traditional methods that train a GIAA model based on average aesthetics, we train an aesthetic meta-learner model by bilevel gradient updating from the support set to the query set using many users’ PIAA tasks. In meta-testing, the aesthetic meta-learner model is fine-tuned using a small amount of aesthetic data of a target user to obtain the PIAA model. The experimental results show that the proposed method outperforms the state-of-the-art PIAA metrics, and the learned prior model of BLG-PIAA can be quickly adapted to unseen PIAA tasks.
Uncertainty is the only certainty there is. Modeling data uncertainty is essential for regression, especially in unconstrained settings. Traditionally the direct regression formulation is considered and the uncertainty is modeled by modifying the output space to a certain family of probabilistic distributions. On the other hand, classification based regression and ranking based solutions are more popular in practice while the direct regression methods suffer from the limited performance. How to model the uncertainty within the present-day technologies for regression remains an open issue. In this paper, we propose to learn probabilistic ordinal embeddings which represent each data as a multivariate Gaussian distribution rather than a deterministic point in the latent space. An ordinal distribution constraint is proposed to exploit the ordinal nature of regression. Our probabilistic ordinal embeddings can be integrated into popular regression approaches and empower them with the ability of uncertainty estimation. Experimental results show that our approach achieves competitive performance. Code is available at https://github.com/Li-Wanhua/POEs.
This work proposes the first method that efficiently supports full resolution images as an input, and can be trained on variable input sizes, and significantly improves upon the state of the art on ground-truth mean opinion scores.
A composition assessment network SAMP-Net with a novel Saliency-Augmented Multi-pattern Pooling (SAMP) module, which analyses visual layout from the perspectives of multiple composition patterns, and can perform more favorably than previous aesthetic assessment approaches.
This paper reformulates this task as an image-language matching problem with a contrastive objective, which regards labels as text and obtains a language prototype from a text encoder for each rank, and proposes OrdinalCLIP, a differentiable prompting method for adapting CLIP for ordinal regression.
The Reddit Photo Critique Dataset (RPCD) is proposed, which contains tuples of image and photo critiques and the polarity of the sentiment of criticism as an indicator of aesthetic judgment is exploited, demonstrating how sentiment polarity correlates positively with the aesthetic judgment available for two aesthetic assessment benchmarks.
The proposed Q-Align achieves state-of-the-art performance on image quality assessment (IQA), image aesthetic assessment (IAA), as well as video quality assessment (VQA) tasks under the original LMM structure and unify the three tasks into one model, termed the OneAlign.
Adding a benchmark result helps the community track progress.