Understanding Aesthetics with Language: A Photo Critique Dataset for Aesthetic Assessment (2022-06-17T00:00:00.000000Z)

TL;DR

The Reddit Photo Critique Dataset (RPCD) is proposed, which contains tuples of image and photo critiques and the polarity of the sentiment of criticism as an indicator of aesthetic judgment is exploited, demonstrating how sentiment polarity correlates positively with the aesthetic judgment available for two aesthetic assessment benchmarks.

Abstract

Computational inference of aesthetics is an ill-defined task due to its subjective nature. Many datasets have been proposed to tackle the problem by providing pairs of images and aesthetic scores based on human ratings. However, humans are better at expressing their opinion, taste, and emotions by means of language rather than summarizing them in a single number. In fact, photo critiques provide much richer information as they reveal how and why users rate the aesthetics of visual stimuli. In this regard, we propose the Reddit Photo Critique Dataset (RPCD), which contains tuples of image and photo critiques. RPCD consists of 74K images and 220K comments and is collected from a Reddit community used by hobbyists and professional photographers to improve their photography skills by leveraging constructive community feedback. The proposed dataset differs from previous aesthetics datasets mainly in three aspects, namely (i) the large scale of the dataset and the extension of the comments criticizing different aspects of the image, (ii) it contains mostly UltraHD images, and (iii) it can easily be extended to new data as it is collected through an automatic pipeline. To the best of our knowledge, in this work, we propose the first attempt to estimate the aesthetic quality of visual stimuli from the critiques. To this end, we exploit the polarity of the sentiment of criticism as an indicator of aesthetic judgment. We demonstrate how sentiment polarity correlates positively with the aesthetic judgment available for two aesthetic assessment benchmarks. Finally, we experiment with several models by using the sentiment scores as a target for ranking images. Dataset and baselines are available (https://github.com/mediatechnologycenter/aestheval).

References69 items

The Document Vectors Using Cosine Similarity Revisited

BERTopic: Neural topic modeling with a class-based TF-IDF procedure

TimeLMs: Diachronic Language Models from Twitter

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Sentiment Analysis of Twitter Data Using Naïve Bayes Classifier

Composition and Style Attributes Guided Image Aesthetic Assessment

Generating Aesthetic Based Critique For Photographs

Sentiment Analysis of Drug Reviews using Transfer Learning

MUSIQ: Multi-scale Image Quality Transformer

Composing Photos Like a Photographer

Mass-scale emotionality reveals human behaviour and marketplace success

Training data-efficient image transformers & distillation through attention

TweetEval: Unified Benchmark and Comparative Evaluation for Tweet Classification

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

EVA: An Explainable Visual Aesthetics Dataset

Transformers: State-of-the-Art Natural Language Processing

A Unified Framework for Shot Type Classification Based on Subject Centric Lens

MovieNet: A Holistic Dataset for Movie Understanding

Adaptive Fractional Dilated Convolution Network for Image Aesthetics Assessment

The Pushshift Reddit Dataset

HuggingFace's Transformers: State-of-the-art Natural Language Processing

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Aesthetic Image Captioning From Weakly-Labelled Photographs

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Aesthetic Attributes Assessment of Images

Effective Aesthetics Prediction With Multi-Level Spatially Pooled Features

Photographic composition classification and dominant geometric element detection for outdoor scenes

Datasheets for datasets

Neural Aesthetic Image Reviewer

Decoupled Weight Decay Regularization

Aesthetic Critiques Generation for Photos

NIMA: Neural Image Assessment

SemEval-2017 Task 4: Sentiment Analysis in Twitter

Twitter sentiment analysis using hybrid cuckoo search method

A-Lamp: Adaptive Layout-Aware Multi-patch Deep Convolutional Neural Network for Photo Aesthetic Assessment

Sentiment Analysis on Tweets about Diabetes: An Aspect-Level Approach

Joint Image and Text Representation for Aesthetics Analysis

Comparison of Text Sentiment Analysis Based on Machine Learning

Photo Aesthetics Ranking Network with Attributes and Content Adaptation

An Image Is Worth More than a Thousand Favorites: Surfacing the Hidden Beauty of Flickr Pictures

Adam: A Method for Stochastic Optimization

RAPID: Rating Pictorial Aesthetics using Deep Learning

Fusion of Multichannel Local and Global Structural Cues for Photo Aesthetics Evaluation

AVA: A large-scale database for aesthetic visual analysis

Content-based photo quality assessment

Assessing the aesthetic quality of photographs using generic image descriptors

Scikit-learn: Machine Learning in Python

Studying Aesthetics in Photographic Images Using a Computational Approach

A Comprehensive Survey on Computational Aesthetic Evaluation of Visual Art Images: Metrics and Challenges

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Pytorch image models

Sentiment Analysis on Twitter Data using KNN and SVM

Maintenance • Who is supporting/hosting/maintaining the dataset? RPCD is supported and maintained by ETH MTC and University of Milano-Bicocca

• Do any export controls or other regulatory restrictions apply to the dataset or to individual instances? No

If the dataset is a sample from a larger set, what was the sampling strategy (e.g., deterministic, probabilistic with specific sampling probabilities

• Were the individuals in question notified about the data collection? No

• Does the dataset relate to people? Yes, but not exclusively

• How many instances are there in total (of each type, if appropriate)? RPCD consists of 73,965 data instances. Specifically, there are 73,965 images and 219,790 photo critiques

(a) Did you state the full set of assumptions of all theoretical results

Do the main claims made in the abstract and introduction accurately reflect the paper's contributions and scope? [Yes] The main claims are listed at the end of Section 1

• Did the individuals in question consent to the collection and use of their data? According to Reddit's Privacy Policy 24 , which is accepted by every user upon registration

with respect to the random seed after running experiments multiple times)? [No] Training are very compute intensive. We can only run the training once per experiment using a random seed

Collection process • How was the data associated with each instance acquired? The data was directly observable (posts in Reddit stored in Pushshift's and Reddit's servers)

(c) Did you report error bars (e.g., with respect to the random seed after running experiments multiple times)? [No] Training are very compute intensive

c) Did you include the estimated hourly wage paid to participants and the total amount spent on participant compensation?

Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes]

Did you describe any potential participant risks, with links to Institutional Review Board (IRB) approvals, if applicable?

• Is there an erratum? All changes to the dataset will be announced on our Zenodo repository 28

Did you discuss any potential negative societal impacts of your work? [Yes] The potential negative societal impacts are described in Section 6