computer-vision-2

Chart Understanding

3260 papers • 126 benchmarks • 313 datasets

This task has no description! Would you like to contribute one?

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in chart-understanding-2

Trend

Dataset

Best Model

Actions

No benchmarks available.

Libraries

i

Use these libraries to find chart-understanding-2 models and implementations

Datasets

SBS Figures

Subtasks

No subtasks available.

Most implemented papers

StructChart: On the Schema, Metric, and Augmentation for Visual Chart Understanding

Botian Shi, Renqiu Xia, Haoyang Peng, Hancheng Ye, Mingsheng Li, Xiangchao Yan, Peng Ye, Yu Qiao, Junchi Yan, Bo Zhang•Tue Sep 19 2023

A novel framework that leverages Structured Triplet Representations (STR) to achieve a unified and label-efficient approach to chart perception and reasoning tasks, which is generally applicable to different downstream tasks, beyond the question-answering task as specifically studied in peer works.

6

Content

0

Paper Graph

MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning

Wenlin Yao, Jianshu Chen, Kaiqiang Song, Sangwoo Cho, Fuxiao Liu, Yaser Yacoob, Xiaoyang Wang, Dong Yu•Tue Nov 14 2023

A large-scale MultiModal ChartInstruction (MMC-Instruction) dataset is introduced comprising 600k instances supporting diverse tasks and chart types and an instruction-tuning methodology and benchmark to advance multimodal understanding of charts is proposed.

168 0

Paper Graph

ChartGalaxy: A Dataset for Infographic Chart Understanding and Generation

Zhen Li, Duan Li, Yukai Guo, Xinyuan Guo, Bowen Li, Lanxi Xiao, Shenyu Qiao, Jiashu Chen, Zijian Wu, Hui Zhang, Xinhuan Shu, Shixia Liu•Fri May 23 2025

ChartGalaxy is introduced, a million-scale dataset designed to advance the understanding and generation of infographic charts and provides a useful resource for enhancing multimodal reasoning and generation in LVLMs.

1 0

Paper Graph

InfoChartQA: A Benchmark for Multimodal Question Answering on Infographic Charts

Changjian Chen, Mengchen Liu, Minzhi Lin, Tianchi Xie, Yilin Ye, Shixia Liu•Sat May 24 2025

Understanding infographic charts with design-driven visual elements (e.g., pictograms, icons) requires both visual recognition and reasoning, posing challenges for multimodal large language models (MLLMs). However, existing visual-question answering benchmarks fall short in evaluating these capabilities of MLLMs due to the lack of paired plain charts and visual-element-based questions. To bridge this gap, we introduce InfoChartQA, a benchmark for evaluating MLLMs on infographic chart understanding. It includes 5,642 pairs of infographic and plain charts, each sharing the same underlying data but differing in visual presentations. We further design visual-element-based questions to capture their unique visual designs and communicative intent. Evaluation of 20 MLLMs reveals a substantial performance decline on infographic charts, particularly for visual-element-based questions related to metaphors. The paired infographic and plain charts enable fine-grained error analysis and ablation studies, which highlight new opportunities for advancing MLLMs in infographic chart understanding. We release InfoChartQA at https://github.com/CoolDawnAnt/InfoChartQA.

9 0

Paper Graph

DVQA: Understanding Data Visualizations via Question Answering

Kushal Kafle, Christopher Kanan, Scott D. Cohen, Brian L. Price•Tue Jan 23 2018

DVQA is presented, a dataset that tests many aspects of bar chart understanding in a question answering framework and two strong baselines are proposed that perform considerably better than current VQA algorithms.

478 0

Paper Graph

ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules

A. Hauptmann, T. Mitamura, Jingdong Sun, Zhi-Qi Cheng, Qianwen Dai, Siyao Li•Tue Apr 04 2023

The proposed framework can significantly reduce the manual effort involved in chart analysis, providing a step towards a universal chart understanding model and offers opportunities for plug-and-play integration with mainstream LLMs such as T5 and TaPas, extending their capability to chart comprehension tasks.

29 0

Paper Graph

UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning

Shafiq R. Joty, Do Xuan Long, Enamul Hoque, Ahmed Masry, P. Kavehzadeh•Tue May 23 2023

It is found that pretraining the model on a large corpus with chart-specific low- and high-level tasks followed by finetuning on three down-streaming tasks results in state-of-the-art performance on three downstream tasks.

164 0

Paper Graph

Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models

Chunrui Han, Zheng Ge, Xiangyu Zhang, Jian‐Yuan Sun, Haoran Wei, Lingyu Kong, Jinyue Chen, Liang Zhao, Jinrong Yang•Sun Dec 10 2023

Compared to the popular BLIP-2, MiniGPT4, and LLaVA, Vary can maintain its vanilla capabilities while enjoying more excellent fine-grained perception and understanding ability and is competent in new document parsing features (OCR or markdown conversion).

89 0

Paper Graph

Improving Language Understanding from Screenshots

Danqi Chen, Tianyu Gao, Zirui Wang, Adithya Bhaskar•Tue Feb 20 2024

A novel Patch-and-Text Prediction (PTP) objective is proposed, which masks and recovers both image patches of screenshots and text within screenshots, and can significantly reduce perplexity by utilizing the screenshot context.

14 0

Paper Graph

Adding a benchmark result helps the community track progress.