A two-stage method for text line detection in historical documents (2018-02-10T00:00:00.000000Z)

TL;DR

The developed method is capable of handling complex layouts as well as curved and arbitrarily oriented text lines and substantially outperforms current state-of-the-art approaches.

Abstract

This work presents a two-stage text line detection method for historical documents. Each detected text line is represented by its baseline. In a first stage, a deep neural network called ARU-Net labels pixels to belong to one of the three classes: baseline, separator and other. The separator class marks beginning and end of each text line. The ARU-Net is trainable from scratch with manageably few manually annotated example images (<50\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$<\,50$$\end{document}). This is achieved by utilizing data augmentation strategies. The network predictions are used as input for the second stage which performs a bottom-up clustering to build baselines. The developed method is capable of handling complex layouts as well as curved and arbitrarily oriented text lines. It substantially outperforms current state-of-the-art approaches. For example, for the complex track of the cBAD: ICDAR2017 Competition on Baseline Detection the F value is increased from 0.859 to 0.922. The framework to train and run the ARU-Net is open source.

Authors

R. Labahn

3 papers

Tobias Grüning

2 papers

Gundram Leifert

1 papers

TL;DR

Abstract

Authors

References62 items

Multi-Task Handwritten Document Layout Analysis

Learning to detect, localize and recognize many text objects in document images from few examples

Fully convolutional network with dilated convolutions for handwritten text line segmentation

dhSegment: A Generic Deep-Learning Approach for Document Segmentation

Manuscript Text Line Detection and Segmentation Using Second-Order Derivatives

Baseline Detection in Historical Documents Using Convolutional U-Nets

Text and non-text separation in offline document images: a survey

Binarization of degraded document images based on hierarchical deep supervised network

Textline detection in degraded historical document images

cBAD: ICDAR2017 Competition on Baseline Detection

Handwritten Text Line Segmentation Using Fully Convolutional Network

A Robust and Binarization-Free Approach for Text Line Detection in Historical Documents

Are Multidimensional Recurrent Layers Really Necessary for Handwritten Text Recognition?

ICDAR2017 Competition on Layout Analysis for Challenging Medieval Manuscripts

PageNet: Page Boundary Extraction in Historical Handwritten Documents

READ-BAD: A New Dataset and Evaluation Scheme for Baseline Detection in Archival Documents

Convolutional Neural Networks for Page Segmentation of Historical Document Images

A comprehensive survey of mostly textual document segmentation algorithms since 2008

A segmentation-free word spotting method for historical printed documents

ICFHR2016 Handwritten Keyword Spotting Competition (H-KWS 2016)

Learning Text-Line Localization with Shared and Local Regression Neural Networks

Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition

PHOCNet: A Deep Convolutional Neural Network for Word Spotting in Handwritten Documents

Deep Residual Learning for Image Recognition

Regular expressions for decoding of neural network outputs

Paragraph text segmentation into lines with Recurrent Neural Networks

ICDAR 2015 competition on text line detection in historical documents

U-Net: Convolutional Networks for Biomedical Image Segmentation

Learning Deconvolution Network for Semantic Segmentation

Efficient segmentation-free keyword spotting in historical document collections

ICFHR2014 Competition on Handwritten Text Recognition on Transcriptorium Datasets (HTRtS)

Word-Graph and Character-Lattice Combination for KWS in Handwritten Documents

Cells in Multidimensional Recurrent Neural Networks

Fully convolutional networks for semantic segmentation

Seam Carving for Text Line Extraction on Color and Grayscale Historical Manuscripts

Language-Independent Text-Line Extraction Algorithm for Handwritten Documents

tranScriptorium: a european project on handwritten text recognition

Binarization-Free Text Line Segmentation for Historical Documents Based on Interest Point Clustering

Europeana: Moving to Linked Open Data

Understanding the difficulty of training deep feedforward neural networks

Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks

Text Line Segmentation of Historical Arabic Documents

Text line segmentation of historical documents: a survey

Best practices for convolutional neural networks applied to visual document analysis

Fast approximate energy minimization via graph cuts

Better Bootstrap Confidence Intervals

Learning representations by back-propagating errors

A Quick Compact Two Sample Test To Duckworth's Specifications

2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp

Scriptnet: Icdar 2017 Competition On Baseline Detection In Archival Documents (Cbad)

Recognition : Learning Where to Start and When to Stop

CITlab ARGUS for Keyword Search in Historical Handwritten Documents - Description of CITlab's System for the ImageCLEF 2016 Handwritten Scanned Document Retrieval Task

Deep Learning

Text line extraction for historical document images

Building A Volunteer Community: Results and Findings from Transcribe Bentham

GradientBased Learning Applied to Document Recognition

Gradient-based learning applied to document recognition

Learning representations by back-propagation errors, nature

Image Analysis and Mathematical Morphology

Sur la sphere vide

2009 10th International Conference on Document Analysis and Recognition Handwritten Text Line Segmentation by Shredding Text into its Lines

Results for an image of the Bozen test set -Results for RU-Nets trained on 5, 30 and 350 training samples (left to right) with different data augmentation strategies B, S+A and S+A+E

Field of Study

Journal Information

Name

Page

Volume

Venue Information

Name

Type

URL

Alternate Names