3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in document-enhancement-3
No benchmarks available.
Use these libraries to find document-enhancement-3 models and implementations
No subtasks available.
This paper proposes an effective end-to-end framework named document enhancement generative adversarial networks (DE-GAN) that uses the conditional GANs (cGANs) to restore severely degraded document images.
DocDiff is the first diffusion-based framework specifically designed for diverse challenging document enhancement problems, including document deblurring, denoising, and removal of watermarks and seals, and achieves state-of-the-art SOTA performance on multiple benchmark datasets, and can significantly enhance the readability and recognizability of degraded document images.
NAF-DPM is proposed, a novel generative framework based on a diffusion probabilistic model (DPM) designed to restore the original quality of degraded documents and demonstrates a notable character error reduction made by OCR systems when transcribing real-world document images enhanced by the framework.
This paper re-visits classical problems in document enhancement. Rather than proposing a new algorithm for a specific problem, we introduce a novel general approach. The key idea is to modify any state-of-the-art algorithm, by providing it with new information (input), improving its own results. Interestingly, this information is based on a solution to a seemingly unrelated problem of visibility detection in R3. We show that a simple representation of an image as a 3D point cloud, gives visibility detection on this cloud a new interpretation. What does it mean for a point to be visible? Although this question has been widely studied within computer vision, it has always been assumed that the point set is a sampling of a real scene. We show that the answer to this question in our context reveals unique and useful information about the image. We demonstrate the benefit of this idea for document binarization and for unshadowing.
This work proposes a light-weight encoder decoder based convolutional neural network architecture for removing the noisy elements from document images and incorporates the perceptual loss for knowledge transfer from pre-trained deep CNN network in its loss function.
This work proposes Laplacian Pyramid with Input/Output Attention Network (LP-IOANet), a novel pipeline with a lightweight architecture and an upsampling module that outperform the state-of-the-art by a 35% relative improvement in mean average error (MAE), while running real-time in four times the resolution on a mobile device.
StainDoc is constructed, the first large-scale, high-resolution dataset specifically designed for document stain removal, and StainRestorer, a Transformer-based document stain removal approach, is proposed, demonstrating superior performance over state-of-the-art methods on the Stain-Doc dataset and its variants StainDoc.Mark and Stain-Doc.Seal.
Adding a benchmark result helps the community track progress.