Table Rows and Columns Detection in Documents

This repository contains code and resources for detecting tables in various types of documents using machine learning and computer vision techniques.

Article link: https://www.analyticsvidhya.com/blog/2023/08/detecting-table-rows-and-columns-in-images-using-transformers/

Introduction

Detecting tables in documents is a common problem in information extraction and document analysis. This project aims to provide tools and solutions to automate the process of identifying and extracting tables from different types of documents, such as PDFs, images, and scanned documents.

The PubTables-1M Dataset

PubTables-1M improves table extraction research with scientific article tables. It supports varied input formats, detailed headers, and addresses over-segmentation issues for accurate annotations.

DEtection TRansformer(DETR)

DETR (DEtection TRansformer) combines a ResNet-based convolutional backbone with an encoder-decoder Transformer, enabling object detection without intricate components like region proposals. It offers end-to-end training using its bipartite matching loss. Experimental results on PubTables-1M underscore the role of canonical data in boosting performance.

Results

License

This project is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Table Rows and Columns Detection in Documents

Introduction

The PubTables-1M Dataset

DEtection TRansformer(DETR)

Results

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

Table Rows and Columns Detection in Documents

Introduction

The PubTables-1M Dataset

DEtection TRansformer(DETR)

Results

License