Read and extract text and other content from PDFs in C# (port of PDFBox)
$ npx skills add UglyToad/PdfPigDecision filters
6 skills matching "document-analysis"
Best blend of quality, stars, freshness, and agent usage
Read and extract text and other content from PDFs in C# (port of PDFBox)
$ npx skills add UglyToad/PdfPigOpenOCR: An Open-Source Toolkit for General-OCR Research and Applications, integrates a unified training and evaluation benchmark, commercial-grade OCR and Document Parsing systems, and faithful reproductions of the core implementations from a wide range of academic papers.
$ npx skills add Topdu/OpenOCRAn on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)
$ npx skills add NanoNets/docextDedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser
$ npx skills add ispras/dedocA curated list of resources for Document Understanding (DU) topic
$ npx skills add tstanislawek/awesome-document-understandingOpen-source platform for extracting structured data from documents using AI.
$ npx skills add DocumindHQ/documind