OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
$ npx skills add ocrmypdf/OCRmyPDFAlternatives
Compare similar skills by workflow fit, trust score, quality, GitHub adoption, maintenance, and install readiness.
Current skill
A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
$ npx skills add ocrmypdf/OCRmyPDFReady-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
$ npx skills add JaidedAI/EasyOCRPDF craft can convert PDF files into various other formats. This project will focus on processing PDF files of scanned books.
$ npx skills add oomol-lab/pdf-craft带带弟弟 通用验证码识别OCR pypi版
$ npx skills add sml2h3/ddddocrOCR model that handles complex tables, forms, handwriting with full layout.
$ npx skills add datalab-to/chandraGLM-OCR: Accurate × Fast × Comprehensive
$ npx skills add zai-org/GLM-OCROCR powered screen-capture tool to capture information instead of images
$ npx skills add dynobo/normcapScan, index, and archive all of your paper documents (acquired by Mayan EDMS)
$ npx skills add zhoubear/open-paperlessYomiTokuはAIを活用した日本語文書解析エンジンを提供するPythonパッケージです。 Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese language.
$ npx skills add kotaro-kinoshita/yomitokuEnhances Tesseract OCR output using LLMs (local or API) for error correction, smart chunking, and markdown formatting of scanned PDFs
$ npx skills add Dicklesworthstone/llm_aided_ocrHandwritten Text Recognition (HTR) system implemented with TensorFlow.
$ npx skills add githubharald/SimpleHTRA Python wrapper for the tesseract-ocr API
$ npx skills add sirfz/tesserocr[验证码识别-训练] This project is based on CNN/ResNet/DenseNet+GRU/LSTM+CTC/CrossEntropy to realize verification code identification. This project is only for training the model.
$ npx skills add kerlomz/captcha_trainerPlumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
$ npx skills add jsvine/pdfplumberOCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
$ npx skills add hiroi-sora/Umi-OCRSmall python-gtk application, which helps the user to merge or split PDF documents and rotate, crop and rearrange their pages using an interactive and intuitive graphical interface.
$ npx skills add pdfarranger/pdfarrangerHow to choose
Use an alternative when it has a clearer install path, higher trust score, fresher maintenance, or better platform fit for your current agent stack. Keep Pdftabextract if it already passes your workflow test and repository review.
Next step
Open the compare page, test the install commands in a sandbox, and check each repository before using a skill in production.