Alternatives

Pix2Text alternatives for AI agents.

Compare similar skills by workflow fit, trust score, quality, GitHub adoption, maintenance, and install readiness.

Current skill

Pix2Text

An Open-Source Python3 tool with SMALL models for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations. 80+ languages are supported.

96
Quality
92
Trust
3.2K
Stars
#1

MinerU

Similarity 122Trust 94Excellent 100

Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.

68K starsJun 17, 2026 pushdocument-processingPythonPDF
$ npx skills add opendatalab/MinerU
#2

Yomitoku

Similarity 121Trust 91Excellent 98

YomiTokuはAIを活用した日本語文書解析エンジンを提供するPythonパッケージです。 Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese language.

1.5K starsJun 8, 2026 pushdocument-processingPythonOCR
$ npx skills add kotaro-kinoshita/yomitoku
#3

Kraken

Similarity 121Trust 92Excellent 100

OCR engine for all the languages

1.0K starsJun 5, 2026 pushdocument-processingPythonOCR
$ npx skills add mittagessen/kraken
#4

LaTeX OCR PRO

Similarity 121Trust 84Strong 71

:art: 数学公式识别增强版:中英文手写印刷公式、支持初级符号推导(数据结构基于 LaTeX 抽象语法树)Math Formula OCR Pro, supports handwrite, Chinese-mixed formulas and simple symbol reasoning (based on LaTeX AST).

1.3K starsJun 11, 2024 pushdocument-processingJupyter NotebookOCR
$ npx skills add LinXueyuanStdio/LaTeX_OCR_PRO
#5

PaddleOCR

Similarity 120Trust 98Excellent 100

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

83K starsJun 16, 2026 pushdocument-processingPythonOCR
$ npx skills add PaddlePaddle/PaddleOCR
#6

ShareX

Similarity 120Trust 98Excellent 100

ShareX is a free and open-source application that enables users to capture or record any area of their screen with a single keystroke. It also supports uploading images, text, and various file types to a wide range of destinations.

38K starsJun 18, 2026 pushdocument-processingC#OCR
$ npx skills add ShareX/ShareX
#7

Tesseract

Similarity 120Trust 96Excellent 100

Tesseract Open Source OCR Engine (main repository)

75K starsJun 13, 2026 pushdocument-processingC++OCR
$ npx skills add tesseract-ocr/tesseract
#8

Tesseract.Js

Similarity 120Trust 95Excellent 100

Pure Javascript OCR for more than 100 Languages 📖🎉🖥

38K starsMay 17, 2026 pushdocument-processingJavaScriptOCR
$ npx skills add naptha/tesseract.js
#9

AI Hands On

Similarity 120Trust 93Excellent 100

A group of notebooks and other files which can help you learn AI from scratch.

1.1K starsJun 2, 2026 pushdocument-processingJupyter NotebookOCR
$ npx skills add Ramakm/ai-hands-on
#10

Umi OCR

Similarity 120Trust 93Excellent 100

OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。

45K starsNov 20, 2025 pushdocument-processingPythonOCR
$ npx skills add hiroi-sora/Umi-OCR
#11

EasyOCR

Similarity 119Trust 93Excellent 100

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

30K starsDec 5, 2025 pushdocument-processingPythonOCR
$ npx skills add JaidedAI/EasyOCR
#12

Pot Desktop

Similarity 119Trust 97Excellent 100

🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.

19K starsJun 16, 2026 pushdocument-processingJavaScriptOCR
$ npx skills add pot-app/pot-desktop
#13

Dolphin

Similarity 119Trust 91Excellent 100

The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.

9.0K starsMar 25, 2026 pushdocument-processingPythonPDF
$ npx skills add bytedance/Dolphin
#14

Layout Parser

Similarity 119Trust 86Strong 78

A Unified Toolkit for Deep Learning Based Document Image Analysis

5.7K starsAug 15, 2024 pushdocument-processingPythonOCR
$ npx skills add Layout-Parser/layout-parser
#15

Easydict

Similarity 119Trust 97Excellent 100

一个简洁优雅的词典翻译 macOS App。开箱即用,支持离线 OCR 识别,支持有道词典,🍎 苹果系统词典,🍎 苹果系统翻译,OpenAI,Gemini,DeepL,Google,Bing,腾讯,百度,阿里,小牛,彩云和火山翻译。A concise and elegant Dictionary and Translator macOS App for looking up words and translating text.

14K starsJun 18, 2026 pushdocument-processingSwiftOCR
$ npx skills add tisfeng/Easydict
#16

Bisheng

Similarity 119Trust 98Excellent 100

BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI workflow, RAG, Agent, Unified model management, Evaluation, SFT, Dataset Management, Enterprise-level System Management, Observability and more.

11K starsJun 18, 2026 pushdocument-processingTypeScriptOCR
$ npx skills add dataelement/bisheng

How to choose

When should you switch?

Use an alternative when it has a clearer install path, higher trust score, fresher maintenance, or better platform fit for your current agent stack. Keep Pix2Text if it already passes your workflow test and repository review.

Next step

Compare top candidates side by side

Open the compare page, test the install commands in a sandbox, and check each repository before using a skill in production.