Alternatives

Phoenix alternatives for AI agents.

Compare similar skills by workflow fit, trust score, quality, GitHub adoption, maintenance, and install readiness.

Current skill

Phoenix

AI Observability & Evaluation

100
Quality
100
Trust
10K
Stars
#1

Mlflow

Similarity 134Trust 100Excellent 100

The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controlling costs and managing access to models and data.

26K starsJun 5, 2026 pushdevelopmentPythonLLMOps
$ npx skills add mlflow/mlflow
#2

Lmnr

Similarity 133Trust 100Excellent 100

Laminar - open-source observability platform purpose-built for AI agents. YC S24.

3.0K starsJun 6, 2026 pushdevelopmentTypeScriptLLMOps
$ npx skills add lmnr-ai/lmnr
#3

Opik

Similarity 126Trust 100Excellent 100

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

19K starsJun 5, 2026 pushdevelopmentPythonLLMOps
$ npx skills add comet-ml/opik
#4

RagaAI Catalyst

Similarity 126Trust 100Excellent 100

Python SDK for Agent AI Observability, Monitoring and Evaluation Framework. Includes features like agent, llm and tools tracing, debugging multi-agentic system, self-hosted dashboard and advanced analytics with timeline and execution graph view

16K starsFeb 11, 2026 pushdevelopmentPythonLLMOps
$ npx skills add raga-ai-hub/RagaAI-Catalyst
#5

Metaflow

Similarity 125Trust 100Excellent 100

Build, Manage and Deploy AI/ML Systems

10K starsJun 5, 2026 pushdevelopmentPythonLLMOps
$ npx skills add Netflix/metaflow
#6

Zenml

Similarity 124Trust 100Excellent 100

ZenML 🙏: One AI Platform from Pipelines to Agents. https://zenml.io.

5.4K starsJun 7, 2026 pushdevelopmentPythonLLMOps
$ npx skills add zenml-io/zenml
#7

Observal

Similarity 122Trust 100Excellent 100

Observal is a unified platform for agent distribution, observability and insights.

2.1K starsJun 9, 2026 pushdevelopmentPythonLLMOps
$ npx skills add BlazeUp-AI/Observal
#8

Helicone

Similarity 118Trust 100Excellent 100

🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 🍓

5.8K starsMay 18, 2026 pushdevelopmentTypeScriptLLMOps
$ npx skills add Helicone/helicone
#9

Agenta

Similarity 117Trust 100Excellent 100

The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.

4.2K starsJun 9, 2026 pushdevelopmentTypeScriptLLMOps
$ npx skills add Agenta-AI/agenta
#10

Langwatch

Similarity 117Trust 100Excellent 100

The platform for LLM evaluations and AI agent testing

3.3K starsJun 6, 2026 pushdevelopmentTypeScriptLLMOps
$ npx skills add langwatch/langwatch
#11

Giskard Oss

Similarity 116Trust 100Excellent 100

🐢 Open-Source Evaluation & Testing library for LLM Agents

5.4K starsJun 5, 2026 pushdevelopmentPythonLLMOps
$ npx skills add Giskard-AI/giskard-oss
#12

Llm Twin Course

Similarity 115Trust 100Excellent 100

🤖 𝗟𝗲𝗮𝗿𝗻 for 𝗳𝗿𝗲𝗲 how to 𝗯𝘂𝗶𝗹𝗱 an end-to-end 𝗽𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻-𝗿𝗲𝗮𝗱𝘆 𝗟𝗟𝗠 & 𝗥𝗔𝗚 𝘀𝘆𝘀𝘁𝗲𝗺 using 𝗟𝗟𝗠𝗢𝗽𝘀 best practices: ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤𝘰𝘥𝘦 + 12 𝘩𝘢𝘯𝘥𝘴-𝘰𝘯 𝘭𝘦𝘴𝘴𝘰𝘯𝘴

4.4K starsApr 20, 2026 pushdevelopmentPythonLLMOps
$ npx skills add decodingai-magazine/llm-twin-course
#13

AGiXT

Similarity 115Trust 100Excellent 100

AGiXT is a dynamic AI Agent Automation Platform that seamlessly orchestrates instruction management and complex task execution across diverse AI providers. Combining adaptive memory, smart features, and a versatile plugin system, AGiXT delivers efficient and comprehensive AI solutions.

3.2K starsJun 2, 2026 pushdevelopmentPythonLLMOps
$ npx skills add Josh-XT/AGiXT
#14

Burr

Similarity 114Trust 100Excellent 100

Build applications that make decisions (chatbots, agents, simulations, etc...). Monitor, trace, persist, and execute on your own infrastructure.

2.0K starsJun 6, 2026 pushdevelopmentPythonLLMOps
$ npx skills add apache/burr
#15

Promptfoo

Similarity 112Trust 100Excellent 100

Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, DeepSeek, and more. Simple declarative configs with command line and CI/CD integration. Used by OpenAI and Anthropic.

22K starsJun 9, 2026 pushdevelopmentTypeScriptLLMOps
$ npx skills add promptfoo/promptfoo
#16

Bisheng

Similarity 111Trust 100Excellent 100

BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI workflow, RAG, Agent, Unified model management, Evaluation, SFT, Dataset Management, Enterprise-level System Management, Observability and more.

11K starsJun 9, 2026 pushdevelopmentTypeScriptLLMOps
$ npx skills add dataelement/bisheng

How to choose

When should you switch?

Use an alternative when it has a clearer install path, higher trust score, fresher maintenance, or better platform fit for your current agent stack. Keep Phoenix if it already passes your workflow test and repository review.

Next step

Compare top candidates side by side

Open the compare page, test the install commands in a sandbox, and check each repository before using a skill in production.