Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
$ npx skills add comet-ml/opikAlternatives
Compare similar skills by workflow fit, trust score, quality, GitHub adoption, maintenance, and install readiness.
Current skill
The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controlling costs and managing access to models and data.
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
$ npx skills add comet-ml/opikZenML π: One AI Platform from Pipelines to Agents. https://zenml.io.
$ npx skills add zenml-io/zenmlNext-generation AI Agent Optimization Platform: Cozeloop addresses challenges in AI agent development by providing full-lifecycle management capabilities from development, debugging, and evaluation to monitoring.
$ npx skills add coze-dev/coze-loopThe open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.
$ npx skills add Agenta-AI/agentaBuild, Manage and Deploy AI/ML Systems
$ npx skills add Netflix/metaflowAI Observability & Evaluation
$ npx skills add Arize-ai/phoenixLaminar - open-source observability platform purpose-built for AI agents. YC S24.
$ npx skills add lmnr-ai/lmnrObserval is a unified platform for agent distribution, observability and insights.
$ npx skills add BlazeUp-AI/ObservalTest your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, DeepSeek, and more. Simple declarative configs with command line and CI/CD integration. Used by OpenAI and Anthropic.
$ npx skills add promptfoo/promptfooπ§ Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 π
$ npx skills add Helicone/heliconePython SDK for Agent AI Observability, Monitoring and Evaluation Framework. Includes features like agent, llm and tools tracing, debugging multi-agentic system, self-hosted dashboard and advanced analytics with timeline and execution graph view
$ npx skills add raga-ai-hub/RagaAI-CatalystThe platform for LLM evaluations and AI agent testing
$ npx skills add langwatch/langwatchπ’ Open-Source Evaluation & Testing library for LLM Agents
$ npx skills add Giskard-AI/giskard-ossAGiXT is a dynamic AI Agent Automation Platform that seamlessly orchestrates instruction management and complex task execution across diverse AI providers. Combining adaptive memory, smart features, and a versatile plugin system, AGiXT delivers efficient and comprehensive AI solutions.
$ npx skills add Josh-XT/AGiXTBuild applications that make decisions (chatbots, agents, simulations, etc...). Monitor, trace, persist, and execute on your own infrastructure.
$ npx skills add apache/burrBISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI workflow, RAG, Agent, Unified model management, Evaluation, SFT, Dataset Management, Enterprise-level System Management, Observability and more.
$ npx skills add dataelement/bishengHow to choose
Use an alternative when it has a clearer install path, higher trust score, fresher maintenance, or better platform fit for your current agent stack. Keep Mlflow if it already passes your workflow test and repository review.
Next step
Open the compare page, test the install commands in a sandbox, and check each repository before using a skill in production.