AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation
$ npx skills add Marker-Inc-Korea/AutoRAGDecision filters
6 skills matching "benchmarking"
Best blend of quality, stars, freshness, and agent usage
AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation
$ npx skills add Marker-Inc-Korea/AutoRAGA streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.
$ npx skills add modelscope/evalscopePython SDK for AI agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks including CrewAI, Agno, OpenAI Agents SDK, Langchain, Autogen, AG2, and CamelAI
$ npx skills add AgentOps-AI/agentops🔥 A list of tools, frameworks, and resources for building AI web agents
$ npx skills add steel-dev/awesome-web-agentsAn on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)
$ npx skills add NanoNets/docextWindows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
$ npx skills add microsoft/WindowsAgentArena