An open-source framework for building and benchmarking environments tailored for large language model (LLM) agents across multiple platforms.
CRAB (Cross-environment Agent Benchmark) is an open-source framework developed by CAMEL-AI for constructing and evaluating environments designed for large language model (LLM) agents. It supports the creation of cross-platform environments, enabling deployment across in-memory systems, Docker-hosted environments, virtual machines, or distributed physical machines. CRAB introduces a graph-based fine-grained evaluation method and an efficient mechanism for task and evaluator construction, facilitating comprehensive assessment of agent performance across diverse settings.
83%
We use cookies to enhance your experience. By continuing to use this site, you agree to our use of cookies. Learn more