An open-source framework for building and benchmarking environments tailored for large language model (LLM) agents across multiple platforms.
CRAB (Cross-environment Agent Benchmark) is an open-source framework developed by CAMEL-AI for constructing and evaluating environments designed for large language model (LLM) agents. It supports the creation of cross-platform environments, enabling deployment across in-memory systems, Docker-hosted environments, virtual machines, or distributed physical machines. CRAB introduces a graph-based fine-grained evaluation method and an efficient mechanism for task and evaluator construction, facilitating comprehensive assessment of agent performance across diverse settings.
We use cookies to enhance your experience. By continuing to use this site, you agree to our use of cookies. Learn more