An open-source Python framework for fine-tuning large language model (LLM) agents using online reinforcement learning.
LlamaGym is an open-source Python framework designed to simplify the fine-tuning of large language model (LLM) agents through online reinforcement learning. By providing a standardized environment similar to OpenAI's Gym, LlamaGym allows developers to efficiently train LLM-based agents by managing conversation context, episode batching, reward assignment, and proximal policy optimization (PPO) setup. This framework enables rapid experimentation with agent prompting and hyperparameters across various Gym environments, facilitating the development of more capable and responsive AI agents.
We use cookies to enhance your experience. By continuing to use this site, you agree to our use of cookies. Learn more