LLM-based safeguard model ensuring safe human-AI conversations.
Llama Guard is a Large Language Model (LLM)-based safeguard developed to ensure safe and appropriate human-AI interactions. It functions by classifying both user inputs and AI-generated outputs to identify and mitigate potential safety risks, such as prompt injections or inappropriate content. The model is instruction-tuned to handle various safety categories and can be customized to align with specific use cases. Llama Guard supports multi-class classification and generates binary decision scores to effectively moderate AI conversations.
81%
We use cookies to enhance your experience. By continuing to use this site, you agree to our use of cookies. Learn more