An open-source framework for building real-time, multimodal AI applications that can see, hear, and speak.
LiveKit Agents is an open-source framework designed to facilitate the development of AI-driven server programs capable of real-time interaction through voice, video, and data channels. It enables the creation of programmable, multimodal AI agents that can process and generate audio, video, and text, integrating seamlessly with large language models (LLMs) and other AI models. The framework supports flexible integrations, including plugins for popular LLMs, speech-to-text (STT), text-to-speech (TTS), and voice activity detection (VAD) services. LiveKit Agents also offers built-in task scheduling, load balancing, and real-time media transport over WebRTC, making it suitable for applications such as AI voice assistants, call centers, transcription services, and real-time translation.
We use cookies to enhance your experience. By continuing to use this site, you agree to our use of cookies. Learn more