An LMM-powered web agent completing user instructions end-to-end by interacting with real-world websites.
WebVoyager is an innovative Large Multimodal Model (LMM) powered web agent designed to autonomously accomplish web tasks online from start to finish, managing the entire process without human intervention. It integrates textual and visual information to navigate and interact with real-world websites, effectively handling complex tasks such as locating specific information, making selections, and completing transactions. WebVoyager has been evaluated on a benchmark comprising tasks from 15 popular websites, achieving a 59.1% task success rate, significantly surpassing the performance of other models. The project is open-source, with code and data available for further development and research.
We use cookies to enhance your experience. By continuing to use this site, you agree to our use of cookies. Learn more