bytedance/UI-TARS-desktop
原文摘要
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra Introduction English | 简体中文 TARS * is a Multimodal AI Agent stack, currently shipping two projects: Agent TARS and UI-TARS-desktop : Agent TARS UI-TARS-desktop Agent TARS is a general multimodal AI Agent stack, it brings the power of GUI Agent and Vision into your terminal, computer, browser and product. It primarily ships with a CLI and Web UI for usage. It aims to provide a workflow that is closer to human-like task completion through cutting-edge multimodal LLMs and seamless integration with various real-world MCP tools. UI-TARS Desktop is a desktop application that provides a native GUI Agent based on the UI-TARS model. It primarily ships a local and remote computer as well as browser operators. Table of Contents News Agent TARS Showcase Core Features Quick Start Documentation UI-TARS Desktop Showcase Features Quick Start Contributing License Citation News [2025-11-05] 🎉 We're excited to announce the release of Agent TARS CLI v0.3.0 ! This version brings streaming support for multiple tools (shell commands, multi-file structured display), runtime settings with timing statistics for too…