Moonshot AI Unveils Kimi K2.5: Open-Source Multimodal Models Enter the Agent Swarm Era
Beijing, January 27, 2026 — Moonshot AI, a leading artificial intelligence company in China, today announced the official release of Kimi K2.5, its most powerful open-source model to date. This release marks a significant breakthrough in open-source multimodal AI technology, particularly in the realm of agent collaboration.
Technical Architecture and Scale
Kimi K2.5 builds upon Kimi K2 through continued pre-training on approximately 15 trillion mixed visual and text tokens, achieving native multimodal capabilities. The model delivers state-of-the-art performance in coding and vision tasks while introducing a novel self-directed agent swarm paradigm.
Kimi K2.5 is now available through Kimi.com, the Kimi application, API access, and the newly launched coding product Kimi Code. The new version supports four operational modes: K2.5 Instant, K2.5 Thinking, K2.5 Agent, and K2.5 Agent Swarm (Beta). The Agent Swarm mode is currently live on Kimi.com, with free credits available for high-tier paid users.
Agent Swarm: A New Paradigm for AI Collaboration
The most notable innovation of Kimi K2.5 lies in its Agent Swarm functionality. For complex tasks, the model can autonomously coordinate up to 100 sub-agents working in parallel, completing over 1,500 tool calls during execution. Compared to traditional single-agent architectures, the Agent Swarm mode can reduce execution time to as little as one-fifth to one-quarter (a 2.2x to 4.5x improvement).
This breakthrough stems from Moonshot AI's Parallel-Agent Reinforcement Learning (PARL) technology. The technique employs a trainable orchestrator agent that decomposes tasks into parallelizable subtasks, with each subtask executed by dynamically instantiated frozen sub-agents. By running these subtasks concurrently, the system significantly reduces end-to-end latency.
However, training a reliable parallel orchestrator presents severe challenges. Due to delayed, sparse, and non-stationary feedback from independently running sub-agents, a common failure mode is "serial collapse"—where the orchestrator defaults to single-agent execution despite having parallel capacity. To address this, PARL employs staged reward shaping that incentivizes parallelism early in training, gradually shifting focus toward task success.
To further ensure parallel strategies genuinely emerge, Moonshot AI introduced a computational bottleneck that makes sequential execution impractical. Instead of counting total steps, performance is evaluated using Critical Steps, a latency-oriented metric inspired by the critical path in parallel computation. This metric accounts for orchestration overhead while reflecting the slowest sub-agent at each stage, meaning spawning more subtasks only helps if it genuinely shortens the critical path.
Breakthroughs in Visual Coding
Kimi K2.5 demonstrates exceptional capabilities in the coding domain, particularly excelling in front-end development. The model can transform simple conversations into complete front-end interfaces, implementing interactive layouts and rich animations, including scroll-triggered effects.
Distinguished from traditional text-driven programming paradigms, Kimi K2.5 achieves跨越式 advancement through visual reasoning capabilities. The model can analyze images and video content, improving the conversion from visual inputs to code outputs while supporting visual debugging, significantly lowering barriers for users expressing intent through visual means.
On Moonshot AI's internal Kimi Code Bench evaluation, Kimi K2.5 demonstrated continuous improvements over K2 across task types, including building, debugging, refactoring, testing, and scripting across multiple programming languages, assessing end-to-end real-world software engineering capabilities.
Office Intelligence and Professional Knowledge Processing
Kimi K2.5 brings agentic intelligence into practical knowledge work scenarios. Its Agent mode can handle high-density, large-scale information inputs, coordinate multi-step tool usage, and deliver expert-level outputs directly through conversation—including documents, spreadsheets, PDF files, and slide decks.
To assess performance in real office scenarios, Moonshot AI constructed two internal professional productivity benchmarks: the AI Office Benchmark for evaluating end-to-end office output quality, and the General Agent Benchmark measuring multi-step production-grade workflows against human expert performance. Results showed Kimi K2.5 achieved 59.3% and 24.3% improvements over K2 Thinking on these benchmarks respectively, demonstrating stronger end-to-end task processing capabilities.
The model supports advanced office functions such as annotations in Word documents, financial model construction with Pivot Tables, and LaTeX equation writing in PDFs, while scaling to long-form outputs like 10,000-word papers or 100-page documents.
Performance Benchmarks and Industry Comparisons
In industry-standard evaluations, Kimi K2.5 was comprehensively benchmarked against top-tier models including GPT-5.2, Claude 4.5 Opus, Gemini 3 Pro, DeepSeek V3.2, and Qwen3-VL-235B-A22B.
In coding tasks, Kimi K2.5 achieved 76.8% on SWE-Bench Verified, 73.0% on SWE-Bench Multilingual, 50.8% on Terminal-Bench 2.0, and an outstanding 85.0% on LiveCodeBench (v6).
In visual understanding, the model scored 78.5% on MMMU-Pro, 84.2% on MathVision, 90.1% on MathVista (mini), 88.8% on OmniDocBench 1.5, and as high as 92.6% on InfoVQA (test).
In agentic search tasks, Kimi K2.5 achieved 60.6% on BrowseComp (improving to 74.9% with context management and 78.4% with Agent Swarm), 72.7% on WideSearch (improving to 79.0% with Agent Swarm), and 77.1% on DeepSearchQA.
Notably, in agentic search tasks, Kimi K2.5 achieved comparable or superior performance to closed-source top models at a significantly lower computational cost, opening new pathways for commercial applications of open-source artificial intelligence.
Agent Intelligence for the Future
The release of Kimi K2.5 marks an important milestone for open-source multimodal models in the realm of agentic intelligence. By integrating visual coding, agent swarm collaboration, and office productivity capabilities into a unified platform, the model demonstrates AI's potential to undertake increasingly complex tasks in knowledge work.
Moonshot AI stated the company will continue pushing the frontiers of agentic intelligence, redefining the boundaries of AI in knowledge work. With the open-source release of tools like Kimi, this technology is rapidly transitioning from laboratory research to practical application scenarios for developers and enterprises.
About Moonshot AI
Moonshot AI is a leading artificial intelligence company in China, dedicated to developing general artificial intelligence technologies. The company's Kimi intelligent assistant products already serve millions of users, and the release of Kimi K2.5 further solidifies its leading position in the open-source multimodal model domain.
Comments
Post a Comment