MiniMax Launches Desktop AI Agent, Positioning Itself as an "AI Intern" for Everyday Work

Chinese AI Startup Introduces Desktop Application That Moves Beyond Chatbot Interactions to Actual Desktop Productivity Tools

  BEIJING — MiniMax, a leading Chinese artificial intelligence startup, has released a desktop version of its AI Agent, marking a significant step toward making AI assistants true digital coworkers capable of performing real tasks on users' computers rather than simply responding to queries within a chat interface.

  The new Desktop App, available for both Mac and Windows operating systems, represents what MiniMax describes as an "AI-native Workspace." Unlike traditional chatbot experiences confined to a conversation window, this 2.0 version can directly manipulate files on a user's computer and automate browser-based tasks after receiving explicit permission.

  "MiniMax has internally referred to this working method as the 'Agent Intern,' and reportedly the vast majority of employees within the company have adopted it," according to company statements. "Now, this concept has been productized and made available to everyone."

  A Shift from Conversation to Action

  The timing of this release is noteworthy. Just last week, Anthropic introduced Claude Cowork, which similarly aims to transform Claude from a "chatbox assistant" into a "desktop employee" capable of operating local files and automatically executing tasks. While these developments may not yet represent the true practicalization of AI Agents, they do push Agent capabilities from concept demonstrations closer to genuine productivity workflows.

  MiniMax Agent's approach centers on three core components: Desktop App capabilities, Expert Agents functionality, and personalized expert configurations. The Desktop App enables the Agent to operate directly on a user's local file system and take control of web browsers to execute automated tasks. Expert Agents allow users to upload private knowledge bases and configure specialized instructions, creating domain-specific "expert avatars." These personalized experts can be trained on proprietary knowledge and standard operating procedures to replicate specialized reasoning.

  Testing Real-World Utility

  Independent testing of the core features yielded mixed but promising results. One evaluation focused on file organization capabilities. When a user's desktop contained a chaotic accumulation of screenshots, documents, videos, and compressed files of unknown origin, the MiniMax Agent was instructed to categorize everything. After receiving user confirmation regarding file permissions it intended to modify, the Agent analyzed and organized files within seconds. The system successfully separated images, documents, videos, and compressed files into appropriate folders, even distinguishing between "movies" and "other videos" while identifying and isolating an unnamed large web scraping data folder.

  A minor technical issue arose when a filename containing spaces caused a move operation to fail, but the Agent detected the problem, repaired it, and continued execution without user intervention.

  Automating Information Gathering

  A second test examined web automation capabilities with a practical daily task: monitoring Hacker News for artificial intelligence-related content. The Agent was instructed to browse the platform, identify popular posts related to AI, and save the results as a Markdown document locally, including titles, links, content summaries, vote counts, and comment numbers.

  The Agent operated within an embedded browser interface rather than占用用户现有的 Chrome session, allowing the user to continue other work while the task executed. Within approximately thirty seconds, the system produced a Markdown file listing AI-related trending posts from Hacker News with accurate information and functional links.

  The testing was extended with additional instructions to sort the collected data by popularity, present it in table format, and provide a summary of the main discussion themes in the AI sector. The Agent updated the Markdown file to include a table and analytical summary with what it described as "core insights." A subsequent request to generate an HTML page displaying posts in a card format was completed quickly, producing a clean visualization with all functional links.

  "This complete chain from information retrieval to organization to visualization ran smoothly," the evaluation noted. "If scheduled to run periodically or set as a regular task, it could eliminate considerable repetitive labor."

  The evaluation acknowledged a limitation: without specifying the quantity of posts to retrieve, only content from the front page was captured, suggesting that more extensive data collection would require more specific instructions.

  Building Domain-Specific Experts

  The Expert Agents functionality available through the web interface allows users to create specialized AI experts by uploading knowledge bases and setting custom instructions. The company describes this as enabling senior specialists to encapsulate their exclusive methodologies, allowing newcomers to quickly become productive.

  One test involved creating an "Economist-style writing expert" using a copy of The Economist's style guide, which distills decades of writing standards into principles for explaining complex problems clearly, concisely, and with analytical judgment. The process of uploading the entire book and having the Agent construct the Expert took approximately ten minutes, with a brief pause during the process that required a prompt to continue. The evaluation noted that adding progress indicators or status messages would improve the user experience.

  After completing the Expert creation, the system was tested by requesting an article on a trending topic. The Agent retrieved relevant information and produced a commentary article that was described as rational, data-driven, and featuring class analysis perspectives reminiscent of The Economist's writing style. However, the evaluation noted that careful reading revealed traces of AI generation, with certain expressions appearing somewhat formulaic and lacking the nuanced judgment and rhythm of human authors. Whether this limitation stems from continued instruction tuning or is inherent to the underlying model's writing capabilities remains unclear.

  This Expert Agents functionality is best suited for scenarios requiring repeated application of specific knowledge systems, such as content review, style consistency enforcement, and standard operating procedure execution.

  Competitive Landscape: MiniMax vs. Claude Cowork

  Comparing desktop AI Agent offerings inevitably leads to comparison with Anthropic's Claude Cowork, which entered the market around the same time with similar ambitions. The competitive analysis reveals distinct positioning.

  In terms of platform support, MiniMax offers both Mac and Windows compatibility while Claude Cowork currently supports only macOS. Pricing presents a significant difference: MiniMax is currently offered free of charge during a limited-time promotional period, whereas Claude Cowork requires a subscription. For users in China, MiniMax is natively accessible while Claude Cowork requires specialized network infrastructure.

  Claude Cowork's advantages include stronger underlying model capabilities, with Opus 4.5's reasoning abilities representing current industry benchmarks, along with integration with the Claude Code ecosystem providing developer-friendly functionality. MiniMax's advantages lie in lower barriers to entry and localization, making it more accessible for ordinary users.

  Both products are developing core capabilities around local file operation, web automation, and expert or skill systems with remarkably similar approaches. This convergence suggests the industry is developing a shared understanding of what an "AI desktop employee" should look like.

  Assessment and Industry Context

  After comprehensive testing, the assessment concluded that MiniMax Agent is functional but requires appropriate expectations. File organization, information collection, web scraping, and data visualization are repetitive tasks the system can handle effectively. The complete workflow from information retrieval to organization to visualization operates successfully and can eliminate considerable mechanical labor. The Expert Agents concept is also well-conceived, encapsulating experience into reusable systems suitable for team collaboration scenarios.

  However, the evaluation identified several current limitations. Complex writing tasks still exhibit detectable AI characteristics requiring human editing. The Expert creation process lacks progress feedback, diminishing the user experience. The depth of web automation capabilities requires further verification.

  These limitations reflect the broader state of the industry. As Dylan Field, CEO of Field, has observed, intuition suggests we are currently in the "MS-DOS era" of artificial intelligence. Andre Karpathy has offered a similar assessment, characterizing large language models as a new "operating system" while noting that the industry remains in the "1960s OS design phase," possessing computational capability but lacking the mouse, windows, and desktop metaphors that would make AI truly accessible.

  Products like Claude Cowork and MiniMax Agent are, in a sense, attempting to construct "mice and windows" for AI, enabling artificial intelligence to enter not just conversation but file systems, browsers, and actual workflows.

  This evolution is just beginning. Today's products represent only the first step, but the direction is established. The ultimate potential will likely require several years of industry development to fully realize.

  For users seeking to experience this direction firsthand, the current timing may be opportune, particularly given MiniMax Agent's limited-time free availability.

Comments

Popular posts from this blog

Moonshot AI Unveils Kimi K2.5: Open-Source Multimodal Models Enter the Agent Swarm Era

MiniMax Voice Design: A Game-Changer in Voice Synthesis

Huawei's "CodeFlying" AI Agent Platform Marks Industrial-Scale Natural Language Programming Era