Founded in 2023 and headquartered in San Carlos, Simular AI is an AI startup focused on developing 'computer-using agents' that simulate human interaction with GUIs. Founders Ang Li and Jiachen Yang have backgrounds from top institutions like DeepMind, Google, and Baidu. The core mission is to create AI agents that can use computers like humans, automating tedious digital tasks and freeing up human potential.
The core technology is the Agent S framework and its upgraded version, Agent S2 – an open, modular, and extensible agent framework. It combines general models for high-level planning and utilizes specialized models for low-level execution and interface grounding, achieving leading performance on multiple benchmarks. Agent S2 introduces innovations like Proactive Hierarchical Planning (PHP) and Mixture-of-Grounding (MoG), enabling precise GUI manipulation using only screenshots. The company embraces open source, and the Agent S/S2 frameworks are available on GitHub.
The product portfolio includes Simular for macOS/Browser (local Mac browser agent) and Simular Desktop (cross-platform desktop assistant) for individuals, and Simular for Business (autonomous digital employees) for enterprises. Products emphasize the security and performance of local execution and focus on human-computer collaboration. A freemium pricing model (currently in beta) is adopted, with custom solutions offered for businesses.
In 2024, the company completed a $5 million early-stage funding round with investors including Basis Set Ventures, Flying Fish Partners, Samsung NEXT Ventures, and South Park Commons.
Core capability, interacts with graphical interfaces by simulating human operations, without relying on APIs.
Advanced modular agent framework combining general model planning with specialized model execution/grounding.
Proactively predicts and dynamically adjusts plans to adapt to real-time environmental changes, improving task success rates.
Utilizes multiple grounding experts to precisely locate UI elements using only screenshot input.
Agent S/S2 frameworks are open source, promoting community participation and technological transparency.
Personal products emphasize running on the user's device, enhancing data security and privacy protection.
Records user digital operations and can automatically replay them, simplifying the creation of automated repetitive tasks.
Agents can attempt different methods for self-correction when errors occur during execution, improving robustness.
Traditional automation methods (RPA, API integration) have limitations. Simular AI is dedicated to building intelligent agents that can directly **perceive, reason, and operate** GUIs across various platforms. Our '**computer-using agents**' understand state by observing the screen, precisely simulate human keyboard and mouse operations, and integrate cognitive patterns of **fast thinking** (intuitive reaction) and **slow thinking** (deep reasoning).
The core technology is embodied in the **open, modular, and extensible** Agent S/S2 frameworks. Both adhere to the design principle of using **general models** for high-level planning and **specialized models** for low-level execution and interface '**grounding**'.
Tech Dimension | Agent S | Agent S2 (Innovations) |
---|---|---|
Planning Capability | Experience-Enhanced Hierarchical Planning | **Proactive Hierarchical Planning (PHP)**: Predicts future states and dynamically adjusts plans |
Human-Computer Interface | Basic Agent-Computer Interface (ACI) | **Enhanced ACI**: Intelligently assigns tasks to expert modules |
GUI Element Localization | Relies on multimodal input, limited by accessibility APIs | **Mixture-of-Grounding (MoG)**: Precisely locates interface elements using only screenshots |
Learning & Adaptation | Basic experience memory mechanism | **Advanced Memory System & Self-Correction**: Continuous learning and strategy adjustment |
Open source is a core differentiator for Simular AI. The Agent S/S2 frameworks are fully open-sourced on GitHub. The company maintains several active repositories (Agent-S, OpenACI, pysimular, etc.) and has established a Discord community to foster developer exchange. Using the framework requires configuring Python environments and Docker, and depends on external LLM services and specialized grounding models.
Product design revolves around AI agents collaborating with users, emphasizing **human-computer collaboration** and user control. Focuses on **local execution (on-device)** to enhance security, responsiveness, and experience. Provides features for recording, sharing, and replaying digital actions.
Offers a range of products covering different user needs:
Plan Name | Price | Key Features | Target User | Available Add-ons |
---|---|---|---|---|
Free Plan | $0/month | Basic workspace tools; public community actions; no private actions | Individual starter users | None |
Premium Plan | $19.99/device/month | Includes Free features; private/team channel actions; local execution | Individuals/teams needing privacy/collaboration | Server, Concierge |
Simular for Business | Contact Sales | Autonomous digital employees; enterprise-grade features & services | Enterprise users | Custom services |
Premium Add-on Services | ||||
Server | +$39.99/device/month | Simular hosted server; includes 200 agent hours; extra $0.10/hour | Users needing cloud computing power | - |
Concierge | Contact Sales | Request Simular experts for custom results without creating actions yourself | Users needing expert services | - |
This segment is rapidly developing, attracting significant attention and investment. Core objectives include workflow automation, task execution, code generation, data analysis, and software interaction (GUI/API).
Diverse technological paths: direct GUI interaction, API orchestration, code generation, conversational AI, no-code/low-code platforms.
Simular AI faces multi-dimensional competition:
**Direct GUI Automation Competitors:** OpenAI Operator/CUA, Manus AI, Genspark Superagent, Ace, Proxy AI.
**Broader AI Agent Frameworks/Platforms:** LangChain, AutoGen, CrewAI, No-code/Low-code platforms (Gumloop, n8n, Google, Microsoft, UiPath, etc.), other open-source agents (Rasa, Haystack, etc.).
**Existing Productivity Suites:** Microsoft 365 Copilot, Google Workspace AI.
Competitor | Focus | Technology/Method | Open Source | Use Case | Differentiation |
---|---|---|---|---|---|
Simular AI | GUI Automation | Modular (MoG, PHP), Human-like interaction, Screenshot analysis | Yes (Core) | Personal/Enterprise Automation | Open Source, Local Exec, Human-Collab, SOTA |
OpenAI Operator | GUI Automation | GPT-4o, Task decomposition | No (Model) | Forms/E-commerce | OpenAI Ecosystem, Strong base model |
Manus AI | General AI Agent (GUI) | Multi-agent collab, or uses Claude 3.x | No | Complex task automation | High attention/funding, Reliability concerns |
Genspark Superagent | API Orchestration/Tool Calling | Hybrid agent (9+ models), 80+ tools, API integration | No | Broad computer tasks | Hybrid agent, Rich toolset, API focus |
Ace | GUI Automation | Direct local K/M control, Observational learning | No | Quick desktop tasks | Local direct control, Claims speed |
Proxy AI | Web Browsing Automation | Parallel processing (multi-agent), Natural language commands | No | Web research/Data collection/Form filling | Web focus, Parallel processing speedup |
LangChain | LLM Application Framework | Prompt chaining, Data integration, Agent modules | Yes | Building various LLM apps | Broad ecosystem, Flexible, Not GUI-focused |
AutoGen | Multi-Agent Conversation Framework | Multi-agent coordination, Code gen, Self-correction | Yes | Complex workflows, Programming tasks | Microsoft support, Strong in code/multi-agent interaction |
CrewAI | Multi-Agent Orchestration Framework | Role-playing agent collab, Task delegation | Yes | Collaborative task automation | High usability, Focus on agent team coordination |
UiPath Agent Builder | Low-Code Platform | Visual design, Integrates UiPath ecosystem | No | Enterprise RPA/Automation | Enterprise-focused, Deep UiPath integration |