Simular AI Agent

Founded in 2023 and headquartered in San Carlos, Simular AI is an AI startup focused on developing 'computer-using agents' that simulate human interaction with GUIs. Founders Ang Li and Jiachen Yang have backgrounds from top institutions like DeepMind, Google, and Baidu. The core mission is to create AI agents that can use computers like humans, automating tedious digital tasks and freeing up human potential.

The core technology is the Agent S framework and its upgraded version, Agent S2 – an open, modular, and extensible agent framework. It combines general models for high-level planning and utilizes specialized models for low-level execution and interface grounding, achieving leading performance on multiple benchmarks. Agent S2 introduces innovations like Proactive Hierarchical Planning (PHP) and Mixture-of-Grounding (MoG), enabling precise GUI manipulation using only screenshots. The company embraces open source, and the Agent S/S2 frameworks are available on GitHub.

The product portfolio includes Simular for macOS/Browser (local Mac browser agent) and Simular Desktop (cross-platform desktop assistant) for individuals, and Simular for Business (autonomous digital employees) for enterprises. Products emphasize the security and performance of local execution and focus on human-computer collaboration. A freemium pricing model (currently in beta) is adopted, with custom solutions offered for businesses.

In 2024, the company completed a $5 million early-stage funding round with investors including Basis Set Ventures, Flying Fish Partners, Samsung NEXT Ventures, and South Park Commons.

Core Features

Human-like GUI Interaction

Core capability, interacts with graphical interfaces by simulating human operations, without relying on APIs.

Agent S2 Framework

Advanced modular agent framework combining general model planning with specialized model execution/grounding.

Proactive Hierarchical Planning (PHP)

Proactively predicts and dynamically adjusts plans to adapt to real-time environmental changes, improving task success rates.

Mixture-of-Grounding (MoG)

Utilizes multiple grounding experts to precisely locate UI elements using only screenshot input.

Open Source Core

Agent S/S2 frameworks are open source, promoting community participation and technological transparency.

Local Execution Priority

Personal products emphasize running on the user's device, enhancing data security and privacy protection.

Operation Recording & Playback

Records user digital operations and can automatically replay them, simplifying the creation of automated repetitive tasks.

Self-Correction Capability

Agents can attempt different methods for self-correction when errors occur during execution, improving robustness.

Technology Deep Dive: Agent S Framework & Computer-Using Agents

Core Concept: AI Computer-Using Agents

Traditional automation methods (RPA, API integration) have limitations. Simular AI is dedicated to building intelligent agents that can directly **perceive, reason, and operate** GUIs across various platforms. Our '**computer-using agents**' understand state by observing the screen, precisely simulate human keyboard and mouse operations, and integrate cognitive patterns of **fast thinking** (intuitive reaction) and **slow thinking** (deep reasoning).

Agent S vs. S2 Framework Comparison

The core technology is embodied in the **open, modular, and extensible** Agent S/S2 frameworks. Both adhere to the design principle of using **general models** for high-level planning and **specialized models** for low-level execution and interface '**grounding**'.

Tech DimensionAgent SAgent S2 (Innovations)
Planning CapabilityExperience-Enhanced Hierarchical Planning**Proactive Hierarchical Planning (PHP)**: Predicts future states and dynamically adjusts plans
Human-Computer InterfaceBasic Agent-Computer Interface (ACI)**Enhanced ACI**: Intelligently assigns tasks to expert modules
GUI Element LocalizationRelies on multimodal input, limited by accessibility APIs**Mixture-of-Grounding (MoG)**: Precisely locates interface elements using only screenshots
Learning & AdaptationBasic experience memory mechanism**Advanced Memory System & Self-Correction**: Continuous learning and strategy adjustment

Performance & Benchmarks

Agent S

  • OSWorld: 83.6% success rate improvement over baseline
  • WindowsAgentArena: Demonstrated excellent cross-platform generalization

Agent S2 (SOTA Performance)

  • OSWorld: 34.5% accuracy at 50 steps, surpassing OpenAI CUA
  • WindowsAgentArena: 52.8% performance improvement
  • AndroidWorld: 50% accuracy, surpassing UI-TARS

Open Source Ecosystem & Community

Open source is a core differentiator for Simular AI. The Agent S/S2 frameworks are fully open-sourced on GitHub. The company maintains several active repositories (Agent-S, OpenACI, pysimular, etc.) and has established a Discord community to foster developer exchange. Using the framework requires configuring Python environments and Docker, and depends on external LLM services and specialized grounding models.

Product Portfolio & Services

Core Product Philosophy

Product design revolves around AI agents collaborating with users, emphasizing **human-computer collaboration** and user control. Focuses on **local execution (on-device)** to enhance security, responsiveness, and experience. Provides features for recording, sharing, and replaying digital actions.

Specific Product Lines

Offers a range of products covering different user needs:

  • Simular for macOS / Simular Browser: Native macOS agent, runs locally, embedded WebKit engine. Emphasizes autonomy, shared control, security, and familiar experience. Simplifies daily digital life. Free download.
  • Simular Desktop: Desktop AI assistant for executing digital actions and automating tasks. Core feature is recording operations as instructions and replaying them. Aims to save time and improve productivity. Potentially cross-platform. Offers Free and Premium plans.
  • Simular for Business: Positioned as **autonomous digital employees** to enhance organizational efficiency. Targets enterprise scenarios (finance, customer service, HR, etc.). Focuses on automation, productivity, scalability, workflow streamlining, RPA, data analysis, etc. Contact for demo.
  • Agent S / S2 Framework: Underlying open-source framework for developers and researchers.

Pricing Structure

Simular AI Pricing Plans
Plan NamePriceKey FeaturesTarget UserAvailable Add-ons
Free Plan$0/monthBasic workspace tools; public community actions; no private actionsIndividual starter usersNone
Premium Plan$19.99/device/monthIncludes Free features; private/team channel actions; local executionIndividuals/teams needing privacy/collaborationServer, Concierge
Simular for BusinessContact SalesAutonomous digital employees; enterprise-grade features & servicesEnterprise usersCustom services
Premium Add-on Services
Server+$39.99/device/monthSimular hosted server; includes 200 agent hours; extra $0.10/hourUsers needing cloud computing power-
ConciergeContact SalesRequest Simular experts for custom results without creating actions yourselfUsers needing expert services-

Competitive Landscape Analysis

AI Agent Market Overview (Focus on Computer Usage)

This segment is rapidly developing, attracting significant attention and investment. Core objectives include workflow automation, task execution, code generation, data analysis, and software interaction (GUI/API).

Diverse technological paths: direct GUI interaction, API orchestration, code generation, conversational AI, no-code/low-code platforms.

Main Competitors

Simular AI faces multi-dimensional competition:

**Direct GUI Automation Competitors:** OpenAI Operator/CUA, Manus AI, Genspark Superagent, Ace, Proxy AI.

**Broader AI Agent Frameworks/Platforms:** LangChain, AutoGen, CrewAI, No-code/Low-code platforms (Gumloop, n8n, Google, Microsoft, UiPath, etc.), other open-source agents (Rasa, Haystack, etc.).

**Existing Productivity Suites:** Microsoft 365 Copilot, Google Workspace AI.

Competitor Feature Comparison

CompetitorFocusTechnology/MethodOpen SourceUse CaseDifferentiation
Simular AIGUI AutomationModular (MoG, PHP), Human-like interaction, Screenshot analysisYes (Core)Personal/Enterprise AutomationOpen Source, Local Exec, Human-Collab, SOTA
OpenAI OperatorGUI AutomationGPT-4o, Task decompositionNo (Model)Forms/E-commerceOpenAI Ecosystem, Strong base model
Manus AIGeneral AI Agent (GUI)Multi-agent collab, or uses Claude 3.xNoComplex task automationHigh attention/funding, Reliability concerns
Genspark SuperagentAPI Orchestration/Tool CallingHybrid agent (9+ models), 80+ tools, API integrationNoBroad computer tasksHybrid agent, Rich toolset, API focus
AceGUI AutomationDirect local K/M control, Observational learningNoQuick desktop tasksLocal direct control, Claims speed
Proxy AIWeb Browsing AutomationParallel processing (multi-agent), Natural language commandsNoWeb research/Data collection/Form fillingWeb focus, Parallel processing speedup
LangChainLLM Application FrameworkPrompt chaining, Data integration, Agent modulesYesBuilding various LLM appsBroad ecosystem, Flexible, Not GUI-focused
AutoGenMulti-Agent Conversation FrameworkMulti-agent coordination, Code gen, Self-correctionYesComplex workflows, Programming tasksMicrosoft support, Strong in code/multi-agent interaction
CrewAIMulti-Agent Orchestration FrameworkRole-playing agent collab, Task delegationYesCollaborative task automationHigh usability, Focus on agent team coordination
UiPath Agent BuilderLow-Code PlatformVisual design, Integrates UiPath ecosystemNoEnterprise RPA/AutomationEnterprise-focused, Deep UiPath integration

Strategic Analysis (SWOT)

Strengths

  • Top-tier technical expertise and research capabilities.
  • Innovative core technologies (Agent S/S2, MoG, PHP).
  • Open-source strategy.
  • Early-stage funding validation.
  • Pragmatic vision positioning (human-computer collaboration).

Weaknesses

  • Early stage of the company.
  • Lack of mature real-world case studies.
  • Potential commercialization challenges.
  • Dependency on external components.
  • Lack of clarity in product lines.

Opportunities

  • Huge market demand (AI automation).
  • Enterprise market potential.
  • Platform expansion (cross-OS, mobile).
  • Community ecosystem building.
  • Strategic partnerships.

Threats

  • Intense market competition.
  • Rapid technological changes.
  • Reliability and scalability challenges.
  • Business model sustainability.
  • Data privacy and security risks.

Frequently Asked Questions (FAQ)