Simular AI Agent

Founded in 2023 and headquartered in San Carlos, Simular AI is an AI startup focused on developing 'computer-using agents' that simulate human interaction with GUIs. Founders Ang Li and Jiachen Yang have backgrounds from top institutions like DeepMind, Google, and Baidu. The core mission is to create AI agents that can use computers like humans, automating tedious digital tasks and freeing up human potential.

The core technology is the Agent S framework and its upgraded version, Agent S2 – an open, modular, and extensible agent framework. It combines general models for high-level planning and utilizes specialized models for low-level execution and interface grounding, achieving leading performance on multiple benchmarks. Agent S2 introduces innovations like Proactive Hierarchical Planning (PHP) and Mixture-of-Grounding (MoG), enabling precise GUI manipulation using only screenshots. The company embraces open source, and the Agent S/S2 frameworks are available on GitHub.

The product portfolio includes Simular for macOS/Browser (local Mac browser agent) and Simular Desktop (cross-platform desktop assistant) for individuals, and Simular for Business (autonomous digital employees) for enterprises. Products emphasize the security and performance of local execution and focus on human-computer collaboration. A freemium pricing model (currently in beta) is adopted, with custom solutions offered for businesses.

In 2024, the company completed a $5 million early-stage funding round with investors including Basis Set Ventures, Flying Fish Partners, Samsung NEXT Ventures, and South Park Commons.

Core Features

Human-like GUI Interaction

Core capability, interacts with graphical interfaces by simulating human operations, without relying on APIs.

Agent S2 Framework

Advanced modular agent framework combining general model planning with specialized model execution/grounding.

Proactive Hierarchical Planning (PHP)

Proactively predicts and dynamically adjusts plans to adapt to real-time environmental changes, improving task success rates.

Mixture-of-Grounding (MoG)

Utilizes multiple grounding experts to precisely locate UI elements using only screenshot input.

Open Source Core

Agent S/S2 frameworks are open source, promoting community participation and technological transparency.

Local Execution Priority

Personal products emphasize running on the user's device, enhancing data security and privacy protection.

Operation Recording & Playback

Records user digital operations and can automatically replay them, simplifying the creation of automated repetitive tasks.

Self-Correction Capability

Agents can attempt different methods for self-correction when errors occur during execution, improving robustness.

Technology Deep Dive: Agent S Framework & Computer-Using Agents

Core Concept: AI Computer-Using Agents

Traditional automation methods (RPA, API integration) have limitations. Simular AI is dedicated to building intelligent agents that can directly **perceive, reason, and operate** GUIs across various platforms. Our '**computer-using agents**' understand state by observing the screen, precisely simulate human keyboard and mouse operations, and integrate cognitive patterns of **fast thinking** (intuitive reaction) and **slow thinking** (deep reasoning).

Agent S vs. S2 Framework Comparison

The core technology is embodied in the **open, modular, and extensible** Agent S/S2 frameworks. Both adhere to the design principle of using **general models** for high-level planning and **specialized models** for low-level execution and interface '**grounding**'.

Tech Dimension	Agent S	Agent S2 (Innovations)
Planning Capability	Experience-Enhanced Hierarchical Planning	Proactive Hierarchical Planning (PHP): Predicts future states and dynamically adjusts plans
Human-Computer Interface	Basic Agent-Computer Interface (ACI)	Enhanced ACI: Intelligently assigns tasks to expert modules
GUI Element Localization	Relies on multimodal input, limited by accessibility APIs	Mixture-of-Grounding (MoG): Precisely locates interface elements using only screenshots
Learning & Adaptation	Basic experience memory mechanism	Advanced Memory System & Self-Correction: Continuous learning and strategy adjustment

Performance & Benchmarks

Agent S

OSWorld: 83.6% success rate improvement over baseline
WindowsAgentArena: Demonstrated excellent cross-platform generalization

Agent S2 (SOTA Performance)

OSWorld: 34.5% accuracy at 50 steps, surpassing OpenAI CUA
WindowsAgentArena: 52.8% performance improvement
AndroidWorld: 50% accuracy, surpassing UI-TARS

Open Source Ecosystem & Community

Open source is a core differentiator for Simular AI. The Agent S/S2 frameworks are fully open-sourced on GitHub. The company maintains several active repositories (Agent-S, OpenACI, pysimular, etc.) and has established a Discord community to foster developer exchange. Using the framework requires configuring Python environments and Docker, and depends on external LLM services and specialized grounding models.

Product Portfolio & Services

Core Product Philosophy

Product design revolves around AI agents collaborating with users, emphasizing **human-computer collaboration** and user control. Focuses on **local execution (on-device)** to enhance security, responsiveness, and experience. Provides features for recording, sharing, and replaying digital actions.

Specific Product Lines

Offers a range of products covering different user needs:

Simular for macOS / Simular Browser: Native macOS agent, runs locally, embedded WebKit engine. Emphasizes autonomy, shared control, security, and familiar experience. Simplifies daily digital life. Free download.
Simular Desktop: Desktop AI assistant for executing digital actions and automating tasks. Core feature is recording operations as instructions and replaying them. Aims to save time and improve productivity. Potentially cross-platform. Offers Free and Premium plans.
Simular for Business: Positioned as **autonomous digital employees** to enhance organizational efficiency. Targets enterprise scenarios (finance, customer service, HR, etc.). Focuses on automation, productivity, scalability, workflow streamlining, RPA, data analysis, etc. Contact for demo.
Agent S / S2 Framework: Underlying open-source framework for developers and researchers.

Pricing Structure

Simular AI Pricing Plans
Plan Name	Price	Key Features	Target User	Available Add-ons
Free Plan	$0/month	Basic workspace tools; public community actions; no private actions	Individual starter users	None
Premium Plan	$19.99/device/month	Includes Free features; private/team channel actions; local execution	Individuals/teams needing privacy/collaboration	Server, Concierge
Simular for Business	Contact Sales	Autonomous digital employees; enterprise-grade features & services	Enterprise users	Custom services
Premium Add-on Services
Server	+$39.99/device/month	Simular hosted server; includes 200 agent hours; extra $0.10/hour	Users needing cloud computing power	-
Concierge	Contact Sales	Request Simular experts for custom results without creating actions yourself	Users needing expert services	-

Competitive Landscape Analysis

AI Agent Market Overview (Focus on Computer Usage)

This segment is rapidly developing, attracting significant attention and investment. Core objectives include workflow automation, task execution, code generation, data analysis, and software interaction (GUI/API).

Diverse technological paths: direct GUI interaction, API orchestration, code generation, conversational AI, no-code/low-code platforms.

Main Competitors

Simular AI faces multi-dimensional competition:

**Direct GUI Automation Competitors:** OpenAI Operator/CUA, Manus AI, Genspark Superagent, Ace, Proxy AI.

**Broader AI Agent Frameworks/Platforms:** LangChain, AutoGen, CrewAI, No-code/Low-code platforms (Gumloop, n8n, Google, Microsoft, UiPath, etc.), other open-source agents (Rasa, Haystack, etc.).

**Existing Productivity Suites:** Microsoft 365 Copilot, Google Workspace AI.

Competitor Feature Comparison

Competitor	Focus	Technology/Method	Open Source	Use Case	Differentiation
Simular AI	GUI Automation	Modular (MoG, PHP), Human-like interaction, Screenshot analysis	Yes (Core)	Personal/Enterprise Automation	Open Source, Local Exec, Human-Collab, SOTA
OpenAI Operator	GUI Automation	GPT-4o, Task decomposition	No (Model)	Forms/E-commerce	OpenAI Ecosystem, Strong base model
Manus AI	General AI Agent (GUI)	Multi-agent collab, or uses Claude 3.x	No	Complex task automation	High attention/funding, Reliability concerns
Genspark Superagent	API Orchestration/Tool Calling	Hybrid agent (9+ models), 80+ tools, API integration	No	Broad computer tasks	Hybrid agent, Rich toolset, API focus
Ace	GUI Automation	Direct local K/M control, Observational learning	No	Quick desktop tasks	Local direct control, Claims speed
Proxy AI	Web Browsing Automation	Parallel processing (multi-agent), Natural language commands	No	Web research/Data collection/Form filling	Web focus, Parallel processing speedup
LangChain	LLM Application Framework	Prompt chaining, Data integration, Agent modules	Yes	Building various LLM apps	Broad ecosystem, Flexible, Not GUI-focused
AutoGen	Multi-Agent Conversation Framework	Multi-agent coordination, Code gen, Self-correction	Yes	Complex workflows, Programming tasks	Microsoft support, Strong in code/multi-agent interaction
CrewAI	Multi-Agent Orchestration Framework	Role-playing agent collab, Task delegation	Yes	Collaborative task automation	High usability, Focus on agent team coordination
UiPath Agent Builder	Low-Code Platform	Visual design, Integrates UiPath ecosystem	No	Enterprise RPA/Automation	Enterprise-focused, Deep UiPath integration

Strategic Analysis (SWOT)

Strengths

Top-tier technical expertise and research capabilities.
Innovative core technologies (Agent S/S2, MoG, PHP).
Open-source strategy.
Early-stage funding validation.
Pragmatic vision positioning (human-computer collaboration).

Weaknesses

Early stage of the company.
Lack of mature real-world case studies.
Potential commercialization challenges.
Dependency on external components.
Lack of clarity in product lines.

Opportunities

Huge market demand (AI automation).
Enterprise market potential.
Platform expansion (cross-OS, mobile).
Community ecosystem building.
Strategic partnerships.

Threats

Intense market competition.
Rapid technological changes.
Reliability and scalability challenges.
Business model sustainability.
Data privacy and security risks.