Skip to main content
Mirobody is a layered, plugin-friendly engine. Each layer can be replaced or extended independently — no monoliths, no hidden coupling.
Mirobody architecture overview

AI & Agent Engine

ModulePathDescription
Chat Servicemirobody/chat/Session management, conversation history, HTTP + WebSocket streaming adapters, memory integration, session sharing
Agent Implementationsmirobody/pub/agents/DeepAgent (LangChain-based), MixAgent (two-phase fusion), BaseAgent (direct LLM)
LLM Clientsmirobody/utils/llm/Provider-agnostic adapter for OpenAI, Gemini, Anthropic, Azure OpenAI, Volcengine, Dashscope. HIPAA-compliant routing for clinical workloads
MCP Servermirobody/mcp/JSON-RPC 2.0, local + HTTP remote endpoint, auto-discovers tools and resources
Toolsmirobody/pub/tools/, tools/Built-in tools (file ops, charts, execute) + your drop-in Python files
Skillsskills/Claude Agent Skills (SKILL.md + metadata.json) auto-discovered at startup
Embeddingsmirobody/utils/embedding.py1024-dim provider-agnostic embeddings (Gemini / Qwen), pgvector semantic search
Prompt Templatesprompts/Jinja2 system prompts with dynamic context (user timezone, available tools, health profile)

Three agents, three jobs

AgentPhasesUse case
DeepAgentSingle model handles tools + responseComplex multi-step research, file ops, planning
MixAgentOrchestrator model (e.g. Claude Sonnet) → Responder model (e.g. Gemini Flash)High-volume workloads where reasoning matters more than narration
BaseAgentDirect LLM chat, no toolsSimple Q&A, low-latency, testing
DeepAgent is the default. See Tools Overview for switching and configuration.

Tools and Skills

Tools are plain Python functions in tools/ — type hints + docstrings are the only “schema”. Mirobody parses them once at startup and exposes them via MCP.
# tools/my_tool.py
def get_my_metric(date: str, user_info: dict) -> dict:
    """
    Fetch a custom metric for the given date.

    Args:
        date: ISO 8601 date string.

    Returns:
        Dict containing the metric value and unit.
    """
    user_id = user_info["user_id"]    # injected by Mirobody from JWT
    return {"value": 42, "unit": "steps"}
Skills are Claude Agent Skills — a SKILL.md (the playbook the agent reads when the skill activates) plus a metadata.json (Mirobody-specific, used for discovery and IDE integration).
skills/glucose-coach/
├── metadata.json   # name, summary, when_to_use, when_not_to_use, tags
└── SKILL.md        # YAML frontmatter (name, description, license) + body
See Adding Tools and MCP Integration for details.

FHIR & Health Standards

ModulePathDescription
FHIR Mappingmirobody/pulse/core/fhir_mapping.pyIn-memory cache indicator → FHIR code, with optional auto-registration of new codes
Indicator Registrymirobody/pulse/core/indicators_info.py400+ StandardIndicator enum, multi-source (Vital, Apple Health, Garmin, Whoop, Renpho, …)
Unit Conversionmirobody/pulse/core/units.pyBidirectional: kg ↔ lb, °C ↔ °F, mg·dL⁻¹ ↔ mmol·L⁻¹, mmHg ↔ kPa, etc.
Indicator Searchmirobody/indicator/Embedding-based free-text → indicator code + concept-graph expansion across LOINC / SNOMED CT / RxNorm / CVX / DCM
Concept Graphmirobody/indicator/concept_graph.py + fhir_concept_graph.binCross-vocabulary bridges and same-system siblings; pulled via Git LFS

Two retrieval modes

MethodScopeInputReturns
adapter.search(user_id, embeddings, top_k)Per-userPre-computed query embeddingsWhat does this user have that matches?
adapter.resolve(term, top_k)GlobalFree textWhat canonical codes does this term map to?
adapter.resolve_many(terms, top_k)Global, batchedList of free text~20–30× faster than looping resolve()
search joins th_series_data to scope to one user; resolve runs cosine over the full vocabulary and returns the global top-k per code system (LOINC, SNOMED CT, RXNORM, CVX, DCM, THETA).

Unit normalization

Free-text “value + unit” strings from any device or chart, in any of the supported languages (en, zh, ja, ko, ru, de, fr, es), get normalized to a canonical UCUM unit plus a LOINC PROPERTY family:
"90次每分钟"    → ParsedQuantity(value=90.0,  unit="/min",   family="NRat")
"<5.6 mg/dL"   → ParsedQuantity(value=5.6,   unit="mg/dL",  family="MCnc", comparator="<")
"Millimol pro Liter" → "mmol/L"
"600步"        → ParsedQuantity(value=600.0, unit="{steps}", family="Num")
Pure local computation — no DB, no embedding API.

Health Data Pipeline (Pulse)

ModulePathDescription
Platform Managermirobody/pulse/Platform–Provider plugin architecture, normalization to StandardPulseData
Theta Platformmirobody/pulse/theta/Direct device integrations: Garmin, Whoop, Oura, Renpho, PostgreSQL, more
Apple Healthmirobody/pulse/apple/Apple Health import, CDA (Clinical Document Architecture) processing
Data Uploadmirobody/pulse/data_upload/StandardPulseDatath_series_data write pipeline
File Parsermirobody/pulse/file_parser/Multi-format: PDF, CSV, Excel, audio, image, genetic; LLM-powered indicator extraction
Aggregationmirobody/pulse/core/aggregate_indicator/Series → daily summaries; derived metrics; sleep window 18:00–18:00
Health Insightsmirobody/pulse/core/insight/AI-powered trend detection, anomaly analysis, pattern recipes (multi-signal, recovery, glucose)

Provider lifecycle

Discovery → OAuth link → periodic pull → save raw (th_raw_data) → normalize to StandardPulseData → write (th_series_data) → aggregate to daily summaries → insights + indicator search. Every provider, whether built-in (mirobody/pulse/theta/mirobody_garmin/) or custom (providers/mirobody_mydevice/), implements the same BaseThetaProvider contract — see Provider Integration.

Infrastructure

ModulePathDescription
Configurationmirobody/utils/config/YAML + env var layered config, Fernet encryption for _KEY/_SECRET/_TOKEN/etc.
Storage Backendmirobody/utils/config/storage/Pluggable: Local filesystem, AWS S3, Aliyun OSS
Auth & Usermirobody/user/JWT, OAuth 2.0 (Google / Apple), WebAuthn / FIDO2, email verification
Servermirobody/server/Starlette ASGI, JWT middleware, rate limiting
Databasemirobody/utils/db.pyAsync PostgreSQL (psycopg) + pgvector, Redis cache/session store
SandboxE2B (external)execute tool runs shell + Python in isolated cloud sandboxes

Configuration layering

config.yaml            ← base, do not edit
  └── config.{env}.yaml  ← your overrides; ENV=localdb → config.localdb.yaml
       └── env vars       ← highest precedence; injected by .env or shell
Sensitive keys (anything whose name contains _KEY, _PASSWORD, _PASS, _PWD, _SECRET, _SK, _TOKEN) are auto-encrypted on first load using your CONFIG_ENCRYPTION_KEY. See Configuration.

Extension Points

Drop a file in and Mirobody picks it up at next startup:
DirectoryDiscovered asNaming convention
tools/MCP toolsFiles ending in .py; functions or *Service classes; _*.py files are ignored
skills/Claude Agent SkillsOne sub-directory per skill, each with SKILL.md + metadata.json
agents/Conversational agents*Agent classes; require an async generate_response
providers/Data providersmirobody_<slug>/provider_<slug>.py extending BaseThetaProvider
prompts/Prompt templatesJinja2 .jinja files referenced via PROMPTS_<AGENT>
resources/MCP resourcesHTML, JSON, or other static files exposed over MCP
Custom directories take precedence over built-ins of the same name.

End-to-end request flow

Here’s what happens when a user sends “Show me my knee pain trend” in the web UI:
1. Browser POSTs to /api/chat (SSE).
2. JWT middleware → user_id.
3. Chat service creates/uses a session, persists user message to th_messages.
4. DeepAgent (default) is loaded with the user's PROVIDERS_DEEP config.
5. Agent calls tools/list via MCP, plans, then calls tools:
   - get_user_profile  → user_id, timezone, available providers
   - indicator search  → "knee pain" → SNOMED CT / LOINC candidates
   - get_health_data   → time series joined by user_id + indicator
   - chart_service     → renders PNG, uploads to S3, returns presigned URL
6. Agent streams thinking / reply chunks via SSE; response is persisted.
7. Aggregator + insights run in background for next session.
For the session-sharing read-path (no auth), see Session Sharing.

Where to go next

Providers

How devices and EHRs hook in

Data Flow

From raw vendor payloads to normalized indicators

File Processing

Multi-format file ingestion and parsing

Tools & MCP

Build and expose tools across the MCP ecosystem