MCP
xmemory exposes a Model Context Protocol server over Streamable HTTP. Any MCP-compatible client — Claude Desktop, Cursor, Windsurf, pydantic-ai, LangChain, Mastra, or a plain SDK call — can connect and get access to xmemory’s read and write tools with no custom code.
The MCP server supports two authentication paths:
- OAuth2 flow for interactive browser-based connectors.
- Direct fixed API-key flow for headless agents, CI, evals, and scripts.
Video tutorial coming soon — we’re putting together a step-by-step video walkthrough showing how to connect xmemory via MCP to products like Claude Code, ChatGPT, and Codex using their native MCP connectors. Stay tuned!
Authentication
Section titled “Authentication”API key: To use xmemory APIs or integrations (including MCP), you need an API key. Please register your interest at https://xmemory.ai and we will reach out to give access. Copy and securely store the key. Never share your API key publicly.
Option 1: OAuth2 (interactive)
Section titled “Option 1: OAuth2 (interactive)”Point your client to https://mcp.xmemory.ai/ and complete the OAuth2 login flow. The client receives an opaque MCP token (format: xmem_mcp_...) and uses it as a Bearer token.
Authorization: Bearer <xmemory_mcp_token>For OAuth-issued MCP sessions, binding metadata (instance/admin/status) is stored server-side.
Option 2: Direct fixed API key (headless)
Section titled “Option 2: Direct fixed API key (headless)”Headless clients can skip OAuth and send an account API key directly as the Bearer token on MCP shortcut paths:
Authorization: Bearer <your xmemory API key>In this mode, session type is determined by the URL path, and the API key is revalidated on every request.
An example of adding an xmemory instance to Claude Code using the direct API key method:
claude mcp add --transport http xmemory https://mcp.xmemory.ai/instance/8c94edfb703645b587fd7d8c2f41b74d --header "Authorization: Bearer <xmemory API key>"Connection
Section titled “Connection”| URL | https://mcp.xmemory.ai/ |
| Transport | Streamable HTTP |
| Auth | Bearer token in the Authorization header |
OAuth2 connection pattern (interactive)
Section titled “OAuth2 connection pattern (interactive)”Use root MCP URL and OAuth discovery/login:
{ "mcpServers": { "xmemory": { "url": "https://mcp.xmemory.ai/", "headers": { "Authorization": "Bearer <xmemory_mcp_token>" } } }}Direct API-key connection pattern (headless)
Section titled “Direct API-key connection pattern (headless)”Use an MCP shortcut URL and pass the fixed API key:
{ "mcpServers": { "xmemory": { "url": "https://mcp.xmemory.ai/instance/<instance_id>", "headers": { "Authorization": "Bearer <xmemory_api_key>" } } }}Supported shortcut paths:
/instance/<instance_id>- instance tools bound to that instance./admin- admin tools, initially disconnected./admin/<instance_id>- admin tools, pre-connected to that instance./status- status-only tool.
For framework-specific setup, see the integration guides: Pydantic, LangChain, Mastra AI.
The xmemory MCP server exposes 14 tools to instance connections: 9 bound tools that operate on the instance bound at login, and 5 explicit-instance tools that take an instance_id parameter for multi-instance access.
For direct API-key usage, /instance/<instance_id> creates the same instance-scoped tool surface, while /admin and /admin/<instance_id> switch to admin tools.
Tool descriptions are dynamic — on each list_tools() call, the server fetches your instance’s schema and appends a summary of its object types and relations to each tool description. This means the LLM sees tool descriptions tailored to your specific instance, making it more likely to use the tools correctly.
get_instance_id
Section titled “get_instance_id”Returns the instance ID bound to the current session (e.g. "inst_abc123").
Parameters: none.
Useful for display, logging, or confirming which instance the agent is operating on.
get_instance_schema
Section titled “get_instance_schema”Returns the full instance schema as a JSON string — object types with their fields, relations, deduplication keys, and descriptions.
Parameters: none.
The LLM can call this to understand what kinds of data the instance stores, which helps it formulate better write and read calls.
Extracts structured entities from free-form text and persists them. Synchronous — blocks until the data is fully committed.
| Parameter | Type | Description |
|---|---|---|
text | string | Free-form text containing facts to extract and remember |
session_id | string | null | Optional session ID for tracing (e.g. claude-qwxhjkmrtz) |
Returns {"status": "ok", "write_id": "<uuid>"} on success.
Internally, the server runs a two-phase pipeline: an LLM extracts structured objects according to your instance’s schema, then a diff engine compares them against existing data and applies inserts, updates, and deletes.
Because write blocks until committed, you can call read immediately after and get consistent results.
write_async
Section titled “write_async”Same as write, but enqueues the operation and returns immediately with a write_id.
| Parameter | Type | Description |
|---|---|---|
text | string | Free-form text containing facts to extract and remember |
session_id | string | null | Optional session ID for tracing |
Returns {"status": "ok", "write_id": "<uuid>"}.
Important: do not call read immediately after write_async — the data may not be committed yet. Use write_status to poll, or use write (synchronous) when you need to read right after.
write_status
Section titled “write_status”Checks the status of an async write previously submitted via write_async.
| Parameter | Type | Description |
|---|---|---|
write_id | string | The write ID returned by write_async |
Returns:
{ "status": "ok", "write_id": "<uuid>", "write_status": "queued | processing | completed | failed | not_found", "error_detail": "<string or null>", "completed_at": "<ISO timestamp or null>"}write_status | Meaning |
|---|---|
queued | Waiting to be picked up |
processing | Currently being extracted and applied |
completed | Successfully committed — safe to read |
failed | Extraction or persistence failed; see error_detail |
not_found | No write with this ID exists |
Queries the instance and returns a natural-language answer.
| Parameter | Type | Description |
|---|---|---|
query | string | A natural-language question about the stored data |
session_id | string | null | Optional session ID for tracing |
Returns a JSON string with an answer field — a human-readable response synthesized from the structured data.
Internally, the server translates the question into SQL against the instance’s knowledge graph, executes it with automatic retry and empty-result verification, and formats the result into a plain-text answer.
Schema evolution (suggestion engine)
Section titled “Schema evolution (suggestion engine)”xmemory learns from reads that couldn’t be fully answered and, on demand,
surfaces a single rolling proposal of schema improvements for the bound
instance. The flow is three tools — review → decide → apply — and a change
is only applied when you call apply_pending_decisions. These tools require
API-key authentication (not a plain instance token).
| Tool | Parameters | Description |
|---|---|---|
review_suggestions | optional session_id | Return the consolidated proposal and a proposal_version token. May report evolution_in_progress if a migration is already running. |
decide_suggestions | proposal_version, decisions, optional session_id | Record an accept / reject / defer per item, in bulk. Returns a next_proposal_version. |
apply_pending_decisions | proposal_version, optional session_id | Commit accepted decisions as one migration. |
Always confirm with the user before deciding or applying. Rejecting an item suppresses that exact suggestion in future proposals.
Explicit-instance tools
Section titled “Explicit-instance tools”These take instance_id as a parameter so a single connection can operate across multiple instances.
| Tool | Parameters | Description |
|---|---|---|
extract | instance_id, text | Extract structured data from text. Extraction only — does not persist. |
write_to | instance_id, text | Synchronous write to a specific instance. |
write_to_async | instance_id, text | Async write to a specific instance; returns write_id. |
write_to_status | instance_id, write_id | Poll the status of an async write. |
read_from | instance_id, query | Query a specific instance. |
Admin tools
Section titled “Admin tools”Selecting the admin connection type at login switches the tool surface to schema generation and instance lifecycle management. All admin tools are prefixed admin_. The admin connection is stateful — it tracks whether you are currently connected to a specific instance, and several tools require that connection.
Important: never hand-write YAML schemas. Always use
admin_generate_schemaoradmin_enhance_schemato produce a valid schema, then pass the result toadmin_create_instanceoradmin_update_instance_schema.
Schema management
| Tool | Parameters | Description |
|---|---|---|
admin_generate_schema | schema_description | Generate a YAML schema from a free-form description. |
admin_enhance_schema | schema_description, schema_to_improve | Improve an existing YAML schema. Returns the new YAML and a structured migration plan for applying non-additive changes safely. |
Stateful instance lifecycle (operate on the connected instance, or change connection state)
| Tool | Parameters | Description |
|---|---|---|
admin_create_instance | schema_yaml, optional cluster_id, name, description, instance_config | Create a new instance. Requires being disconnected. |
admin_connect_instance | instance_id | Connect to an existing instance. Fails if already connected. |
admin_disconnect_instance | — | Disconnect from the current instance. |
admin_get_instance_id | — | Return the connected instance ID (empty string if disconnected). |
admin_get_instance_schema | — | Return the YAML schema of the connected instance. |
admin_update_instance_schema | schema_yaml, optional migration_plan, confirm_destructive | Update the connected instance’s schema. Pass migration_plan (from admin_enhance_schema) for non-additive changes; confirm_destructive authorises ops that drop data. |
admin_update_instance_config | instance_config | Update per-instance model config overrides. |
Schema evolution (operate on the connected instance)
| Tool | Parameters | Description |
|---|---|---|
admin_dry_run_migration | schema_yaml, optional migration_plan, confirm_destructive | Preview the DDL a migration would run, without applying it. |
admin_list_migrations | optional limit (1–200, default 50), before_id, include_yaml | List applied migrations, newest first. |
admin_get_migration | migration_id, optional include_yaml | Get a single migration record. |
Cluster and instance management (require API key)
| Tool | Parameters | Description |
|---|---|---|
admin_list_clusters | — | List clusters accessible to the API key. |
admin_get_cluster | cluster_id | Get a single cluster by ID. |
admin_list_instances | cluster_id, optional verbose | List instances in a cluster. |
admin_list_own_instances | optional verbose | List all instances across linked clusters. |
admin_get_instance_by_id | instance_id | Get instance metadata. |
admin_delete_instance_by_id | instance_id | Delete an instance. |
admin_get_instance_schema_by_id | instance_id | Get schema by instance ID. |
admin_update_instance_schema_by_id | instance_id, schema_yml or schema_json | Update schema by instance ID. |
admin_update_instance_metadata_by_id | instance_id, name, optional description | Replace instance name/description. |
admin_patch_instance_metadata_by_id | instance_id, optional name, description | Partially update instance metadata. |
Sync vs async writes
Section titled “Sync vs async writes”Use write when you need to read the data back immediately — it blocks until committed, guaranteeing consistency.
Use write_async + write_status when throughput matters more than immediate consistency — the client isn’t blocked, and you can poll for completion later.
Error handling
Section titled “Error handling”All tools return {"error": "<message>"} as a JSON string on failure rather than raising exceptions, so the MCP client always gets a parseable response. Common errors:
| Error | Cause |
|---|---|
"no instance bound to this session" | Token is invalid or not linked to an instance |
"text size (N bytes) exceeds maximum (M bytes)" | Write payload too large (limit: 1 MB) |
"write queue not ready" | Background processor hasn’t started |
"write failed: <detail>" | Extraction or persistence failure |