Python

The xmemory-ai package gives your Python code persistent, structured memory. Write free-form text, have it automatically extracted into typed objects, and query it back in natural language.

For MCP-based integration (no SDK needed), see the MCP guide.

API key: To use xmemory APIs or integrations, you need an API key. Please register your interest at https://xmemory.ai and we will reach out to give access. Copy and securely store the key. Never share your API key publicly.

Installation

pip install xmemory-ai

Requires Python 3.10+ and pydantic>=2.0.

Quick start

This is the full flow — from schema to stored knowledge to answers — in a single script:

import yaml
from xmemory import XmemoryClient, SchemaType

# Connect (reads XMEM_AUTH_TOKEN from env if token is not passed)
client = XmemoryClient(token="your-token")

# List available clusters
clusters = client.admin.list_clusters()
cluster_id = clusters[0].id

# Describe what you want to remember
schema = client.admin.generate_schema(
    cluster_id,
    "Track contacts with name, email, company, and notes.",
)

# Create a memory instance from that schema
inst = client.admin.create_instance(
    cluster_id=cluster_id,
    name="contacts",
    schema_text=yaml.dump(schema.data_schema),
    schema_type=SchemaType.YML,
)
# inst is an InstanceAPI handle — all subsequent calls go through it.

# Write some information
inst.write("Alice Johnson works at Acme Corp. Her email is alice@acme.com.")
inst.write("Bob Lee is a designer at Globex. He joined last Monday.")

# Read it back
result = inst.read("What is Alice's email?")
print(result.reader_result)  # {"answer": "alice@acme.com"}

result = inst.read("Who joined recently?")
print(result.reader_result)  # {"answer": "Bob Lee joined last Monday."}

client.close()

Once you have an instance ID, skip the schema step on subsequent runs:

client = XmemoryClient(token="your-token")
inst = client.instance("your-saved-instance-id")

Configuration

Parameter	Env var	Default	Description
`token`	`XMEM_AUTH_TOKEN`	`None`	Bearer token for authentication
`url`	`XMEM_API_URL`	`https://api.xmemory.ai`	API base URL
`timeout`	—	`60`	Default request timeout in seconds
`http_client`	—	`None`	External `httpx.Client` (you manage its lifecycle)

All parameters are keyword-only. Token and URL fall back to their environment variables when not passed.

Context manager

with XmemoryClient(token="your-token") as client:
    inst = client.instance("your-instance-id")
    result = inst.read("What is Alice's email?")

Async client

The async client mirrors the sync API. Use it in asyncio code:

from xmemory import AsyncXmemoryClient

async def main():
    async with AsyncXmemoryClient(token="your-token") as client:
        inst = client.instance("your-instance-id")
        result = await inst.read("What is Alice's email?")
        print(result.reader_result)

import asyncio
asyncio.run(main())

Writing

Send free-form text — xmemory extracts structured objects according to your schema and merges them into the knowledge graph.

inst = client.instance("your-instance-id")

resp = inst.write("Carol is a senior engineer at Initech. Her email is carol@initech.com.")
print(resp.cleaned_objects)  # the objects that were stored
print(resp.diff_plan)        # what was inserted, updated, or deleted
print(resp.trace_id)         # request trace id when available

WriteResult, ReadResult, and ExtractResult include trace_id, which is useful when you want to correlate SDK calls with API logs.

Extraction logic

Control the speed/accuracy tradeoff:

from xmemory import ExtractionLogic

resp = inst.write("...", extraction_logic=ExtractionLogic.FAST)

Value	When to use
`DEEP`	Important or complex information (default)
`REGULAR`	Balanced speed and accuracy
`FAST`	High-volume, low-stakes writes

Reading

Ask questions in natural language. xmemory translates them into SQL against the knowledge graph and returns a formatted answer.

resp = inst.read("Who works at Acme Corp?")
print(resp.reader_result)
print(resp.trace_id)

Read modes

from xmemory import ReadMode

# Plain-text answer (default)
resp = inst.read("What is Alice's email?", read_mode=ReadMode.SINGLE_ANSWER)
# → {"answer": "alice@acme.com"}

# Structured objects and relations
resp = inst.read("Show all contacts", read_mode=ReadMode.XRESPONSE)
# → {"objects": [...], "relations": [...]}

# Raw SQL result sets
resp = inst.read("List all contacts", read_mode=ReadMode.RAW_TABLES)
# → {"tables": [...]}

Async writes

For latency-sensitive code, enqueue a write and return immediately:

resp = inst.write_async("Dave manages the London office.")
print(resp.write_id)  # use this to poll for completion

Then check the status:

status = inst.write_status(resp.write_id)
print(status.write_status)
# → WriteQueueStatus.QUEUED | PROCESSING | COMPLETED | FAILED | NOT_FOUND

Do not call read immediately after write_async — the data may not be committed yet. Poll with write_status until COMPLETED, or use write (synchronous) when you need to read right after.

Extracting (without writing)

Preview what xmemory would extract from a piece of text, without storing anything:

resp = inst.extract("Dave manages the London office.")
print(resp.objects_extracted)
print(resp.trace_id)

Accepts the same extraction_logic parameter as write.

Describing (agent tool discovery)

The describe() method returns agent-facing tool descriptions enriched with the instance’s actual schema. Use it to tell an LLM what tools are available and how to call them.

desc = inst.describe()

# Plain text — inject into a system prompt
print(desc.as_text())

# Anthropic tool-use format
tools = desc.as_anthropic_tools()

# OpenAI function-calling format
tools = desc.as_openai_tools()

Results are cached locally for 5 minutes. To force a refresh (e.g. after updating the schema):

inst.clear_describe_cache()
desc = inst.describe()

as_text() shows tools as method signatures by default. Pass include_http=True to also show HTTP method and path for raw REST callers.

Cluster and instance management

Listing clusters

clusters = client.admin.list_clusters()
for c in clusters:
    print(f"{c.id}: {c.name}")

Generating a schema

Describe what you want to track in plain language:

schema = client.admin.generate_schema(
    cluster_id,
    "Track user preferences, open tasks with priorities, and conversation history.",
)
print(schema.data_schema)

Object names use CamelCase (UserPreferences, OpenTask). The generation endpoint handles naming conventions automatically.

Creating an instance

import yaml
from xmemory import SchemaType

inst = client.admin.create_instance(
    cluster_id=cluster_id,
    name="my-memory",
    schema_text=yaml.dump(schema.data_schema),
    schema_type=SchemaType.YML,
    description="Optional description",
)
# inst is a bound InstanceAPI — use inst.write(), inst.read(), etc.

Updating a schema

When your needs change, pass the existing schema so changes are incremental:

new_schema = client.admin.generate_schema(
    cluster_id,
    "Add an assignee field to tasks.",
    current_yml_schema=current_schema,
)
client.admin.update_instance_schema(
    instance_id, yaml.dump(new_schema.data_schema), SchemaType.YML
)

Existing objects and fields are preserved — only the described changes are applied.

Listing and deleting instances

instances = client.admin.list_instances()
info = client.admin.get_instance(instance_id)

client.admin.delete_instance(instance_id)

Error handling

All errors raise XmemoryAPIError. The exception carries an optional .status attribute with the HTTP status code.

from xmemory import XmemoryAPIError, XmemoryHealthCheckError

# Check connectivity
try:
    client.check_health()
except XmemoryHealthCheckError as e:
    print(f"API unreachable: {e}")

# Handle operation errors
try:
    inst.write("...")
except XmemoryAPIError as e:
    print(f"Error (HTTP {e.status}): {e}")

XmemoryHealthCheckError is a subclass of XmemoryAPIError, so catching XmemoryAPIError covers both.

Reference

Client

Method	Returns	Description
`XmemoryClient(token, url, timeout, http_client)`	—	Create a sync client
`AsyncXmemoryClient(token, url, timeout, http_client)`	—	Create an async client
`client.instance(instance_id)`	`InstanceAPI`	Get a handle for data operations on an instance
`client.check_health()`	—	Raises `XmemoryHealthCheckError` if the API is unreachable
`client.close()`	—	Close the underlying HTTP client

Admin methods (`client.admin`)

Method	Returns	Description
`list_clusters(ids?, timeout?)`	`list[ClusterInfo]`	List clusters
`get_cluster(cluster_id, timeout?)`	`ClusterInfo`	Get a cluster by ID
`generate_schema(cluster_id, description, *, current_yml_schema?, timeout?)`	`GenerateSchemaResult`	Generate a schema from a plain-language description
`create_instance(cluster_id, name, schema_text, schema_type, *, description?, schema_description?, timeout?)`	`InstanceAPI`	Create a new instance; returns a bound handle
`list_instances(ids?, timeout?)`	`list[InstanceInfo]`	List instances
`get_instance(instance_id, timeout?)`	`InstanceInfo`	Get an instance by ID
`delete_instance(instance_id, timeout?)`	`list[str]`	Delete an instance
`get_instance_schema(instance_id, timeout?)`	`InstanceSchemaInfo`	Get an instance’s schema
`update_instance_schema(instance_id, schema_text, schema_type, timeout?)`	`InstanceInfo`	Update an instance’s schema
`update_instance_metadata(instance_id, name, description?, timeout?)`	`InstanceInfo`	Update instance name and description

Instance methods (`client.instance(id)`)

Method	Returns	Description
`write(text, *, extraction_logic?, diff_engine?, timeout?)`	`WriteResult`	Extract and persist objects from text
`write_async(text, *, extraction_logic?, diff_engine?, timeout?)`	`AsyncWriteResult`	Enqueue a write; returns `write_id` immediately
`write_status(write_id, timeout?)`	`WriteStatusResult`	Poll the status of an async write
`read(query, *, read_mode?, read_id?, timeout?)`	`ReadResult`	Query the instance in natural language
`extract(text, *, extraction_logic?, timeout?)`	`ExtractResult`	Extract objects without writing them
`get_schema(timeout?)`	`InstanceSchemaInfo`	Get this instance’s schema
`describe(timeout?)`	`DescribeResult`	Get agent-facing tool descriptions (cached 5 min)
`clear_describe_cache()`	`None`	Force next `describe()` to fetch fresh data
`id`	`str`	The instance ID (property)

Enums

Enum	Values
`SchemaType`	`YML`, `JSON`
`ExtractionLogic`	`FAST`, `REGULAR`, `DEEP`
`ReadMode`	`SINGLE_ANSWER`, `RAW_TABLES`, `XRESPONSE`
`WriteQueueStatus`	`QUEUED`, `PROCESSING`, `COMPLETED`, `FAILED`, `NOT_FOUND`

Exceptions

Exception	Parent	`.status`
`XmemoryAPIError`	`Exception`	HTTP status code or `None`
`XmemoryHealthCheckError`	`XmemoryAPIError`	HTTP status code or `None`