Tencent Open Sources TencentDB Agent Memory: 4-Layer Local Memory Pipeline for AI Agents

Machine Learning


Tencent has released TencentDB Agent Memory, an open-source memory system for AI agents. The project is shipped under the MIT License. This targets problems familiar to anyone who ships long-term agents: context bloat and recall failure.

It is symbolic short-term memory along with layered long-term memory. It integrates with OpenClaw as a plugin and with Hermes Agent through a gateway adapter. The default backend is local SQLite with the sqlite-vec extension, so no external API is required.

Why is it difficult to remember agents?

Most current memory stacks shred data into pieces and dump them into flat vector stores. In this case, recall becomes a blind similarity search across the truncated fragments without macro-level guidance. This architecture is based on two pillars: memory layering and symbolic memory.

4-layer semantic pyramid

For long-term personalization, TencentDB Agent Memory builds a four-level pyramid instead of a flat log. The layers are L0 Conversation, L1 Atom, L2 Scenario, and L3 Persona. These correspond to raw dialogs, atomic facts, scene blocks, and user profiles.

The persona layer contains day-to-day user preferences and is queried first. The system drills down to Atom or the live conversation only when deeper details are needed. Lower layers preserve evidence. The upper layer preserves the structure.

Storage is heterogeneous. Facts, logs, and traces are stored in a database for full-text search. Personas, scenes, and canvases are saved as human-readable Markdown files. Layered memory artifacts exist in ~/.openclaw/memory-tdai/.

Symbolic short-term memory via mermaids

Long-running agent tasks consume tokens through detailed tool logs, search results, code, and error traces. TencentDB agent memory addresses this issue through context offloading combined with symbolic memory.

Complete tool logs are offloaded to the following external files: refs/*.md. State transitions are encoded in Mermaid syntax within the lightweight task canvas. The agent examines the symbol graph in the context window.

If you want the raw text, run grep. node_id And get the corresponding file. The Tencent development team describes this as a definitive drill-down from symbols at the top layer, to indexes at the middle layer, and then to the raw text at the bottom layer.

Benchmark numbers

Results are measured over continuous long-term sessions rather than isolated turns. For example, the SWE bench runs 50 consecutive tasks per session to simulate the pressure of context accumulation.

WideSearch increased the pass rate from 33% to 50% by integrating the plugin with OpenClaw, a relative improvement of 51.52%. Token usage decreased from 221.31 million to 85.64 million, a reduction of 61.38%.

On the SWE bench, the success rate increased from 58.4% to 64.2%, but tokens decreased from 3,474.1 million to 2,375.4 million, a decrease of 33.09%. For AA-LCR, the success rate varies from 44.0% to 47.5%. Tokens decreased from 112 million to 77.3 million, a decrease of 30.98%.

Regarding long-term memory, PersonaMem’s accuracy increases from 48% to 76%. Note: These numbers are based on Tencent’s own evaluation.

Recall and search

Acquisition defaults to a hybrid strategy. This system combines BM25 keyword search and vector embedding and fuses them using reciprocal rank fusion (RRF). Developers can switch to pure keyword or embedding Set the mode through the settings field. BM25 tokenizer supports both Chinese (jieba) and English.

The default configuration triggers L1 memory extraction every 5 turns. A user persona is generated for every 50 new memories. Recall returns 5 items with a default timeout of 5 seconds. When timed out, the system skips the injection rather than blocking the conversation.

Installation and developer surfaces

OpenClaw integration is shipped as a single npm package. @tencentdb-agent-memory/memory-tencentdb. The project requires Node.js 22.16 or later. One configuration flag is required to enable this. The plugin then handles conversation capture, memory extraction, scene aggregation, persona generation, and invocation.

For Hermes, the Docker image bundles the agent, plugin, and TDAI Memory Gateway. The default model is Tencent Cloud’s DeepSeek-V3.2. All OpenAI compatible endpoints are MODEL_PROVIDER=custom Flag.

Two tools are exposed to the agent during the session. tdai_memory_search and tdai_conversation_search. Both return the following reference node_id and result_ref Field for traceback. Tencent Cloud Vector Database (TCVDB) backend is also available as a local SQLite alternative.

Visual explanation of Marktechpost



TencentDB Agent Memory — Preview

01 / Overview

What is TencentDB agent memory?

An MIT-licensed memory system for AI agents that combines symbolic short-term memory with a four-layer long-term memory pipeline. Runs completely locally with no external API dependencies.

short term memory

Offload detailed tool logs to files and keep compact Mermaid task canvases in context.

long term memory

Extract the conversation into a four-layer semantic pyramid (L0 → L1 → L2 → L3).

local backend

The default is SQLite + sqlite-vec. Tencent Cloud Vector Database (TCVDB) is optional.

integration

Delivered as an OpenClaw plugin and a Hermes Agent Docker image.

02 / Architecture

4 layer semantic pyramid

Long-term memory is not flat, but layered. The upper layer is responsible for the structure. Lower layers preserve evidence.

L3 PersonaUser profile (persona.md)

L2/ScenarioScene block (markdown)

L1 AtomAtomic facts (JSONL)

L0・Conversationlive dialogue

Drilldown path: Persona → Scenario → Atom → Conversation. Use of references node_id and result_ref For definitive tracebacks.

03 / Symbolic short term

Mermaid task canvas + context offload

Detailed intermediate logs are the largest token consumer in long tasks. The plugin offloads them to disk and keeps the dense symbol graph in context.

structure

  • Complete tool logs are offloaded to: refs/*.md It is located under the data directory.
  • State transitions are encoded in Mermaid syntax within the lightweight task canvas.
  • The agent examines the symbol graph and greps it. node_id Get the raw text.

Storage path on disk: ~/.openclaw/memory-tdai/. All artifacts are human readable for whitebox debugging.

04 / Installation

Install the OpenClaw plugin

Requires Node.js 22.16 or later and OpenClaw installed.


openclaw plugins install @tencentdb-agent-memory/memory-tencentdb
openclaw gateway restart

Enabling zero configuration

Add the following ~/.openclaw/openclaw.json Enable with default SQLite + sqlite-vec.

{
  "memory-tencentdb": {
    "enabled": true
  }
}

05 / Configuration

Routine tuning parameters

All fields have reasonable defaults. The most common knobs are listed below.

field default explanation
storeBackend Scrite storage backend
recall.strategy hybrid Keywords / Embedded / Hybrid (RRF)
recall.maxResults 5 Items returned per recall
recall.timeoutMs 5000 Skip injection on timeout
pipeline.everyNConversations 5 L1 extraction every N turns
persona.triggerEveryN 50 Generate a persona every N memories
offload.enabled error Short-term compression switching

06 / Short-term compression

Enabling Mermaid Offload (v0.3.4 and later)

Three steps to enable context offloading for long-running tasks.

Step 1 · Enable offloading in plugin settings

{
  "memory-tencentdb": {
    "config": {
      "offload": { "enabled": true }
    }
  }
}

Step 2 · Register the slot so that OpenClaw can route offload requests

{
  "plugins": {
    "slots": {
      "contextEngine": "openclaw-context-offload"
    }
  }
}

Step 3 · Apply runtime patches (once per OpenClaw installation)

bash scripts/openclaw-after-tool-call-messages.patch.sh

07 / Hermes Docker

Run memory-enabled Hermes in one container

A single Docker image bundles the Hermes Agent, memory_tencentdb plugin, and TDAI Memory Gateway.


docker build -f Dockerfile.hermes -t hermes-memory .


docker run -d \
  --name hermes-memory \
  --restart unless-stopped \
  -p 8420:8420 \
  -e MODEL_API_KEY="your-api-key" \
  -e MODEL_BASE_URL="https://api.lkeap.cloud.tencent.com/v1" \
  -e MODEL_NAME="deepseek-v3.2" \
  -e MODEL_PROVIDER="custom" \
  -v hermes_data:/opt/data \
  hermes-memory


curl http://localhost:8420/health

Any OpenAI compatible endpoint will work. MODEL_PROVIDER=custom. memory data is hermes_data volume.

08 / Agent tools and recalls

What agents see

Two tools are exposed to the agent during the session. Recall uses BM25 + Vector + RRF fusion by default.

tdai_memory_search

Search across L1 atoms, L2 scenarios, and L3 personas.

tdai_conversation_search

Search raw L0 conversation history.

Retrieval default

  • Hybrid strategy: BM25 keywords + vector embeddings, fused via mutual rank fusion.
  • BM25 tokenizer supports Chinese (jieba) and English.
  • Returns 5 items per recall. 5000ms timeout. Skip injection if timeout occurs.
  • References include: node_id and result_ref For traceback.

09 / Benchmark

Reporting profits with OpenClaw

It is measured over continuous long-term sessions rather than isolated turns. SWE bench runs 50 consecutive tasks per session.

benchmark baseline With plug-in Δpath Δ token
wide search 33% 50% +51.52% −61.38%
SWE bench 58.4% 64.2% +9.93% −33.09%
AA-LCR 44.0% 47.5% +7.95% −30.98%
persona memo 48% 76% +59%

Numbers are based on Tencent’s own evaluation and reflect integration with OpenClaw.

10 / Resources

where are you going next

Documentation, source code, and community channels.

source code

github.com/Tencent/TencentDB-Agent-Memory

npm package

@tencentdb-agent-memory/memory-tencentdb

road map

Portable memory, automatic skill generation, and visual debugging dashboard.

curator mark tech post · AI Research, Designed for Builders

Important points

  • TencentDB Agent Memory is Tencent’s open source (MIT) memory system for AI agents, built on symbolic short-term memory with a layered long-term memory pipeline with no external API dependencies.
  • Long-term memory is structured as a four-layer semantic pyramid (L0 Conversations → L1 Atoms → L2 Scenarios → L3 Personas) and is constructed by drilling down. node_id and result_ref Instead of flat vector recall.
  • Short-term memory offloads redundant tool logs. refs/*.md It also keeps only the compact Mermaid task canvas in context, reducing token usage while maintaining full traceability.
  • Improvements are reported when integrated with OpenClaw: WideSearch pass rate went from 33% → 50% (token reduction 61.38%), SWE bench went from 58.4% → 64.2%, AA-LCR went from 44.0% → 47.5%, and PersonaMem accuracy went from 48% → 76%.
  • It ships as a single npm plugin for OpenClaw and a Docker image for Hermes, with default local SQLite + sqlite-vec, hybrid BM25 + vector + RRF retrieval, and an optional Tencent Cloud Vector Database (TCVDB) backend.

Please check Repo. Please feel free to follow us too Twitter Don’t forget to join us 150,000+ ML subreddits and subscribe our newsletter. hang on! Are you on telegram? You can now also participate by telegram.

Need to partner with us to promote your GitHub repository, Hug Face Page, product release, webinar, etc.? connect with us


Michal Sutter is a data science expert with a master’s degree in data science from the University of Padova. With a strong foundation in statistical analysis, machine learning, and data engineering, Michal excels at transforming complex datasets into actionable insights.



Source link