Graph Model
Context as typed nodes with explicit relationships instead of a flat message tape.
Applied Research
Open-source context management platform for LLM agents: graph-based representation with tiered compression to prevent pre-rot degradation.
The foundation: a typed graph data structure where context nodes have relationships, types, and metadata. Everything else builds on this representation.
Three-tier compression pipeline that prioritizes reversible operations. Lossless first, lossy only when necessary. Maintains information density metrics throughout.
Parsers for code, docs, conversations, and tool outputs. Converts raw text into typed graph nodes with automatic relationship detection.
Semantic search over the context graph. Combines embedding similarity with graph-aware ranking to surface the most relevant context for a given query.
Quality metrics for context windows: measures information density, relevance coverage, and "pre-rot" degradation score. Catches context decay before it impacts outputs.
Whitespace normalization, exact deduplication, redundant preamble removal. Fully reversible โ no information loss. Always runs first.
Merges adjacent related nodes, removes stale/superseded context, collapses repeated patterns. Minimally lossy with tracked provenance.
LLM-powered distillation for when aggressive compression is needed. Preserves key facts and relationships while drastically reducing token count.
Context as typed nodes with explicit relationships instead of a flat message tape.
Reversible strategies first; lossy summarization only when lower tiers are insufficient.
Agent queries retriever, retriever fans across hot/warm/cold memory, returns context, then archives new artifacts.
Typed graph data structure where context nodes have relationships, types, and metadata. The foundation for everything else.
Three-tier compression pipeline that prioritizes reversible operations. Lossless first, lossy only when necessary.
Parsers for code, docs, conversations, and tool outputs. Converts raw text into typed graph nodes.
Semantic search over the context graph. Combines embedding similarity with graph-aware ranking.
Quality metrics: information density, relevance coverage, and "pre-rot" degradation score.
Whitespace normalization, exact deduplication, redundant preamble removal. Fully reversible.
Merges adjacent related nodes, removes stale context, collapses repeated patterns.
LLM-powered distillation. Preserves key facts and relationships while drastically reducing tokens.
Long-running agents fail before they hit hard context limits.
In production, agents often need 50+ tool calls to complete real workflows. By that point, quality starts degrading even when there is still room left in the context window. In ContextEngine, this failure mode is treated as a first-class systems problem called pre-rot: quality decay around 60-70% context usage, not at 100%.
The default pattern in most stacks is still:
That approach loses structure, burns tokens, and makes recovery difficult.
ContextEngine models context as a typed graph, not a flat transcript.
Messages, tool calls, artifacts, entities, and summaries become explicit nodes with relationships. That structure enables selective retrieval, targeted compression, and reversible reductions instead of blanket summarization.
Compression is applied in a strict hierarchy:
This keeps high-value context available longer while reducing token pressure.
ContextEngine is split into modular packages so teams can adopt only what they need.
| Package | Purpose | Status |
|---|---|---|
context-core | Graph model, entity tracking, semantic index, token budget | Complete |
context-compression | Pipeline orchestration + 9 compression strategies | Complete |
context-memory | Backends, tiered storage, retrieval, artifact versioning | Complete |
context-tools | Tool caching, pattern detection, result compression, prefetch | Complete |
context-observe | OpenTelemetry traces, Prometheus metrics, event stream | Complete |
context-multiagent | Broker/handoff/sync for multi-agent coordination | Planned |
Current test footprint across completed packages: 1,259 tests.
Typical effect: 2-5x savings with zero information loss.
Typical effect: additional 2-4x savings.
Typical effect: 5-10x on residual context.
Combined end-to-end compression: 10-20x depending on workload.
ContextEngine separates working memory from persistent memory:
This allows agents to keep active context lightweight while recalling detail on demand.
Agent quality is tightly coupled to tool-call behavior. ContextEngine includes dedicated tool intelligence:
The goal is lower latency, lower token usage, and less repeated work.
ContextEngine is instrumented as a systems component, not a black box:
This makes context quality measurable and debuggable in production.
Short-term focus is integration hardening and framework adapters; long-term focus is robust multi-agent context coordination.
Python 3.12, Pydantic, NetworkX, sentence-transformers, spaCy, tiktoken, ChromaDB, OpenTelemetry, uv