[AI ARCH] JANUARY 20, 2026

Dynamic Context Discovery: The Agent That Knows Where to Look

BY LALO · 4 MIN READ · 861 WORDS

In my last post, I explored graph-informed routing — using the density of your personal context graph to decide which model should think. But that assumed you have a graph worth querying.

After six months of daily use, you have a problem. Thousands of decisions. Hundreds of unresolved threads. You can’t dump all of that into context. Token limits. Cost. Noise.

The common answer is compression. Summarize older conversations. Abstract patterns. Accept some loss.

But compression is lossy. And the 5% you lose might be the decision that matters most in a given moment.

There’s a third option.

The Missing Piece in Hybrid Approaches

Most AI memory systems already use multiple storage layers. Mem0’s architecture combines graph memory with vector retrieval. That’s table stakes.

But most hybrids still statically include context — deciding upfront what goes in the prompt. The insight isn’t “use multiple storage paradigms.” It’s “let the agent decide which layer to query, and when.”

Dynamic Context Discovery

The insight comes from how agents handle tool discovery. Cursor’s research on MCP tools found that static tool inclusion bloats context without improving performance. Dynamic discovery — seed with minimal context, let the agent query for more when needed — achieved 46.9% token reduction with better relevance.

The same principle applies to memory.

Static seed: Always include minimal, high-signal context:

Current bucket/project
Last 2-3 recent traces
Top entities by frequency

Dynamic retrieval: Let the agent pull more when it detects it needs it:

User mentions an entity not in the seed
Time reference (“last month,” “in March”)
Cross-context query (“what about the work side of this?”)
Explicit precedent request (“what did I decide about X?”)

The agent doesn’t compress your history. It queries your history — when relevant, with precision.

Why This Matters

Compression assumes you know upfront every detail that will matter later. You don’t.

The moment that matters often doesn’t feel like a decision at the time. A fleeting thought, a tangent you did once that become super relevant, a pattern you didn’t know was a pattern.

Documentation captures what you knew to write down. Dynamic discovery captures what you didn’t and surfaces it when the connection emerges in real-time.

The Hybrid Architecture

This requires a specific architecture. You can’t just use a vector database — but you can’t just use a file system or relational database either. You need all three.

Document layer (file system): Store every decision trace as a document. Atomic. Fast. Complete. Nothing lost. No schema constraints.

Triple layer (graph database): Emit relationships as a secondary index. Entity A blocks Entity B. Decision X is linked to Entity Y. This enables graph traversal without replacing documents.

Vector layer (semantic search): Embed traces for similarity. When dynamic retrieval triggers, search here.

The insight: store like a file system (lossless), index like a database (queryable), embed for fuzzy retrieval. Then let the agent decide which layer to query at inference time.

The flow:

User speaks → Decision trace saved as document (lossless, file-like)
Background job emits triples (graph relationships)
Query triggers → Static seed + dynamic retrieval from graph/vectors
Full fidelity when needed, minimal tokens when not

Frequency Scoring (Not Just Recency)

One insight from building this: recency isn’t the best signal for what to surface.

An idea mentioned consistently over six months matters more than something mentioned three times last week then dropped. Excitement fades. Consistency indicates importance.

The scoring function I’ve been experimenting with uses non-linear decay:

Fast decay in the first week (filter out excitement spikes)
Slow decay for older items (reward consistency)

A topic you’ve mentioned monthly for six months scores higher than a topic you mentioned daily for a week then forgot.

The Research I Drew From

I spent time this week with Mem0’s memory architecture (arXiv:2504.19413). Their research showed:

91% lower latency vs. full-context approaches
90% token reduction through smart retrieval

Their approach separates retrieval from state management — a key insight. Cursor’s work on MCP tool discovery showed the same principle applies to context: dynamic > static.

I’ve been extending both to include the dynamic discovery pattern, where the agent decides when to retrieve, not just what.

If you’re building memory-based AI:

Mem0: github.com/mem0ai/mem0 (Apache 2.0)
Their paper: arXiv:2504.19413

Props to both teams for the research that informed this thinking.

The Takeaway

You don’t have to choose between file system and database. You don’t have to choose between dumping everything and compressing.

Dynamic context discovery is the third option: store losslessly (file system), index relationally (graph), embed semantically (vectors), and retrieve on-demand. The agent queries your full history using whichever layer fits the query.

Overtime as you build up context, the AI that serves you best isn’t the one that chose a storage paradigm upfront. It’s the one that learned when to ask for more, which layer to ask, and emerge connections you may have not caught yourself.

It’s like having an Executive Assistant that knows you better than yourself. Or like Wendy to Axelrod. Or Donna to Harvey.