Step 6: The Memory Manager — Give Your Agent Long-Term Memory

The four operations

The drip frames long-term memory as four operations: write (is this durable enough to keep?), select (which stored facts matter now?), compress (fold a session into a summary), and invalidate (does a new fact retire an old one?). We already built all four into memory.py:

select → recall() (valid facts only)
write + invalidate → write_fact() (upsert closes the old value)
compress → summarize_session()

The manager just orchestrates them around one agent turn.

agent.py

# agent.py
import ollama
from memory import connect, recall, extract_facts, write_fact, recent_summaries
 
def agent_turn(con, session: int, user_msg: str) -> str:
    # 1. SELECT — pull the currently-valid facts relevant to this message
    facts = recall(con, user_msg, k=6)
    memory_block = "\n".join(f"- {p}: {v}" for (_s, p, v) in facts) or "(nothing yet)"
    summary = recent_summaries(con, limit=2)
 
    system = (
        "You are a helpful assistant with memory of this user.\n"
        f"What you know about them (current facts):\n{memory_block}\n"
        + (f"\nRecent context:\n{summary}\n" if summary else "")
        + "\nUse this memory when relevant. Don't invent facts you don't have."
    )
 
    # 2. ANSWER
    reply = ollama.chat(
        model="llama3.2",
        messages=[
            {"role": "system", "content": system},
            {"role": "user", "content": user_msg},
        ],
    )["message"]["content"]
 
    # 3. WRITE + INVALIDATE — extract durable facts and upsert them
    for f in extract_facts(user_msg):
        if f.get("predicate") and f.get("value"):
            write_fact(con, f["predicate"], f["value"], session=session)
 
    return reply

A real conversation

if __name__ == "__main__":
    con = connect()
    print(agent_turn(con, 1, "Hi! I'm Maya, a backend engineer in Toronto."))
    print(agent_turn(con, 1, "What do you know about me so far?"))

$ python agent.py
Hi Maya! Good to meet a fellow backend engineer.
You're Maya, a backend engineer, and you're based in Toronto.

The second turn answers from memory the first turn wrote — within one session that's unremarkable. The magic is that it holds across sessions, because the facts are in SQLite, not the context window. Close the process, reopen it next week, and recall() still returns Maya.

Compress at session end

Call summarize_session() when a conversation wraps (a timeout, an explicit "bye", or a turn budget):

def end_session(con, session: int, turns: list[str]) -> None:
    from memory import summarize_session
    summarize_session(con, session, turns)

That's the whole manager. Simple by design — the intelligence is in the store's temporal writes and the recall filter, not in a sprawling agent. Step 7 runs the test that proves it.

Reference: ollama-python · Agent Long-Term Memory — §04 Learning what to remember · Agentic Context Engineering (drip)