Engineering
Written by Bobur Umurzokov
How to monitor AI agent memory with AgentOps
Learn how AgentOps can be used to track and monitor Memori operations. Get full visibility into memory usage, context injection, performance and reliability of AI agents.
In this post, you will learn how AgentOps can be used to track and monitor Memori operations. Together, they give developers full visibility into memory usage when conversations are saved, how context is injected, and what effect this has on performance and reliability.
Key Takeaways:
- Why memory observability matters for AI agents.
- How AgentOps helps monitor memory effectiveness, token use, and errors.
- Practical steps for developers to debug and optimize memory in their own agents.
Adding a memory layer makes AI agents more useful and personal, saves up to 90% of token usage, and increases the speed of responses. But it also adds new challenges. You might ask questions like: Is memory being recorded correctly? Is the right context added to the conversation? How does memory impact cost, latency, and accuracy?
Why Memory Monitoring Matters
With Memori, you are giving your AI agent a notebook that it writes in automatically. Every conversation gets saved, and when needed, the agent looks back at its notes to give better answers. We also want to track and monitor the following while using Memori:
- Memory is saved correctly
- The right information is used in conversations
- The extra time and cost added by memory
- Errors that happen behind the scenes
Why Track Memori with AgentOps?
AgentOps serves as an observability layer for AI agents, providing developers with the insights they need to build more reliable and cost-effective AI applications. With AgentOps, you can see exactly how Memori handles memory in your AI agents. It records when conversations are automatically captured and stored, and it shows how context is injected into each LLM interaction. You get visibility into the full conversation flow across sessions, making it easier to understand how dialogue evolves over time. AgentOps also helps analyze how historical context improves response quality, while tracking latency and token usage to measure performance impact. Finally, it flags issues with memory recording or context retrieval so you can quickly identify and fix errors.
Setting Up Memory Monitoring
Let's walk through the setup step by step.
Step 1: Install What You Need
First, install the required packages:
pip install agentops memorisdk openai python-dotenv
Step 2: Set Up Your API Keys
Create a .env file with your keys:
AGENTOPS_API_KEY="your_agentops_api_key_here"
OPENAI_API_KEY="your_openai_api_key_here"
Step 3: Create Your Memory-Enabled Agent
Here's a simple example where we'll record user preferences and then ask questions:
import agentops
from memori import Memori
from openai import OpenAI
import os
from dotenv import load_dotenv
# Load environment variables
load_dotenv()
# Start tracking everything with AgentOps
agentops.start_trace("preference_memory_demo", tags=["user_preferences", "memory_test"])
try:
# Set up OpenAI
openai_client = OpenAI()
# Set up Memori - this gives our agent automatic memory
memori = Memori(
database_connect="sqlite:///user_preferences.db",
conscious_ingest=True, # This makes it smart about what to remember
auto_ingest=True, # This makes it automatic
)
# Turn on the memory system
memori.enable()
print("=== Recording User Preferences ===")
# First preference: Favorite cuisine
response1 = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[{
"role": "user",
"content": "I love Italian food, especially pasta with marinara sauce"
}]
)
print("Assistant:", response1.choices[0].message.content)
# Second preference: Dietary restrictions
response2 = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[{
"role": "user",
"content": "I'm vegetarian and don't eat any meat"
}]
)
print("Assistant:", response2.choices[0].message.content)
# Third preference: Cooking style
response3 = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[{
"role": "user",
"content": "I prefer homemade meals over restaurant food"
}]
)
print("Assistant:", response3.choices[0].message.content)
print("\n=== Testing Memory with Questions ===")
# Now let's test if the agent remembers our preferences
response4 = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[{
"role": "user",
"content": "Can you suggest a recipe I might like for dinner tonight?"
}]
)
print("Assistant:", response4.choices[0].message.content)
print("Notice: The agent should suggest a vegetarian Italian recipe for home cooking!")
# Test with a follow-up question
response5 = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[{
"role": "user",
"content": "What ingredients should I avoid when shopping?"
}]
)
print("Assistant:", response5.choices[0].message.content)
print("Notice: The agent should remember you're vegetarian!")
# Mark our tracking as successful
agentops.end_trace(end_state="success")
except Exception as e:
print(f"Something went wrong: {e}")
agentops.end_trace(end_state="error")
What You'll See in Your AgentOps Dashboard
After running this code, your AgentOps dashboard will show you:
1. Memory Timeline
You'll see exactly when each preference was recorded and how it was used later.

2. Context Injection
Watch how Memori automatically added your food preferences to the recipe question, even though you didn't mention them again.
3. Performance Impact
See how much extra time and tokens the memory system used. Usually, it's worth the small cost for much better responses.
4. Error Tracking
If something goes wrong with memory (like the database is full), you'll see exactly what happened.
5. Token Usage
Track how many extra words (tokens) were added to give the agent context. This helps you understand costs.
Summary
AI agents are only as good as their memory, but memory needs to be monitored. Memori takes care of recording and managing conversations automatically, while AgentOps shows you exactly what's remembered, how it's used, and where things go wrong. Together, they make it easy to debug, improve performance, and build agents that feel truly helpful to users.