How to Fix CrewAI Memory in Production with Mem0

Posted In

Engineering

Posted On

March 4, 2026

Summarize with AI

Summarize

Blogs

Summarize

Blogs

Summarize

Blogs

Summarize

Blogs

On This Page

Posted On

March 4, 2026

Posted In

Engineering

Summarize with AI

Summarize

Blog

Summarize

Blog

Summarize

Blog

Summarize

Blog

CrewAI ships with a built-in memory system, which is great. You can set memory=True on your Crew, and agents gain short and long-term memory, entity memory, and contextual memory, backed by local ChromaDB and SQLite storage. But there is no per-user isolation for CrewAI memory types, so your system will fail fast in production.

Mem0 plugs into CrewAI as an external memory provider that fixes these problems. It provides persistent storage across sessions, native multi-user scoping, and smarter memory extraction that goes beyond raw retrieval. In this article, I’ll walk you through what CrewAI's memory system does, where it breaks down, and how to configure Mem0 as a replacement.

TLDR

  • CrewAI has four memory types: short-term (ChromaDB + RAG), long-term (SQLite), entity (RAG-based), and contextual (an orchestration layer that combines the others). All default to local storage.

  • The defaults are machine-bound, have no multi-user isolation, and cause ChromaDB locking errors under concurrent access.

  • Mem0 replaces short-term and entity memory as an external provider, adding user-scoped memory, cross-session persistence, and intelligent extraction that filters signal from noise.

  • Two integration paths: Mem0 Cloud (managed, API key) and Mem0 OSS (self-hosted with your own vector store, LLM, and embedder).

CrewAI default vs. CrewAI + Mem0

Aspect

CrewAI default memory

CrewAI + Mem0

Storage

Local ChromaDB + SQLite

Mem0 Cloud or self-hosted vector DB

Persistence

Machine-bound, session-scoped

Cross-session, cross-deployment

Multi-user

Not natively supported

Built-in via user_id scoping

Memory intelligence

Raw RAG retrieval

Inferred memory extraction + categorization

Setup complexity

memory=True (one line)

External memory config (moderate)

What does CrewAI's memory system actually do?

CrewAI provides a single Memory class that handles storage and retrieval through one API. It uses an LLM to analyze content on save, automatically infer scope, categories, and importance, and blend semantic similarity, recency, and importance during recall. Retrieval supports two depths, shallow for routine agent context and deep for complex queries.

If you have read older tutorials or community posts about CrewAI memory, you have likely seen it described as four separate components: short-term, long-term, entity, and contextual. I find it helpful to understand what each one stores and when it fires, because that determines which components I need to replace. 

Short-term memory and the current session

Short-term memory is the working memory agents use during a single crew execution. I think of CrewAI’s short-term memory as the scratchpad that keeps agents coherent within a single run.

Short-term memory covers recent interactions and task outputs scoped to a single kickoff() call. When an agent finishes a task and the next agent in the sequence needs that output as context, this layer handles the handoff. 

Embeddings are persisted to local storage under ./.crewai/memory by default (configurable via the CREWAI_STORAGE_DIR environment variable or a storage parameter). The data survives between runs on the same machine, but sharing across processes or environments requires manual intervention. In my experience, this is the first memory type that breaks when you move from "works on my laptop" to "runs in production."

Long-term memory and cross-session learning

Long-term memory stores the outcomes of past task executions so a crew can adjust its approach over time. The key distinction from short-term memory is the type of question it answers: not "what did this user tell me" but "what approach worked the last time I ran this task." It is structured, outcome-oriented storage designed for operational learning across runs rather than conversational recall.

Entity memory and knowledge about the world

Entity memory tracks people, places, organizations, and concepts that agents encounter during tasks. 

This memory type uses RAG-based storage similar to short-term memory, scoped to entities rather than task outputs. If your agent processes a document that mentions "Acme Corp" and its CEO, entity memory is where those facts land.

In multi-agent crews, this is how agents build a shared understanding of the world they are operating in. A research agent surfaces an entity, and a downstream outreach agent can recall it without the fact being explicitly passed through the task chain.

Contextual memory as the orchestrator

Contextual memory combines short- and long-term memory with external memory to build a unified view of context for each agent. 

It is automatic, so you can’t configure it directly. When an agent receives a task, contextual memory assembles the relevant pieces from the other memory types and injects them into the agent's prompt.

That is the system as designed. In local development, it works well enough. For the most part.  The problems start when you try to run it in a real environment. For a broader look at why memory layers matter for AI agents, see the AI memory layer guide.

Where does CrewAI's default memory break down?

CrewAI is one of several agentic frameworks that include memory as a first-class concept, which is great on paper, but the implementation defaults create real friction in production. I ran into real problems within the first week of moving a crew to production, and the community forums suggest I am not alone. 

  • Local-only storage. If you deploy to a container, spin up a new instance, or move to a different machine, all your accumulated memory is gone. CrewAI stores memory files in platform-specific directories (~/Library/Application Support/CrewAI/ on macOS, ~/.local/share/CrewAI/ on Linux), and there is no built-in mechanism to sync or export them. I lost three days of accumulated entity memory when I redeployed to a new Cloud Run instance. You will not realize this is a problem until you have already lost data.

  • No multi-user support. CrewAI does not automatically isolate memory per end-user in a server setting. If you run one API serving many users, you must implement scoping yourself: separate storage per user, per-user scope paths, or external memory keyed by user_id. Skip that, and context bleeds between users. I saw this firsthand when two test users hit my API within seconds of each other, and one received recommendations clearly meant for the other. It’s not just my problem, the CrewAI community forums have multiple threads from developers hitting this exact problem. If User A tells the crew they are looking for a house in France and User B says they want an apartment in Spain, context bleeds between the two.

  • No memory intelligence. You might expect the memory system to prioritize important information over throwaway context, but it does not. The default RAG retrieval does basic semantic matching against everything stored, giving equal weight to a user's stated dietary restriction and an offhand comment about the weather. I noticed my agents burning tokens on irrelevant retrieved context well before I understood why.

  • Concurrent access issues. On older CrewAI setups backed by ChromaDB, running multiple crews in parallel against shared storage produces "database is locked" errors. Current versions use LanceDB with a retry mechanism, which helps, but, LanceDB's own documentation notes that too many concurrent writers can still exhaust the retry limit and produce failed writes. Real-world bug reports also show commit conflicts crashing entire instances under concurrent deletes. I tried those workarounds and found them fragile under real concurrent load.

  • Configuration confusion. The relationship between memory_config, external_memory, embedder, and individual memory type overrides trips up a lot of developers. A GitHub issue documented a bug in which local_mem0_config was not applied because the wrong variable was passed to Memory.from_config(). I hit a variant of this myself. I set the Mem0 config on memory_config but also passed a separate embedder config, and the two conflicted silently.

After working through all of this, I ended up replacing CrewAI's short-term and entity memory with Mem0 while leaving CrewAI's long-term memory and contextual memory in place. 

Long-term memory stores structured task outcomes in SQLite, which serves a different purpose than what Mem0 handles. It tracks whether a task succeeded or failed across runs so the crew can adjust its approach, not user-specific facts or conversational context. Contextual memory is automatic and will pull from whatever memory sources are active, so it picks up Mem0 without any extra configuration. 

The net result is that Mem0 handles the user-facing memory (what did this person tell me, what do I know about them) while CrewAI's built-in components handle the operational memory (what worked last time I ran this task). 

How does Mem0 plug into CrewAI's memory architecture?

Mem0 integrates with CrewAI's short-term and entity memory as an external provider. Two paths, depending on whether you want managed infrastructure or full control.

I tested both integration paths, Mem0 Cloud and the self-hosted OSS version, and ended up using Cloud for my production setup while keeping OSS for a client project with strict data residency requirements. Here is how each one works.

Using Mem0 Cloud (managed)

This is the path I recommend if you want to get up and running fast. You grab an API key from app.mem0.ai/get-api-key, set it as an environment variable, and configure the ExternalMemory object. The first time I got this working end-to-end took about 15 minutes.

import os
from crewai import Crew, Process
from crewai.memory.external.external_memory import ExternalMemory

os.environ["MEM0_API_KEY"] = "api-key"  # Get one at https://app.mem0.ai/get-api-key

external_memory = ExternalMemory(
    embedder_config={
        "provider": "mem0",
        "config": {
            "user_id": "john",
            "org_id": "my_org_id",
            "project_id": "my_project_id",
            "api_key": os.getenv("MEM0_API_KEY"),
            "run_id": "session_123",
            "includes": "preferences",
            "excludes": "small_talk",
            "infer": True,
            "custom_categories": [
                {"dietary_preferences": "Tracks food allergies, restrictions, and preferences"},
                {"travel_preferences": "Tracks travel style, preferred destinations, and booking habits"}
            ]
        },
    }
)

crew = Crew(
    agents=[...],
    tasks=[...],
    external_memory=external_memory,
    process=Process.sequential,
    verbose=True
)
import os
from crewai import Crew, Process
from crewai.memory.external.external_memory import ExternalMemory

os.environ["MEM0_API_KEY"] = "api-key"  # Get one at https://app.mem0.ai/get-api-key

external_memory = ExternalMemory(
    embedder_config={
        "provider": "mem0",
        "config": {
            "user_id": "john",
            "org_id": "my_org_id",
            "project_id": "my_project_id",
            "api_key": os.getenv("MEM0_API_KEY"),
            "run_id": "session_123",
            "includes": "preferences",
            "excludes": "small_talk",
            "infer": True,
            "custom_categories": [
                {"dietary_preferences": "Tracks food allergies, restrictions, and preferences"},
                {"travel_preferences": "Tracks travel style, preferred destinations, and booking habits"}
            ]
        },
    }
)

crew = Crew(
    agents=[...],
    tasks=[...],
    external_memory=external_memory,
    process=Process.sequential,
    verbose=True
)
import os
from crewai import Crew, Process
from crewai.memory.external.external_memory import ExternalMemory

os.environ["MEM0_API_KEY"] = "api-key"  # Get one at https://app.mem0.ai/get-api-key

external_memory = ExternalMemory(
    embedder_config={
        "provider": "mem0",
        "config": {
            "user_id": "john",
            "org_id": "my_org_id",
            "project_id": "my_project_id",
            "api_key": os.getenv("MEM0_API_KEY"),
            "run_id": "session_123",
            "includes": "preferences",
            "excludes": "small_talk",
            "infer": True,
            "custom_categories": [
                {"dietary_preferences": "Tracks food allergies, restrictions, and preferences"},
                {"travel_preferences": "Tracks travel style, preferred destinations, and booking habits"}
            ]
        },
    }
)

crew = Crew(
    agents=[...],
    tasks=[...],
    external_memory=external_memory,
    process=Process.sequential,
    verbose=True
)

I want to walk through the config parameters because a few of them are not obvious from the names alone, and getting them right saved me a lot of debugging later.

  • user_id is the only required parameter. It scopes all memory to a specific user, which is the single change that solved my multi-user isolation problem. Every memory operation, both reads and writes, gets filtered through this ID.

  • org_id and project_id are optional but I found them valuable quickly. I use project_id to separate staging and production memory for the same crew. Without it, I was accidentally polluting production memory with test data during development.

  • run_id ties memory to a specific session or execution run. I use this for short-term conversational context that should not persist after the user closes the chat. If you skip it, all memory persists indefinitely under the user_id.

  • includes and excludes filter what types of information get stored or retrieved. Setting excludes to "small_talk" was one of the most impactful changes I made. Before that, my agents were retrieving greetings and filler alongside actual user preferences.

  • infer defaults to True, and I strongly recommend leaving it on. This is the parameter that tells Mem0 to figure out what is worth remembering rather than storing everything raw. When I briefly turned it off for testing, retrieval quality dropped noticeably because the agent was pulling back full conversation turns instead of extracted facts.

  • custom_categories lets you define your own memory buckets with descriptions. I set up categories like "dietary_preferences" and "travel_preferences" for a concierge agent, and Mem0 automatically sorted extracted memories into the right bucket. If you skip this, Mem0 still categorizes memories, just with its own default taxonomy.

The Cloud path works well if you are comfortable sending data to a third-party API. If that is not an option, Mem0 also supports a fully self-hosted setup.

Using Mem0 OSS (self-hosted)

If you cannot send data to a third-party API, or if you want full control over your vector store and embedding pipeline, the self-hosted path is what you want. I used this for a client project where all user data had to stay within their AWS VPC. 

The setup takes longer, but the config structure is similar.

from crewai import Crew, Process
from crewai.memory.external.external_memory import ExternalMemory

external_memory = ExternalMemory(
    embedder_config={
        "provider": "mem0",
        "config": {
            "user_id": "john",
            "local_mem0_config": {
                "vector_store": {
                    "provider": "qdrant",
                    "config": {"host": "localhost", "port": 6333}
                },
                "llm": {
                    "provider": "openai",
                    "config": {
                        "api_key": "your-openai-key",
                        "model": "gpt-4o-mini"
                    }
                },
                "embedder": {
                    "provider": "openai",
                    "config": {
                        "api_key": "your-openai-key",
                        "model": "text-embedding-3-small"
                    }
                }
            },
            "infer": True
        },
    }
)

crew = Crew(
    agents=[...],
    tasks=[...],
    external_memory=external_memory,
    process=Process.sequential,
    verbose=True
)
from crewai import Crew, Process
from crewai.memory.external.external_memory import ExternalMemory

external_memory = ExternalMemory(
    embedder_config={
        "provider": "mem0",
        "config": {
            "user_id": "john",
            "local_mem0_config": {
                "vector_store": {
                    "provider": "qdrant",
                    "config": {"host": "localhost", "port": 6333}
                },
                "llm": {
                    "provider": "openai",
                    "config": {
                        "api_key": "your-openai-key",
                        "model": "gpt-4o-mini"
                    }
                },
                "embedder": {
                    "provider": "openai",
                    "config": {
                        "api_key": "your-openai-key",
                        "model": "text-embedding-3-small"
                    }
                }
            },
            "infer": True
        },
    }
)

crew = Crew(
    agents=[...],
    tasks=[...],
    external_memory=external_memory,
    process=Process.sequential,
    verbose=True
)
from crewai import Crew, Process
from crewai.memory.external.external_memory import ExternalMemory

external_memory = ExternalMemory(
    embedder_config={
        "provider": "mem0",
        "config": {
            "user_id": "john",
            "local_mem0_config": {
                "vector_store": {
                    "provider": "qdrant",
                    "config": {"host": "localhost", "port": 6333}
                },
                "llm": {
                    "provider": "openai",
                    "config": {
                        "api_key": "your-openai-key",
                        "model": "gpt-4o-mini"
                    }
                },
                "embedder": {
                    "provider": "openai",
                    "config": {
                        "api_key": "your-openai-key",
                        "model": "text-embedding-3-small"
                    }
                }
            },
            "infer": True
        },
    }
)

crew = Crew(
    agents=[...],
    tasks=[...],
    external_memory=external_memory,
    process=Process.sequential,
    verbose=True
)

The key difference here is the local_mem0_config block, where you specify your own vector store, LLM, and embedder. I used Qdrant in the example above because that was what I had running, but you can replace it with Weaviate, pgvector, or any other supported provider. 

The tradeoff here is that you own the uptime, backups, and scaling of every component in that stack. When my Qdrant instance ran out of memory during a load test, that was my problem to solve, not Mem0's.

One thing I want to flag. If you hit the bug I mentioned earlier, where local_mem0_config is not being applied, make sure you are on a recent version of CrewAI. The fix was merged, but if you are pinned to an older version, you will run into the same silent failure I did.

How does Mem0 address the multi-user and production gaps?

I want to be specific about what changed after I switched, because the improvements mapped directly to the problems I described earlier.

  • After adding user_id to the Mem0 config, I reran the same France-vs-Spain two-user test that had originally exposed the context bleed. This time, each user got back only their own context. I confirmed on the Mem0 dashboard that the memories were cleanly isolated. One config parameter fixed what I had been hacking around for weeks.

  • Redeployments stopped wiping memory too. I tore down my Cloud Run instance, spun up a fresh one, and the agent recalled that the user preferred window seats and had asked about flights to Tokyo two sessions ago. On the client project using the OSS path, the Qdrant instance persisted independently of the application containers, so I got the same durability with self-hosted infrastructure.

  • Splitting staging and production into separate project_id values eliminated the environment cross-contamination, where test entities like "Fake User" had been surfacing in real user responses.

  • The retrieval quality improvement surprised me the most. With infer enabled, a user saying "I am allergic to nuts, by the way, it is raining here today" resulted in one stored memory about the nut allergy. The weather comment was correctly discarded. My average retrieved context shrank while the relevance of that context went up.

I did not run formal benchmarks on my own CrewAI setup, so I cannot provide exact agent-specific performance numbers. However, the pattern I observed was consistent with the benchmark's findings. My agents retrieved more relevant context, used fewer tokens per retrieval, and produced better responses after the switch.

Conclusion

CrewAI's memory system gives you the right building blocks for local prototyping. Mem0 turns those building blocks into a production-ready memory layer that persists across sessions, isolates context per user, and filters out noise before it reaches your agents. 

The integration took me about 15 minutes for the Cloud path, and after the switch, my agents stopped forgetting and started compounding their knowledge.

FAQs

Does CrewAI support multi-user memory isolation?

Not natively. CrewAI does not scope memory per user in server environments, which means context can bleed between users sharing the same API. Mem0 fixes this with a user_id parameter that filters all reads and writes to a specific user.

What happens to CrewAI memory when you redeploy?

It gets wiped. CrewAI stores memory in local, machine-bound directories. Any redeployment, new container, or fresh cloud instance starts with a blank slate. Mem0 persists memory externally so agents retain knowledge across sessions and deployments.

Which CrewAI memory types does Mem0 replace?

Mem0 replaces short-term and entity memory as an external provider. Long-term memory (SQLite-backed task outcomes) and contextual memory (the orchestration layer) stay handled by CrewAI natively.

Can I self-host Mem0 instead of using Mem0 Cloud?

Yes. Mem0 OSS lets you bring your own vector store (Qdrant, Weaviate, pgvector), LLM, and embedder. This is the right path if you have data residency requirements or cannot send data to a third-party API.

What does the infer parameter do in Mem0's CrewAI config?

When set to true (the default), Mem0 extracts meaningful facts rather than storing raw conversation text. A user saying "I'm allergic to nuts, by the way it's raining here today" results in one stored memory about the allergy. The filler gets discarded. Turning this off noticeably degrades retrieval quality.

How do you prevent test data from polluting production memory?

Use separate project_id values for staging and production in your Mem0 config. Without this separation, test entities and dummy users can surface inside real user responses.

On This Page

Subscribe To New Posts

Subscribe for fresh articles and updates. It’s quick, easy, and free.

No spam. Unsubscribe anytime.

No spam. Unsubscribe anytime.

No spam. Unsubscribe anytime.

Give your AI a memory and personality.

Instant memory for LLMs—better, cheaper, personal.

Give your AI a memory and personality.

Instant memory for LLMs—better, cheaper, personal.

Give your AI a memory and personality.

Instant memory for LLMs—better, cheaper, personal.

Summarize with AI

Summarize

Blog

Summarize

Blog

Summarize

Blog

Summarize

Blog

© 2026 Mem0. All rights reserved.

Summarize with AI

Summarize

Blog

Summarize

Blog

Summarize

Blog

Summarize

Blog

© 2026 Mem0. All rights reserved.

Summarize with AI

Summarize

Blog

Summarize

Blog

Summarize

Blog

Summarize

Blog

© 2026 Mem0. All rights reserved.