Features
Knowledge Graph
How Memcity automatically builds a knowledge graph from your documents and uses it to find connections humans would miss.
What is a Knowledge Graph?
Imagine you have hundreds of documents — employee handbooks, product specs, meeting notes, support tickets. A knowledge graph is like a mind map your application builds automatically by reading all those documents and figuring out what "things" exist and how they're connected.
For example, after ingesting a few documents, Memcity might build a graph like this:
[John Smith] --works_at--> [Acme Corp]
[Acme Corp] --headquartered_in--> [San Francisco]
[Acme Corp] --manufactures--> [Widget Pro]
[Widget Pro] --uses_technology--> [React Native]
[Jane Doe] --manages--> [John Smith]
[Jane Doe] --works_at--> [Acme Corp]Each box is an entity (a person, company, product, location, concept, or technology). Each arrow is a relationship (how two entities are connected).
Why Do You Need One Alongside Vector Search?
Vector search is great at finding text that sounds similar to your query. But it misses logical connections between pieces of information that are in different documents.
Example: You have two documents:
- "John Smith is the CEO of Acme Corp" (in the team page)
- "Acme Corp reported $50M in quarterly revenue" (in the finance report)
If a user asks "What is the revenue of John Smith's company?", pure vector search might not connect these — the text in document 1 doesn't mention revenue, and document 2 doesn't mention John Smith.
But the knowledge graph connects them:
[John Smith] --is_ceo_of--> [Acme Corp] --has_revenue--> [$50M]Memcity traverses this graph at search time, finding document 2 through the entity "Acme Corp" that connects it to John Smith.
How Memcity Builds the Graph
Entity Extraction
When you ingest a document, Memcity's LLM reads each chunk and identifies entities — the "nouns" of your data. Entities have a name, a type, and optional metadata:
| Type | Examples |
|---|---|
| Person | "John Smith", "Dr. Sarah Chen", "the CTO" |
| Organization | "Acme Corp", "Engineering Team", "Board of Directors" |
| Product | "Widget Pro", "API Gateway", "Mobile App v3" |
| Technology | "React", "PostgreSQL", "Kubernetes", "OAuth 2.0" |
| Concept | "microservices architecture", "agile methodology", "refund policy" |
| Location | "San Francisco", "Building 3", "us-east-1" |
Entity extraction happens automatically during ingestion — you don't need to tag anything manually.
Relationship Tracking
After extracting entities, the LLM identifies relationships between them. Relationships are stored as triples: subject → predicate → object.
Subject Predicate Object
────────── ───────── ──────
John Smith is_ceo_of Acme Corp
Acme Corp uses_technology React Native
Widget Pro depends_on PostgreSQL
Jane Doe reports_to John Smith
Refund Policy applies_to Digital ProductsRelationships are bidirectional — if "John Smith is_ceo_of Acme Corp" exists, searching for either entity finds the other.
Concrete Example
Let's say you ingest these three documents:
Document 1: Team Directory
Sarah Chen is the VP of Engineering at TechCo. She leads a team
of 50 engineers across three offices. Sarah previously worked
at Google on the Search team.Document 2: Product Spec
Project Atlas is TechCo's next-generation search platform.
It uses Elasticsearch for indexing and React for the frontend.
The project is scheduled for Q3 2025 launch.Document 3: Meeting Notes
In today's standup, Sarah approved the use of Kubernetes for
Project Atlas deployment. The team will migrate from bare metal
to GKE clusters by end of month.After ingestion, the knowledge graph contains:
[Sarah Chen] --role:vp_engineering--> [TechCo]
[Sarah Chen] --leads--> [Engineering Team]
[Sarah Chen] --previously_at--> [Google]
[Sarah Chen] --approved--> [Kubernetes for Atlas]
[Project Atlas] --owned_by--> [TechCo]
[Project Atlas] --uses--> [Elasticsearch]
[Project Atlas] --uses--> [React]
[Project Atlas] --uses--> [Kubernetes]
[Project Atlas] --launches--> [Q3 2025]
[Engineering Team] --size--> [50 engineers]Now if someone asks "What technologies does Sarah's project use?", the graph connects:
Sarah Chen → leads → Engineering at TechCo → owns → Project Atlas → uses → [Elasticsearch, React, Kubernetes]
Vector search alone would struggle with this because "Sarah" and "Elasticsearch" never appear in the same document.
GraphRAG Traversal Strategies
When you search, Memcity extracts entities from your query, finds them in the graph, and traverses outward to find related information. The traversal strategy determines how it explores:
Breadth-First (breadth_first)
Explores all neighbors at each depth level before going deeper. Like exploring a building floor by floor — check every room on floor 1, then every room on floor 2, etc.
Depth 0: [Sarah Chen]
Depth 1: [TechCo], [Engineering Team], [Google], [Kubernetes]
Depth 2: [Project Atlas], [50 engineers], [Search team], ...
Depth 3: [Elasticsearch], [React], [Q3 2025], ...Best for: Broad exploration when you want to discover all connections at each level.
Best-First (best_first)
Always follows the highest-scoring connection first, regardless of depth. Like a detective who always chases the hottest lead.
If the query is about "technology", the traversal might go:
[Sarah Chen] --approved--> [Kubernetes] (high relevance to "technology")
[Kubernetes] --used_by--> [Project Atlas] (high relevance)
[Project Atlas] --uses--> [Elasticsearch] (high relevance)It dives deep along the most relevant path rather than exploring broadly.
Best for: When you know roughly what you're looking for and want to find the most relevant path quickly.
Hybrid (hybrid) — Recommended
BFS for the first hop (discover all immediate connections), then best-first for deeper exploration (follow the most promising leads). This gets the best of both strategies.
Hop 1 (BFS): [TechCo], [Engineering Team], [Google], [Kubernetes]
Hop 2+ (Best): Follow most relevant → [Project Atlas] → [Elasticsearch]Best for: Most use cases. This is the default.
Configuration
graph: {
enabled: true, // Enable knowledge graph
traversalStrategy: "hybrid", // "breadth_first" | "best_first" | "hybrid"
maxDepth: 3, // How many hops to traverse
maxNodes: 50, // Max nodes to visit per search
}| Option | Default | Description |
|---|---|---|
enabled | true | Set to false to disable graph entirely |
traversalStrategy | "hybrid" | How to explore the graph |
maxDepth | 3 | Maximum relationship hops. Higher = broader but slower |
maxNodes | 50 | Safety limit on nodes visited. Prevents runaway traversals |
Tuning tips:
- Increase
maxDepth(to 4-5) if your documents have long chains of relationships (e.g., org charts, dependency trees) - Decrease
maxDepth(to 1-2) if you only care about direct connections - Increase
maxNodes(to 100) if you have a densely connected graph and want thorough exploration - Decrease
maxNodes(to 20) if you want faster searches at the cost of coverage
Code Examples
Ingesting Documents That Build the Graph
The graph builds automatically during ingestion — no special code needed:
// These three documents will create graph connections
await memory.ingestText(ctx, {
orgId,
knowledgeBaseId: kbId,
text: "Sarah Chen is the VP of Engineering at TechCo.",
source: "team-directory.md",
});
await memory.ingestText(ctx, {
orgId,
knowledgeBaseId: kbId,
text: "Project Atlas is TechCo's search platform using Elasticsearch.",
source: "product-spec.md",
});
await memory.ingestText(ctx, {
orgId,
knowledgeBaseId: kbId,
text: "Sarah approved Kubernetes for the Atlas deployment.",
source: "meeting-notes.md",
});Searching with Graph-Enhanced Results
Graph results appear alongside regular vector search results:
const results = await memory.getContext(ctx, {
orgId,
knowledgeBaseId: kbId,
query: "What technologies does Sarah's project use?",
});
// Results include both:
// 1. Direct vector matches (documents mentioning technologies)
// 2. Graph-traversed matches (documents connected through entities)
for (const result of results.results) {
console.log(result.text);
console.log("Score:", result.score);
console.log("Source:", result.citations?.source);
}How It Integrates with the Search Pipeline
The knowledge graph is Step 11 in the 16-step pipeline. It runs after RRF fusion and deduplication, adding graph-discovered results to the candidate set. These graph results then go through reranking (Step 12) alongside the vector search results, so they're scored on equal footing.
Limitations and Best Practices
Works best with:
- Structured, factual content (team pages, product docs, policies)
- Documents that reference shared entities (people, products, companies)
- Content where relationships between concepts matter
Works less well with:
- Highly abstract or creative content
- Very short documents with no clear entities
- Content in languages the LLM handles poorly
Best practices:
- Use descriptive source names when ingesting — these help with citation generation
- Ingest related documents into the same knowledge base so the graph can connect them
- The quality of entity extraction depends on your AI model — if you need better extraction, consider upgrading to a more capable model
Availability
The knowledge graph is available on Pro and Team tiers. Community tier uses vector search and BM25 only.