memcity
v0.7.0 — STABLE

AI Memory for
Convex Apps

Vector search, knowledge graph, episodic memory, and a production-grade 16-step RAG pipeline. One component. One line to install.

Get Started
npx shadcn add @memcity/memory-search
16
Pipeline Steps
25+
File Types
1024d
Embeddings
convex/memory.ts
import { Memory } from "memcity";

const memory = new Memory(components.memcity, {
  tier: "pro",
  ai: {
    gateway: "openrouter",
    model: "google/gemini-2.0-flash-001",
  },
});

// Ingest a document
await memory.ingest({
  text: documentContent,
  knowledgeBaseId: "kb_engineering",
  source: "design-doc.md",
});

// Search with the full 16-step pipeline
const results = await memory.search({
  query: "How does authentication work?",
  knowledgeBaseId: "kb_engineering",
});
16
Pipeline Steps
Query to result, fully configurable
25+
File Types
PDF, audio, video, images, code
<700ms
Avg Latency
Full pipeline, cold cache
$0.003
Per Query
OpenRouter + Jina combined

How it works

Three Steps to AI Memory

From raw documents to intelligent search results. See what happens under the hood.

01INGEST

Feed your documents

PDF, Markdown, images, audio, video — 25+ file types. Text is chunked with configurable overlap and heading preservation.

source document01overlap02overlap03
02PROCESS

Automatic intelligence

Each chunk gets embedded (Jina v4, 1024d), indexed for BM25, and optionally analyzed for entities and relationships.

processing — chunk_042
$memcity process chunk_042
[embed]Jina v4 encoding... 1024 dimensions28ms
[embed][0.234, -0.891, 0.456, ..., 0.123]
[index]BM25 tokenize... 487 tokens indexed12ms
[index]vector stored in knowledge_base_01
[ner]entities:
AuthModule User Session JWT OAuth2
[graph]relationships:
AuthModule --manages--> User
User --creates--> Session
AuthModule --issues--> JWT
donechunk_042 processed in 94ms94ms
$
03SEARCH

16-step retrieval

Query routing, HyDE, hybrid search, RRF fusion, reranking, chunk expansion, citations, and caching. All configurable.

search — pipeline
$ memory.search("How does authentication work?")
[01]quota_check2ms
[02]cache_lookup1ms
[03]query_routing15ms
[04]decomposition45ms
[05]query_expand32ms
[06]hyde_generate120ms
[07]embedding28ms
[08]hybrid_search85ms
[09]rrf_fusion3ms
[10]acl_filter5ms
[11]graphrag95ms
[12]reranking110ms
[13]chunk_expand45ms
[14]citations12ms
[15]raptor65ms
[16]format+cache4ms
8 results in 667ms — $0.003
0.94auth.mdp.12 L34
0.91login.tsL56-78
0.87session.mdp.8 L12

Architecture

Four Subsystems, One Component

Vector search, knowledge graph, episodic memory, and RAG pipeline — all connected through Convex's real-time database.

CONVEX DBReal-time SyncVECTOR SEARCHSemantic + BM25RRF FusionKNOWLEDGE GRAPHEntity ExtractionGraphRAG TraversalEPISODIC MEMORYPer-User RecallDecay & ConsolidationRAG PIPELINE16 StepsProduction-Grade
01

Semantic + BM25

Hybrid vector search with weighted RRF fusion for best retrieval quality.

02

Entity Extraction

Automatic NER with BFS, best-first, and hybrid graph traversal strategies.

03

Per-User Memory

Episodic memory with temporal decay and consolidation over conversations.

04

Configurable Pipeline

Every step is a toggle. Disable what you don't need, tune what you do.

Capabilities

Everything You Need

From basic vector search to enterprise-grade knowledge management, all in a single Convex component.

pro

16-Step RAG Pipeline

From query routing to cached results, every search passes through a production-grade retrieval pipeline you can configure per-query.

  • Query routing classifies complexity automatically
  • HyDE generates hypothetical answer documents
  • Jina Reranker v3 for precision ranking
  • Chunk expansion fetches surrounding context
  • Citations with page/line/heading breadcrumbs
QUERY INPUTRoute + DecomposeHyDE + EmbedHybrid Search (Semantic + BM25)RRF Fusion + ACL FilterRerank + Chunk ExpansionCitations + RAPTOR + Cache667ms | 8 results
pro

Knowledge Graph

Automatic entity extraction with relationship tracking. Your documents become a navigable graph of connected concepts.

  • BFS, best-first, and hybrid traversal strategies
  • Entity deduplication and merging
  • Relationship strength scoring
  • GraphRAG traversal augments search results
AuthUserSessionRolePermissionTokenGroupOAuthBFS + Best-First + Hybrid Traversal
pro

25+ File Types

PDF, DOCX, images (OCR), audio transcription, video, spreadsheets, and more. All processed automatically.

  • PDF via Jina Reader with gateway fallback
  • Images: OCR + description via vision model
  • Audio/video: automatic transcription
  • CSV, XLSX, PPTX: structured extraction
DocumentsPDFDOCXDOCTextTXTMDHTMLJSONMediaPNGJPGMP3MP4DataCSVXLSXPPTX+13 more25+ file types supported

Episodic Memory

Per-user memory with decay and consolidation. Your AI remembers conversations over time.

pro

RAPTOR Summaries

Recursive abstractive processing for tree-organized retrieval. High-level answers from large corpora.

pro

Vector + BM25 Search

Semantic embeddings plus BM25 keyword search with weighted RRF fusion.

Enterprise ACL

Fine-grained document-level permissions with principal hierarchies for users, roles, and groups.

team

Audit Logging

Immutable audit trail for every search, ingest, and deletion. Full compliance support.

team

Usage Quotas

Per-organization document limits, storage caps, and API rate limiting with automatic enforcement.

team

Under the hood

16-Step Search Pipeline

Every query passes through a production-grade retrieval pipeline. Here's what happens when you call memory.search()

memory.search(query)01Quota CheckUsage limits2ms02CacheRepeated queries1ms03RouteClassify complexity15ms04DecomposeBreak sub-queries45ms05ExpandQuery variations32ms06HyDEHypothetical docs120ms07Embed1024d vectors28ms08SearchSemantic + BM2585ms09FuseRRF merging3ms10ACLPermissions filter5ms11GraphRAGEntity traversal95ms12RerankJina Reranker v3110ms13ExpandChunk context45ms14CitePage/line refs12ms15RAPTORSummaries65ms16FormatCache + return4mstotal: 667ms | tokens: 3,420 | cost: $0.0028
$ memcity pipeline --verbose

[01] quota_check      -- 2ms    PASS
[02] cache_lookup     -- 1ms    MISS
[03] query_routing    -- 15ms   complexity=high
[04] decomposition    -- 45ms   sub_queries=3
[05] query_expansion  -- 32ms   variations=4
[06] hyde_generation  -- 120ms  hypothetical_docs=2
[07] embedding        -- 28ms   dimensions=1024
[08] hybrid_search    -- 85ms   semantic=50 bm25=50
[09] rrf_fusion       -- 3ms    candidates=42
[10] acl_filter       -- 5ms    permitted=38
[11] dedup+graphrag   -- 95ms   entities=12 edges=28
[12] reranking        -- 110ms  jina_v3
[13] chunk_expansion  -- 45ms   expanded=8
[14] citations        -- 12ms   breadcrumbs=8
[15] raptor_summary   -- 65ms   summaries=3
[16] format+cache     -- 4ms    cached=true

total: 667ms | tokens: 3,420 | cost: $0.0028

Pricing

Simple, One-Time Pricing

One-time payment. Lifetime updates. No subscriptions.

Community

Perfect for prototyping and small projects.

  • Hybrid vector search (semantic + BM25)
  • RRF fusion with configurable weights
  • Basic text ingestion & chunking
  • 1 knowledge base
  • Caching & analytics
  • Apache 2.0 license
Get Started Free
Most Popular

Pro

$49.00Promo
$2.00one-time

Full pipeline for production applications.

  • Everything in Community
  • 16-step RAG pipeline
  • Knowledge graph (GraphRAG)
  • Episodic memory (per-user)
  • 25+ file type processing
  • RAPTOR summaries
  • Jina Reranker v3
  • Cascading deletion
  • Unlimited knowledge bases
  • Commercial license
Buy Pro License

Team

$99.00Promo
$49.00one-time

Enterprise controls for multi-tenant apps.

  • Everything in Pro
  • Per-document ACLs
  • Immutable audit logging
  • Usage quotas & rate limiting
  • Multi-org support
  • Priority support
  • Team license (up to 10 devs)
Buy Team License

Feedback themes

What Early Adopters Value

The one-command install plus sane defaults got us to production faster than wiring a custom RAG stack from scratch.

Integration Team

Common feedback theme @ Early adopter interviews

Hybrid search plus reranking materially improved answer relevance on noisy internal docs.

Platform Team

Common feedback theme @ Early adopter interviews

Document-level ACLs and audit trails were mandatory for enterprise evaluations.

Security Team

Common feedback theme @ Early adopter interviews

Large file support (PDF, images, audio, and video) reduced ingestion edge cases and support burden.

Data Ops Team

Common feedback theme @ Early adopter interviews

Usage quotas and tier controls gave us predictable cost boundaries before broad rollout.

Product Team

Common feedback theme @ Early adopter interviews

Graph-based retrieval helped surface relationships keyword search kept missing.

Applied AI Team

Common feedback theme @ Early adopter interviews

Ready to Add AI Memory?

Start with the free community tier. Upgrade when you need knowledge graphs, file processing, or enterprise controls.