Memcity is an enterprise-grade AI memory component for Convex applications. It provides vector search, knowledge graph, episodic memory, and a 16-step RAG pipeline in a single installable package.

How do I install Memcity?

Install Memcity with a single command: npx shadcn add @memcity/memory-search. Then register it in your Convex config and create a Memory instance with your API keys.

What file types does Memcity support?

Memcity supports 25+ file types including PDF, DOCX, DOC, TXT, MD, HTML, JSON, CSV, XLSX, PPTX, PNG, JPG, MP3, WAV, MP4, MOV, and more.

How much does Memcity cost?

Community tier is free. Pro and Team are one-time licenses. See the pricing cards for the current price and offer.

What is the 16-step RAG pipeline?

The pipeline includes: quota check, cache, query routing, decomposition, query expansion, HyDE, embedding, hybrid search (semantic + BM25), RRF fusion, ACL filtering, GraphRAG traversal, reranking, chunk expansion, citations, RAPTOR summaries, and format+cache.

v0.7.0 — STABLE

AI Memory for
Convex Apps

Vector search, knowledge graph, episodic memory, and a production-grade 16-step RAG pipeline. One component. One line to install.

Get Started

npx shadcn add @memcity/memory-search

Pipeline Steps

25+

File Types

1024d

Embeddings

convex/memory.ts

import { Memory } from "memcity";

const memory = new Memory(components.memcity, {
  tier: "pro",
  ai: {
    gateway: "openrouter",
    model: "google/gemini-2.0-flash-001",
  },
});

// Ingest a document
await memory.ingest({
  text: documentContent,
  knowledgeBaseId: "kb_engineering",
  source: "design-doc.md",
});

// Search with the full 16-step pipeline
const results = await memory.search({
  query: "How does authentication work?",
  knowledgeBaseId: "kb_engineering",
});

How it works

Three Steps to AI Memory

From raw documents to intelligent search results. See what happens under the hood.

01INGEST

Feed your documents

PDF, Markdown, images, audio, video — 25+ file types. Text is chunked with configurable overlap and heading preservation.

02PROCESS

Automatic intelligence

Each chunk gets embedded (Jina v4, 1024d), indexed for BM25, and optionally analyzed for entities and relationships.

processing — chunk_042

$memcity process chunk_042

[embed]Jina v4 encoding... 1024 dimensions28ms

[embed][0.234, -0.891, 0.456, ..., 0.123]

[index]BM25 tokenize... 487 tokens indexed12ms

[index]vector stored in knowledge_base_01

[ner]entities:

AuthModule User Session JWT OAuth2

[graph]relationships:

AuthModule --manages--> User

User --creates--> Session

AuthModule --issues--> JWT

donechunk_042 processed in 94ms94ms

03SEARCH

16-step retrieval

Query routing, HyDE, hybrid search, RRF fusion, reranking, chunk expansion, citations, and caching. All configurable.

search — pipeline

$ memory.search("How does authentication work?")

[01]quota_check2ms

[02]cache_lookup1ms

[03]query_routing15ms

[04]decomposition45ms

[05]query_expand32ms

[06]hyde_generate120ms

[07]embedding28ms

[08]hybrid_search85ms

[09]rrf_fusion3ms

[10]acl_filter5ms

[11]graphrag95ms

[12]reranking110ms

[13]chunk_expand45ms

[14]citations12ms

[15]raptor65ms

[16]format+cache4ms

8 results in 667ms — $0.003

0.94auth.mdp.12 L34

0.91login.tsL56-78

0.87session.mdp.8 L12

Architecture

Four Subsystems, One Component

Vector search, knowledge graph, episodic memory, and RAG pipeline — all connected through Convex's real-time database.

Semantic + BM25

Hybrid vector search with weighted RRF fusion for best retrieval quality.

Entity Extraction

Automatic NER with BFS, best-first, and hybrid graph traversal strategies.

Per-User Memory

Episodic memory with temporal decay and consolidation over conversations.

Configurable Pipeline

Every step is a toggle. Disable what you don't need, tune what you do.

Capabilities

Everything You Need

From basic vector search to enterprise-grade knowledge management, all in a single Convex component.

pro

16-Step RAG Pipeline

From query routing to cached results, every search passes through a production-grade retrieval pipeline you can configure per-query.

Query routing classifies complexity automatically
HyDE generates hypothetical answer documents
Jina Reranker v3 for precision ranking
Chunk expansion fetches surrounding context
Citations with page/line/heading breadcrumbs

pro

Knowledge Graph

Automatic entity extraction with relationship tracking. Your documents become a navigable graph of connected concepts.

BFS, best-first, and hybrid traversal strategies
Entity deduplication and merging
Relationship strength scoring
GraphRAG traversal augments search results

pro

25+ File Types

PDF, DOCX, images (OCR), audio transcription, video, spreadsheets, and more. All processed automatically.

PDF via Jina Reader with gateway fallback
Images: OCR + description via vision model
Audio/video: automatic transcription
CSV, XLSX, PPTX: structured extraction

Episodic Memory

Per-user memory with decay and consolidation. Your AI remembers conversations over time.

pro

RAPTOR Summaries

Recursive abstractive processing for tree-organized retrieval. High-level answers from large corpora.

pro

Vector + BM25 Search

Semantic embeddings plus BM25 keyword search with weighted RRF fusion.

Enterprise ACL

Fine-grained document-level permissions with principal hierarchies for users, roles, and groups.

team

Audit Logging

Immutable audit trail for every search, ingest, and deletion. Full compliance support.

team

Usage Quotas

Per-organization document limits, storage caps, and API rate limiting with automatic enforcement.

team

Under the hood

16-Step Search Pipeline

Every query passes through a production-grade retrieval pipeline. Here's what happens when you call memory.search()

$ memcity pipeline --verbose

[01] quota_check      -- 2ms    PASS
[02] cache_lookup     -- 1ms    MISS
[03] query_routing    -- 15ms   complexity=high
[04] decomposition    -- 45ms   sub_queries=3
[05] query_expansion  -- 32ms   variations=4
[06] hyde_generation  -- 120ms  hypothetical_docs=2
[07] embedding        -- 28ms   dimensions=1024
[08] hybrid_search    -- 85ms   semantic=50 bm25=50
[09] rrf_fusion       -- 3ms    candidates=42
[10] acl_filter       -- 5ms    permitted=38
[11] dedup+graphrag   -- 95ms   entities=12 edges=28
[12] reranking        -- 110ms  jina_v3
[13] chunk_expansion  -- 45ms   expanded=8
[14] citations        -- 12ms   breadcrumbs=8
[15] raptor_summary   -- 65ms   summaries=3
[16] format+cache     -- 4ms    cached=true

total: 667ms | tokens: 3,420 | cost: $0.0028

Pricing

Simple, One-Time Pricing

One-time payment. Lifetime updates. No subscriptions.

Community

Perfect for prototyping and small projects.

Hybrid vector search (semantic + BM25)
RRF fusion with configurable weights
Basic text ingestion & chunking
1 knowledge base
Caching & analytics
Apache 2.0 license

Get Started Free

Pro

$49.00Promo

$2.00one-time

Full pipeline for production applications.

Everything in Community
16-step RAG pipeline
Knowledge graph (GraphRAG)
Episodic memory (per-user)
25+ file type processing
RAPTOR summaries
Jina Reranker v3
Cascading deletion
Unlimited knowledge bases
Commercial license

Buy Pro License

Team

$99.00Promo

$49.00one-time

Enterprise controls for multi-tenant apps.

Everything in Pro
Per-document ACLs
Immutable audit logging
Usage quotas & rate limiting
Multi-org support
Priority support
Team license (up to 10 devs)

Buy Team License

Feedback themes

What Early Adopters Value

“The one-command install plus sane defaults got us to production faster than wiring a custom RAG stack from scratch.”

Integration Team