Features
File Ingestion
Process 25+ file types including PDFs, images, audio, and video with automatic text extraction and AI analysis.
Overview
Memcity can process almost any file you throw at it. Upload a PDF, a PowerPoint presentation, a photo of a whiteboard, or even a video recording — Memcity extracts the text, chunks it, embeds it, and makes it searchable through the same RAG pipeline used for plain text.
The processing is automatic. You upload a file, tell Memcity to process it, and minutes later it's searchable via natural language queries.
Supported File Types
Documents
| Format | Extensions | How It's Processed |
|---|---|---|
.pdf | Text extracted via Jina Reader. If text extraction fails (scanned PDFs), falls back to the AI gateway model for OCR. | |
| Word | .docx, .doc | Processed through the AI gateway model which reads the document structure, headings, tables, and content. |
Plain Text
| Format | Extensions | How It's Processed |
|---|---|---|
| Text | .txt | Direct text extraction — no AI needed. |
| Markdown | .md | Direct extraction. Heading structure is preserved for citations. |
| HTML | .html | HTML tags are stripped, text content is extracted. |
| JSON | .json | Stringified and chunked. Useful for API responses, config files. |
| CSV | .csv | Parsed row by row. Each row or group of rows becomes a chunk. |
Spreadsheets
| Format | Extensions | How It's Processed |
|---|---|---|
| Excel | .xlsx, .xls | The AI gateway model reads each sheet, extracting tabular data with headers and cell values. Formulas are evaluated to their results. |
Presentations
| Format | Extensions | How It's Processed |
|---|---|---|
| PowerPoint | .pptx | Each slide is processed individually. The AI model reads slide titles, bullet points, and any text in shapes. Speaker notes are included. |
Images
| Format | Extensions | How It's Processed |
|---|---|---|
| Images | .png, .jpg, .webp, .gif, .heic, .heif | The AI gateway model's vision capability performs OCR (text extraction from images) AND generates a description of the visual content. A whiteboard photo gets both the text on the board and a description of the diagrams. |
Audio
| Format | Extensions | How It's Processed |
|---|---|---|
| Audio | .mp3, .wav, .m4a, .ogg, .flac, .aac, .webm | The AI gateway model transcribes the audio to text. Speaker diarization (who said what) is included when the model supports it. |
Video
| Format | Extensions | How It's Processed |
|---|---|---|
| Video | .mp4, .webm, .mov, .avi, .mkv | The AI gateway model extracts both the audio transcript and visual content descriptions. For a presentation recording, you get the spoken words plus the slide content. |
File Upload Flow
Processing a file takes three steps: generate an upload URL, upload the file, then trigger processing. Here's the complete flow:
Step 1: Generate an Upload URL
Convex uses presigned URLs for file uploads. This gives you a temporary URL that your frontend can upload directly to — no need to proxy through your server.
// convex/files.ts
import { action } from "./_generated/server";
import { Memory } from "memcity";
import { components } from "./_generated/api";
const memory = new Memory(components.memcity, {
tier: "pro",
ai: { gateway: "openrouter", model: "google/gemini-2.0-flash-001" },
});
export const getUploadUrl = action({
args: {},
handler: async (ctx) => {
// Returns a temporary URL valid for ~1 hour
const uploadUrl = await memory.getUploadUrl(ctx);
return uploadUrl;
},
});Step 2: Upload the File from Your Frontend
// In your React component
async function handleFileUpload(file: File) {
// Get the presigned upload URL from Convex
const uploadUrl = await getUploadUrl();
// Upload directly to Convex storage
const response = await fetch(uploadUrl, {
method: "POST",
headers: { "Content-Type": file.type },
body: file,
});
const { storageId } = await response.json();
return storageId;
}Step 3: Process the Uploaded File
// convex/files.ts
export const processFile = action({
args: {
storageId: v.id("_storage"),
fileName: v.string(),
orgId: v.string(),
knowledgeBaseId: v.string(),
},
handler: async (ctx, args) => {
// This triggers the full processing pipeline:
// 1. Detect file type from extension/MIME type
// 2. Extract text using the appropriate processor
// 3. Chunk the extracted text
// 4. Generate embeddings for each chunk
// 5. Index for search
// 6. Extract entities and relationships
const result = await memory.processUploadedFile(ctx, {
orgId: args.orgId,
knowledgeBaseId: args.knowledgeBaseId,
storageId: args.storageId,
fileName: args.fileName,
});
return result;
// { success: true, chunkCount: 47, documentId: "..." }
},
});Complete Frontend Example
Putting it all together with a drag-and-drop upload:
function FileUploader({ orgId, kbId }: { orgId: string; kbId: string }) {
const [uploading, setUploading] = useState(false);
async function onDrop(files: File[]) {
setUploading(true);
for (const file of files) {
// 1. Get upload URL
const uploadUrl = await getUploadUrl();
// 2. Upload file
const res = await fetch(uploadUrl, {
method: "POST",
headers: { "Content-Type": file.type },
body: file,
});
const { storageId } = await res.json();
// 3. Process file
await processFile({
storageId,
fileName: file.name,
orgId,
knowledgeBaseId: kbId,
});
console.log(`Processed ${file.name}`);
}
setUploading(false);
}
return (
<div onDrop={(e) => {
e.preventDefault();
onDrop(Array.from(e.dataTransfer.files));
}}>
{uploading ? "Processing..." : "Drop files here"}
</div>
);
}Batch Ingestion
For multiple text documents, use batchIngest to process them all in one call:
await memory.batchIngest(ctx, {
orgId,
knowledgeBaseId: kbId,
documents: [
{ text: "First document content...", source: "doc1.md" },
{ text: "Second document content...", source: "doc2.md" },
{ text: "Third document content...", source: "doc3.md" },
],
});This is more efficient than calling ingestText in a loop because it batches the embedding API calls.
URL Ingestion
Ingest web pages directly from a URL:
await memory.ingestUrl(ctx, {
orgId,
knowledgeBaseId: kbId,
url: "https://docs.example.com/getting-started",
});SSRF Protection: Memcity validates URLs before fetching to prevent Server-Side Request Forgery attacks. Internal IPs (127.0.0.1, 10.x.x.x, 192.168.x.x) and private hostnames are blocked.
What Happens Inside the Processing Pipeline
When you process a PDF, here's what happens step by step:
- File detection — Memcity reads the file extension and MIME type to determine the processor.
- Text extraction — For PDF, Jina Reader extracts the text content. If it fails (e.g., the PDF is a scanned image), the AI gateway model performs OCR.
- Text cleaning — Extra whitespace, headers/footers, and formatting artifacts are removed.
- Chunking — The cleaned text is split into chunks using the configured strategy (recursive by default, ~512 tokens each).
- Embedding — Each chunk is sent to Jina v4 to generate a 1,024-dimensional vector embedding.
- Indexing — Vectors are stored in Convex's vector index. Raw text is indexed for BM25 keyword search.
- Entity extraction (Pro+) — The LLM identifies entities and relationships in each chunk, adding them to the knowledge graph.
- Metadata — Source file name, page numbers, and heading structure are stored for citation generation.
File Size Limits
- Maximum file size: 100MB per file
- Maximum files per batch: No hard limit, but processing is sequential within a batch
- Recommended: For very large files (over 50MB), consider splitting them into smaller pieces before upload
Error Handling
File processing can fail for various reasons. Always handle errors:
try {
const result = await memory.processUploadedFile(ctx, {
orgId,
knowledgeBaseId: kbId,
storageId,
fileName: "report.pdf",
});
console.log(`Processed: ${result.chunkCount} chunks created`);
} catch (error) {
if (error.message.includes("Unsupported file type")) {
// File format not supported
} else if (error.message.includes("File too large")) {
// Over 100MB limit
} else if (error.message.includes("Text extraction failed")) {
// Could not extract text — corrupted file?
} else {
// Unexpected error
throw error;
}
}Availability
| Feature | Community | Pro | Team |
|---|---|---|---|
Text ingestion (ingestText) | Yes | Yes | Yes |
URL ingestion (ingestUrl) | - | Yes | Yes |
| File upload + processing | - | Yes | Yes |
| Batch ingestion | - | Yes | Yes |
| All 25+ file types | - | Yes | Yes |