Knowledge Base
SailFish lets you import documents to build a private knowledge base that the AI can search and reference. Combined with its multi-level memory system, the AI remembers what you teach it across sessions.
Importing Documents
Supported Formats
| Format | Extension | Notes |
|---|---|---|
| Markdown | .md | Best format — preserves structure |
| Plain Text | .txt | Good for logs, notes, etc. |
.pdf | Text extracted automatically | |
| Word | .docx | Text and tables extracted |
How to Import
- Click the Knowledge Base icon in the left sidebar
- Click “Import Document”
- Select the file(s) you want to import (multi-select supported)
- Wait for processing to complete
The processing pipeline includes:
- Chunking: Splits long documents into semantically coherent segments
- Vector embedding: Generates vector representations for each chunk (using a local embedding model)
- Indexing: Builds a vector index plus a BM25 keyword index, enabling both semantic and keyword search
Fully Local Processing
All document processing runs entirely on your machine using the built-in embedding model (Transformers.js). Nothing is uploaded to any external service. Your data never leaves your computer.
Import Tips
- Choose high-value documents: Project docs, runbooks, team guidelines — things you reference frequently
- Keep documents up to date: If content changes, delete the old version and re-import
- Markdown works best: Well-structured Markdown with clear headings produces the highest quality chunks
- Avoid very large files: Keep individual files under 10 MB for best performance
Using the Knowledge Base
Once documents are imported, the AI automatically searches the knowledge base when it needs information. You can also ask it to look things up directly:
According to our knowledge base, what is the deployment process?
Is there anything in the knowledge base about database backups?
Look up the API docs for the user authentication section.
How Retrieval Works
When the AI searches for information, it uses two retrieval methods simultaneously:
- Semantic search: Understands the meaning of your question and finds conceptually related content — even if the exact words differ
- Keyword search (BM25): Matches exact terms, ideal for specific names, configuration keys, and technical identifiers
Results from both methods are merged and ranked. The most relevant content is injected into the AI’s context.
Three-Level Memory System
SailFish’s AI has three memory tiers that mimic how human memory works:
L1 — Task Memory (Working Memory)
Complete conversation records from the most recent tasks, kept directly in the context window. Like something you just did — you remember every detail.
- Managed automatically; no manual action required
- More recent tasks retain more detail; older ones are progressively compressed
- 5-level progressive compression: from full conversation down to a one-sentence summary, maximizing information within the token budget
L2 — Knowledge Documents (Long-Term Factual Memory)
Information you explicitly teach the AI, or important facts the AI summarizes on its own. Loaded into context automatically at the start of every conversation.
For example, you tell the AI:
Remember: our production database is at 10.0.1.50:3306, username app_user
From that point on, the AI knows this in every conversation — no need to repeat it.
How it’s organized:
- Per-context: each server gets its own “host profile” recording OS, installed software, network config, etc.
- Without a terminal, it’s organized per user — storing personal preferences and work habits
- Only stores high-value, frequently reused facts; every entry is worth the tokens to auto-inject
What belongs in L2:
| Good fit | Not a good fit |
|---|---|
| Server IPs, ports, credentials | One-off troubleshooting sessions |
| Project deployment processes | Temporary test data |
| Team coding standards | Full log contents |
| Common command aliases | Specific details of a single operation |
L3 — Conversation Records (Episodic Memory)
Complete historical conversations are permanently saved. The AI can search them proactively using its tools. Like experiences you need to “think back” to recall.
You: How did we fix the nginx 502 error last time?
AI: [searches conversation history, finds relevant records, and summarizes]
L2 vs L3:
- L2 provides facts (“nginx listens on port 8080”), loaded automatically every time
- L3 provides experience (“how we debugged it last time”), loaded on demand via search
- With L3 as a safety net, L2 doesn’t need to store everything — it can stay focused on core facts
Managing the Knowledge Base
In the knowledge base panel you can:
- View imported documents — see the list and processing status
- Search the knowledge base — test retrieval quality
- Delete documents you no longer need (frees storage)
CLI Operations
You can also manage the knowledge base from the command line:
npm run sft -- knowledge:list # list imported documents
npm run sft -- knowledge:search "query" # search the knowledge base