3个命令搭知识库：你的笔记终于能"搜懂意思"了

我搭了个跑在自己机器上的知识库，能自动扫描文件、语义搜索、定时更新，全程零 API 费用。

之前试过关键词搜索笔记，搜"AI怎么记住对话"，0 结果。换成语义搜索，秒命中。下面是完整搭建过程，每一步都能直接复制执行。

为什么你需要一个知识库

笔记越积越多，Ctrl+F 搜不到东西了。关键词匹配只认字面，不认意思。

“AI怎么记住之前的对话"搜不到"记忆系统架构”，但语义上是一回事。这就是关键词搜索的致命缺陷。

第一步：装 MemPalace（5分钟）

MemPalace 是本地优先的 AI 记忆系统，底层用 ChromaDB 存向量数据。

1
2
3
4
5
# 一行安装
pip install mempalace

# 验证
mempalace status

输出类似这样说明成功：

1
2
3
Palace Status:
  Drawers filed: 0
  Total memories: 0

安装完自动创建 ~/.mempalace/ 目录。默认嵌入模型是 all-MiniLM-L6-v2（英文模型，384维），后面会换成中文模型。

第二步：装 Ollama + 中文嵌入模型（10分钟）

默认英文模型对中文命中率约 50%。换成中文模型，实测 100%。

1
2
3
4
5
6
7
8
# 1. 装 Ollama（本地推理引擎）
curl -fsSL https://ollama.com/install.sh | sh

# 2. 启动服务
ollama serve &

# 3. 拉中文嵌入模型（约4.7GB）
ollama pull qwen3-embedding

验证 Ollama 在跑：

1
2
curl http://localhost:11434/api/embeddings \
  -d '{"model":"qwen3-embedding","prompt":"测试中文嵌入"}'

返回一个 4096 维浮点数组就对了。

踩坑提醒： Ollama 嵌入 API 用 prompt 字段，不是 input。ChromaDB 自带的 Ollama 适配器用错了字段，后面会说怎么绕过。

第三步：初始化知识库（3分钟）

编辑 ~/.mempalace/config.json：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
{
  "palace_path": "/root/.mempalace/palace",
  "collection_name": "mempalace_drawers",
  "topic_wings": [
    "project",
    "technical",
    "personal",
    "creative",
    "work"
  ]
}

topic_wings 是知识分区，相当于书架分类标签。按需修改。

写初始化脚本 init_palace.py：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#!/usr/bin/env python3
"""用 Qwen3 中文嵌入模型初始化 MemPalace 向量集合"""
import chromadb
import requests

class Qwen3Embedding:
    """绕过 ChromaDB Ollama 适配器的 prompt/input bug"""
    def __init__(self, url="http://localhost:11434", model="qwen3-embedding"):
        self.url = url
        self.model = model

    def __call__(self, texts):
        embeddings = []
        for t in texts:
            r = requests.post(
                f"{self.url}/api/embeddings",
                json={"model": self.model, "prompt": t},
                timeout=60
            )
            embeddings.append(r.json()["embedding"])
        return embeddings

client = chromadb.PersistentClient(path="/root/.mempalace/palace")
ef = Qwen3Embedding()

collection = client.get_or_create_collection(
    name="mempalace_drawers",
    embedding_function=ef,
    metadata={"hnsw:space": "cosine"}
)

print(f"集合已就绪，当前 {collection.count()} 条记录")

运行：

1
python3 init_palace.py

关键： 如果之前用默认模型创建过集合（384维），必须删掉重建。Qwen3 是 4096 维，维度不同会报错。

1
2
3
4
5
6
7
8
9
# 重建（会清空已有数据）
python3 -c "
import chromadb
client = chromadb.PersistentClient(path='/root/.mempalace/palace')
for c in client.list_collections():
    client.delete_collection(c.name)
    print(f'已删除: {c.name}')
"
# 然后重新运行 init_palace.py

第四步：存入第一条知识（2分钟）

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#!/usr/bin/env python3
"""存入一条知识并验证"""
import chromadb
import hashlib
import time
import requests

client = chromadb.PersistentClient(path="/root/.mempalace/palace")
col = client.get_collection("mempalace_drawers")

text = "AI 助手的记忆系统分四层：热记忆（每轮注入）→ 用户画像 → 知识库按需检索 → 会话历史搜索"
wing = "technical"
room = "memory-architecture"

resp = requests.post("http://localhost:11434/api/embeddings",
    json={"model": "qwen3-embedding", "prompt": text}, timeout=60)
embedding = resp.json()["embedding"]

doc_id = hashlib.md5(text.encode()).hexdigest()[:16]
col.add(
    ids=[doc_id],
    embeddings=[embedding],
    documents=[text],
    metadatas=[{
        "wing": wing,
        "room": room,
        "timestamp": time.strftime("%Y-%m-%dT%H:%M:%S")
    }]
)

print(f"已存入: {text[:50]}...")
print(f"总条数: {col.count()}")

第五步：验证语义搜索（2分钟）

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
#!/usr/bin/env python3
"""语义搜索测试"""
import chromadb
import requests

client = chromadb.PersistentClient(path="/root/.mempalace/palace")
col = client.get_collection("mempalace_drawers")

query = "AI 怎么记住之前的对话？"

resp = requests.post("http://localhost:11434/api/embeddings",
    json={"model": "qwen3-embedding", "prompt": query}, timeout=60)
query_embedding = resp.json()["embedding"]

results = col.query(query_embeddings=[query_embedding], n_results=3)

for i, doc in enumerate(results["documents"][0]):
    dist = results["distances"][0][i]
    wing = results["metadatas"][0][i].get("wing", "?")
    print(f"[{dist:.3f}] ({wing}) {doc[:80]}...")

实测效果对比：

搜索问题	关键词匹配	语义搜索
“AI怎么记住之前的对话”	0 结果	命中"四层记忆架构"
“怎么防止模型忘东西”	0 结果	命中记忆管理策略
“中文向量模型哪个好”	0 结果	命中嵌入模型对比

第六步：自动扫描入库（核心）

写一个脚本扫描你的笔记/文档目录，自动切片 + 向量化 + 入库。

保存为 auto_index.sh：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
#!/bin/bash
set -e

DOCS_DIR="${1:-$HOME/notes}"
PALACE_PATH="$HOME/.mempalace/palace"
LOG_FILE="$HOME/.mempalace/logs/index_$(date +%Y%m%d).log"

mkdir -p "$(dirname "$LOG_FILE")"

log() { echo "[$(date '+%H:%M:%S')] $1" | tee -a "$LOG_FILE"; }

log "开始扫描: $DOCS_DIR"

if command -v mempalace &>/dev/null; then
    mempalace mine "$DOCS_DIR" --limit 500 >> "$LOG_FILE" 2>&1
    log "MemPalace mine 完成"
else
    python3 << 'PYEOF'
import os, hashlib, time, requests, chromadb

DOCS_DIR = os.environ.get("DOCS_DIR", os.path.expanduser("~/notes"))
client = chromadb.PersistentClient(path=os.path.expanduser("~/.mempalace/palace"))
col = client.get_collection("mempalace_drawers")
existing_ids = set(col.get()["ids"]) if col.count() > 0 else set()

def embed(text):
    r = requests.post("http://localhost:11434/api/embeddings",
        json={"model": "qwen3-embedding", "prompt": text[:500]}, timeout=120)
    return r.json()["embedding"]

added = 0
for root, dirs, files in os.walk(DOCS_DIR):
    dirs[:] = [d for d in dirs if not d.startswith(".")]
    for f in files:
        if not f.endswith((".md", ".txt", ".py", ".yaml", ".json")):
            continue
        fpath = os.path.join(root, f)
        try:
            content = open(fpath, "r", errors="ignore").read()
        except:
            continue
        if len(content) < 50:
            continue

        chunks = [content[i:i+500] for i in range(0, len(content), 500)]
        for chunk in chunks:
            doc_id = hashlib.md5(chunk.encode()).hexdigest()[:16]
            if doc_id in existing_ids:
                continue

            try:
                vec = embed(chunk)
                col.add(ids=[doc_id], embeddings=[vec], documents=[chunk],
                    metadatas=[{"source": fpath, "wing": "docs",
                                "timestamp": time.strftime("%Y-%m-%dT%H:%M:%S")}])
                existing_ids.add(doc_id)
                added += 1
            except Exception as e:
                print(f"  错误 {fpath}: {e}")

print(f"新增 {added} 条，总计 {col.count()} 条")
PYEOF
    log "Python 扫描完成"
fi

if command -v mempalace &>/dev/null; then
    mempalace status 2>&1 | head -10 | tee -a "$LOG_FILE"
fi

1
2
chmod +x auto_index.sh
./auto_index.sh ~/notes

CPU 推理每条约 20 秒，100 个文件大约 30 分钟。有 GPU 会快很多。

第七步：定时自动更新（2分钟）

让知识库每天自动扫描新增文件：

1
2
3
4
5
# 编辑 crontab
crontab -e

# 添加这行：每天凌晨3点自动更新
0 3 * * * ~/auto_index.sh ~/notes >> ~/.mempalace/logs/cron.log 2>&1

或者用 MemPalace 自带的命令：

1
0 3 * * * mempalace mine ~/notes --limit 500 >> ~/.mempalace/logs/cron.log 2>&1

第八步：接入 AI 助手（可选）

如果你用 AI 助手框架，可以把 MemPalace 作为记忆后端：

1
2
3
4
5
6
# AI 助手框架配置文件
memory:
  memory_enabled: true
  provider: mempalace
  memory_char_limit: 2200
  flush_min_turns: 6

这样对话中产生的关键信息会自动存入知识库，下次对话时能通过语义搜索找回。

踩坑清单（必看）

坑	症状	解决方案
嵌入模型 API 字段错误	返回空向量，ChromaDB 报 IndexError	用 `prompt` 不用 `input`
换嵌入模型后维度冲突	报 dimension mismatch	必须删旧集合重建
ChromaDB 并发写损坏	数据库报 file is not a database	不同进程用不同目录
CPU 推理太慢	几百条数据跑几小时	批量跑用 batch_size=5
中文搜索命中率低	all-MiniLM-L6-v2 只有50%命中	换 Qwen3-Embedding
Ollama 服务没启动	嵌入请求超时	`ollama serve` 或设自启
切片太粗	搜索命中但信息不精确	500字左右切一片

成本核算

项目	数据
磁盘占用	Ollama 模型约4.7GB + ChromaDB 约10MB/千条
内存占用	Ollama 常驻约150MB（CPU模式）
API 费用	0（全部本地）
首次搭建时间	约30分钟
日常维护	零（定时任务自动跑）