Weaviate — 自带 RAG 的全能向量数据库

GitHub: weaviate/weaviate
Stars: 16,300+ | Language: Go (96.0%) | License: BSD 3-Clause
官网: weaviate.io | 最新版本: v1.38.0 “HFresh” (2026-06-05)

项目速览

Weaviate 由荷兰公司 SeMI Technologies 于 2019 年开源，是唯一一个将向量检索、关键词搜索、生成式 AI 和重排序能力内置于单一数据库中的开源向量数据库。截至 2026 年 6 月，项目在 GitHub 上获得超过 16,300 颗 Star，累计发布 545 个版本，迭代速度极高。最新版本 v1.38.0 代号 “HFresh”，引入了 Namespace 和嵌套对象过滤等新特性。

Weaviate 最大的差异化特征是”AI-native”设计：它不仅存储向量，还内置了与 OpenAI、Cohere、HuggingFace、Google、Anthropic 等模型提供商的集成。你可以在创建 Collection 时直接指定向量化模型，Weaviate 会自动调用远程 API（或本地模型）完成 Embedding 生成——写入数据时只需传入原始文本，Weaviate 自动完成向量化、索引和存储的全流程。

更重要的是，Weaviate 原生支持 Generative Search（生成式搜索）。这意味着你不需要额外搭建 LangChain 或 LlamaIndex 管道——直接在 Weaviate 查询中指定 generate 参数，它会自动将检索到的文档作为上下文，调用 LLM 生成答案。这种”数据库即 RAG”的设计极大地简化了系统架构。

功能概述

自动向量化 — 模型提供商集成

Weaviate 在 Collection 创建时即可指定向量化模型，之后的数据写入完全透明：

import weaviate
from weaviate.classes.config import Configure, DataType, Property

client = weaviate.connect_to_local()

# 创建 Collection 时指定 OpenAI 文本 Embedding 模型
articles = client.collections.create(
    name="Article",
    properties=[
        Property(name="title", data_type=DataType.TEXT),
        Property(name="content", data_type=DataType.TEXT),
    ],
    vector_config=Configure.Vectors.text2vec_openai(
        model="text-embedding-3-small",
        dimensions=1536
    )
)

# 插入时只需传入原始文本，Weaviate 自动调用 OpenAI 生成向量
articles.data.insert_many([
    {"title": "Intro to Vector DBs", "content": "Vector databases enable semantic search..."},
    {"title": "RAG Systems", "content": "RAG combines retrieval with generation..."},
    {"title": "Machine Learning", "content": "ML models create embeddings from raw data..."},
])

支持的向量化模块包括：text2vec-openai、text2vec-cohere、text2vec-huggingface、text2vec-ollama（本地模型）、multi2vec-clip（多模态）、text2vec-transformers 等。也支持使用 Configure.Vectors.self_provided 自行提供向量。

混合搜索 — 语义 + 关键词融合

Weaviate 的 Hybrid Search 将向量语义检索和 BM25F 关键词检索结合，通过融合算法给出最终排序：

from weaviate.classes.query import HybridFusion, BM25Operator

articles = client.collections.get("Article")

# 基础混合搜索
response = articles.query.hybrid(
    query="semantic search techniques",
    alpha=0.5,                    # 0 = 纯关键词，1 = 纯向量，0.5 = 均衡
    limit=5
)

# 使用 RELATIVE_SCORE 融合 + 指定关键词属性权重
response = articles.query.hybrid(
    query="vector database performance",
    query_properties=["title^2", "content"],  # title 权重翻倍
    fusion_type=HybridFusion.RELATIVE_SCORE,
    alpha=0.75,
    limit=5
)

# BM25 操作符：OR 模式，至少匹配 2 个词
response = articles.query.hybrid(
    query="fast accurate search indexing",
    bm25_operator=BM25Operator.or_(minimum_match=2),
    limit=5
)

for obj in response.objects:
    print(obj.properties)

融合策略支持 HybridFusion.RANKED（RRF）和 HybridFusion.RELATIVE_SCORE（DBSF）两种算法。alpha 参数控制向量与关键词的权重平衡。

生成式搜索 — 数据库内置 RAG

Weaviate 的 Generative Search 是其最具特色的功能——在数据库查询层面直接集成 LLM 生成：

from weaviate.classes.generate import GenerativeConfig

articles = client.collections.get("Article")

# Single Prompt：对每个检索结果单独生成
response = articles.generate.near_text(
    query="vector database use cases",
    limit=3,
    single_prompt="基于以下内容，用一句话总结这篇文章的核心观点：{content}",
    generative_provider=GenerativeConfig.openai(model="gpt-4o")
)

for obj in response.objects:
    print(f"原文标题: {obj.properties['title']}")
    print(f"AI 总结: {obj.generative.text}\n")

# Grouped Task：对所有检索结果合并生成
response = articles.generate.near_text(
    query="RAG system architecture",
    limit=5,
    grouped_task="综合所有检索到的文章，写一个关于 RAG 系统架构的简短概述",
    grouped_properties=["title", "content"]
)

print(f"综合生成: {response.generative.text}")

这种方式的数据流完全在 Weaviate 内部完成：检索 -> 组装上下文 -> 发送给 LLM -> 返回生成结果，开发者无需手动拼接 pipeline。

多租户与安全

Weaviate v1.38 引入了 Namespace 概念，为多租户提供了更精细的隔离粒度。安全方面支持 API Key 认证、OIDC（OpenID Connect）集成和基于角色的访问控制（RBAC）。Collection 级别的 tenant 参数和 TTL（Time-To-Live）机制满足了 SaaS 多租户场景下的数据隔离和自动过期需求。

向量压缩与性能优化

Weaviate 提供三级向量压缩方案：Product Quantization（PQ）、Binary Quantization（BQ）和 Scalar Quantization（SQ），通过 vector_index_config 在 Collection 创建时配置。HNSW 索引支持 vector_cache_max_objects 参数控制向量缓存大小，在内存和延迟之间精细权衡。

REST + GraphQL + gRPC 三协议支持

Weaviate 是唯一同时支持 GraphQL、REST 和 gRPC 三种协议的向量数据库。GraphQL 接口提供了极高的查询灵活性，客户端可以在一次请求中精确指定所需的属性和嵌入字段，减少不必要的数据传输。

适用场景

快速搭建 GenAI 应用

Weaviate 的”数据库即 RAG”设计使其成为快速搭建智能应用的理想选择。只需部署 Weaviate，配置 LLM API Key，即可通过简单查询语句实现语义搜索 + AI 生成。对于需要快速验证产品的创业团队，Weaviate 省去了搭建 RAG 管道的大量工程时间。

多模态内容管理

Weaviate 的 multi2vec-clip 模块支持图片和文本的统一向量化。在电商商品搜索场景中，用户上传的图片和商品文本描述被映射到同一向量空间，实现图文互搜。媒体资产管理（DAM）系统可以利用这一特性同时检索视觉相似和语义相关的资产。

SaaS 多租户知识管理

Weaviate 的 Collection Tenant、Namespace 和 RBAC 组合使其天然适配 SaaS 平台的架构需求。每个租户的数据通过 Collection 或 Tenant 隔离，TTL 机制支持试用期内容的自动清理，OIDC 集成可与企业的统一身份认证系统对接。

合规审计与溯源

Weaviate 的生成式搜索在返回 AI 生成内容的同时保留原始检索结果（response.objects），满足企业合规场景下对 AI 输出可溯源的要求。审计人员可以追溯 AI 总结的每一条信息来源，验证其准确性。

本地化 AI 与数据主权

通过 text2vec-ollama 或 text2vec-transformers 等本地向量化模块，Weaviate 可以在完全不与外部 API 通信的情况下完成 Embedding 和检索。对于有严格数据主权要求的企业（如政府、金融），这提供了完全离线的 AI 搜索能力。

快速上手

环境要求

Docker 与 Docker Compose
Python 3.9+

安装

使用 Docker Compose（推荐）：

# 下载官方 compose 文件
curl -o docker-compose.yml "https://weaviate.io/download/stable/docker-compose.yml"

# 启动（包含 Weaviate + 可选模块）
docker compose up -d

Python 客户端：

pip install -U weaviate-client

最简示例：语义检索

import weaviate
from weaviate.classes.config import Configure, DataType, Property

# 1. 连接到本地 Weaviate
client = weaviate.connect_to_local()

# 2. 创建 Collection（指定向量化模块）
collection = client.collections.create(
    name="QuickStart",
    properties=[
        Property(name="text", data_type=DataType.TEXT),
        Property(name="category", data_type=DataType.TEXT),
    ],
    vector_config=Configure.Vectors.text2vec_ollama(
        api_endpoint="http://host.docker.internal:11434",
        model="nomic-embed-text"
    )
)

# 3. 插入数据
collection.data.insert_many([
    {"text": "Vector databases store embeddings and enable similarity search",
     "category": "technical"},
    {"text": "Machine learning requires large amounts of training data",
     "category": "technical"},
    {"text": "The cuisines of Italy are known for pasta and wine",
     "category": "culinary"},
])

# 4. 语义搜索
response = collection.query.near_text(
    query="how to search by meaning rather than keywords",
    limit=2
)

for obj in response.objects:
    print(f"  [{obj.properties['category']}] {obj.properties['text']}")
    print(f"  distance: {obj.metadata.distance:.4f}\n")

client.close()

带过滤的搜索

import weaviate.classes as wvc

collection = client.collections.get("QuickStart")

# 仅搜索 category 为 "technical" 的内容
response = collection.query.near_text(
    query="data storage methods",
    filters=wvc.query.Filter.by_property("category").equal("technical"),
    limit=5
)

批量导入大数据集

# 使用 Dynamic Batch 自动管理批量大小
with collection.batch.dynamic() as batch:
    for item in large_dataset:
        batch.add_object(properties={
            "text": item["content"],
            "category": item["label"]
        })
        if batch.number_errors > 10:
            print("Errors exceeded threshold, stopping batch import")
            break

# 查看导入失败的对象
failed = collection.batch.failed_objects
if failed:
    print(f"Failed to import {len(failed)} objects")

源码架构

Weaviate 采用 Go 语言 monorepo 结构，代码量精简但功能完备：

weaviate/
├── cmd/weaviate-server/        # 服务器主入口
│   └── main.go                 #   启动流程、模块注册
├── adapters/                   # 外部系统适配层
│   ├── clients/                #   LLM/Embedding API 客户端
│   ├── repositories/           #   数据持久化仓库
│   └── handlers/               #   REST/GraphQL/gRPC 处理器
├── usecases/                   # 核心业务逻辑层
│   ├── schema/                 #   Schema 管理（Collection/Property）
│   ├── objects/                #   对象 CRUD 操作
│   ├── vector/                 #   向量索引与检索
│   ├── classification/         #   自动分类（kNN/Zero-Shot）
│   └── search/                 #   搜索编排（语义/关键词/混合/生成式）
├── entities/                   # 领域实体定义
│   ├── models/                 #   数据模型
│   ├── schema/                 #   Schema 定义
│   └── search/                 #   搜索结果类型
├── modules/                    # 可插拔模块系统
│   ├── text2vec-openai/        #   OpenAI Embedding 模块
│   ├── text2vec-cohere/        #   Cohere Embedding 模块
│   ├── text2vec-huggingface/   #   HuggingFace Embedding 模块
│   ├── text2vec-ollama/        #   Ollama 本地 Embedding 模块
│   ├── text2vec-transformers/  #   本地 Transformers Embedding 模块
│   ├── generative-openai/      #   OpenAI 生成式模块
│   ├── generative-cohere/      #   Cohere 生成式模块
│   ├── generative-anthropic/   #   Anthropic 生成式模块
│   ├── multi2vec-clip/         #   CLIP 多模态模块
│   ├── reranker-transformers/  #   重排序模块
│   └── qna-transformers/       #   本地问答模块
├── client/                     # 客户端库（Python/Go/Java/JS）
├── cluster/                    # 集群管理
│   ├── schema/                 #   分布式 Schema 同步
│   └── tx_manager.go           #   事务管理器
├── grpc/                       # gRPC 协议自动生成代码
├── openapi-specs/              # OpenAPI 规范文件
├── tools/                      # 开发与运维工具
├── test/                       # 集成与验收测试
├── docker-compose/             # 开发环境 Compose 文件
└── go.mod / Dockerfile         # Go 与容器构建配置

Weaviate 的架构遵循 Clean Architecture 模式，清晰分为四层：

**usecases/**：核心业务逻辑层。所有搜索类型（语义、关键词、混合、生成式）都在此层编排。Schema 管理处理 Collection、Property、索引的定义和变更。
**adapters/**：外部依赖适配层。向量索引（HNSW）、持久化存储（LSM-Tree / BoltDB）、外部 API 调用的具体实现都在此层。
**modules/**：Weaviate 最具特色的模块化架构。每个向量化或生成模块都是一个独立的 Go 包，遵循统一接口。模块在启动时通过配置注册，运行时按需调用。这确保了核心数据库逻辑与模型提供商逻辑的彻底解耦——新增一个模型提供商只需新增一个模块包。
**entities/**：领域实体和数据模型定义，跨层共享。

实操 Demo

以下构建一个完整的智能文档问答系统，展示 Weaviate 的端到端 RAG 能力：

"""
智能文档问答系统 Demo
功能：文本入库 → 语义搜索 → 自动生成答案（数据库内置 RAG）
"""
import weaviate
from weaviate.classes.config import Configure, DataType, Property
from weaviate.classes.generate import GenerativeConfig
import weaviate.classes as wvc

# 1. 连接到 Weaviate
print("[1/5] 正在连接 Weaviate...")
client = weaviate.connect_to_local()
assert client.is_ready()
print("  连接成功！")

# 2. 创建文档 Collection
print("[2/5] 正在创建 Collection...")

# 检查并删除已存在的 Collection（便于重复运行 Demo）
if client.collections.exists("DocQA"):
    client.collections.delete("DocQA")

collection = client.collections.create(
    name="DocQA",
    properties=[
        Property(name="title", data_type=DataType.TEXT),
        Property(name="content", data_type=DataType.TEXT),
        Property(name="category", data_type=DataType.TEXT),
        Property(name="year", data_type=DataType.INT),
    ],
    vector_config=Configure.Vectors.text2vec_ollama(
        api_endpoint="http://host.docker.internal:11434",
        model="nomic-embed-text"
    ),
    # 配置 HNSW 索引
    vector_index_config=wvc.config.Configure.VectorIndex.hnsw(
        distance_metric=wvc.config.VectorDistances.COSINE,
        vector_cache_max_objects=10000,
        quantizer=wvc.config.Configure.VectorIndex.Quantizer.pq()
    )
)
print("  Collection 创建完成（含向量量化）")

# 3. 批量导入文档
print("[3/5] 正在导入文档...")
documents = [
    {"title": "Introduction to Vector Databases",
     "content": "Vector databases are specialized database systems designed to store, index, "
                "and query high-dimensional vector embeddings. Unlike traditional databases "
                "that excel at exact matches, vector databases perform approximate nearest "
                "neighbor (ANN) search to find semantically similar items.",
     "category": "database", "year": 2024},
    {"title": "Understanding RAG Architecture",
     "content": "Retrieval-Augmented Generation (RAG) combines information retrieval with "
                "large language models. The system first retrieves relevant documents from "
                "a knowledge base, then uses these documents as context for the LLM to "
                "generate accurate, grounded responses. This reduces hallucinations.",
     "category": "architecture", "year": 2024},
    {"title": "HNSW Index Algorithm",
     "content": "Hierarchical Navigable Small World (HNSW) is a graph-based algorithm for "
                "approximate nearest neighbor search. It builds a multi-layer graph where "
                "upper layers enable long-range jumps and lower layers refine the search. "
                "HNSW offers excellent query speed and recall trade-offs.",
     "category": "algorithm", "year": 2023},
    {"title": "Hybrid Search in Modern Databases",
     "content": "Hybrid search combines vector semantic search with traditional keyword-based "
                "search (BM25). The results from both methods are fused using algorithms like "
                "Reciprocal Rank Fusion (RRF) or Distribution-Based Score Fusion (DBSF). "
                "This approach captures both semantic meaning and exact keyword matches.",
     "category": "search", "year": 2025},
    {"title": "Embedding Models Overview",
     "content": "Embedding models convert text, images, or other data into dense vector "
                "representations. Popular models include OpenAI's text-embedding-3, "
                "Cohere's Embed v3, and open-source options like BGE and all-MiniLM. "
                "The quality of embeddings directly impacts search accuracy.",
     "category": "models", "year": 2025},
]

with collection.batch.dynamic() as batch:
    for doc in documents:
        batch.add_object(properties=doc)

if len(collection.batch.failed_objects) > 0:
    print(f"  警告：{len(collection.batch.failed_objects)} 条导入失败")
else:
    print(f"  成功导入 {len(documents)} 篇文档")

# 4. 执行多种搜索模式
print("[4/5] 开始搜索测试...\n")

# 4a. 纯语义搜索
print("=== 语义搜索：'How to find similar items in a database?' ===")
response = collection.query.near_text(
    query="How to find similar items in a database?",
    limit=2,
    return_metadata=wvc.query.MetadataQuery(distance=True)
)
for obj in response.objects:
    print(f"  [{obj.properties['category']}] {obj.properties['title']}")
    print(f"   distance={obj.metadata.distance:.4f}\n")

# 4b. 混合搜索
print("=== 混合搜索：'hybrid search fusion' ===")
response = collection.query.hybrid(
    query="hybrid search fusion",
    alpha=0.5,
    limit=2
)
for obj in response.objects:
    print(f"  [{obj.properties['category']}] {obj.properties['title']}\n")

# 4c. 带过滤的语义搜索
print("=== 语义 + 过滤：搜索且 year >= 2025 ===")
response = collection.query.near_text(
    query="modern search techniques",
    filters=wvc.query.Filter.by_property("year").greater_or_equal(2025),
    limit=3
)
for obj in response.objects:
    print(f"  [{obj.properties['year']}] {obj.properties['title']}\n")

# 5. 生成式搜索 — 数据库内置 RAG
print("[5/5] 生成式搜索（RAG）...\n")

print("=== Single Prompt：逐篇总结 ===")
response = collection.generate.near_text(
    query="vector database indexing",
    limit=2,
    single_prompt="用一句话概括这篇文章的核心观点：{content}",
    generative_provider=GenerativeConfig.openai(model="gpt-4o")
)
for obj in response.objects:
    print(f"  原文：{obj.properties['title']}")
    print(f"  AI 总结：{obj.generative.text}\n")

print("=== Grouped Task：综合分析 ===")
response = collection.generate.near_text(
    query="RAG system and vector search",
    limit=3,
    grouped_task="综合以下所有文章的内容，说明向量数据库和 RAG 系统之间的关系",
    grouped_properties=["title", "content"],
    generative_provider=GenerativeConfig.openai(model="gpt-4o")
)
print(f"  综合生成结果：\n{response.generative.text}")

# 清理
client.collections.delete("DocQA")
client.close()
print("\nDemo 完成！")

运行前提：

# 启动 Weaviate（带 Ollama 向量化模块）
docker compose up -d

# 如使用 OpenAI 向量化和生成，需要设置
export OPENAI_API_KEY="sk-..."

# 安装 Python 客户端
pip install -U weaviate-client

特性	Weaviate	Chroma	Qdrant	Milvus
Stars	16,300+	28,400+	32,300+	44,800+
核心语言	Go	Rust	Rust	Go + C++
自动向量化	内置（OpenAI/Cohere/HF/Ollama）	内置（all-MiniLM-L6-v2）	无（需手动计算）	无（需手动计算）
内置 RAG（生成式搜索）	原生支持（grouped task）	无	无	无
混合检索	BM25F + 向量，RRF/DBSF	Chroma Cloud 支持	RRF/DBSF	BM25/SPLADE + 向量
API 协议	GraphQL + REST + gRPC	Python/JS SDK	REST + gRPC	gRPC + REST
模块化架构	可插拔模块（向量化/生成/重排序）	自定义 Embedding 函数	嵌入式/分布式	复用客户端 SDK
多租户	Collection Tenant + Namespace + RBAC	Collection	Collection/Payload 隔离	Database/Collection/Partition/Key
自动 TTL	支持（对象级 TTL）	无	无	支持（Collection TTL）
向量量化	PQ/BQ/SQ	无	SQ/PQ/BQ	PQ/SQ
部署复杂度	中等	极低	低至中	中高
学习曲线	中等	极低	低至中	中高

参考资源

GitHub 仓库: https://github.com/weaviate/weaviate
官方文档: https://docs.weaviate.io
Python 客户端文档: https://docs.weaviate.io/weaviate/client-libraries/python
搜索 API 参考: https://docs.weaviate.io/weaviate/search
生成式搜索指南: https://docs.weaviate.io/weaviate/search/generative
Weaviate Cloud: https://console.weaviate.cloud
社区论坛: https://forum.weaviate.io

目录