Skip to content

feat: Add Valkey as a RAG vector store backend #2062

@daric93

Description

@daric93

Summary

Add Valkey as a RAG vector store backend in MetaGPT's RAGIndexFactory and RAGRetrieverFactory, using the Valkey Search module for vector similarity (KNN) search. This follows the existing ConfigBasedFactory pattern used by the FAISS, Chroma, BM25, and Elasticsearch backends, and uses the valkey-glide client library.

Motivation

MetaGPT's RAG system currently supports FAISS, Chroma, BM25, and Elasticsearch backends. Redis is already a dependency (redis~=5.0.0), but it is used only for BrainMemory conversation-state persistence — never for vector search.

As a result, users who already run Valkey infrastructure must deploy a separate vector store (Chroma, Elasticsearch, or local FAISS) purely for MetaGPT's RAG functionality. This adds operational complexity and infrastructure cost. Valkey (the open-source, Linux Foundation–backed fork of Redis) ships a native Search module with KNN vector similarity search, so teams already running Valkey could consolidate vector indexing and retrieval onto it instead of standing up another service.

Once integrated, RAG-dependent systems such as RoleZeroLongTermMemory, Experience Pool, MemoryStorage, and IndexRepo could all be configured to use Valkey via config2.yaml.

Proposed Implementation

  • ValkeyVectorStore implementing llama-index's BasePydanticVectorStore interface, backed by valkey-glide
  • ValkeyIndexConfig and ValkeyRetrieverConfig (Pydantic) in metagpt/rag/schema.py with configurable: host, port, password, use_tls, request_timeout, index_name, prefix, vector_dimensions, distance_metric (COSINE/L2/IP), vector_algorithm (HNSW/FLAT), client_name
  • _create_valkey factory methods registered in RAGIndexFactory and RAGRetrieverFactory (lazy imports so valkey-glide stays optional)
  • Uses FT.CREATE with a VECTOR field (HNSW/FLAT, FLOAT32) for indexing and FT.SEARCH with KNN for retrieval
  • Stores documents as JSON keys with a configurable prefix; metadata round-trips through retrieval
  • Configuration example added to config2.example.yaml

Dependencies

  • valkey-glide >= 2.1.0, < 3.0.0 (added as an optional dependency under the rag extra)
  • Valkey server with the Search module loaded (e.g., the valkey/valkey-bundle image)

Testing Plan

  • Unit tests (mocked client): connection (TLS/password/timeout), index creation, batch ingestion + partial-failure propagation, KNN query, delete, drop-index cleanup, SCAN safety limits, client_name
  • Integration tests: run against a live Valkey instance with the Search module, covering index creation, single + batch document ingestion, KNN retrieval and ordering, top-k limits, delete, drop-index key cleanup, and metadata preservation

Additional context

I have a working implementation with passing unit and live integration tests and am happy to open a PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions