Embedding Stores

An Embedding Store is the specialized vector database that persists and queries document embeddings — the numerical vectors generated by the Embedding Model. It enables similarity search, finding the documents most semantically related to a user's query.

Turing ES supports three backends via Spring AI. The active backend is set globally in Administration → Settings and can be overridden per Semantic Navigation Site in its Generative AI tab.

Supported Backends

ChromaDB

A lightweight, open-source vector database ideal for development and small to medium deployments.

Self-hosted, connects via its HTTP API
Zero infrastructure overhead for teams already running Python tooling
No special schema setup required — Turing ES manages the collections automatically
Multi-tenant and multi-database support

Docker Compose quickstart:

services:
  chroma:
    image: chromadb/chroma:latest
    ports:
      - "8000:8000"

Configuration	Default	Description
Base URL	`http://localhost:8000`	Chroma HTTP API endpoint
Collection Name	`turing`	Target collection name
Tenant Name	`default_tenant`	Chroma tenant identifier
Database Name	`default_database`	Chroma database name
Key Token	—	Bearer token for authentication
Basic Username / Password	—	Basic auth credentials

Authentication: ChromaDB supports two methods — Bearer token and Basic auth. Configure either via the credential field or the provider options. When both are present, the credential field takes precedence.

PgVector

PostgreSQL with the pgvector extension — the best choice for deployments that already use PostgreSQL as their primary database.

Avoids an additional infrastructure dependency
Embeddings live in the same database as your application data
Supports standard PostgreSQL backup, replication, and access control
Connection pooling via HikariCP (max 5 connections per store)

Enable the required extensions in your PostgreSQL instance:

CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS hstore;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

Configuration	Default	Description
JDBC URL	—	PostgreSQL connection string (e.g., `jdbc:postgresql://localhost:5432/turing`)
Username	—	Database user
Password	—	Database password
Table Name	`vector_store`	Table where embeddings are stored
Schema Name	`public`	PostgreSQL schema
Dimensions	—	Vector dimensionality — must match the embedding model
Distance Type	—	`COSINE_DISTANCE`, `EUCLIDEAN_DISTANCE`, or `INNER_PRODUCT`
Index Type	—	`HNSW` or `IVFFLAT`

Table schema created by Turing ES:

CREATE TABLE IF NOT EXISTS "vector_store" (
    id       UUID DEFAULT uuid_generate_v4() PRIMARY KEY,
    content  TEXT,
    metadata JSON,
    embedding VECTOR
);

Milvus

A purpose-built, cloud-native vector database designed for high-scale similarity search.

Recommended for large corpora or high-throughput deployments
Supports distributed operation, horizontal scaling, and advanced index management
Managed cloud offering available (Zilliz Cloud)

Configuration	Default	Description
Base URL	`http://localhost:19530`	Milvus service URI
Collection Name	`turing`	Target collection
Database Name	—	Optional database name
Token	—	Authentication token (`username:password` format)
Embedding Dimension	—	Vector dimensionality — must match the embedding model
Metric Type	—	`COSINE`, `L2`, or `IP` (inner product)
Index Type	—	`HNSW`, `IVF_FLAT`, `IVF_SQ8`, or `DISKANN`
Index Parameters	—	JSON string with index-specific params (e.g., `{"M":16,"efConstruction":200}`)

Store Comparison

Feature	ChromaDB	PgVector	Milvus
Best for	Dev / small-medium	PostgreSQL shops	Large-scale production
Infrastructure	Standalone container	PostgreSQL extension	Dedicated cluster
Scaling	Single node	PostgreSQL replication	Horizontal / distributed
Index types	Automatic	HNSW, IVFFLAT	HNSW, IVF_FLAT, IVF_SQ8, DISKANN
Distance metrics	Cosine, L2	Cosine, L2, Inner Product	Cosine, L2, Inner Product
Multi-tenant	Yes (tenant + database)	Via schema	Yes (database + collection)
Authentication	Token or Basic Auth	JDBC credentials	Token
Managed cloud	—	Any managed PostgreSQL	Zilliz Cloud

Create / Edit Form

Navigate to Generative AI → Embedding Stores to manage store instances.

General Information

Field	Required	Description
Title	Yes	Display name for this store instance
Description	Yes	Brief description of its purpose
Vendor	Yes	Select the backend: ChromaDB, PgVector, or Milvus. Selecting a vendor applies default values for Endpoint URL and Collection Name.

Connection

Field	Required	Description
Endpoint URL	Yes	Base URL for the store backend
Collection Name		Name of the collection or table (vendor-specific default applied)
Credential		Authentication token or password — stored encrypted. Leave blank when editing to keep the existing value.

Provider Options

Each vendor exposes additional configuration fields in the Provider Options section. These fields appear dynamically when a vendor is selected. A raw JSON editor is also available for advanced configurations.

Status

Field	Description
Enabled	Toggle to activate or deactivate this store. Disabled stores are not available for selection.

Vendor defaults applied on selection:

Vendor	Default URL	Default Collection
ChromaDB	`http://localhost:8000`	`turing`
PgVector	`jdbc:postgresql://localhost:5432/turing`	`vector_store`
Milvus	`http://localhost:19530`	`turing`

Collection Management

Each store instance provides a Collections page where you can view and manage vector collections.

The collections table shows:

Column	Description
Collection Name	Name of the collection or table
ID	Internal identifier
Document Count	Number of distinct source documents embedded
Chunk Count	Total number of embedding chunks stored

Available actions per collection:

Action	Description
Create	Create a new empty collection in the store
Clear	Remove all embeddings from the collection while keeping the collection structure
Delete	Remove the entire collection and all its data

Clearing or deleting collections

Clearing or deleting a collection permanently removes all stored embeddings. A full re-indexing of all content is required to rebuild the vectors. This action cannot be undone.

System Information

The System Info endpoint returns backend-specific metadata for monitoring and diagnostics:

Backend	Information Returned
ChromaDB	ChromaDB version
PgVector	PostgreSQL version, pgvector extension version
Milvus	Milvus server version

REST API

Store instances are managed via the REST API at /api/store.

Store Instance Endpoints

Method	Endpoint	Description
`GET`	`/api/store`	List all store instances (ordered by title)
`GET`	`/api/store/structure`	Get the structure template for a new instance
`GET`	`/api/store/{id}`	Get a specific store instance
`POST`	`/api/store`	Create a new store instance
`PUT`	`/api/store/{id}`	Update an existing store instance
`DELETE`	`/api/store/{id}`	Delete a store instance

Collection Endpoints

Method	Endpoint	Description
`GET`	`/api/store/{id}/collections`	List all collections in a store
`POST`	`/api/store/{id}/collections/{name}`	Create a new collection
`DELETE`	`/api/store/{id}/collections/{name}`	Delete a collection
`DELETE`	`/api/store/{id}/collections/{name}/clear`	Clear all embeddings from a collection

System Info Endpoint

Method	Endpoint	Description
`GET`	`/api/store/{id}/system-info`	Get store backend version and status

Store Vendor Endpoints

Method	Endpoint	Description
`GET`	`/api/store/vendor`	List all available vendors
`GET`	`/api/store/vendor/{id}`	Get a specific vendor

Global Configuration

Set the default embedding store in Administration → Settings:

Setting	Description
Default Embedding Store	Which vector database backend to use (ChromaDB, PgVector, or Milvus)

Individual Semantic Navigation Sites can override this setting in their Generative AI tab. The Knowledge Base always uses the global default.

Security

Credentials are handled with care at every layer:

Stored encrypted — credentials are encrypted via TurSecretCryptoService before being persisted in the credentialEncrypted column
Never returned — the credential field is transient and write-only. It flows in on save but never comes back in API responses
Edit safely — leaving the credential field blank when editing preserves the existing encrypted value
Per-vendor auth — ChromaDB supports Bearer token or Basic Auth; Milvus uses token-based auth; PgVector uses JDBC credentials with connection pooling

Caching

Store instance and vendor data is cached at the repository layer to avoid repeated database reads:

turStoreInstancefindAll — caches the full list of instances
turStoreInstancefindById — caches individual instance lookups
turStoreVendorfindAll / turStoreVendorfindById — caches vendor metadata

Cache entries are invalidated automatically on create, update, or delete.

Page	Description
Embedding Models	Configure the models that generate vectors stored here
What is RAG?	How embedding stores fit into the RAG pipeline
GenAI & LLM Configuration	RAG architecture, RAG sources, and system overview
LLM Instances	Configure the LLM providers that supply embedding APIs
Assets	Knowledge Base files that are indexed into the Embedding Store
Semantic Navigation	Generative AI tab: per-site embedding overrides

Supported Backends​

ChromaDB​

PgVector​

Milvus​

Store Comparison​

Create / Edit Form​

General Information​

Connection​

Provider Options​

Status​

Collection Management​

System Information​

REST API​

Store Instance Endpoints​

Collection Endpoints​

System Info Endpoint​

Store Vendor Endpoints​

Global Configuration​

Security​

Caching​

Related Pages​