Embedding Stores
An Embedding Store is the specialized vector database that persists and queries document embeddings — the numerical vectors generated by the Embedding Model. It enables similarity search, finding the documents most semantically related to a user's query.
Turing ES supports three backends via Spring AI. The active backend is set globally in Administration → Settings and can be overridden per Semantic Navigation Site in its Generative AI tab.
Supported Backends
ChromaDB
A lightweight, open-source vector database ideal for development and small to medium deployments.
- Self-hosted, connects via its HTTP API
- Zero infrastructure overhead for teams already running Python tooling
- No special schema setup required — Turing ES manages the collections automatically
- Multi-tenant and multi-database support
Docker Compose quickstart:
services:
chroma:
image: chromadb/chroma:latest
ports:
- "8000:8000"
| Configuration | Default | Description |
|---|---|---|
| Base URL | http://localhost:8000 | Chroma HTTP API endpoint |
| Collection Name | turing | Target collection name |
| Tenant Name | default_tenant | Chroma tenant identifier |
| Database Name | default_database | Chroma database name |
| Key Token | — | Bearer token for authentication |
| Basic Username / Password | — | Basic auth credentials |
Authentication: ChromaDB supports two methods — Bearer token and Basic auth. Configure either via the credential field or the provider options. When both are present, the credential field takes precedence.
PgVector
PostgreSQL with the pgvector extension — the best choice for deployments that already use PostgreSQL as their primary database.
- Avoids an additional infrastructure dependency
- Embeddings live in the same database as your application data
- Supports standard PostgreSQL backup, replication, and access control
- Connection pooling via HikariCP (max 5 connections per store)
Enable the required extensions in your PostgreSQL instance:
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS hstore;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
| Configuration | Default | Description |
|---|---|---|
| JDBC URL | — | PostgreSQL connection string (e.g., jdbc:postgresql://localhost:5432/turing) |
| Username | — | Database user |
| Password | — | Database password |
| Table Name | vector_store | Table where embeddings are stored |
| Schema Name | public | PostgreSQL schema |
| Dimensions | — | Vector dimensionality — must match the embedding model |
| Distance Type | — | COSINE_DISTANCE, EUCLIDEAN_DISTANCE, or INNER_PRODUCT |
| Index Type | — | HNSW or IVFFLAT |
Table schema created by Turing ES:
CREATE TABLE IF NOT EXISTS "vector_store" (
id UUID DEFAULT uuid_generate_v4() PRIMARY KEY,
content TEXT,
metadata JSON,
embedding VECTOR
);
Milvus
A purpose-built, cloud-native vector database designed for high-scale similarity search.
- Recommended for large corpora or high-throughput deployments
- Supports distributed operation, horizontal scaling, and advanced index management
- Managed cloud offering available (Zilliz Cloud)
| Configuration | Default | Description |
|---|---|---|
| Base URL | http://localhost:19530 | Milvus service URI |
| Collection Name | turing | Target collection |
| Database Name | — | Optional database name |
| Token | — | Authentication token (username:password format) |
| Embedding Dimension | — | Vector dimensionality — must match the embedding model |
| Metric Type | — | COSINE, L2, or IP (inner product) |
| Index Type | — | HNSW, IVF_FLAT, IVF_SQ8, or DISKANN |
| Index Parameters | — | JSON string with index-specific params (e.g., {"M":16,"efConstruction":200}) |
Store Comparison
| Feature | ChromaDB | PgVector | Milvus |
|---|---|---|---|
| Best for | Dev / small-medium | PostgreSQL shops | Large-scale production |
| Infrastructure | Standalone container | PostgreSQL extension | Dedicated cluster |
| Scaling | Single node | PostgreSQL replication | Horizontal / distributed |
| Index types | Automatic | HNSW, IVFFLAT | HNSW, IVF_FLAT, IVF_SQ8, DISKANN |
| Distance metrics | Cosine, L2 | Cosine, L2, Inner Product | Cosine, L2, Inner Product |
| Multi-tenant | Yes (tenant + database) | Via schema | Yes (database + collection) |
| Authentication | Token or Basic Auth | JDBC credentials | Token |
| Managed cloud | — | Any managed PostgreSQL | Zilliz Cloud |
Create / Edit Form
Navigate to Generative AI → Embedding Stores to manage store instances.
General Information
| Field | Required | Description |
|---|---|---|
| Title | Yes | Display name for this store instance |
| Description | Yes | Brief description of its purpose |
| Vendor | Yes | Select the backend: ChromaDB, PgVector, or Milvus. Selecting a vendor applies default values for Endpoint URL and Collection Name. |
Connection
| Field | Required | Description |
|---|---|---|
| Endpoint URL | Yes | Base URL for the store backend |
| Collection Name | Name of the collection or table (vendor-specific default applied) | |
| Credential | Authentication token or password — stored encrypted. Leave blank when editing to keep the existing value. |
Provider Options
Each vendor exposes additional configuration fields in the Provider Options section. These fields appear dynamically when a vendor is selected. A raw JSON editor is also available for advanced configurations.
Status
| Field | Description |
|---|---|
| Enabled | Toggle to activate or deactivate this store. Disabled stores are not available for selection. |
Vendor defaults applied on selection:
| Vendor | Default URL | Default Collection |
|---|---|---|
| ChromaDB | http://localhost:8000 | turing |
| PgVector | jdbc:postgresql://localhost:5432/turing | vector_store |
| Milvus | http://localhost:19530 | turing |
Collection Management
Each store instance provides a Collections page where you can view and manage vector collections.
The collections table shows:
| Column | Description |
|---|---|
| Collection Name | Name of the collection or table |
| ID | Internal identifier |
| Document Count | Number of distinct source documents embedded |
| Chunk Count | Total number of embedding chunks stored |
Available actions per collection:
| Action | Description |
|---|---|
| Create | Create a new empty collection in the store |
| Clear | Remove all embeddings from the collection while keeping the collection structure |
| Delete | Remove the entire collection and all its data |
Clearing or deleting a collection permanently removes all stored embeddings. A full re-indexing of all content is required to rebuild the vectors. This action cannot be undone.
System Information
The System Info endpoint returns backend-specific metadata for monitoring and diagnostics:
| Backend | Information Returned |
|---|---|
| ChromaDB | ChromaDB version |
| PgVector | PostgreSQL version, pgvector extension version |
| Milvus | Milvus server version |
REST API
Store instances are managed via the REST API at /api/store.
Store Instance Endpoints
| Method | Endpoint | Description |
|---|---|---|
GET | /api/store | List all store instances (ordered by title) |
GET | /api/store/structure | Get the structure template for a new instance |
GET | /api/store/{id} | Get a specific store instance |
POST | /api/store | Create a new store instance |
PUT | /api/store/{id} | Update an existing store instance |
DELETE | /api/store/{id} | Delete a store instance |
Collection Endpoints
| Method | Endpoint | Description |
|---|---|---|
GET | /api/store/{id}/collections | List all collections in a store |
POST | /api/store/{id}/collections/{name} | Create a new collection |
DELETE | /api/store/{id}/collections/{name} | Delete a collection |
DELETE | /api/store/{id}/collections/{name}/clear | Clear all embeddings from a collection |
System Info Endpoint
| Method | Endpoint | Description |
|---|---|---|
GET | /api/store/{id}/system-info | Get store backend version and status |
Store Vendor Endpoints
| Method | Endpoint | Description |
|---|---|---|
GET | /api/store/vendor | List all available vendors |
GET | /api/store/vendor/{id} | Get a specific vendor |
Global Configuration
Set the default embedding store in Administration → Settings:
| Setting | Description |
|---|---|
| Default Embedding Store | Which vector database backend to use (ChromaDB, PgVector, or Milvus) |
Individual Semantic Navigation Sites can override this setting in their Generative AI tab. The Knowledge Base always uses the global default.
Security
Credentials are handled with care at every layer:
- Stored encrypted — credentials are encrypted via
TurSecretCryptoServicebefore being persisted in thecredentialEncryptedcolumn - Never returned — the
credentialfield is transient and write-only. It flows in on save but never comes back in API responses - Edit safely — leaving the credential field blank when editing preserves the existing encrypted value
- Per-vendor auth — ChromaDB supports Bearer token or Basic Auth; Milvus uses token-based auth; PgVector uses JDBC credentials with connection pooling
Caching
Store instance and vendor data is cached at the repository layer to avoid repeated database reads:
turStoreInstancefindAll— caches the full list of instancesturStoreInstancefindById— caches individual instance lookupsturStoreVendorfindAll/turStoreVendorfindById— caches vendor metadata
Cache entries are invalidated automatically on create, update, or delete.
Related Pages
| Page | Description |
|---|---|
| Embedding Models | Configure the models that generate vectors stored here |
| What is RAG? | How embedding stores fit into the RAG pipeline |
| GenAI & LLM Configuration | RAG architecture, RAG sources, and system overview |
| LLM Instances | Configure the LLM providers that supply embedding APIs |
| Assets | Knowledge Base files that are indexed into the Embedding Store |
| Semantic Navigation | Generative AI tab: per-site embedding overrides |