Skip to main content

Indexing Plugins

An Indexing Plugin is the output adapter that delivers processed documents from the Dumont DEP pipeline to a search engine. Dumont DEP supports three targets — you choose one per deployment.


Available Plugins

PluginTargetClient LibraryUse Case
Turing (default)Viglet Turing ESTuring Java SDK 2026.1.17Full enterprise search with GenAI, facets, spotlights
SolrApache SolrSolrJ 10.0.0Direct Solr indexing without Turing ES
ElasticsearchElasticsearchES Java Client 9.3.2Direct Elasticsearch indexing without Turing ES

Turing ES Plugin (default)

The default plugin delivers documents to Turing ES via its REST API, using the official Turing Java SDK.

Configuration

dumont:
indexing:
provider: turing

turing:
url: http://localhost:2700
apiKey: your-turing-api-token
PropertyDescription
turing.urlBase URL of the Turing ES instance
turing.apiKeyAPI Token created in Turing ES → Administration → API Tokens

How It Works

  1. Receives a batch of Job Items from the message queue
  2. Creates a TurSNServer instance using the Turing Java SDK
  3. Calls TurSNJobUtils.importItems() to submit the batch
  4. Turing ES validates each document against the target SN Site configuration
  5. Documents are queued internally in Turing ES for Solr indexing
API Token required

The Turing plugin cannot deliver content without a valid API Token. Create one in Turing ES → Administration → API Tokens before starting Dumont DEP.


Apache Solr Plugin

The Solr plugin delivers documents directly to an Apache Solr collection, bypassing Turing ES entirely. Use this when you want Dumont DEP as a pure data extraction tool without Turing ES features.

Configuration

dumont:
indexing:
provider: solr
solr:
url: http://localhost:8983/solr
collection: my-collection
PropertyDescription
dumont.indexing.solr.urlApache Solr base URL
dumont.indexing.solr.collectionTarget Solr collection name

How It Works

  1. Receives a batch of Job Items from the message queue
  2. Converts each Job Item into a SolrInputDocument
  3. Adds all documents to the Solr collection via SolrJ
  4. Commits the changes

The Solr client is cleaned up automatically when the application shuts down (@PreDestroy).


Elasticsearch Plugin

The Elasticsearch plugin delivers documents directly to an Elasticsearch index using bulk requests.

Configuration

dumont:
indexing:
provider: elasticsearch
elasticsearch:
url: http://localhost:9200
index: my-index
username: ~
password: ~
PropertyDescription
dumont.indexing.elasticsearch.urlElasticsearch base URL
dumont.indexing.elasticsearch.indexTarget index name
dumont.indexing.elasticsearch.usernameOptional authentication username
dumont.indexing.elasticsearch.passwordOptional authentication password

How It Works

  1. Receives a batch of Job Items from the message queue
  2. Builds a bulk request containing all documents
  3. Submits the bulk request to Elasticsearch
  4. Logs any per-document errors from the bulk response

Authentication is optional — leave username and password empty for unauthenticated clusters.


Why Use Turing ES Instead of Solr or Elasticsearch Directly?

The Solr and Elasticsearch plugins deliver raw documents to the search engine — your application is responsible for everything else: building queries, rendering facets, managing spotlights, handling locales, and building the search UI.

Turing ES adds an entire enterprise search platform on top of the search engine. Here's what you get by choosing the Turing plugin over direct Solr or Elasticsearch indexing:

Search Experience Layer

CapabilityWith Turing ESDirect Solr / Elasticsearch
Faceted navigationConfigured per site — facet types, sort, AND/OR operators, secondary facets, custom facets with rangesBuild manually with query params
SpotlightsCurated results pinned to search terms, injected at configured positionsNot available — build from scratch
Targeting RulesFilter results by user profile (department, role, country) at query timeNot available
Merge ProvidersCombine documents from two connectors into one enriched result using a join keyNot available
Spell checkBuilt-in with auto-correction modeConfigure Solr/ES suggester manually
AutocompleteReady-to-use endpointConfigure Solr/ES suggester manually
More Like ThisOne toggle per siteConfigure MLT handler manually
Result rankingBoost rules with conditions and weights via admin UIWrite boost queries manually
HighlightingConfigurable HTML tags per siteConfigure highlight params manually
Self-describing JSONResponse includes pre-built links for pagination, facet filters, locale switching — the front-end is a pure rendering layerBuild all query logic client-side

Generative AI & RAG

CapabilityWith Turing ESDirect Solr / Elasticsearch
RAG (Retrieval-Augmented Generation)Documents are automatically embedded as vectors during indexing — users ask questions in natural language and get grounded answersNot available
AI AgentsCompose assistants that combine LLM + search + web browsing + code execution + MCP toolsNot available
Chat interfaceBuilt-in UI with direct LLM, Semantic Navigation, and AI Agent tabsNot available
Knowledge BaseUpload files to MinIO — automatically indexed as vector embeddings for RAGNot available
LLM providersAnthropic Claude, OpenAI, Azure OpenAI, Google Gemini, Ollama — configured via admin UINot available
Tool calling27 native tools across 7 categories available to AI AgentsNot available
Token usage monitoringDashboard showing LLM consumption by model, day, and monthNot available

Administration & Operations

CapabilityWith Turing ESDirect Solr / Elasticsearch
Admin consoleBrowser-based React UI for all configurationSolr Admin UI (limited) / Kibana (separate)
Multi-siteMultiple SN Sites on one instance, each with independent fields, facets, and AI settingsManage collections/indices manually
Multi-languageLocale-aware indexing and search with per-locale Solr coresConfigure manually per core/index
Connector managementIntegration page with monitoring, stats, double-check, and indexing managerNot available
Search metricsTop search terms by day/week/month/all-time per siteNot available out of the box
Application logsMongoDB-backed log viewer in the admin consoleExternal log tools required
Security (SSO)Keycloak OAuth2/OIDC with SAML, LDAP, social login, and MFAConfigure separately per product

Integration

CapabilityWith Turing ESDirect Solr / Elasticsearch
REST APISelf-describing search response with pre-built navigation linksRaw query/response
GraphQLBuilt-in endpointNot available
Java SDKOfficial typed client on Maven CentralUse SolrJ / ES Client directly
JavaScript SDKOfficial @viglet/turing-sdk on npmNo official SDK

When to Use Direct Solr or Elasticsearch

The Solr and Elasticsearch plugins are appropriate when:

  • You already have a search infrastructure and only need Dumont DEP as a data extraction tool
  • You have your own search UI and query layer built on top of Solr/Elasticsearch
  • You don't need the GenAI, faceted navigation, or admin console features
  • You're integrating with a third-party system that requires direct Solr/Elasticsearch access

For all other cases — especially when building a new search experience — Turing ES is the recommended target because it provides a complete, ready-to-use enterprise search platform with GenAI capabilities on top of the same Apache Solr engine.


Switching Plugins

Change the active plugin by setting dumont.indexing.provider:

# Via JVM property
java -Ddumont.indexing.provider=solr -jar viglet-dumont.jar

# Via environment variable
DUMONT_INDEXING_PROVIDER=elasticsearch
java -jar viglet-dumont.jar

Only one plugin is active per deployment. All connectors share the same output target.


PageDescription
Configuration ReferenceAll application.yaml properties
ArchitectureWhere indexing plugins fit in the pipeline
Turing ES — REST APITuring ES indexing API reference
Turing ES — Semantic NavigationSN Sites, facets, spotlights, and search configuration
Turing ES — RAGRetrieval-Augmented Generation and vector embeddings