SearchConfiguration

org.sagebionetworks.repo.model.search.table.SearchConfiguration

Bundles the index-wide default analyzer and per-column overrides used to build a SearchIndex. A SearchConfiguration is the equivalent of the analyzer wiring in the settings.analysis block of an OpenSearch create-index request — it points at the TextAnalyzer that supplies analysis.analyzer.default (and optionally analysis.analyzer.default_search) for the index, then layers any per-column overrides on top. Bind one to a project (or any entity) and every SearchIndex below it inherits the configuration.

Reference vs inline. Every analyzer / override slot accepts the same shape: either a reference to a saved resource — {"$ref": "{organizationName}-{name}"} — or an inline literal pasted directly. References preserve reuse and central editing; inline literals keep the configuration self-contained but cannot be referenced from elsewhere. The same $ref shape is also used inside a TextAnalyzer's settings.filter registry map to point at a SynonymSet, so curators learn one rule.

Example using a saved TextAnalyzer plus a saved override (both via $ref):

{
  "organizationName": "biomed",
  "name": "publications_v1",
  "defaultAnalyzer": { "$ref": "org.sagebionetworks-SCIENTIFIC" },
  "columnAnalyzerOverrides": [
    { "$ref": "biomed-publications_overrides" }
  ]
}

Same shape, but with an inline override (the analyzer for the override's columns is itself a $ref):

{
  "organizationName": "biomed",
  "name": "publications_v1",
  "defaultAnalyzer": { "$ref": "org.sagebionetworks-SCIENTIFIC" },
  "columnAnalyzerOverrides": [
    {
      "overrides": [
        { "columnName": "disease_code", "analyzer": { "$ref": "biomed-acronym_exact" } },
        { "columnName": "authors",      "analyzer": { "$ref": "biomed-scientific_search" } }
      ]
    }
  ]
}

Asymmetric index/search analysis (e.g. edge_ngram autocomplete) is expressed inside the chosen TextAnalyzer's settings JSON by declaring both analyzer.default and analyzer.default_search; OpenSearch picks the right one per its search-analyzer precedence rules. There is no separate search-time analyzer field on the SearchConfiguration — in line with the OpenSearch docs' guidance that this case is uncommon.

Synonyms are not wired through the SearchConfiguration. A TextAnalyzer that wants synonyms references a SynonymSet directly via {"$ref": "{org}-{name}"} inside its own settings.filter registry map — see TextAnalyzer.

See SearchConfigBinding for how the configuration attaches to a project, TextAnalyzer / ColumnAnalyzerOverride for the analyzer components.

Field	Type	Description
id	STRING	The unique ID of this search configuration.
organizationName	STRING	The name of the Organization this resource belongs to. Immutable after creation.
name	STRING	The resource name. Must start with a letter and contain only letters, digits, and underscores. Unique within the organization and immutable after creation. Used as part of the qualified name ({organizationName}-{name}) when referenced by other resources.
description	STRING	Optional description.
defaultAnalyzer	OBJECT	Optional. The analyzer that supplies this index's `analysis.analyzer.default` slot. Either a reference to a saved TextAnalyzer (preferred — supports reuse) written as `{"$ref": "{organizationName}-{name}"}`, or an inline analyzer literal pasted directly. The `$ref` form looks up the row by qualified name; the inline form lives only inside this SearchConfiguration's JSON, has no persisted id, and cannot be reused elsewhere. $ref form: `"defaultAnalyzer": { "$ref": "biomed-publications" }` Inline form — the bare OpenSearch `settings.analysis` block. Allowed root keys are `char_filter`, `tokenizer`, `filter`, and `analyzer`. Refs to SynonymSets are not permitted inside an inline analyzer literal — save a TextAnalyzer and reference it by qualified name to use synonyms. `"defaultAnalyzer": { "tokenizer": { "std": { "type": "standard" } }, "filter": { "english_stop": { "type": "stop", "stopwords": "_english_" } }, "analyzer": { "default": { "type": "custom", "tokenizer": "std", "filter": ["lowercase", "english_stop"] } } }` If the analyzer also declares an `analyzer.default_search` entry, that becomes the index's `analysis.analyzer.default_search` — see OpenSearch's index analyzers and search analyzers for the precedence rules. If omitted, each column falls back to the system default analyzer for its data type (see ColumnTypeToOpenSearchMapping).
columnAnalyzerOverrides	ARRAY<OBJECT>	Optional ordered list of ColumnAnalyzerOverride entries. Each entry is either a reference written as `{"$ref": "{organizationName}-{name}"}` (preferred — supports reuse) or an inline ColumnAnalyzerOverride literal. Inline entries live only inside this SearchConfiguration's JSON. First match wins for duplicated column names across overrides. $ref form: `"columnAnalyzerOverrides": [ { "$ref": "biomed-publications_overrides" } ]` Inline form — the same JSON shape a ColumnAnalyzerOverride record would have, minus identity / audit fields. The inner per-column `analyzer` slot still accepts either a `$ref` to a saved TextAnalyzer or a bare OpenSearch `settings.analysis` block: `"columnAnalyzerOverrides": [ { "overrides": [ { "columnName": "disease_code", "analyzer": { "$ref": "biomed-acronym_exact" } }, { "columnName": "abstract", "analyzer": { "analyzer": { "default": { "type": "custom", "tokenizer": "standard", "filter": ["lowercase"] } } } } ] } ]`
etag	STRING	Synapse employs an Optimistic Concurrency Control (OCC) scheme.
createdOn	STRING	The date this resource was created.
createdBy	STRING	The ID of the user that created this resource.
modifiedOn	STRING	The date this resource was last modified.
modifiedBy	STRING	The ID of the user that last modified this resource.