The Open API specification for Synapse is now available for download!

Download Open API Spec

SearchConfiguration

org.sagebionetworks.repo.model.search.table.SearchConfiguration

Bundles the index-wide default analyzer and per-column overrides used to build a SearchIndex. A SearchConfiguration is the equivalent of the analyzer wiring in the settings.analysis block of an OpenSearch create-index request — it points at the TextAnalyzer that supplies analysis.analyzer.default (and optionally analysis.analyzer.default_search) for the index, then layers any per-column overrides on top. Bind one to a project (or any entity) and every SearchIndex below it inherits the configuration.

Reference vs inline. Every analyzer / override slot accepts the same shape: either a reference to a saved resource — {"$ref": "{organizationName}-{name}"} — or an inline literal pasted directly. References preserve reuse and central editing; inline literals keep the configuration self-contained but cannot be referenced from elsewhere. The same $ref shape is also used inside a TextAnalyzer's settings.filter registry map to point at a SynonymSet, so curators learn one rule.

Example using a saved TextAnalyzer plus a saved override (both via $ref):

{
  "organizationName": "biomed",
  "name": "publications_v1",
  "defaultAnalyzer": { "$ref": "org.sagebionetworks-SCIENTIFIC" },
  "columnAnalyzerOverrides": [
    { "$ref": "biomed-publications_overrides" }
  ]
}

Same shape, but with an inline override (the analyzer for the override's columns is itself a $ref):

{
  "organizationName": "biomed",
  "name": "publications_v1",
  "defaultAnalyzer": { "$ref": "org.sagebionetworks-SCIENTIFIC" },
  "columnAnalyzerOverrides": [
    {
      "overrides": [
        { "columnName": "disease_code", "analyzer": { "$ref": "biomed-acronym_exact" } },
        { "columnName": "authors",      "analyzer": { "$ref": "biomed-scientific_search" } }
      ]
    }
  ]
}

Asymmetric index/search analysis (e.g. edge_ngram autocomplete) is expressed inside the chosen TextAnalyzer's settings JSON by declaring both analyzer.default and analyzer.default_search; OpenSearch picks the right one per its search-analyzer precedence rules. There is no separate search-time analyzer field on the SearchConfiguration — in line with the OpenSearch docs' guidance that this case is uncommon.

Synonyms are not wired through the SearchConfiguration. A TextAnalyzer that wants synonyms references a SynonymSet directly via {"$ref": "{org}-{name}"} inside its own settings.filter registry map — see TextAnalyzer.

See SearchConfigBinding for how the configuration attaches to a project, TextAnalyzer / ColumnAnalyzerOverride for the analyzer components.

Field Type Description
id STRING The unique ID of this search configuration.
organizationName STRING The name of the Organization this resource belongs to. Immutable after creation.
name STRING The resource name. Must start with a letter and contain only letters, digits, and underscores. Unique within the organization and immutable after creation. Used as part of the qualified name ({organizationName}-{name}) when referenced by other resources.
description STRING Optional description.
defaultAnalyzer OBJECT

Optional. The analyzer that supplies this index's analysis.analyzer.default slot. Either a reference to a saved TextAnalyzer (preferred — supports reuse) written as {"$ref": "{organizationName}-{name}"}, or an inline analyzer literal pasted directly. The $ref form looks up the row by qualified name; the inline form lives only inside this SearchConfiguration's JSON, has no persisted id, and cannot be reused elsewhere.

$ref form:

"defaultAnalyzer": { "$ref": "biomed-publications" }

Inline form — the bare OpenSearch settings.analysis block. Allowed root keys are char_filter, tokenizer, filter, and analyzer. Refs to SynonymSets are not permitted inside an inline analyzer literal — save a TextAnalyzer and reference it by qualified name to use synonyms.

"defaultAnalyzer": {
  "tokenizer": { "std": { "type": "standard" } },
  "filter":    { "english_stop": { "type": "stop", "stopwords": "_english_" } },
  "analyzer": {
    "default": {
      "type": "custom",
      "tokenizer": "std",
      "filter": ["lowercase", "english_stop"]
    }
  }
}

If the analyzer also declares an analyzer.default_search entry, that becomes the index's analysis.analyzer.default_search — see OpenSearch's index analyzers and search analyzers for the precedence rules. If omitted, each column falls back to the system default analyzer for its data type (see ColumnTypeToOpenSearchMapping).

columnAnalyzerOverrides ARRAY<OBJECT>

Optional ordered list of ColumnAnalyzerOverride entries. Each entry is either a reference written as {"$ref": "{organizationName}-{name}"} (preferred — supports reuse) or an inline ColumnAnalyzerOverride literal. Inline entries live only inside this SearchConfiguration's JSON. First match wins for duplicated column names across overrides.

$ref form:

"columnAnalyzerOverrides": [
  { "$ref": "biomed-publications_overrides" }
]

Inline form — the same JSON shape a ColumnAnalyzerOverride record would have, minus identity / audit fields. The inner per-column analyzer slot still accepts either a $ref to a saved TextAnalyzer or a bare OpenSearch settings.analysis block:

"columnAnalyzerOverrides": [
  {
    "overrides": [
      { "columnName": "disease_code", "analyzer": { "$ref": "biomed-acronym_exact" } },
      { "columnName": "abstract",     "analyzer": {
        "analyzer": {
          "default": { "type": "custom", "tokenizer": "standard", "filter": ["lowercase"] }
        }
      } }
    ]
  }
]
etag STRING Synapse employs an Optimistic Concurrency Control (OCC) scheme.
createdOn STRING The date this resource was created.
createdBy STRING The ID of the user that created this resource.
modifiedOn STRING The date this resource was last modified.
modifiedBy STRING The ID of the user that last modified this resource.