> ## Documentation Index
> Fetch the complete documentation index at: https://docs.omophub.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Bulk Semantic Search

> Perform semantic OMOP concept search on multiple natural-language queries simultaneously with optimized batch processing for LLM and AI workflows.

## Overview

This endpoint allows you to submit multiple semantic search queries in a single request, combining the power of vector similarity matching with efficient batch processing. We host a vector similarity service and do not use third party services for this. It's ideal for processing clinical notes, batch NLP pipelines, or any workflow requiring high-confidence concept matching across many terms.

Each query uses vector similarity search with a default limit of 10 results per query (configurable via `page_size`, max 50). Up to 25 queries can be submitted per request.

## Request Body

<ParamField body="searches" type="array" required>
  Array of semantic search query objects (1-25 items)

  <Expandable title="Search Object">
    <ParamField body="search_id" type="string" required>
      Unique identifier for this search within the batch
    </ParamField>

    <ParamField body="query" type="string" required>
      Natural language search query (1-500 characters)
    </ParamField>

    <ParamField body="page_size" type="integer" default="10">
      Number of results per search (1-50). Overrides `defaults.page_size` for this search.
    </ParamField>

    <ParamField body="threshold" type="number" default="0.5">
      Minimum similarity score (0.0-1.0). Higher values = stricter matching. Overrides `defaults.threshold` for this search.
    </ParamField>

    <ParamField body="vocabulary_ids" type="string[]">
      Filter results to specific vocabularies (e.g., `["SNOMED", "ICD10CM"]`). Overrides `defaults.vocabulary_ids` for this search.
    </ParamField>

    <ParamField body="domain_ids" type="string[]">
      Filter results to specific domains (e.g., `["Condition", "Drug"]`). Overrides `defaults.domain_ids` for this search.
    </ParamField>

    <ParamField body="standard_concept" type="string">
      Filter by standard concept status: `"S"` (Standard) or `"C"` (Classification). Overrides `defaults.standard_concept` for this search.
    </ParamField>

    <ParamField body="concept_class_id" type="string">
      Filter by concept class (e.g., `"Clinical Finding"`). Overrides `defaults.concept_class_id` for this search.
    </ParamField>
  </Expandable>
</ParamField>

<ParamField body="defaults" type="object">
  Default parameters applied to all searches. Individual searches can override any default.

  <Expandable title="Defaults Object">
    <ParamField body="page_size" type="integer" default="10">
      Default results per search (1-50)
    </ParamField>

    <ParamField body="threshold" type="number" default="0.5">
      Default minimum similarity score (0.0-1.0)
    </ParamField>

    <ParamField body="vocabulary_ids" type="string[]">
      Default vocabulary filter for all searches (e.g., `["SNOMED"]`)
    </ParamField>

    <ParamField body="domain_ids" type="string[]">
      Default domain filter for all searches (e.g., `["Condition"]`)
    </ParamField>

    <ParamField body="standard_concept" type="string">
      Default standard concept filter: `"S"` or `"C"`
    </ParamField>

    <ParamField body="concept_class_id" type="string">
      Default concept class filter
    </ParamField>
  </Expandable>
</ParamField>

## Query Parameters

<ParamField query="vocab_release" type="string">
  Specific vocabulary release version (defaults to latest)
</ParamField>

## Response

<ResponseField name="success" type="boolean">
  Indicates if the request was successful
</ResponseField>

<ResponseField name="data" type="object">
  Response data object containing results and summary counts

  <Expandable title="Data Object">
    <ResponseField name="results" type="array">
      Array of search results, one per query

      <Expandable title="Search Result">
        <ResponseField name="search_id" type="string">
          Identifier matching the request's search\_id
        </ResponseField>

        <ResponseField name="query" type="string">
          Original search query
        </ResponseField>

        <ResponseField name="status" type="string">
          Query execution status: `completed` or `failed`
        </ResponseField>

        <ResponseField name="results" type="array">
          Array of matching concepts with similarity scores

          <Expandable title="Concept Result">
            <ResponseField name="concept_id" type="integer">
              Unique concept identifier
            </ResponseField>

            <ResponseField name="concept_name" type="string">
              Primary concept name
            </ResponseField>

            <ResponseField name="concept_code" type="string">
              Concept code from source vocabulary
            </ResponseField>

            <ResponseField name="vocabulary_id" type="string">
              Source vocabulary
            </ResponseField>

            <ResponseField name="domain_id" type="string">
              Domain classification
            </ResponseField>

            <ResponseField name="concept_class_id" type="string">
              Concept class
            </ResponseField>

            <ResponseField name="standard_concept" type="string | null">
              Standard concept indicator: `S`, `C`, or `null`
            </ResponseField>

            <ResponseField name="similarity_score" type="number">
              Semantic similarity score (0.0-1.0). Higher = more similar.
            </ResponseField>

            <ResponseField name="matched_text" type="string">
              The text that matched (concept name or synonym)
            </ResponseField>
          </Expandable>
        </ResponseField>

        <ResponseField name="error" type="string">
          Error message (only present if query failed)
        </ResponseField>

        <ResponseField name="similarity_threshold" type="number">
          The similarity threshold used for this query
        </ResponseField>

        <ResponseField name="result_count" type="integer">
          Number of results returned for this query
        </ResponseField>

        <ResponseField name="duration" type="number">
          Processing time for this query in milliseconds
        </ResponseField>

        <ResponseField name="query_enhancement" type="object | null">
          Query enhancement details (only present if query was modified)

          <Expandable title="Query Enhancement">
            <ResponseField name="original_query" type="string">
              Original query before enhancement
            </ResponseField>

            <ResponseField name="enhanced_query" type="string">
              Enhanced query used for search
            </ResponseField>

            <ResponseField name="abbreviations_expanded" type="string[]">
              List of abbreviations that were expanded
            </ResponseField>

            <ResponseField name="misspellings_corrected" type="string[]">
              List of misspellings that were corrected
            </ResponseField>
          </Expandable>
        </ResponseField>
      </Expandable>
    </ResponseField>

    <ResponseField name="total_searches" type="integer">
      Total number of searches in the request
    </ResponseField>

    <ResponseField name="completed_count" type="integer">
      Number of successfully completed searches
    </ResponseField>

    <ResponseField name="failed_count" type="integer">
      Number of failed searches
    </ResponseField>

    <ResponseField name="total_duration" type="number">
      Total processing time in milliseconds
    </ResponseField>
  </Expandable>
</ResponseField>

<ResponseField name="meta" type="object">
  <Expandable title="Metadata">
    <ResponseField name="request_id" type="string">
      Unique identifier for the request
    </ResponseField>

    <ResponseField name="vocab_release" type="string">
      Vocabulary release version used
    </ResponseField>

    <ResponseField name="timestamp" type="string">
      Request timestamp
    </ResponseField>
  </Expandable>
</ResponseField>

<RequestExample>
  ```bash cURL theme={null}
  curl -X POST "https://api.omophub.com/v1/search/semantic-bulk" \
    -H "Authorization: Bearer YOUR_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "defaults": {
        "vocabulary_ids": ["SNOMED"],
        "standard_concept": "S",
        "threshold": 0.5,
        "page_size": 5
      },
      "searches": [
        {
          "search_id": "s1",
          "query": "heart attack"
        },
        {
          "search_id": "s2",
          "query": "sugar diabetes",
          "threshold": 0.7
        },
        {
          "search_id": "s3",
          "query": "aspirin tablets",
          "vocabulary_ids": ["RxNorm"],
          "domain_ids": ["Drug"]
        }
      ]
    }'
  ```

  ```javascript JavaScript theme={null}
  const response = await fetch('https://api.omophub.com/v1/search/semantic-bulk', {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer YOUR_API_KEY',
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      defaults: {
        vocabulary_ids: ['SNOMED'],
        standard_concept: 'S',
        threshold: 0.5,
        page_size: 5
      },
      searches: [
        {
          search_id: 's1',
          query: 'heart attack'
        },
        {
          search_id: 's2',
          query: 'sugar diabetes',
          threshold: 0.7
        },
        {
          search_id: 's3',
          query: 'aspirin tablets',
          vocabulary_ids: ['RxNorm'],
          domain_ids: ['Drug']
        }
      ]
    })
  });

  const data = await response.json();
  ```

  ```python Python theme={null}
  import requests

  headers = {
      'Authorization': 'Bearer YOUR_API_KEY',
      'Content-Type': 'application/json'
  }

  payload = {
      'defaults': {
          'vocabulary_ids': ['SNOMED'],
          'standard_concept': 'S',
          'threshold': 0.5,
          'page_size': 5
      },
      'searches': [
          {
              'search_id': 's1',
              'query': 'heart attack'
          },
          {
              'search_id': 's2',
              'query': 'sugar diabetes',
              'threshold': 0.7
          },
          {
              'search_id': 's3',
              'query': 'aspirin tablets',
              'vocabulary_ids': ['RxNorm'],
              'domain_ids': ['Drug']
          }
      ]
  }

  response = requests.post(
      'https://api.omophub.com/v1/search/semantic-bulk',
      headers=headers,
      json=payload
  )

  data = response.json()
  ```
</RequestExample>

<ResponseExample>
  ```json Response theme={null}
  {
    "success": true,
    "data": {
      "results": [
        {
          "search_id": "s1",
          "query": "heart attack",
          "status": "completed",
          "results": [
            {
              "concept_id": 4329847,
              "concept_name": "Myocardial infarction",
              "concept_code": "22298006",
              "vocabulary_id": "SNOMED",
              "domain_id": "Condition",
              "concept_class_id": "Clinical Finding",
              "standard_concept": "S",
              "similarity_score": 0.92,
              "matched_text": "Myocardial infarction"
            },
            {
              "concept_id": 434376,
              "concept_name": "Acute myocardial infarction",
              "concept_code": "57054005",
              "vocabulary_id": "SNOMED",
              "domain_id": "Condition",
              "concept_class_id": "Clinical Finding",
              "standard_concept": "S",
              "similarity_score": 0.89,
              "matched_text": "Acute myocardial infarction"
            }
          ],
          "similarity_threshold": 0.5,
          "result_count": 2,
          "duration": 45,
          "query_enhancement": null
        },
        {
          "search_id": "s2",
          "query": "sugar diabetes",
          "status": "completed",
          "results": [
            {
              "concept_id": 201826,
              "concept_name": "Type 2 diabetes mellitus",
              "concept_code": "44054006",
              "vocabulary_id": "SNOMED",
              "domain_id": "Condition",
              "concept_class_id": "Clinical Finding",
              "standard_concept": "S",
              "similarity_score": 0.88,
              "matched_text": "Type 2 diabetes mellitus"
            },
            {
              "concept_id": 4000678,
              "concept_name": "Diabetes mellitus",
              "concept_code": "73211009",
              "vocabulary_id": "SNOMED",
              "domain_id": "Condition",
              "concept_class_id": "Clinical Finding",
              "standard_concept": "S",
              "similarity_score": 0.85,
              "matched_text": "Diabetes mellitus"
            }
          ],
          "similarity_threshold": 0.7,
          "result_count": 2,
          "duration": 38,
          "query_enhancement": null
        },
        {
          "search_id": "s3",
          "query": "aspirin tablets",
          "status": "completed",
          "results": [
            {
              "concept_id": 1112807,
              "concept_name": "Aspirin",
              "concept_code": "1191",
              "vocabulary_id": "RxNorm",
              "domain_id": "Drug",
              "concept_class_id": "Ingredient",
              "standard_concept": "S",
              "similarity_score": 0.94,
              "matched_text": "Aspirin"
            }
          ],
          "similarity_threshold": 0.5,
          "result_count": 1,
          "duration": 32,
          "query_enhancement": null
        }
      ],
      "total_searches": 3,
      "completed_count": 3,
      "failed_count": 0,
      "total_duration": 156
    },
    "meta": {
      "request_id": "req_sem_bulk_abc123",
      "vocab_release": "2025.2",
      "timestamp": "2025-01-15T10:30:00Z"
    }
  }
  ```
</ResponseExample>

## Key Differences from Bulk Search

| Feature                 | Bulk Semantic Search                           | Bulk Search                     |
| ----------------------- | ---------------------------------------------- | ------------------------------- |
| **Search method**       | Vector similarity (embeddings)                 | Full-text keyword matching      |
| **Score field**         | `similarity_score` (0.0-1.0)                   | `search_score` (relevance rank) |
| **Max queries**         | 25 per request                                 | 50 per request                  |
| **Max page\_size**      | 50                                             | 100                             |
| **Threshold parameter** | Yes (filters by similarity)                    | No                              |
| **Query enhancement**   | Yes (abbreviation expansion, typo correction)  | No                              |
| **Response shape**      | `data` (object with results array and summary) | `data` (direct array)           |

## Use Cases

### Clinical Notes Processing

Process extracted terms from clinical notes in batch:

```python theme={null}
# Terms extracted from clinical notes via NLP
clinical_terms = [
    "chest pain radiating to left arm",
    "shortness of breath on exertion",
    "elevated troponin levels",
    "irregular heartbeat"
]

payload = {
    "defaults": {
        "vocabulary_ids": ["SNOMED"],
        "domain_ids": ["Condition", "Observation"],
        "standard_concept": "S",
        "threshold": 0.6
    },
    "searches": [
        {"search_id": f"note_{i}", "query": term}
        for i, term in enumerate(clinical_terms)
    ]
}

response = requests.post(
    "https://api.omophub.com/v1/search/semantic-bulk",
    headers=headers,
    json=payload
)
```

### High-Confidence Batch Matching

Use a high threshold for automated mapping pipelines where accuracy is critical:

```python theme={null}
payload = {
    "defaults": {
        "threshold": 0.8,
        "standard_concept": "S",
        "page_size": 3
    },
    "searches": [
        {"search_id": "dx1", "query": "heart attack"},
        {"search_id": "dx2", "query": "high blood pressure"},
        {"search_id": "dx3", "query": "sugar diabetes"}
    ]
}
```

### Multi-Domain Batch Search

Search across different domains in a single request using per-search overrides:

```python theme={null}
payload = {
    "defaults": {"standard_concept": "S", "threshold": 0.5},
    "searches": [
        {
            "search_id": "cond1",
            "query": "chest pain",
            "domain_ids": ["Condition"],
            "vocabulary_ids": ["SNOMED"]
        },
        {
            "search_id": "drug1",
            "query": "blood thinner medication",
            "domain_ids": ["Drug"],
            "vocabulary_ids": ["RxNorm"]
        },
        {
            "search_id": "lab1",
            "query": "blood sugar test",
            "domain_ids": ["Measurement"],
            "vocabulary_ids": ["LOINC"]
        }
    ]
}
```

## Error Handling

### Per-Query Failure Isolation

Each query in the batch is processed independently. Failed queries do not affect other queries:

```json theme={null}
{
  "success": true,
  "data": {
    "results": [
      {
        "search_id": "s1",
        "query": "heart attack",
        "status": "completed",
        "results": [...],
        "similarity_threshold": 0.5,
        "result_count": 5,
        "duration": 42,
        "query_enhancement": null
      },
      {
        "search_id": "s2",
        "query": "",
        "status": "failed",
        "results": [],
        "error": "Search query is required",
        "similarity_threshold": 0.5,
        "result_count": 0,
        "duration": 1,
        "query_enhancement": null
      }
    ],
    "total_searches": 2,
    "completed_count": 1,
    "failed_count": 1,
    "total_duration": 48
  }
}
```

## Related Endpoints

* [Semantic Search](/api-reference/search/semantic-search) - Single query semantic search with pagination
* [Bulk Search](/api-reference/search/bulk-search) - Keyword-based bulk search (up to 50 queries)
* [Basic Search](/api-reference/search/basic-search) - Single query keyword search with pagination
* [Similar Concepts](/api-reference/search/search-similar) - Find concepts similar to a given concept
