> ## Documentation Index
> Fetch the complete documentation index at: https://docs.omophub.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Search Similar Concepts

> Find semantically similar medical concepts using advanced machine learning algorithms with flexible search criteria and body parameters.

## Overview

This endpoint identifies medical concepts that are semantically similar to a provided query or set of criteria. It leverages advanced machine learning models trained on medical terminology to discover related concepts that may not share exact keywords but are clinically relevant and contextually similar.

## Request Body

<ParamField body="query" type="string" required>
  Primary search query or concept description to find similar concepts for
</ParamField>

<ParamField body="vocabulary_ids" type="array">
  Target vocabularies to search within (array of strings)

  <br />

  **Examples**: `["SNOMED", "ICD10CM"]`, `["RXNORM", "NDC"]`
</ParamField>

<ParamField body="domain_ids" type="array">
  Clinical domains to focus the similarity search (array of strings)

  <br />

  **Examples**: `["Condition", "Procedure"]`, `["Drug", "Device"]`
</ParamField>

<ParamField body="concept_class_ids" type="array">
  Concept classes to include in similarity search (array of strings)

  <br />

  **Examples**: `["Clinical Finding", "Procedure"]`, `["Ingredient", "Brand Name"]`
</ParamField>

<ParamField body="similarity_threshold" type="number" default="0.7" min="0" max="1.0">
  Minimum similarity score threshold (0 to 1.0)

  <br />

  **Higher values** = More strict similarity matching
</ParamField>

<ParamField body="page_size" type="integer" default="20" min="1" max="100">
  Number of similar concepts to return per page
</ParamField>

<ParamField body="page" type="integer" default="1" min="1">
  Page number (1-based indexing)
</ParamField>

<ParamField body="include_scores" type="boolean" default="true">
  Include similarity scores in the response
</ParamField>

<ParamField body="include_explanations" type="boolean" default="false">
  Include explanations for why concepts are considered similar
</ParamField>

<ParamField body="standard_concept" type="string">
  Filter to standard concepts only

  <br />

  **Options**: `S` (standard), `C` (classification), `N` (non-standard)
</ParamField>

<ParamField body="include_invalid" type="boolean" default="true">
  Include invalid/deprecated concepts in similarity search
</ParamField>

<ParamField body="algorithm" type="string" default="hybrid">
  Similarity algorithm to use

  <br />

  **Options**:

  * `semantic` - Neural embedding-based similarity. Best for finding conceptually similar terms (e.g., "heart attack" → "Myocardial infarction").
  * `lexical` - Text-based Jaccard word similarity. Good for fuzzy text matching and typo tolerance.
  * `hybrid` (default) - Combines word and character similarity for balanced matching.
</ParamField>

## Response

<ResponseField name="success" type="boolean">
  Indicates if the request was successful
</ResponseField>

<ResponseField name="data" type="object">
  Contains the similar concepts search results

  <Expandable title="data">
    <ResponseField name="similar_concepts" type="array">
      Array of concepts similar to the query

      <Expandable title="similar_concepts">
        <ResponseField name="concept_id" type="integer">
          OMOP concept ID
        </ResponseField>

        <ResponseField name="concept_name" type="string">
          Primary concept name
        </ResponseField>

        <ResponseField name="concept_code" type="string">
          Vocabulary-specific concept code
        </ResponseField>

        <ResponseField name="vocabulary_id" type="string">
          Source vocabulary identifier
        </ResponseField>

        <ResponseField name="domain_id" type="string">
          Clinical domain classification
        </ResponseField>

        <ResponseField name="concept_class_id" type="string">
          Concept class identifier
        </ResponseField>

        <ResponseField name="similarity_score" type="number">
          Similarity score between 0.0 and 1.0
        </ResponseField>

        <ResponseField name="explanation" type="string">
          Reason for similarity match (if requested)
        </ResponseField>

        <ResponseField name="standard_concept" type="string">
          Standard concept flag (S, C, or N)
        </ResponseField>

        <ResponseField name="synonyms" type="array">
          Alternative names for the concept
        </ResponseField>
      </Expandable>
    </ResponseField>

    <ResponseField name="search_metadata" type="object">
      Information about the similarity search

      <Expandable title="search_metadata">
        <ResponseField name="original_query" type="string">
          The original search query provided
        </ResponseField>

        <ResponseField name="algorithm_used" type="string">
          Similarity algorithm applied (`semantic`, `lexical`, or `hybrid`)
        </ResponseField>

        <ResponseField name="similarity_threshold" type="number">
          Minimum similarity threshold applied
        </ResponseField>

        <ResponseField name="total_candidates" type="integer">
          Approximate count of concepts evaluated for similarity. For `semantic` algorithm, this is a lower bound based on sampled results.
        </ResponseField>

        <ResponseField name="results_returned" type="integer">
          Number of similar concepts returned
        </ResponseField>

        <ResponseField name="processing_time_ms" type="integer">
          Search processing time in milliseconds
        </ResponseField>

        <ResponseField name="embedding_latency_ms" type="integer">
          Time spent on embedding generation (only present when `algorithm_used` is `semantic`)
        </ResponseField>
      </Expandable>
    </ResponseField>
  </Expandable>
</ResponseField>

<RequestExample>
  ```bash cURL theme={null}
  curl -X POST "https://api.omophub.com/v1/search/similar" \
    -H "Authorization: Bearer YOUR_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "query": "type 2 diabetes mellitus",
      "vocabulary_ids": ["SNOMED", "ICD10CM"],
      "domain_ids": ["Condition"],
      "similarity_threshold": 0.8,
      "page_size": 10,
      "include_scores": true,
      "include_explanations": true,
      "standard_concept": "S"
    }'
  ```

  ```javascript JavaScript theme={null}
  const response = await fetch('https://api.omophub.com/v1/search/similar', {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer YOUR_API_KEY',
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      query: "type 2 diabetes mellitus",
      vocabulary_ids: ["SNOMED", "ICD10CM"],
      domain_ids: ["Condition"],
      similarity_threshold: 0.8,
      page_size: 10,
      include_scores: true,
      include_explanations: true,
      standard_concept: "S"
    })
  });
  const data = await response.json();
  ```

  ```python Python theme={null}
  import requests

  url = "https://api.omophub.com/v1/search/similar"
  payload = {
      "query": "type 2 diabetes mellitus",
      "vocabulary_ids": ["SNOMED", "ICD10CM"],
      "domain_ids": ["Condition"],
      "similarity_threshold": 0.8,
      "page_size": 10,
      "include_scores": True,
      "include_explanations": True,
      "standard_concept": "S"
  }

  response = requests.post(
      url,
      json=payload,
      headers={"Authorization": "Bearer YOUR_API_KEY"}
  )
  data = response.json()
  ```
</RequestExample>

<ResponseExample>
  ```json Response theme={null}
  {
    "success": true,
    "data": {
      "similar_concepts": [
        {
          "concept_id": 44054006,
          "concept_name": "Type 2 diabetes mellitus",
          "concept_code": "44054006",
          "vocabulary_id": "SNOMED",
          "domain_id": "Condition",
          "concept_class_id": "Clinical Finding",
          "similarity_score": 1.0,
          "explanation": "Exact match - same clinical concept",
          "standard_concept": "S",
          "synonyms": ["Non-insulin dependent diabetes mellitus", "Adult-onset diabetes"]
        },
        {
          "concept_id": 201826,
          "concept_name": "Type 2 diabetes mellitus without complications",
          "concept_code": "E11.9",
          "vocabulary_id": "ICD10CM",
          "domain_id": "Condition",
          "concept_class_id": "4-char nonbill code",
          "similarity_score": 0.92,
          "explanation": "High similarity - more specific variant of the same condition",
          "standard_concept": "S",
          "synonyms": ["Type II diabetes mellitus uncomplicated"]
        },
        {
          "concept_id": 443729,
          "concept_name": "Diabetes mellitus type 2 with hyperglycemia",
          "concept_code": "443729",
          "vocabulary_id": "SNOMED",
          "domain_id": "Condition",
          "concept_class_id": "Clinical Finding",
          "similarity_score": 0.89,
          "explanation": "High similarity - related complication of the same condition",
          "standard_concept": "S",
          "synonyms": ["Type 2 DM with elevated glucose"]
        }
      ],
      "search_metadata": {
        "original_query": "type 2 diabetes mellitus",
        "algorithm_used": "hybrid",
        "similarity_threshold": 0.8,
        "total_candidates": 125000,
        "results_returned": 3,
        "processing_time_ms": 245
      }
    },
    "meta": {
      "request_id": "req_sim_abc123",
      "timestamp": "2024-01-15T10:30:00Z",
      "vocab_release": "2025.1",
      "pagination": {
        "page": 1,
        "page_size": 20,
        "total_items": 3,
        "total_pages": 1,
        "has_next": false,
        "has_previous": false
      },
      "search": {
        "query": "type 2 diabetes mellitus",
        "total_results": 3,
        "filters_applied": {
          "similarity_type": "concept"
        }
      }
    }
  }
  ```

  ```json Error Response theme={null}
  {
    "success": false,
    "error": {
      "code": "INVALID_THRESHOLD",
      "message": "Similarity threshold must be between 0 and 1.0",
      "details": {
        "provided_threshold": 1.5,
        "valid_range": "0-1.0"
      }
    }
  }
  ```
</ResponseExample>

## Usage Examples

### Basic Similarity Search

Find concepts similar to a medical condition:

```json theme={null}
{
  "query": "hypertension",
  "page_size": 5,
  "similarity_threshold": 0.7
}
```

### Cross-Vocabulary Similarity

Find similar concepts across multiple vocabularies:

```json theme={null}
{
  "query": "myocardial infarction",
  "vocabulary_ids": ["SNOMED", "ICD10CM", "ICD9CM"],
  "domain_ids": ["Condition"],
  "similarity_threshold": 0.8,
  "include_explanations": true
}
```

### Pharmacological Similarity

Find similar drug concepts with detailed scoring:

```json theme={null}
{
  "query": "metformin hydrochloride",
  "vocabulary_ids": ["RXNORM"],
  "domain_ids": ["Drug"],
  "concept_class_ids": ["Ingredient", "Clinical Drug"],
  "similarity_threshold": 0.75,
  "include_scores": true,
  "algorithm": "semantic"
}
```

### Semantic Search for Clinical Terms

Use neural embeddings to find conceptually similar terms (even when words don't match):

```json theme={null}
{
  "query": "heart attack",
  "vocabulary_ids": ["SNOMED"],
  "domain_ids": ["Condition"],
  "similarity_threshold": 0.5,
  "algorithm": "semantic",
  "page_size": 10
}
```

This will find "Myocardial infarction" and related concepts even though the words "heart" and "attack" don't appear in the medical term.

## Algorithm Comparison

| Feature               | `semantic`            | `lexical`               | `hybrid` (default)          |
| --------------------- | --------------------- | ----------------------- | --------------------------- |
| Model                 | Neural embeddings     | Jaccard word similarity | Word + character similarity |
| Best for              | Conceptual similarity | Fuzzy text matching     | Balanced matching           |
| "heart attack" → "MI" | Excellent             | Poor (word mismatch)    | Poor                        |
| Typo tolerance        | Moderate              | Good                    | Good                        |
| Total counts          | Approximate           | Exact                   | Exact                       |
| Speed                 | Fast (15-50ms)        | Fast (50-200ms)         | Fast (50-200ms)             |
| Requirements          | Embedding service     | None                    | None                        |

<Note>
  **Semantic Algorithm Pagination:** When using `algorithm: "semantic"`, the `total_candidates` and pagination counts are approximate values optimized for performance. Use `has_next` in the response to reliably determine if more results exist.
</Note>

## Related Endpoints

* [GET Search Similar by ID](/api-reference/search/get-search-similar-by-id) - Get similar concepts for a specific concept ID
* [Semantic Search](/api-reference/search/semantic-search) - Dedicated semantic search endpoint using neural embeddings
* [Advanced Search](/api-reference/search/advanced-search) - Advanced search with multiple criteria
