Skip to main content
This feature is coming soon and is not yet available in the current API version.

Overview

This endpoint will provide fuzzy search capabilities that can find relevant medical concepts even when the search query contains typos, spelling variations, or approximate matches. It will use advanced string matching algorithms including edit distance, phonetic matching, and character-based similarity scoring.

Query Parameters

query
string
required
The search term or phrase (supports typos and variations)
vocabulary_ids
string
Target vocabularies for search (comma-separated)
Examples: SNOMED, ICD10CM,LOINC, RXNORM,NDC
domains
string
Filter results to specific domains (comma-separated)
Examples: Condition,Procedure, Drug,Device
concept_classes
string
Filter to specific concept classes (comma-separated)
fuzzy_threshold
number
default:"0.7"
Minimum fuzzy match score (0.0-1.0, higher = more strict)
edit_distance
integer
default:"2"
Maximum allowed edit distance (1-5)
algorithm
string
default:"hybrid"
Fuzzy matching algorithm
Options: levenshtein, jaro_winkler, soundex, metaphone, hybrid
include_phonetic
boolean
default:"true"
Include phonetic similarity matching
include_abbreviations
boolean
default:"true"
Include abbreviation and acronym matching
case_sensitive
boolean
default:"false"
Whether matching should be case sensitive
word_order_sensitive
boolean
default:"false"
Whether word order affects matching score
min_word_length
integer
default:"3"
Minimum word length for fuzzy matching
boost_exact_matches
boolean
default:"true"
Give higher scores to exact substring matches
standard_concept
string
Filter by standard concept status: S, N, C
include_invalid
boolean
default:"false"
Include invalid/deprecated concepts
include_synonyms
boolean
default:"true"
Search within concept synonyms
include_descriptions
boolean
default:"false"
Search within concept descriptions
language
string
default:"en"
Language for linguistic processing (ISO 639-1 code)
sort_by
string
default:"relevance"
Sort order for results
Options: relevance, alphabetical, concept_id, vocabulary
page
integer
default:"1"
Page number for pagination
page_size
integer
default:"20"
Number of results per page (max 100)
vocab_release
string
Specific vocabulary release version (defaults to latest)

Response

success
boolean
Indicates if the request was successful
data
object
meta
object
curl -X GET "https://api.omophub.com/v1/search/fuzzy?query=diabetis&vocabulary_ids=SNOMED,ICD10CM&fuzzy_threshold=0.8&edit_distance=2&include_phonetic=true" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json"
{
  "success": true,
  "data": {
    "query": "diabetis",
    "fuzzy_parameters": {
      "algorithm": "hybrid",
      "fuzzy_threshold": 0.8,
      "edit_distance": 2,
      "include_phonetic": true,
      "case_sensitive": false
    },
    "search_statistics": {
      "total_candidates": 15847,
      "fuzzy_matches": 23,
      "phonetic_matches": 8,
      "exact_substring_matches": 2,
      "processing_time_ms": 1247
    },
    "concepts": [
      {
        "concept_id": 201826,
        "concept_name": "Type 2 diabetes mellitus",
        "concept_code": "44054006",
        "vocabulary_id": "SNOMED",
        "vocabulary_name": "Systematized Nomenclature of Medicine Clinical Terms",
        "domain_id": "Condition",
        "concept_class_id": "Clinical Finding",
        "standard_concept": "S",
        "fuzzy_score": 0.89,
        "match_details": {
          "matched_term": "diabetes mellitus",
          "match_type": "fuzzy",
          "edit_distance": 1,
          "similarity_score": 0.89,
          "phonetic_match": false,
          "match_position": "start",
          "character_differences": [
            {
              "position": 7,
              "expected": "s",
              "found": ""
            }
          ]
        },
        "alternative_spellings": [
          "diabetes",
          "diabetis",
          "diabetees"
        ],
        "confidence_level": "High"
      },
      {
        "concept_id": 201820,
        "concept_name": "Diabetes mellitus", 
        "concept_code": "73211009",
        "vocabulary_id": "SNOMED",
        "vocabulary_name": "Systematized Nomenclature of Medicine Clinical Terms",
        "domain_id": "Condition",
        "concept_class_id": "Clinical Finding",
        "standard_concept": "S",
        "fuzzy_score": 0.92,
        "match_details": {
          "matched_term": "diabetes",
          "match_type": "fuzzy",
          "edit_distance": 1,
          "similarity_score": 0.92,
          "phonetic_match": false,
          "match_position": "full",
          "character_differences": [
            {
              "position": 7,
              "expected": "s",
              "found": ""
            }
          ]
        },
        "alternative_spellings": [
          "diabetes",
          "diabetis",
          "diabetic"
        ],
        "confidence_level": "High"
      },
      {
        "concept_id": 435216,
        "concept_name": "Type 2 diabetes mellitus",
        "concept_code": "E11",
        "vocabulary_id": "ICD10CM",
        "vocabulary_name": "International Classification of Diseases, Tenth Revision, Clinical Modification",
        "domain_id": "Condition",
        "concept_class_id": "3-char billing code",
        "standard_concept": "S",
        "fuzzy_score": 0.87,
        "match_details": {
          "matched_term": "diabetes mellitus",
          "match_type": "fuzzy",
          "edit_distance": 1,
          "similarity_score": 0.87,
          "phonetic_match": false,
          "match_position": "start",
          "character_differences": [
            {
              "position": 7,
              "expected": "s",
              "found": ""
            }
          ]
        },
        "alternative_spellings": [
          "diabetes",
          "diabetis"
        ],
        "confidence_level": "High"
      },
      {
        "concept_id": 4048098,
        "concept_name": "Diabetic diet",
        "concept_code": "160670007",
        "vocabulary_id": "SNOMED",
        "vocabulary_name": "Systematized Nomenclature of Medicine Clinical Terms",
        "domain_id": "Observation",
        "concept_class_id": "Regime/therapy",
        "standard_concept": "S",
        "fuzzy_score": 0.83,
        "match_details": {
          "matched_term": "diabetic",
          "match_type": "fuzzy",
          "edit_distance": 2,
          "similarity_score": 0.83,
          "phonetic_match": true,
          "match_position": "start",
          "character_differences": [
            {
              "position": 7,
              "expected": "s",
              "found": ""
            },
            {
              "position": 8,
              "expected": "ic",
              "found": ""
            }
          ]
        },
        "alternative_spellings": [
          "diabetic",
          "diabetis",
          "diabetig"
        ],
        "confidence_level": "Medium"
      }
    ],
    "suggestions": {
      "did_you_mean": [
        "diabetes",
        "diabetic",
        "diabetes mellitus"
      ],
      "alternative_queries": [
        "diabetes mellitus",
        "diabetic condition",
        "blood sugar disorder",
        "hyperglycemia"
      ],
      "common_misspellings": [
        {
          "misspelling": "diabetis",
          "correction": "diabetes",
          "frequency": "common"
        },
        {
          "misspelling": "diabetees",
          "correction": "diabetes",
          "frequency": "rare"
        },
        {
          "misspelling": "diabetess",
          "correction": "diabetes",
          "frequency": "uncommon"
        }
      ]
    }
  },
  "meta": {
    "request_id": "req_fuzzy_search_123",
    "timestamp": "2024-01-15T10:30:00Z",
    "algorithm_version": "fuzzy_v2.1.0",
    "pagination": {
      "page": 1,
      "page_size": 20,
      "total_items": 23,
      "total_pages": 2,
      "has_next": true,
      "has_previous": false
    },
    "vocab_release": "2025.2"
  }
}

Usage Examples

Search with a misspelled term:
curl -X GET "https://api.omophub.com/v1/search/fuzzy?query=pnemonia&domains=Condition" \
  -H "Authorization: Bearer YOUR_API_KEY"

Strict Fuzzy Matching

Use higher threshold for more precise matches:
curl -X GET "https://api.omophub.com/v1/search/fuzzy?query=appendisitis&fuzzy_threshold=0.85&edit_distance=1" \
  -H "Authorization: Bearer YOUR_API_KEY"
Include phonetic matching for pronunciation variations:
curl -X GET "https://api.omophub.com/v1/search/fuzzy?query=serkosis&algorithm=hybrid&include_phonetic=true&vocabulary_ids=SNOMED" \
  -H "Authorization: Bearer YOUR_API_KEY"
Search across multiple vocabularies:
curl -X GET "https://api.omophub.com/v1/search/fuzzy?query=diabetis&vocabulary_ids=SNOMED,ICD10CM,LOINC&fuzzy_threshold=0.7" \
  -H "Authorization: Bearer YOUR_API_KEY"
Enable case sensitivity for specific use cases:
curl -X GET "https://api.omophub.com/v1/search/fuzzy?query=DNA&case_sensitive=true&include_abbreviations=true" \
  -H "Authorization: Bearer YOUR_API_KEY"

Algorithm Comparison

Try different algorithms for optimal results:
# Levenshtein distance
curl -X GET "https://api.omophub.com/v1/search/fuzzy?query=miokardial&algorithm=levenshtein&edit_distance=3" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Jaro-Winkler similarity
curl -X GET "https://api.omophub.com/v1/search/fuzzy?query=miokardial&algorithm=jaro_winkler&fuzzy_threshold=0.8" \
  -H "Authorization: Bearer YOUR_API_KEY"

Fuzzy Matching Algorithms

Levenshtein Distance

  • Description: Counts minimum single-character edits (insertions, deletions, substitutions)
  • Best For: Simple typos, OCR errors
  • Performance: Fast
  • Example: “diabetis” → “diabetes” (1 deletion)

Jaro-Winkler Similarity

  • Description: Considers character matches and transpositions, with prefix boost
  • Best For: Names, transposed characters
  • Performance: Moderate
  • Example: “pnemonia” → “pneumonia” (transposition + insertion)

Soundex

  • Description: Phonetic algorithm based on English pronunciation
  • Best For: Pronunciation variations, names
  • Performance: Very fast
  • Example: “cirosis” → “cirrhosis” (similar sound)

Metaphone

  • Description: Advanced phonetic algorithm with better accuracy than Soundex
  • Best For: Complex phonetic variations
  • Performance: Moderate
  • Example: “nefritis” → “nephritis” (phonetic similarity)

Hybrid Algorithm

  • Description: Combines multiple algorithms with weighted scoring
  • Best For: General-purpose fuzzy search
  • Performance: Moderate to slow (most comprehensive)
  • Example: Uses all methods and selects best matches

Match Types and Scoring

Edit Distance Scoring

  • Distance 1: Single character difference (high confidence)
  • Distance 2: Two character differences (medium confidence)
  • Distance 3+: Multiple differences (lower confidence)

Similarity Score Interpretation

  • 0.9-1.0: Very high similarity (likely correct match)
  • 0.8-0.89: High similarity (probable match)
  • 0.7-0.79: Moderate similarity (possible match)
  • 0.6-0.69: Low similarity (uncertain match)
  • Below 0.6: Very low similarity (unlikely match)

Match Position Impact

  • Full Match: Entire query matches concept name
  • Start Match: Query matches beginning of concept name
  • End Match: Query matches end of concept name
  • Middle Match: Query matches substring within concept name

Common Medical Misspellings

Cardiovascular Terms

  • “miokardial” → “myocardial”
  • “arteriosklerosis” → “arteriosclerosis”
  • “hipertension” → “hypertension”

Respiratory Terms

  • “pnemonia” → “pneumonia”
  • “asma” → “asthma”
  • “bronkitis” → “bronchitis”

Endocrine Terms

  • “diabetis” → “diabetes”
  • “hiperglycemia” → “hyperglycemia”
  • “tiroid” → “thyroid”

Neurological Terms

  • “serebral” → “cerebral”
  • “epilepsey” → “epilepsy”
  • “demenshia” → “dementia”

Gastrointestinal Terms

  • “appendisitis” → “appendicitis”
  • “gastroenterits” → “gastroenteritis”
  • “kolitis” → “colitis”

Performance Considerations

Query Optimization

  • Short Queries: Use edit distance limits (1-2)
  • Long Queries: Increase fuzzy threshold (0.8+)
  • Multiple Words: Enable word order flexibility

Algorithm Selection

  • Speed Priority: Use Levenshtein or Soundex
  • Accuracy Priority: Use Hybrid algorithm
  • Phonetic Focus: Use Metaphone or Soundex
  • Balanced Approach: Use Jaro-Winkler

Resource Management

  • Large Vocabularies: Increase fuzzy threshold
  • Real-time Applications: Limit edit distance
  • Batch Processing: Use lower thresholds for recall
I