Skip to main content
This guide walks through the complete path from FHIR-coded clinical data to populated OMOP CDM tables using OMOPHub. It covers vocabulary resolution, domain assignment, standard concept mapping, and CDM table placement - the four steps that every FHIR-to-OMOP transformation pipeline has to solve. If you’re looking for a resource-by-resource cookbook (Condition → condition_occurrence, Observation → measurement, MedicationStatement → drug_exposure, and so on), see the FHIR Integration guide. This page focuses on the vocabulary standardization layer that sits at the heart of any FHIR-to-OMOP pipeline.

1. Why Vocabulary Resolution Is the Hard Part

Converting FHIR resources to OMOP CDM tables is not primarily a schema transformation problem. The structural mapping - which FHIR fields go into which OMOP columns - is well-documented in the HL7 FHIR-to-OMOP IG. The hard part is vocabulary standardization: taking the coded clinical concepts in your FHIR data and resolving them to the correct OMOP standard concepts. This is hard because:
  • FHIR CodeableConcept fields can contain codes from any vocabulary system - SNOMED CT, ICD-10-CM, LOINC, RxNorm, local hospital codes, or several simultaneously
  • OMOP requires a specific *_concept_id column in each clinical table to point at a standard concept, and the vocabulary domain determines which CDM table the record belongs to
  • The same clinical idea can appear as different codes in different systems, and a single ICD-10 code can map to multiple SNOMED concepts
  • Some FHIR codes are already standard OMOP concepts (most SNOMED codes), while others need mapping via Maps to relationships (ICD-10, NDC, local codes)
OMOPHub handles this resolution layer so your pipeline doesn’t have to maintain a local vocabulary database. See Why OMOPHub vs Self-Hosting for the comparison against hosting ATHENA yourself.

2. The Four-Step Flow

Every FHIR-to-OMOP vocabulary resolution follows the same pattern.
1

Extract the coded concept from the FHIR resource

A FHIR resource contains one or more CodeableConcept or Coding elements. For example, a FHIR Condition resource might contain:
{
  "resourceType": "Condition",
  "code": {
    "coding": [
      {
        "system": "http://snomed.info/sct",
        "code": "44054006",
        "display": "Type 2 diabetes mellitus"
      },
      {
        "system": "http://hl7.org/fhir/sid/icd-10-cm",
        "code": "E11.9",
        "display": "Type 2 diabetes mellitus without complications"
      }
    ]
  }
}
This Condition has two codings for the same clinical idea: one in SNOMED CT, one in ICD-10-CM.
2

Resolve to the OMOP standard concept

Send the codings to OMOPHub’s Concept Resolver. It handles vocabulary identification, concept lookup, Maps to traversal, and OHDSI vocabulary preference ranking in a single call:
curl -X POST https://api.omophub.com/v1/fhir/resolve/codeable-concept \
  -H "Authorization: Bearer oh_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "coding": [
      { "system": "http://snomed.info/sct", "code": "44054006" },
      { "system": "http://hl7.org/fhir/sid/icd-10-cm", "code": "E11.9" }
    ],
    "resource_type": "Condition"
  }'
The Resolver returns a best_match based on OHDSI vocabulary preference (SNOMED > RxNorm > LOINC > CVX > ICD-10 for conditions), plus alternatives and unresolved arrays. The important payload is nested under best_match.resolution:
{
  "data": {
    "best_match": {
      "resolution": {
        "source_concept": {
          "concept_id": 45576876,
          "concept_code": "44054006",
          "concept_name": "Type 2 diabetes mellitus",
          "vocabulary_id": "SNOMED",
          "standard_concept": "S"
        },
        "standard_concept": {
          "concept_id": 201826,
          "concept_name": "Type 2 diabetes mellitus",
          "vocabulary_id": "SNOMED",
          "domain_id": "Condition",
          "concept_class_id": "Clinical Finding",
          "standard_concept": "S"
        },
        "mapping_type": "direct",
        "target_table": "condition_occurrence",
        "domain_resource_alignment": "aligned"
      }
    },
    "alternatives": [
      {
        "resolution": {
          "source_concept": {
            "concept_id": 45576876,
            "vocabulary_id": "ICD10CM",
            "concept_code": "E11.9"
          },
          "standard_concept": {
            "concept_id": 201826,
            "concept_name": "Type 2 diabetes mellitus",
            "domain_id": "Condition"
          },
          "mapping_type": "mapped"
        }
      }
    ],
    "unresolved": []
  }
}
Key things to notice:
  • Both the SNOMED code (already standard) and the ICD-10 code (mapped via Maps to) resolve to the same standard concept: 201826
  • The SNOMED coding wins as best_match because SNOMED is the preferred vocabulary for the Condition domain in OHDSI conventions
  • mapping_type tells you what happened: direct (source was already standard), mapped (followed Maps to), semantic_match (fell back to text-based search), or unmapped
  • target_table tells you which CDM table the record belongs to - computed from the standard concept’s domain, not from the FHIR resource type
  • domain_resource_alignment is aligned when the FHIR resource_type you declared matches the concept’s OMOP domain - useful as a sanity signal for mis-coded data
3

Read the domain assignment

The domain_id in the standard concept determines which OMOP CDM table the record belongs to. This is critical, and it’s vocabulary-driven, not FHIR-resource-driven.
domain_idTarget OMOP CDM table
Conditioncondition_occurrence
Drugdrug_exposure
Measurementmeasurement
Observationobservation
Procedureprocedure_occurrence
Devicedevice_exposure
Specimenspecimen
Visitvisit_occurrence
The FHIR resource type does NOT always determine the OMOP CDM table. A FHIR Observation resource carrying a blood glucose measurement (LOINC) maps to the measurement table, not observation. A FHIR Condition resource carrying a lab-derived finding might map to measurement. Always use the vocabulary domain (from standard_concept.domain_id, or read resolution.target_table directly) for table assignment.
4

Populate the CDM table row

With the standard concept resolved and the target table identified, populate the CDM row:
INSERT INTO condition_occurrence (
  person_id,
  condition_concept_id,        -- 201826 (from standard_concept.concept_id)
  condition_start_date,         -- from FHIR Condition.onsetDateTime
  condition_type_concept_id,    -- 32817 (EHR) or per your convention
  condition_source_value,       -- "44054006" (original source code)
  condition_source_concept_id   -- 45576876 (from source_concept.concept_id)
) VALUES (
  :person_id,
  201826,
  :onset_date,
  32817,
  '44054006',
  45576876
);

3. Working with the Python SDK

The same four-step flow, in Python:
import omophub

client = omophub.OMOPHub()

# Step 1: Extract codings from your FHIR resource (you parse the JSON)
fhir_condition = {
    "coding": [
        {"system": "http://snomed.info/sct", "code": "44054006"},
        {"system": "http://hl7.org/fhir/sid/icd-10-cm", "code": "E11.9"},
    ],
}

# Step 2: Resolve
result = client.fhir.resolve_codeable_concept(
    coding=fhir_condition["coding"],
    resource_type="Condition",
)
res = result["best_match"]["resolution"]

# Step 3: Read domain and table assignment
print(f"Standard concept: {res['standard_concept']['concept_id']} ({res['standard_concept']['concept_name']})")
print(f"Domain:           {res['standard_concept']['domain_id']}")
print(f"Target table:     {res['target_table']}")
print(f"Mapping type:     {res['mapping_type']}")

# Step 4: Use the resolved values in your ETL
# insert into the appropriate CDM table with the standard + source concept IDs

4. Batch Processing: The ETL Pattern

In a real ETL pipeline you’re processing thousands of FHIR resources. Don’t resolve one at a time - use the batch endpoint. Deduplicate first, then batch-resolve the unique codings in chunks of 100:
import omophub
import json

client = omophub.OMOPHub()

# Load a FHIR Bundle (e.g., from a Bulk FHIR export)
with open("fhir_bundle.json") as f:
    bundle = json.load(f)

# Step 1: Extract all unique codings across all resources in the bundle
all_codings = set()
for entry in bundle["entry"]:
    resource = entry["resource"]
    if resource["resourceType"] == "Condition" and "code" in resource:
        for c in resource["code"].get("coding", []):
            all_codings.add((c["system"], c["code"]))
    elif resource["resourceType"] == "MedicationRequest":
        med = resource.get("medicationCodeableConcept", {})
        for c in med.get("coding", []):
            all_codings.add((c["system"], c["code"]))
    # ... handle other resource types

print(f"Total resources: {len(bundle['entry']):,}")
print(f"Unique codings:  {len(all_codings):,}")

# Step 2: Batch resolve unique codings (100 per call)
codings_list = [{"system": s, "code": c} for s, c in all_codings]
cache = {}

for i in range(0, len(codings_list), 100):
    chunk = codings_list[i : i + 100]
    result = client.fhir.resolve_batch(chunk)
    for item in result["results"]:
        if "resolution" in item:
            src = item["resolution"]["source_concept"]
            cache[(src["vocabulary_id"], src["concept_code"])] = item["resolution"]
        else:
            # Failed coding - log for manual review
            print(f"  Failed: {item['error']['code']} - {item['error']['message']}")

# Steps 3-4: Apply the cache to every row in your full dataset
# (pandas merge / SQL JOIN / dict lookup, depending on your pipeline)
The deduplication step is critical. A FHIR Bulk Export with 500,000 Condition resources might contain only 2,000 unique diagnosis codes. Map the 2,000, then join against the full dataset. See Batch & Performance for the full pattern.

5. Using the FHIR R4 Terminology Service

If your pipeline already speaks FHIR (you’re integrating with HAPI FHIR or EHRbase, or you’re building a spec-conformant client), you can use OMOPHub’s FHIR R4 terminology operations instead of the REST resolver: $lookup - get concept details for a code:
curl "https://fhir.omophub.com/fhir/r4/CodeSystem/\$lookup?\
system=http://snomed.info/sct&code=44054006" \
  -H "Authorization: Bearer oh_your_api_key"
$translate - map between vocabularies:
curl "https://fhir.omophub.com/fhir/r4/ConceptMap/\$translate?\
system=http://hl7.org/fhir/sid/icd-10-cm&code=E11.9&\
target=http://snomed.info/sct" \
  -H "Authorization: Bearer oh_your_api_key"
$validate-code - check if a code exists:
curl "https://fhir.omophub.com/fhir/r4/CodeSystem/\$validate-code?\
url=http://snomed.info/sct&code=44054006" \
  -H "Authorization: Bearer oh_your_api_key"
These return standard FHIR Parameters responses and can be consumed directly by FHIR-aware clients. See the FHIR Terminology Service overview for the full operation reference.
Concept Resolver vs FHIR Terminology Service - when to use which:Use the Concept Resolver (/v1/fhir/resolve*) when you want the complete OMOP mapping chain in one call - source concept, standard concept, domain, target CDM table, mapping type. This is purpose-built for ETL pipelines and returns OMOPHub’s native JSON envelope.Use the FHIR R4 Terminology Service (/fhir/r4/*) when you’re integrating with FHIR infrastructure (HAPI FHIR, EHRbase, Firely) that expects spec-conformant FHIR operations and OperationOutcome error responses.Both use the same underlying vocabulary data. The difference is response format and how much resolution logic the server hands you in a single call.

6. Handling Edge Cases

One-to-many mappings

A single ICD-10 code can map to multiple SNOMED standard concepts (e.g. a combination diagnosis that splits into separate Condition and Observation concepts). The single-coding resolver returns alternative_standard_concepts alongside the primary standard_concept. For ETL pipelines, inspect that array and decide whether to write one row per alternative or pick the highest-quality match.

Unmapped codes

If a code doesn’t map to any standard concept, the Resolver returns mapping_type: "unmapped" with no standard_concept. Your pipeline should:
  1. Store the source code in the *_source_value field
  2. Set *_concept_id to 0 (OMOP convention for unmapped)
  3. Log the unmapped code for manual review - don’t silently drop records

Local / proprietary codes

Hospital-specific codes with custom FHIR system URIs (e.g. http://hospital.local/codes) won’t be in the OMOP vocabulary tables. Two options:
  • Pass a display value with no system/code. The Resolver falls back to semantic search over the display text, scoped to the resource_type domain, and returns mapping_type: "semantic_match" with a similarity_score.
  • Pre-map your local codes to standard vocabularies as part of your site configuration, then send standard codes to the Resolver. See Collaborative Mapping for the shared-mapping-file pattern.

7. The Complete Pipeline Architecture

┌──────────────┐     ┌──────────────────┐     ┌──────────────────┐
│  FHIR Source │     │     OMOPHub      │     │     OMOP CDM     │
│              │     │                  │     │                  │
│ Condition    │     │ 1. Deduplicate   │     │ condition_       │
│ Observation  │     │    unique codes  │     │   occurrence     │
│ Medication   │────▶│                  │────▶│ measurement      │
│ Procedure    │     │ 2. Batch resolve │     │ drug_exposure    │
│ ...          │     │    via Concept   │     │ procedure_       │
│              │     │    Resolver      │     │   occurrence     │
│              │     │                  │     │ observation      │
│              │     │ 3. Cache results │     │ ...              │
│              │     │                  │     │                  │
│              │     │ 4. Apply to full │     │                  │
│              │     │    dataset       │     │                  │
└──────────────┘     └──────────────────┘     └──────────────────┘
The first ETL run hits the Resolver most. Every subsequent run hits it less, because the mapping cache grows and only genuinely new codes need resolution. By the third or fourth run, the bottleneck stops being API calls and starts being whatever else your pipeline is doing.

FHIR Integration Cookbook

Resource-by-resource mappings: Condition, Observation, MedicationRequest, Procedure, and more, with the exact Coding extraction logic for each.

FHIR Terminology Service

Full operation reference for the FHIR R4 Terminology Service: $lookup, $translate, $validate-code, $expand, $subsumes, $find-matches, $closure, $diff.

Lean ETL Mapping Cache

Build validated mapping caches during development and apply them locally at production speed.

Collaborative Mapping

Share mappings across teams via source_to_concept_map files.

Batch & Performance

Deduplication, batch endpoints, cache patterns for ETL at scale.

Known Limitations

What OMOPHub does not do. FHIR-specific caveats, vocabulary exclusions, and what’s on the roadmap.