Batch & Performance Best Practices

If you’re processing thousands or millions of records through OMOPHub, how you structure your API calls matters. This guide covers patterns that reduce your API usage, improve throughput, and keep your pipelines fast.

1. Rule 1: Deduplicate Before You Map

This is the single most impactful optimization. A dataset with 10 million patient records might contain only 3,000 unique diagnosis codes. Map the 3,000, not the 10 million.

Python

import omophub
import pandas as pd

client = omophub.OMOPHub()

# Load your source data
df = pd.read_csv("patient_diagnoses.csv")
print(f"Total rows: {len(df):,}")             # 10,000,000

# Extract unique codes
unique_codes = df["icd10_code"].dropna().unique().tolist()
print(f"Unique codes: {len(unique_codes):,}") # 2,847

# Batch-resolve the unique codes - not the 10 million rows
result = client.mappings.map(
    target_vocabulary="SNOMED",
    source_codes=[
        {"vocabulary_id": "ICD10CM", "concept_code": code}
        for code in unique_codes
    ],
)

# Build a {source_code: target_concept_id} cache
mapping_cache = {}
for m in result["data"]["mappings"]:
    mapping_cache[m["source_concept_code"]] = m["target_concept_id"]

# Apply the cache to all 10M rows - zero additional API calls
df["snomed_concept_id"] = df["icd10_code"].map(mapping_cache)

This turns 10 million API calls into ~30 calls (2,847 codes / 100 items per batch = 29 batches, each counting as one call). Same result, ~350,000x fewer calls.

2. Rule 2: Use Batch Endpoints

OMOPHub provides batch and bulk endpoints that process multiple items in a single HTTP request. Each batch request counts as one API call, regardless of how many items are in the batch. Available batch and bulk endpoints:

Endpoint	Purpose
`POST /v1/concepts/batch`	Retrieve up to 100 concepts by ID
`POST /v1/concepts/map/batch`	Run up to 50 mapping requests, each mapping a list of OMOP concept IDs (`source_concepts`) to a target vocabulary. Source codes are not accepted here - use `POST /v1/concepts/map`, which takes `source_codes`
`POST /v1/concepts/hierarchy/batch`	Up to 100 ancestor / descendant lookups
`POST /v1/concepts/relationships/batch`	Batch relationship queries
`POST /v1/search/bulk`	Run multiple search queries in one request
`POST /v1/search/semantic-bulk`	Batch semantic search with embeddings
`POST /v1/fhir/resolve/batch`	FHIR Resolver batch: up to 100 codings per request

Python

# Instead of this (N API calls):
for concept_id in concept_ids:
    result = client.concepts.get(concept_id)

# Do this (1 API call per 100 concepts):
result = client.concepts.batch(concept_ids[:100])
for concept in result["data"]["concepts"]:
    print(f"{concept['concept_id']}: {concept['concept_name']}")

Batch endpoints accept up to 100 items per request. For larger inputs, chunk into groups of 100 and send the chunks sequentially or with controlled concurrency. Each chunk counts as one API call against your quota.

3. Rule 3: Cache What’s Stable

Vocabulary data changes only when OHDSI publishes a new ATHENA release (typically every 2 to 3 months). Concept metadata you look up today will return the same result tomorrow, next week, and next month until the next release. Design your caching accordingly:

Python

import json
from pathlib import Path

CACHE_FILE = Path("vocabulary_cache.json")

def get_with_cache(client, concept_id):
    cache = json.loads(CACHE_FILE.read_text()) if CACHE_FILE.exists() else {}

    key = str(concept_id)
    if key in cache:
        return cache[key]

    # Cache miss, fetch from API
    result = client.concepts.get(concept_id)
    cache[key] = result
    CACHE_FILE.write_text(json.dumps(cache))
    return result

For production ETL pipelines, consider:

File cache (JSON, SQLite) for single-machine pipelines
Redis cache with a TTL aligned to your vocabulary update cycle (60-90 days)
Database table (source_to_concept_map) for team-wide shared caches. See the Collaborative Mapping guide.

The Lean ETL Mapping Cache guide walks through the full pattern: build a validated mapping cache with OMOPHub during development, apply it with local lookups in production.

4. Rule 4: Use the Right Search Endpoint

Different search endpoints have different characteristics. Use the most specific one for your use case.

Endpoint	Best for
`GET /v1/concepts/{concept_id}`	Direct lookup when you already have the OMOP concept ID
`GET /v1/concepts/by-code`	Lookup by vocabulary code (e.g. ICD-10 `E11.9`) when you know the vocabulary
`GET /v1/search/concepts`	Keyword / full-text search with filters
`GET /v1/search/autocomplete`	Prefix matching for search-as-you-type UIs
`GET /v1/search/semantic`	Natural-language or fuzzy matching via embeddings

If you have the concept ID, don’t search by name. If you have the exact vocabulary code, use by-code instead of text search. Save semantic search for when the user query is ambiguous or phrased in natural language - it handles synonyms, abbreviations, and clinical descriptions, but is meaningfully slower than keyword search.

5. Rule 5: Build Autocomplete Responsibly

If you’re powering a search-as-you-type UI with OMOPHub: Debounce aggressively. Don’t fire an API call on every keystroke. Wait 300-500ms after the user stops typing before sending the request. Use the autocomplete endpoint. GET /v1/search/autocomplete is optimized for prefix matching and returns faster than full search. Set a minimum query length. Don’t search for single characters. Require at least 3 characters before triggering a search. Cache recent results client-side. If the user types “diab”, gets results, then types “diabe”, the “diab” result set is a superset of “diabe” matches and can be filtered locally without another request.

TypeScript

import { OMOPHub } from '@omophub/omophub-node';

// Pseudocode for responsible autocomplete
const client = new OMOPHub();
const DEBOUNCE_MS = 400;
const MIN_CHARS = 3;
const cache = new Map<string, unknown>();
let debounceTimer: ReturnType<typeof setTimeout>;

function onSearchInput(query: string) {
  clearTimeout(debounceTimer);
  if (query.length < MIN_CHARS) return;

  debounceTimer = setTimeout(async () => {
    if (cache.has(query)) return showResults(cache.get(query));

    const { data } = await client.search.autocomplete(query);
    cache.set(query, data);
    showResults(data);
  }, DEBOUNCE_MS);
}

6. Rule 6: Limit Hierarchy Depth

Concept hierarchies can be deep. SNOMED Is a trees sometimes traverse 20+ levels for highly specialized terms. For most phenotype definitions, 3 to 5 levels is plenty.

Python

# Inefficient: unlimited depth
ancestors = client.hierarchy.ancestors(201826)

# Efficient: clinically relevant depth
ancestors = client.hierarchy.ancestors(
    201826,
    max_levels=5,
    vocabulary_ids=["SNOMED"],       # narrow by vocabulary
    relationship_types=["Is a"],     # narrow by relationship type
)

max_levels matters for both latency and for the number of concepts you pay to pull back. Cap it at the clinical depth you actually need.

7. Rule 7: Handle Errors and Retries

OMOPHub returns standard HTTP status codes. Build retry logic for transient failures only:

Status	Meaning	Action
`200`	Success	Process response
`400`	Bad request	Fix your request, do not retry
`401`	Unauthorized	Check your API key
`404`	Not found	Concept or code does not exist, do not retry
`429`	Rate limited	Back off and retry after the `retry-after` header
`5xx`	Server error	Retry with exponential backoff

Python

import time

def api_call_with_retry(fn, max_retries=3):
    for attempt in range(max_retries):
        try:
            return fn()
        except omophub.RateLimitError as e:
            wait = getattr(e, "retry_after", None) or (2 ** attempt)
            time.sleep(wait)
        except omophub.ServerError:
            time.sleep(2 ** attempt)
    raise RuntimeError("Max retries exceeded")

Never retry 400-level errors (except 429). A 400 means your request is malformed; retrying will produce the same result. A 404 means the concept or code does not exist in the vocabulary, and your code should flag it for manual review, not loop.

8. Putting It All Together: ETL Pipeline Pattern

Here’s the recommended shape for a production ETL pipeline:

Extract unique source codes

Pull a distinct list of codes from your source data. This is your mapping input, not the full patient dataset.

Check your local cache

Before hitting the API, check if you’ve already mapped each code in a previous run. Load your source_to_concept_map cache.

Batch-resolve the cache misses

For codes not in the cache, use the batch mapping endpoint (POST /v1/concepts/map/batch) or the FHIR Resolver batch (POST /v1/fhir/resolve/batch). Chunk into groups of 100.

Update your local cache

Write the new mappings back to your cache file or database table.

Apply mappings to the full dataset

Join the mapping cache against your source data via local lookup (pandas merge, SQL JOIN, dict lookup). No API calls needed for this step.

Flag unmapped codes

Any source code with no mapping gets flagged for manual review. Do not silently drop records.

This pattern means your first ETL run makes the most API calls, and every subsequent run makes fewer because the cache grows. By the third or fourth run, you’re hitting the API only for genuinely new codes.

For more on the full end-to-end pattern, see Lean ETL Mapping Cache and Collaborative Mapping.

​1. Rule 1: Deduplicate Before You Map

​2. Rule 2: Use Batch Endpoints

​3. Rule 3: Cache What’s Stable

​4. Rule 4: Use the Right Search Endpoint

​5. Rule 5: Build Autocomplete Responsibly

​6. Rule 6: Limit Hierarchy Depth

​7. Rule 7: Handle Errors and Retries

​8. Putting It All Together: ETL Pipeline Pattern

1. Rule 1: Deduplicate Before You Map

2. Rule 2: Use Batch Endpoints

3. Rule 3: Cache What’s Stable

4. Rule 4: Use the Right Search Endpoint

5. Rule 5: Build Autocomplete Responsibly

6. Rule 6: Limit Hierarchy Depth

7. Rule 7: Handle Errors and Retries

8. Putting It All Together: ETL Pipeline Pattern