The most important impact the Standardized Vocabularies have on the ETL process from raw to CDM-formatted data is the Domain of each Concept. Irrespective from which source table a record comes, or what coding scheme it is represented by, the destination table will be determined by the domain_id of the respective Concept. Any ETL will have to follow the following logic for process every record in the source data:
domain_id | CDM table | Field | Comment |
---|---|---|---|
Generic | Any | Any | Generic Concepts can be in any field that ends in concept_id. |
Gender | PERSON | gender_concept_id | |
Race | PERSON | race_concept_id | |
Ethnicity | PERSON | ethnicity_concept_id | |
Visit | VISIT_OCCURRENCE | visit_concept_id | |
Procedure | PROCEDURE_OCCURRENCE | procedure_concept_id | |
Modifier | PROCEDURE_OCCURRENCE | modifier_concept_id | |
Drug | DRUG_EXPOSURE | drug_concept_id | |
Route | DRUG_EXPOSURE | route_concept_id | |
Unit | MEASUREMENT or OBSERVATION or SPECIMEN | unit_concept_id | Units are used in different contexts. * |
Device | DEVICE_EXPOSURE | device_concept_id | |
Condition | CONDITION_OCCURRENCE | condition_concept_id | |
Measurement | MEASUREMENT | measurement_concept_id | |
Meas Value Operator | MEASUREMENT | operator_concept_id | |
Meas Value | MEASUREMENT | value_as_concept_id | |
Observation | OBSERVATION | observation_concept_id | |
Relationship | FACT_RELATIONSHIP | relationship_concept_id | |
Place of Service | CARE_SITE | place_of_service_concept_id | |
Provider Specialty | PROVIDER | specialty_concept_id | |
Currency | MEASUREMENT or OBSERVATION or SPECIMEN | currency_concept_id | Currency values appear in any of the *_COST tables. * |
Revenue Code | PROCEDURE_COST | revenue_code_concept_id | |
Specimen | SPECIMEN | specimen_concept_id | |
Spec Anatomic Site | SPECIMEN or MEASUREMENT or OBSERVATION | anatomic_site_concept_id or value_as_concept_id or value_as_concept_id | Anatomical Site Concepts are used to characterize the origin of a Specimen, but also the result of a Measurement or Observation. * |
Spec Disease Status | SPECIMEN | disease_status_concept_id |
* If there is more than one potential destination table the ETL needs to identify the context in which a Standard Concept is used, and select the right table from this table.
If in the mapping process a Concept of the “Domain” or “Metadata” Domains are retrieved, an error has snuck into the mapping table. Please report those cases in the CDM-Builder.
The same is true if a Type Concept Domain, like “Obs Period Type”, “Death Type”, “Visit Type”, “Procedure Type”, etc. is produced. Though these Type Concepts are valid concepts and have to be placed into the <domain>_type_concept_id field of the respective CDM tables, they cannot be the result of the mapping process. They denote the origin of the record and the selection of Type Concepts should be hard-wired into the ETL process.