This is an old revision of the document!
The most important impact the Standardized Vocabularies have on the ETL process from raw to CDM-formatted data is the Domain of each Concept. Irrespective from which source table a record comes, or what coding scheme it is represented by, the destination table will be determined by the domain_id of the respective Concept. Any ETL will have to follow the following logic for process every record in the source data:
domain_id | CDM table | Field | Comment |
---|---|---|---|
Generic | Any | Any | Generic Concepts can be in any field that ends in concept_id. |
Gender | PERSON | gender_concept_id | |
Race | PERSON | race_concept_id | |
Ethnicity | PERSON | ethnicity_concept_id | |
Visit | VISIT_OCCURRENCE | visit_concept_id | |
Procedure | PROCEDURE_OCCURRENCE | procedure_concept_id | |
Modifier | PROCEDURE_OCCURRENCE | modifier_concept_id | |
Drug | DRUG_EXPOSURE | drug_concept_id | |
Route | DRUG_EXPOSURE | route_concept_id | |
Unit | MEASUREMENT or OBSERVATION or SPECIMEN | unit_concept_id | Units are used in different contexts. * |
Device | DEVICE_EXPOSURE | device_concept_id | |
Condition | CONDITION_OCCURRENCE | condition_concept_id | |
Measurement | MEASUREMENT | measurement_concept_id | |
Meas Value Operator | MEASUREMENT | operator_concept_id | |
Meas Value | MEASUREMENT | value_as_concept_id | |
Observation | OBSERVATION | observation_concept_id | |
Relationship | FACT_RELATIONSHIP | relationship_concept_id | |
Place of Service | CARE_SITE | place_of_service_concept_id | |
Provider Specialty | PROVIDER | specialty_concept_id | |
Currency | VISIT_COST or PROCEDURE_COST or DRUG_COST or DEVICE_COST | currency_concept_id | Currency values appear in any of the *_COST tables. * |
Revenue Code | PROCEDURE_COST | revenue_code_concept_id | |
Specimen | SPECIMEN | specimen_concept_id | |
Spec Anatomic Site | SPECIMEN or MEASUREMENT or OBSERVATION | anatomic_site_concept_id or value_as_concept_id or value_as_concept_id | Anatomical Site Concepts are used to characterize the origin of a Specimen, but also the result of a Measurement or Observation. * |
Spec Disease Status | SPECIMEN | disease_status_concept_id |
* If there is more than one potential destination table the ETL needs to identify the context in which a Standard Concept is used, and select the right table from this table.
If the Domain of the destination Concept is “Domain” or “Metadata”, an error has occurred in the construction of the mapping table. Please report in the CDM-Builder forum if you believe there is such an error in the Standardized Vocabularies data.
The same is true if you find a Type Concept Domain, like “Obs Period Type”, “Death Type”, “Visit Type”, “Procedure Type”, etc.These Type Concepts are valid concepts and have to be placed into the respective *_type_concept_id field of the respective CDM tables. However, they should never be introduced as part of a mapping process, as there is no equivalent information in the source data, because they denote the origin of the record. Type Concepts are hard-wired in the ETL process, because it depends on the structure of the source data how to assign Type Concepts to records. Please also report if you find a situation like that.