Data ETL

The most important impact the Standardized Vocabularies have on the ETL process from raw to CDM-formatted data is the Domain of each Concept. Irrespective from which source table a record comes, or what coding scheme it is represented by, the destination table will be determined by the domain_id of the respective Concept. Any ETL will have to follow the following logic for process every record in the source data:

domain_idCDM tableFieldComment
GenericAnyAnyGeneric Concepts can be in any field that ends in concept_id.
GenderPERSONgender_concept_id
RacePERSONrace_concept_id
EthnicityPERSONethnicity_concept_id
VisitVISIT_OCCURRENCEvisit_concept_id
ProcedurePROCEDURE_OCCURRENCEprocedure_concept_id
ModifierPROCEDURE_OCCURRENCEmodifier_concept_id
DrugDRUG_EXPOSUREdrug_concept_id
RouteDRUG_EXPOSUREroute_concept_id
UnitMEASUREMENT or OBSERVATION or SPECIMENunit_concept_idUnits are used in different contexts. *
DeviceDEVICE_EXPOSUREdevice_concept_id
ConditionCONDITION_OCCURRENCEcondition_concept_id
MeasurementMEASUREMENTmeasurement_concept_id
Meas Value OperatorMEASUREMENToperator_concept_id
Meas ValueMEASUREMENTvalue_as_concept_id
ObservationOBSERVATIONobservation_concept_id
RelationshipFACT_RELATIONSHIPrelationship_concept_id
Place of ServiceCARE_SITEplace_of_service_concept_id
Provider SpecialtyPROVIDERspecialty_concept_id
CurrencyMEASUREMENT or OBSERVATION or SPECIMENcurrency_concept_idCurrency values appear in any of the *_COST tables. *
Revenue CodePROCEDURE_COSTrevenue_code_concept_id
SpecimenSPECIMENspecimen_concept_id
Spec Anatomic SiteSPECIMEN or MEASUREMENT or OBSERVATIONanatomic_site_concept_id or value_as_concept_id or value_as_concept_idAnatomical Site Concepts are used to characterize the origin of a Specimen, but also the result of a Measurement or Observation. *
Spec Disease StatusSPECIMENdisease_status_concept_id

* If there is more than one potential destination table the ETL needs to identify the context in which a Standard Concept is used, and select the right table from this table.

If in the mapping process a Concept of the “Domain” or “Metadata” Domains are retrieved, an error has snuck into the mapping table. Please report those cases in the CDM-Builder.

The same is true if a Type Concept Domain, like “Obs Period Type”, “Death Type”, “Visit Type”, “Procedure Type”, etc. is produced. Though these Type Concepts are valid concepts and have to be placed into the <domain>_type_concept_id field of the respective CDM tables, they cannot be the result of the mapping process. They denote the origin of the record and the selection of Type Concepts should be hard-wired into the ETL process.