User Tools

Site Tools


documentation:vocabulary:standard_classification_and_source_concepts

Standard, Classification and Source Concepts

Within a Domain, codes come from a number of Vocabularies, and their codes have often identical or overlapping meanings. To bring order to this situation, each of them is assigned one of three designations:

Standard Concept (standard_concept = 'S')

Standard Concept are the “official” Concepts that are to be used to represent a unique clinical entity in the Standardized Clinical Data Tables. Their Concept ID is written to the respective *_concept_id fields. Usually, Standard Concept are sourced from well-established Vocabularies that have a comprehensive coverage of the Domain and the Concepts are well-defined. For example, in the Condition Domain this is achieved through the SNOMED Vocabulary. If no comprehensive list of available entities is available in a certain Domain, such as the Device Domain, Standard Concepts come from a variety of different Vocabularies. Same is true for the Procedure Domain.

Classification Concepts (standard_concept = 'C')

These have a hierarchical relationship to Standard Concepts and can therefore be used to query for Standard Concepts using the records of the CONCEPT_ANCESTOR table. However, they themselves cannot appear in the Data Tables. For example, the MedDRA concept for “COPD” has hierarchical relationships to the Standard SNOMED-CT Concepts that are all forms of this disease. Likewise, the Concept 4283987 “ANTICOAGULANTS” of the Vocabulary “VA Class” cannot appear in the DRUG_EXPOSURE or DRUG_ERA tables, but its Descendant Concepts that have the Concept Class “Ingredient”, “Clinical Drug” or “Branded Drug” can.

Classification Concepts may be sourced from different Vocabularies than the Standard Concepts. Note that Classification Concepts are not unique. For example, there are Concepts for the Drug Class “Anticoagulants” coming from the NDF-RT, VA Class, ETC and ATC Vocabularies. Also note that the membership depends on the Vocabulary. In most cases the membership list of equivalent Classification Concepts are similar or identical, but the medical science does not provide a generally agreed upon standard definition of these classes.

Source Concepts (standard_concept = NULL)

These are all remaining Concepts that are not Standard or Classification Concepts. Note that Concepts can change their designation over time: if they are deprecated (valid_end_date is less than 2099-12-31 and invalid_reason = 'D' or 'U'), formerly Standard or Classification Concepts will turn into Source Concepts.

Source Concepts can only appear in the *_source_concept_id fields of the Data Tables. They represent the code in the source data. Each Source Concept is mapped to one or more Standard Concepts during the ETL process. If no mapping is available, the Standard Concept with the concept_id = 0 is written into the *_concept_id field. See Data ETL Section for details.

For all Concepts in a Domain, this creates the following logical structure:

Source Concepts are mapped through a mapping to Standard Concepts, and these have relationships of various semantic natures to each other. In addition, they have hierarchical relationships to each other and the Classification Concepts, in this case derived from the Vocabularies A, B and C. Hierarchical relationships amongst Classification Concepts generally are only happening between Concepts of the same Vocabulary.

All Concepts are stored in the CONCEPT table, all relationships in the CONCEPT_RELATIONSHP table and all generation-spanning hierarchical relationships in the CONCEPT_ANCESTOR table. The latter are only defined between Standard and Classification Concepts, Source Concepts do not participate in this hierarchical tree structure. ___

All elements in the Vocabularies have a representation as a Concept in the CONCEPT table. This results in three categories of Concepts:

  • Standard Concepts: A Standard Concept is the “official” Concept that is to be used to represent a unique clinical entity. They are designated as such in the field standard_concept = 'S'. Only Standard Concepts can appear in the *_concept_id or *_type_concept_id fields of the Standardized Clinical Data Tables of the CDM.
  • Classification Concepts (designated in standard_concept = 'C') have a hierarchical relationship to Standard Concepts and can therefore be used to query for Standard Concepts using the records of the CONCEPT_ANCESTOR table. However, they themselves cannot appear in the Data Tables. For example, the Concept 4283987 “ANTICOAGULANTS” of the Vocabulary “VA Class” cannot appear in the DRUG_EXPOSURE or DRUG_ERA tables, but its Descendant Concepts that have the Concept Class “Ingredient”, “Clinical Drug” or “Branded Drug” can.
  • Source Concepts can only appear in the *_source_concept_id fields of the Data Tables. They represent the code in the source data. Each Source Concept is mapped to one or more Standard Concepts during the ETL process. If no mapping is available, the Standard Concept with the concept_id = 0 is written into the *_concept_id field. See Data ETL Section for details.

The designation of a Concept as Standard, Classification or Source depends on the Vocabulary. See the specifications for each Vocabulary for details.

documentation/vocabulary/standard_classification_and_source_concepts.txt · Last modified: 2015/09/18 18:36 by cgreich