User Tools

Site Tools


implementation_international_drug_vocabulary

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
implementation_international_drug_vocabulary [2019/07/22 16:27]
ekorchmar
implementation_international_drug_vocabulary [2019/07/24 14:09]
ekorchmar
Line 3: Line 3:
  
 ==== Combined target structure ==== ==== Combined target structure ====
-Standard concepts in [[documentation:​vocabulary:​drug|Drug domain]] are placed in a single comprehensive hierarchy based on their attributes. To correctly implement a vocabulary in CDM and find or build an counterpart for each source drug concept, following attributes must be extracted:+Standard concepts in [[documentation:​vocabulary:​drug|Drug domain]] are placed in a single comprehensive hierarchy based on their attributes. To correctly implement a vocabulary in CDM and find or build an counterpart for each source drug concept, ​the following attributes must be extracted:
  
   - **Ingredients**:​ active substance(-s) in pharmacological preparation. Examples: Aspirin, Trastuzumab,​ Ibuprofen etc.;   - **Ingredients**:​ active substance(-s) in pharmacological preparation. Examples: Aspirin, Trastuzumab,​ Ibuprofen etc.;
Line 15: Line 15:
 Such attributes may be given explicitly in a well-structured source vocabulary or have to be extracted from drug product names. In case of drug products or discrete attributes like Ingredients or Brand Names not having their own codes, new codes have to be constructed combining the word “OMOP” and a running number. The running number should be unique across all vocabularies. That means, each time a new vocabulary is added or refreshed, the next Concept Code should be the one of the last (without the '​OMOP'​ string) +1.  Such attributes may be given explicitly in a well-structured source vocabulary or have to be extracted from drug product names. In case of drug products or discrete attributes like Ingredients or Brand Names not having their own codes, new codes have to be constructed combining the word “OMOP” and a running number. The running number should be unique across all vocabularies. That means, each time a new vocabulary is added or refreshed, the next Concept Code should be the one of the last (without the '​OMOP'​ string) +1. 
  
-All source vocabulary concepts, extracted attributes ​and dosage and packaging information must be staged in standardized format of input tables, that can processed to be included into CDM Standardized Vocabularies.+All source vocabulary concepts, extracted attributesdosage and packaging information must be staged in standardized format of input tables, that can processed to be included into OHDSI Standardized Vocabularies.
  
 === Challenges and problems === === Challenges and problems ===
Line 42: Line 42:
 | invalid_reason ​          | No        | string(1) ​   | Flag indicating wether the Concept is active (today'​s date between valid_start and valid_end_date),​ or upgraded ('​U'​) or deprecated ('​D'​). ​                                                 | | invalid_reason ​          | No        | string(1) ​   | Flag indicating wether the Concept is active (today'​s date between valid_start and valid_end_date),​ or upgraded ('​U'​) or deprecated ('​D'​). ​                                                 |
  
-This table is expected to contain ​as a minimum the comprehensive list of Concept Classes:+This table is expected to contain ​concepts having following ​Concept Classes:
  
   * Drug Product (Branded Drug, Clinical Drug, Marketed Product etc.)   * Drug Product (Branded Drug, Clinical Drug, Marketed Product etc.)
Line 51: Line 51:
   * Device (for source conccepts falling outside of Drug cathegory)   * Device (for source conccepts falling outside of Drug cathegory)
  
-It may contain Branded or Clinical Drug Forms or Components, but if not they will be derived (see below). Note that units should not have their own concept ​in the DRUG_CONCEPT_STAGE table. Instead, they should be used as verbatim. If the precise Concept Class is not known, it can be included as "Drug Product"​ and the correct Concept Class will be assigned during the incorporation automatically based on the availability of Strength, Dose Form, Brand Name, Supplier, Quantity and Box Size information.+It may contain Branded or Clinical Drug Forms or Components, but if not they will be derived (see below). Note that units should not necessarily ​have an entry in the DRUG_CONCEPT_STAGE table. Instead, they should be used as verbatim. If the precise Concept Class of a Drug Product ​is relevant, it can be preserved in source_concept_class_id field.
  
 Brand Names that are simple combinations of generic international name of active substance and manufacturer name (e.g. "​Aspirin Bayer"​) should not appear as attributes for Drug Products. Manufacturer information should be stored as a concept with Supplier class. Brand Names that are simple combinations of generic international name of active substance and manufacturer name (e.g. "​Aspirin Bayer"​) should not appear as attributes for Drug Products. Manufacturer information should be stored as a concept with Supplier class.
Line 70: Line 70:
  
 ==RELATIONSHIP_TO_CONCEPT== ==RELATIONSHIP_TO_CONCEPT==
-This table should contain the mapping between source codes and Standard Concepts for Ingredients,​ Brand Names, Dose Forms, Suppliers and Units.It also may contain mapping from source drugs to Standard Concepts for related ATC classes. All other relationships will be ignored.+This table should contain the mapping between source codes and Standard Concepts for Ingredients,​ Brand Names, Dose Forms, Suppliers and Units. All other relationships will be ignored.
  
 ^Field^Required^Type^Description^ ^Field^Required^Type^Description^
Line 80: Line 80:
 This table should contain all mappings from the new to existing Concepts and their precedence. This table should contain all mappings from the new to existing Concepts and their precedence.
  
-**Units** should be mapped to Standard Concept Units. Weight units should be converted to milligram, volume units should be mapped to milliliter, molar - to millimole with the right conversion factor. The source_code field should contain the verbatim string of the unit. It is highly desirable to only use units that are in use by Standard native RxNorm concepts. ​Querry ​[[documentation:​cdm:​drug_strength|DRUG_STRENGTH table]] for a distinct list.+**Units** should be mapped to Standard Concept Units. Weight units should be converted to milligram, volume units should be mapped to milliliter, molar - to millimole with the right conversion factor. The source_code field should contain the verbatim string of the unit. It is highly desirable to only use units that are in use by Standard native RxNorm concepts. ​Query [[documentation:​cdm:​drug_strength|DRUG_STRENGTH table]] for a distinct list.
  
-**Ingredients** ​must be usually mapped to Standard concepts one to one. If ingredient is given as a mix (e.g. Co-dried gel of Magnesium Carbonate and Aluminium Hydroxide), it should be split in multiple ​enitities ​with distinct new codes; each component of the mix must be mapped to standard ingredient.+**Ingredients** ​are usually mapped to Standard concepts one to one. If ingredient is given as a mix (e.g. Co-dried gel of Magnesium Carbonate and Aluminium Hydroxide), it should be split in multiple ​entities ​with distinct new codes; each component of the mix must be mapped to standard ingredient.
  
 One to many mappings with precedence should be used if: One to many mappings with precedence should be used if:
   * Source ingredient is an ion (like calcium, iron, zinc, etc.), which should be mapped to all it's salts;   * Source ingredient is an ion (like calcium, iron, zinc, etc.), which should be mapped to all it's salts;
   * Source ingredient is a herbal extract, which should be mapped to all suitable standard concepts;   * Source ingredient is a herbal extract, which should be mapped to all suitable standard concepts;
-  * Target vocabularies contain duplicates ​of standard ingredients. This is rare.+  * Target vocabularies contain ​logical ​duplicates ​among standard ingredients. This is rare. Example: RxNorm contains both 19026739 Pantothenic Acid and 19088079 pantothenate as separate standard ingredients (as of July 2019).
  
 **Dose Forms** are commonly mapped to multiple RxNorm dose forms with precedence. Modified release forms should be first mapped to corresponding forms in RxNorm vocabulary (like Delayed Release Oral Capsule), and then to more generic forms (Oral Capsule) with lower precedence. **Dose Forms** are commonly mapped to multiple RxNorm dose forms with precedence. Modified release forms should be first mapped to corresponding forms in RxNorm vocabulary (like Delayed Release Oral Capsule), and then to more generic forms (Oral Capsule) with lower precedence.
Line 97: Line 97:
 |concept_code_2|Yes|string(255)|The other source code of the pair| |concept_code_2|Yes|string(255)|The other source code of the pair|
  
-This table should contain relationships for each Drug Concept: To the Ingredients (always), the Dose Form (if appropriate),​the Supplier (if appropriate) and the Brand Name (if appropriate). All other relationships will be derived and ignored if they exist in the table. The relationships need not be symmetrical,​ only the one initiating from the Drug Concept is required.+This table should contain relationships for each Drug Concept: To the Ingredients (always), the Dose Form (if appropriate),​the Supplier (if appropriate) and the Brand Name (if appropriate). All other relationships will be derived and ignored if they exist in the table. The relationships ​don'​t ​need to be symmetrical,​ only the one initiating from the Drug Concept is required.
  
-If Drug Product concept does not have an Ingredient attribute, it will not have any standard mapping target after processing. Supplier attribute will not be considered for concepts without DS_STAGE or PC_STAGE entry since Marketed Product concepts can not exist without dosage information.+If Drug Product concept does not have an Ingredient attribute, it will be non-standard (as all source concepts) and not have any standard mapping target after processing. Supplier attribute will not be considered for concepts without DS_STAGE or PC_STAGE entry since Marketed Product concepts can'​t ​exist without dosage information.
  
 ==DS_STAGE== ==DS_STAGE==
Line 109: Line 109:
 |amount_unit|No|string(255)|The verbatim unit of the absolute content (solids)| |amount_unit|No|string(255)|The verbatim unit of the absolute content (solids)|
 |numerator_value|No|float|The numerator value for a concentration (usually liquid formulations)| |numerator_value|No|float|The numerator value for a concentration (usually liquid formulations)|
-|numerator_unit|No|string(255)|The verbatim numerator unit of a concentration (liquids)| +|numerator_unit|No|string(255)|The verbatim numerator unit of a concentration (usually liquid formulations)| 
-|denominator_value|No|float|The denominator value for a concentration (usally ​liquid formulations). It should contain a number for Quantified products, and null for everything else.| +|denominator_value|No|float|The denominator value for a concentration (usually ​liquid formulations). It should contain a number for Quantified products, and null for everything else.| 
-|denominator_unit|No|string(255)|The verbatim denominator unit of a concentration (liquids)| +|denominator_unit|No|string(255)|The verbatim denominator unit of a concentration (usually liquid formulations)| 
-|box_size|No|integer|The ​amount ​of units per box|+|box_size|No|integer|The ​number ​of units per box|
  
 This table contains the dose of each ingredient in each drug, as well as the box_size. For drugs which have no strength information or have only for some of the containing ingredients,​ the ds_stage record must be omitted. '​0'​ values in ds_stage are only allowed for inert drugs. This table contains the dose of each ingredient in each drug, as well as the box_size. For drugs which have no strength information or have only for some of the containing ingredients,​ the ds_stage record must be omitted. '​0'​ values in ds_stage are only allowed for inert drugs.
 Drug ingredients should match those in internal_relationship_stage. Drug ingredients should match those in internal_relationship_stage.
-If ingredients are mapped to the same one in relationship_to_concept their dosages should be summed up.+If ingredients are mapped to the same one in relationship_to_concept their dosages should be summed up as for a single ingredient before processing.
 A drug should not contain ingredients in solid (amount) and liquid (numerator/​denominator) form. This might be caused be either source data aberration or drug pack, which must be split into separate Drug Products and processed in PC_STAGE table. A drug should not contain ingredients in solid (amount) and liquid (numerator/​denominator) form. This might be caused be either source data aberration or drug pack, which must be split into separate Drug Products and processed in PC_STAGE table.
 +If denominator value is given, quantified drug will be created with given denominator value and unit as total volume.
  
-  * Inhalers, enemas or sprays that release certain dosage of active ingredient per activation should also be stored in numerator/​denominator form with total number of actuations as denominator (e.g. X MG / Y ACTUAT). Drugs that release active substance over prolonged period of time, like transcutaneous patches or modified release oral pills, can also be stored as numerator/​denominator with hours as denominators.+  * Inhalers, enemas or sprays that release certain dosage of active ingredient per activation should also be stored in numerator/​denominator form with total number of actuations as denominator (e.g. X MG / Y ACTUAT).
   * All drugs with fixed amount must have dosage in amount fields and all solutions must have dosage filled in numerator and denominator fields. When liquid drugs in data contain concentration information without volume, DENOMINATOR_VALUE field is left empty.   * All drugs with fixed amount must have dosage in amount fields and all solutions must have dosage filled in numerator and denominator fields. When liquid drugs in data contain concentration information without volume, DENOMINATOR_VALUE field is left empty.
   * Gases for inhalation must be put in numerator fields with % in unit field without filling denominator fields. It’s the only acceptable use of percents in DS_STAGE. Make sure to convert everything else to mg/ml or mg/mg.   * Gases for inhalation must be put in numerator fields with % in unit field without filling denominator fields. It’s the only acceptable use of percents in DS_STAGE. Make sure to convert everything else to mg/ml or mg/mg.
   * Patches, drug implants and other forms that release molecules over a period of time (even extended release tablets or capsules) may also be stored in numerator/​denominator form (e.g. X MG / Y HOUR).   * Patches, drug implants and other forms that release molecules over a period of time (even extended release tablets or capsules) may also be stored in numerator/​denominator form (e.g. X MG / Y HOUR).
-  * Inhalers, enemas or sprays that release certain dosage of active ingredient per activation should also be stored in numerator/​denominator form with total number of actuations as denominator (e.g. X MG / Y ACTUAT). 
   * If dose form for the source concept is given as a soluble powder without a solvent (except powder inhalers), dosage is stored in amount field.   * If dose form for the source concept is given as a soluble powder without a solvent (except powder inhalers), dosage is stored in amount field.
   * For drugs that are administered in a form of oral liquid (solution, suspension, syrup), denominator value of 5 ML should be kept only when we are certain that the dosage is not given “per tbsp.”; if 5 ML is not an actual fixed administered dose (e.g. a sachet or vial), it should be treated as concentration (DENOMINATOR_VALUE = NULL).   * For drugs that are administered in a form of oral liquid (solution, suspension, syrup), denominator value of 5 ML should be kept only when we are certain that the dosage is not given “per tbsp.”; if 5 ML is not an actual fixed administered dose (e.g. a sachet or vial), it should be treated as concentration (DENOMINATOR_VALUE = NULL).
   * Box size equal to 1 should be simply stored as NULL.   * Box size equal to 1 should be simply stored as NULL.
-  * If source provides conflictedmixed or incomplete info on one or more ingredients ​dosage ​or solution volume, drug should not have entries in ds_stage+  * Drugs can't have differing information for denominators among different ingredientsskip dosage for some ingredients or have same ingredient with different dosages ​
  
 ==PС_STAGE== ==PС_STAGE==
Line 134: Line 134:
 |pack_concept_code|Yes|string(255)|The source code of the Pack, either Branded or Clinical| |pack_concept_code|Yes|string(255)|The source code of the Pack, either Branded or Clinical|
 |drug_concept_code|Yes|string(255)|The component drug product in the Pack| |drug_concept_code|Yes|string(255)|The component drug product in the Pack|
-|amount|No|integer|The number of units of the drug product in drug_concept_code|+|amount|No|integer|The number of units of the drug product in a pack|
 |box_size|No|integer|The number of packs if the pack is boxed (several packs in a larger container| |box_size|No|integer|The number of packs if the pack is boxed (several packs in a larger container|
  
-This table contains the composition of a Clinical or Branded Pack: The Clinical or Branded Drug and, number of doses in each box and number of boxes in each pack. If it is a boxed Pack, it will also contain the box size, since Packs have no records in DS_STAGE like the other drug products. Packs are allowed to have branded drugs as components, although usually Brand Name is only attributed to packs as a whole. Supplier may only be atrributed ​to the pack as a whole.+This table contains the composition of a Clinical or Branded Pack: The Clinical or Branded Drug and, number of doses in each box and number of boxes in each pack. If it is a boxed Pack, it will also contain the box size, since Packs have no records in DS_STAGE like the other drug products. Packs are allowed to have branded drugs as components, although usually Brand Name is only attributed to packs as a whole. Supplier may only be attributed ​to the pack as a whole
 + 
 +Box size equal to 1 can also be omitted.
  
 ==CONCEPT_SYNONYM_STAGE== ==CONCEPT_SYNONYM_STAGE==
Line 152: Line 154:
 ==CONCEPT_RELATIONSHIP_MANUAL== ==CONCEPT_RELATIONSHIP_MANUAL==
  
-This table allows to manually map source concepts to existing standard concept in OMOP CDM circumventing standard vocabulary building process. This is useful to represent source concepts that belong ​to a different ​domain, preserve relationships inside the vocabulary or to map concepts to standard Drug concepts from outside of RxNorm and RxNorm Extension logic (e.g. standard concepts in [[documentation:​vocabulary:​atc|ATC]] or [[documentation:​vocabulary:​cvx|CVX]] vocabulary. It is also recommended to provide manual mapping for drugs that may have poor source representation yet are usually of special interest for researchers,​ like insulins or vaccines.+This table allows to manually map source concepts to existing standard concept in OMOP CDM circumventing standard vocabulary building process. This is useful to represent source concepts that should be mapped ​to a concept from non-drug ​domain, preserve relationships inside the vocabulary or to map concepts to standard Drug concepts from outside of RxNorm and RxNorm Extension logic (e.g. standard concepts in [[documentation:​vocabulary:​cvx|CVX]] vocabulary. It is also recommended to provide manual mapping for drugs that may have poor source representation yet are usually of special interest for researchers,​ like insulins or vaccines.
  
 ^Field^Required^Type^Description^ ^Field^Required^Type^Description^
Line 177: Line 179:
 | Combinations of product components should be unique. These are Ingredient-strength(s) combination,​ Dose Form, Brand Name, Quantity, Box size.  | Only the highest Concept Code is retained, and the other ones are treated as non-standard Concepts and mapped to the highest. ​                                                                                         | | Combinations of product components should be unique. These are Ingredient-strength(s) combination,​ Dose Form, Brand Name, Quantity, Box size.  | Only the highest Concept Code is retained, and the other ones are treated as non-standard Concepts and mapped to the highest. ​                                                                                         |
 | Each product should have links (records in INTERNAL_RELATIONSHIP_STAGE) to all their Ingredients. ​                                             | The product will be treated as if it had only the linked Ingredients. If no Ingredients are linked, the product will be processed into the CONCEPT_STAGE table, but as an orphan without any related Concept Classes. ​ | | Each product should have links (records in INTERNAL_RELATIONSHIP_STAGE) to all their Ingredients. ​                                             | The product will be treated as if it had only the linked Ingredients. If no Ingredients are linked, the product will be processed into the CONCEPT_STAGE table, but as an orphan without any related Concept Classes. ​ |
-| Ingredients should be linked to their Standard ​Counterparts.                                                                                   | These Ingredients are treated as new Standard Ingredients. ​                                                                                                                                                            | +| Ingredients should be linked to their Standard ​counterparts if such concepts exist.                                                                                   | These Ingredients are treated as new Standard Ingredients, which may lead to creation of duplicates.                                                                                                                                                             | 
-| Dose Forms should be linked to their Standard Counterparts.                                                                                    | The processing ​will fail                                                                                                                                                                                             +| Dose Forms should be linked to their valid counterparts if such concepts exist.                                                                                    | These Dose Forms will be treated as new valid Concepts, which may lead to creation of duplicates                                                                                                                                                                                           
-| Brand Names should be linked to their Valid Counterparts.                                                                                   | These Brand Names will be treated as new Concepts. ​                                                                                                                                                                    | +| Brand Names should be linked to their valid counterparts if such concepts exist.                                                                                   | These Brand Names will be treated as new valid Concepts, which may lead to creation of duplicates.                                                                                                                                                                     
-| All % in source dosages should be converted into mg/ml (mg) unless it is a gas.                                                                | A drug would not be mapped to it's Standard ​Conept ​                                                                                                                                                                    |+| Suppliers should be linked to their Valid Counterparts if such concepts exist. 
 +| These Suppliers will be treated as new valid Concepts, which may lead to creation of duplicates. ​     
 +
 +| All % in source dosages should be converted into mg/ml (mg) unless it is a gas.                                                                | A drug would not be mapped to it's Standard ​Concept ​                                                                                                                                                                    |
 | Marketed Product (a drug that has relationship to it's supplier in INTERNAL_RELATIONSHIP_STAGE) should have both dosage and Dose Form          | The product won't be processed into CONCEPT_STAGE table. ​                                                                                                                                                              | | Marketed Product (a drug that has relationship to it's supplier in INTERNAL_RELATIONSHIP_STAGE) should have both dosage and Dose Form          | The product won't be processed into CONCEPT_STAGE table. ​                                                                                                                                                              |
 | Boxed drug should have both dosage and Dose Form.                                                                                              | The product won't be processed into CONCEPT_STAGE table. ​                                                                                                                                                              | | Boxed drug should have both dosage and Dose Form.                                                                                              | The product won't be processed into CONCEPT_STAGE table. ​                                                                                                                                                              |
Line 190: Line 195:
  
 All propositions to add a new vocabulary into CDM may be submitted (optionally with prepared input tables) as [[https://​github.com/​OHDSI/​Vocabulary-v5.0/​issues|issues on github]]. All propositions to add a new vocabulary into CDM may be submitted (optionally with prepared input tables) as [[https://​github.com/​OHDSI/​Vocabulary-v5.0/​issues|issues on github]].
- 
-** Matching ** 
-This step is necessary to add inferred equivalence relationships between new and existing Standard Concepts. All matches are created. 
- 
-The matching considers all components (Ingredient-strength(s),​ Dose Form, Brand Name, Quantity, Box size) in the order of precedence. A 10% mismatch between strength or total volume values is still considered a match. Matching beteween normal and Quantified products compares the Numerator Value of the non-quantied to the Numerator divided by the Denominator Value. 
- 
-** Inferring of missing Concept Classes ** 
- 
-After RxNorm Extension Standard concept with the fullest attribute set will be created for each source entity, all missing preceding Concept Classes are inferred from the existing ones, from bottom upwards. ​ 
- 
-^Concept Class^Defined by^ 
-|Clinical Drug Component|Ingredient-strength. Note that Clinical Components are always single-Ingredient. This is in contrast to all other Concept Classes| 
-|Branded Drug Component|Ingredient-strength(s),​ Brand Name| 
-|Clinical Drug Form|Ingredient(s),​ Dose Form| 
-|Branded Drug Form|Ingredient(s),​ Dose Form, Brand Name| 
-|Clinical Drug|Ingredient-strength(s),​ Dose Form| 
-|Branded Drug|Ingredient-strength(s),​ Dose Form, Brand Name| 
-|Quantified Clinical Drug|Ingredient-strength(s),​ Dose Form, Quantity| 
-|Quantified Branded Drug|Ingredient-strength(s),​ Dose Form, Brand Name, Quantity| 
-|Clinical Drug Box|Ingredient-strength(s),​ Dose Form, Box size| 
-|Branded Drug Box|Ingredient-strength(s),​ Dose Form, Brand Name, Box size| 
-|Quantified Clinical Box|Ingredient-strength(s),​ Dose Form, Quantity, Box size| 
-|Quantified Branded Box|Ingredient-strength(s),​ Dose Form, Brand Name, Quantity, Box size| 
- 
-Even though all drug classes are inferred, only those will be written to the CONCEPT table that have no mapping to an existing equivalent Standard Concept. 
- 
implementation_international_drug_vocabulary.txt · Last modified: 2021/06/09 10:15 by adavydov