User Tools

Site Tools


documentation:international_drugs

RxNorm Extension - an OHDSI resource to represent international drugs

The purpose of this document is to define the process, rules and resulting structure of incorporating international drug vocabularies into an overall RxNorm-like system, called RxNorm Extension.

Drug vocabularies contain drug products and their components. Only about a third of these products are identical in the drug markets of individual countries or jurisdictions of a drug marketing approval agency. Even if the active ingredients are the same, they can differ in their Drug Strengths, Drug Forms, Brand Names, package sizes and manufacturers or distributors.

Therefore, these vocabularies need to be incorporated into the existing Drug Domain in such a way that all existing drugs and their components are correctly mapped, and the missing ones added as new concepts. This includes a life cycle for each Concept, allowing to generation, deprecation and update over time.

The processing script that follows these instructions can be found here.

General structure

The Drug Domain should be organized in a hierarchical structure described Drug Domain. This structure is based on RxNorm, which also forms the core of the content. RxNorm comprehensively describes the drug market in the United States. It may not contain products sold in the markets of other countries. It also does not contain US medical food or food supplement products.

This structure contains at a minimum (from bottom to top):

Concept Class Composed of
Branded Drug Ingredients, their strength, form, brand name
Clinical Drug Ingredient, their strength, form
Branded Drug Form Ingredient, form, brand name
Clinical Drug Form Ingredient, form
Branded Drug Component Ingredient, strength, brand name
Clinical Drug Component Ingredient, strength
Dose Form Form
Brand Name Brand name
Ingredient Ingredient
Drug Class Drug class

It may optionally contain

Concept Class Composed of
Marketed Product Ingredients, their strength, form, supplier (brand name and box size are optional)
Quantified Branded Drug Box Ingredients, their strength, form, brand name, size and box size
Quantified Clinical Drug Box Ingredients, their strength, form, size and box size
Branded Drug Box Ingredients, their strength, form, brand name and box size
Clinical Drug Box Ingredients, their strength, form and box size
Quantified Branded Drug Ingredients, their strength, form, brand name, size
Quantified Clinical Drug Ingredients, their strength, form, size
Branded Pack Branded Drugs, their number (box size is optional)
Clinical Pack Clinical Drugs, their number (box size is optional)
Supplier Supplier

Currently not supported in the Standardized Vocabularies:

Concept ClassComposed ofNote
Precise IngredientIngredientIngredient used instead
Multiple IngredientIngredientsSingle ingredients used instead
Dose Form GroupDose FormExplicit Dose Forms used instead

The Concepts are connected through hierarchical and lateral relationships.

Combined target structure

To incorporate a new set of drug information, a structure should be achieved that contains every Concept only once and preserves the RxNorm structure, no matter which vocabulary the additional Concept is coming from. In a way, it should create a mixed RxNorm/drug vocabularies union.

In order to achieve this, any two equivalent Concepts have to be matched through its components: Ingredients to Ingredients, Forms to Forms, Supplier to Supplier, etc. Concepts are defined as matching if all components match. For example, a Clinical Drug matches another Clinical Drug if it contains the same Ingredients at the same strength and the same Dose Form.

Rules for adding Concepts

To add a Concept for which there is an existing equivalent:

  • It should be recorded as a non-standard (source) Concept.
  • It should be mapped through a “Maps to” relationship to its standard equivalent.
  • All other relationships are optional and for QA and convenience. The standard Concept will take its place as the official representation.

To add a Concept that does not have an equivalent:

  • It should be recorded as a standard Concept (standard_concept = 'S'), with the exception of Brand Names, which should be recorded as a non-standard Concepts.
  • It should have hierarchical and lateral relationships in the same manner as RxNorm Concepts do.
  • It should form relationship to relevant drug classes. The relationship_id of these relationship do not have to follow the RxNorm standard, as it differs for every drug class. Classes are most often defined for Ingredients, but some non-Ingredients may directly designate a Concept Class and “jump over” the Ingredient or even Drug Forms or Drug Components. They will be inferred automatically.

Units used in the strength determination are not added. They must be mapped to a Standard UCUM Concept instead. If a unit is not present in the UCUM vocabulary it has to be added.

Challenges and problems

To implement a tool to create and maintain the above structure, a number of issues need to be taken care of:

  • Excipients: There is no general agreement of what is an active agreement and what is an excipient. Therefore, some of the ingredients need to be declared as “semiactive”, such as gelatine. Generally, if ingredient can be biologically active, but is not present in preparation for its pharmaceutical properties, it should be considered excipient. Excipients should be excluded from a list of a drug's ingredients.
  • Forms: These are not used the same way across drug vocabularies. For example, RxNorm has a Form “Cream”, but also “Ophthalmic cream”, “Vaginal cream”, “Rectal cream”, “Oral cream” and “Cutaneous cream”, making this Form ambiguous. Instead of a one-to-one mapping, a one to many mapping with an order of precedence is required to establish matching equivalence between Forms.
  • Strength: RxNorm normalizes weight units to “mg” and volume units to mL, but other vocabularies might not. There might be units like “µg”, “gram-%” or “volume-%”. Special unit conversion tables are needed instead of simple unit mappings. This approach becomes infeasible if units are used where the conversion is dependent on the molecule, like “mol” or “equivalent”.
  • Ingredient forms: Ingredients might have ambiguous chemical forms, which RxNorm calls “Ingredient” and “Precise Ingredient” (e.g. a salt of the active compound). They have to be mapped to the right Standard RxNorm Ingredient. If there is no RxNorm Ingredient to map to and the drug vocabulary to be added contains several ambiguous forms of the same Ingredient, one of them has to be declared Standard. In rare cases there might be several Standard duplicates of the same Ingredient. In those cases mappings from source vocabularies must be made with precedence. Another problem might occur due to the strength is given for a precise ingredient, rather than a standard ingredient. An ingredient that is presented in the way of aqueous/spirit extract should be considered as the same one.

Implementation

1. Registering a new drug vocabulary

If a drug vocabulary gets added for the first time, it needs to get listed in the VOCABULARY table.

2. Creation of input tables

The new vocabulary should be prepared in the following tables:

DRUG_CONCEPT_STAGE

Field Required Type Description
concept_name Yes string(255) An unambiguous, meaningful and descriptive name for the Concept in English language
domain_id Yes string(20) A foreign key to the DOMAIN table. The standard content is 'Drug', but for non-drugs it could be 'Device' of 'Observation'
vocabulary_id Yes string(20) A foreign key to the VOCABULARY table. The value of this field should be identical for all records, indicating the new vocabulary being added.
concept_class_id Yes string(20) One of the above listed RxNorm Concept Classes
concept_code Yes string(50) The code in the source vocabulary. If the source vocabulary does not contain a code, e.g. for ingredients or dose forms, they will be created automatically (see below OMOP created codes)
source_concept_class_id No string(20) Concept class that is given by the source vocabulary
possible_excipient No string(1) A flag only relevant to ingredients, indicating whether or not they are not active ingredients and could be omitted from an ingredient list.
valid_start_date No date Date when the Concept became valid. This may or may not coincide with the date the product went to market.
valid_end_date No date Date when the Concept became invalid. Market withdrawal does not mean a Concept is invalid.
invalid_reason No string(1) Flag indicating wether the Concept is active (today's date between valid_start and valid_end_date), or upgraded ('U') or deprecated ('D').

This table is expected to contain as a minimum the comprehensive list of Concept Classes:

  • Drug Product (Branded Drug, Clinical Drug, Marketed Product etc.)
  • Form
  • Brand Name
  • Ingredient
  • Unit
  • Supplier
  • Device (for source conccepts falling outside of Drug cathegory)

It may contain Branded or Clinical Drug Forms or Components, but if not they will be derived (see below). Note that units should not have their own concept in the DRUG_CONCEPT_STAGE table. Instead, they should be used as verbatim. If the precise Concept Class is not known, it can be included as “Drug Product” and the correct Concept Class will be assigned during the incorporation automatically based on the availability of Strength, Dose Form, Brand Name, Supplier, Quantity and Box Size information.

Brand Names that are simple combinations of generic international name of active substance and manufacturer name should not appear as attributes for Drug Products. Manufacturer information should be stored as a concept with Supplier class.

Concepts that belong to the source vocabulary, but do not belong to Drug domain by OMOP rules., should be classified as 'Device'. Typically, these belong to different substance groups:

  • Radiographic contrasts and dyes (barium sulfate, radiolabeled iodides)
  • Nutritional supplements without stated mechanism of pharmaceutical action (e.g. herbal complexes, homeopathic preparations)
  • Formulas for infant feeding
  • Parenteral nutrition (aminoacids and/or lipid mixes)
  • Solution for dialysis, catheter maintenance etc.
  • Blood products for transfusion, blood plasma, autologous and non-autologous transplants of any kind
  • Cosmetics, sunscreens, non-medicated shampoos and soaps, etc.
  • Surgical materials like bone cements
  • Topical/external disinfectants

Animal drugs can be handled as Drugs or Devices, depending on what their role in patient data can be expected to be. Note that only concepts from Drug domain can have attributes.

RELATIONSHIP_TO_CONCEPT This table should contain the mapping between source codes and Standard Concepts for Ingredients, Brand Names, Dose Forms, Suppliers and Units.It also may contain mapping from source drugs to Standard Concepts for related ATC classes. All other relationships will be ignored.

FieldRequiredTypeDescription
concept_code_1Yesstring(255)The source code
concept_id_2YesintegerThe existing target Concept
precedenceNointegerFor multiple concept_code_1/concept_id_2 combination the order of precedence in which they should be considered for equivalence testing. The mapping with the highest prevalence among the drugs will be used for writing a record to the CONCEPT_RELATIONSHIP table. A missing precedence will be interpreted as precedence 1. Every precedence value should be unique per concept_code_1
conversion_factorNofloatThe factor used to convert the source code to the target Concept. This is usually defined for units

This table should contain all mappings from the new to existing Concepts and their precedence.

Units should be mapped to Standard Concept Units. Weight units should be converted to milligram, volume units should be mapped to milliliter, molar - to millimole with the right conversion factor. The source_code field should contain the verbatim string of the unit. It is highly preferrable to use the same units that are in use for valid drugs stored in RxNorm, not just Standard units.

INTERNAL_RELATIONSHIP_STAGE

FieldRequiredTypeDescription
concept_code_1Yesstring(255)One source code of the pair
concept_code_2Yesstring(255)The other source code of the pair

This table should contain relationships for each Drug Concept: To the Ingredients (always), the Dose Form (if appropriate),the Supplier (if appropriate) and the Brand Name (if appropriate). All other relationships will be derived and ignored if they exist in the table. The relationships need not be symmetrical, only the one initiating from the Drug Concept is required.

If Drug Product concept does not have an Ingredient attribute, it will not have any standard mapping target after processing. Supplier attribute will not be considered for concepts without DS_STAGE or PC_STAGE entry since Marketed Product concepts can not exist without dosage information.

DS_STAGE

FieldRequiredTypeDescription
drug_concept_codeYesstring(255)The source code of the Drug or Drug Component, either Branded or Clinical
ingredient_concept_codeYesstring(255)The source code for one of the Ingredients
amount_valueNofloatThe numeric value for absolute content (usually solid formulations)
amount_unitNostring(255)The verbatim unit of the absolute content (solids)
numerator_valueNofloatThe numerator value for a concentration (usually liquid formulations)
numerator_unitNostring(255)The verbatim numerator unit of a concentration (liquids)
denominator_valueNofloatThe denominator value for a concentration (usally liquid formulations). It should contain a number for Quantified products, and null for everything else.
denominator_unitNostring(255)The verbatim denominator unit of a concentration (liquids)
box_sizeNointegerThe amount of units per box

This table contains the dose of each ingredient in each drug, as well as the box_size. For drugs which have no strength information or have only for some of the containing ingredients, the ds_stage record must be omitted. '0' values in ds_stage are only allowed for inert drugs. Drug ingredients should match those in internal_relationship_stage. If ingredients are mapped to the same one in relationship_to_concept their dosages should be summed up. A drug should not contain ingredients in solid (amount) and liquid (numerator/denominator) form. This might be caused be either source data aberration or drug pack, which must be split into separate Drug Products and processed in PC_STAGE table.

Inhalers, enemas or sprays that release certain dosage of active ingredient per activation should also be stored in numerator/denominator form with total number of actuations as denominator (e.g. X MG / Y ACTUAT). Drugs that release active substance over prolonged period of time, like transcutaneous patches or modified release oral pills, can also be stored as numerator/denominator with hours as denominators.

PС_STAGE

FieldRequiredTypeDescription
pack_concept_codeYesstring(255)The source code of the Pack, either Branded or Clinical
drug_concept_codeYesstring(255)The component drug product in the Pack
amountNointegerThe number of units of the drug product in drug_concept_code
box_sizeNointegerThe number of packs if the pack is boxed (several packs in a larger container

This table contains the composition of a Clinical or Branded Pack: The Clinical or Branded Drug and the number in each pack. If it is a boxed Pack, it will also contain the box size, since Packs have no records in DS_STAGE like the other drug products.

CONCEPT_SYNONYM_STAGE

This table contains alternative names for concepts. These are either alternative names provided by source or names in original languages, since DRUG_CONCEPT_STAGE will contain english names only.

FieldTypeDescription
synonym_concept_idinteger
synonym_namestring(255)Alternative name of the concept. There is no need to copy the entry from DRUG_CONCEPT_STAGE
synonym_concept_codestring(50)Concept code in source vocabulary
synonym_vocabulary_idstring(20)VOCABULARY_ID of source vocabulary
language_concept_idintegerCONCEPT_ID for Standard concept representing language

4. Concept Codes

Source systems my designate codes for different levels or Concept Classes. For all Concepts that are inferred or do not come with a code has to be assigned. The codes are constructed of the word “OMOP” and a running number. The running number should be unique across all vocabularies. That means, each time a new vocabulary is added or refreshed, the next Concept Code should be the one of the last (without the 'OMOP' string) +1.

5. Quality of input tables

The input tables need to have the following quality requirements:

Rule If rule is violated
Each record should be unique in all tables. The processing will fail.
Concept Codes should be unique and should not repeat for different products. The processing will fail.
Combinations of product components should be unique. These are Ingredient-strength(s) combination, Dose Form, Brand Name, Quantity, Box size. Only the highest Concept Code is retained, and the other ones are treated as non-standard Concepts and mapped to the highest.
Each product should have links (records in INTERNAL_RELATIONSHIP_STAGE) to all their Ingredients. The product will be treated as if it had only the linked Ingredients. If no Ingredients are linked, the product will be processed into the CONCEPT_STAGE table, but as an orphan without any related Concept Classes.
Ingredients should be linked to their Standard Counterparts. These Ingredients are treated as new Standard Ingredients.
Dose Forms should be linked to their Standard Counterparts. The processing will fail.
Brand Names should be linked to their Valid Counterparts. These Brand Names will be treated as new Concepts.
All % in source dosages should be converted into mg/ml (mg) unless it is a gas. A drug would not be mapped to it's Standard Conept
Marketed Product (a drug that has relationship to it's supplier in INTERNAL_RELATIONSHIP_STAGE) should have both dosage and Dose Form The product won't be processed into CONCEPT_STAGE table.
Boxed drug should have both dosage and Dose Form. The product won't be processed into CONCEPT_STAGE table.
Product ingredients should match in INTERNAL_RELATIONSHIP_STAGE and DS_STAGE The processing will fail.
When mapping Ingredients, Dose Forms or other attributes are mapped to multiple targets precedence values must be present and unique for each source concept Processing will create orphaned meaningless branches of RxNorm Extension concepts.

For quality assurance of input tablesyou can use drug_stage_tables_QA.sql script from project's github

5. Processing

If all 5 tables DRUG_CONCEPT_STAGE, INTERNAL_RELATIONSHIP_STAGE, RELATIONSHIP_TO_CONCEPT, PC_STAGE and DS_STAGE are available, the new terminology can be built:

Inferring of missing Concept Classes

All missing Concept Classes are inferred from the existing ones, from bottom upwards.

Concept ClassDefined by
Clinical Drug ComponentIngredient-strength. Note that Clinical Components are always single-Ingredient. This is in contrast to all other Concept Classes
Branded Drug ComponentIngredient-strength(s), Brand Name
Clinical Drug FormIngredient(s), Dose Form
Branded Drug FormIngredient(s), Dose Form, Brand Name
Clinical DrugIngredient-strength(s), Dose Form
Branded DrugIngredient-strength(s), Dose Form, Brand Name
Quantified Clinical DrugIngredient-strength(s), Dose Form, Quantity
Quantified Branded DrugIngredient-strength(s), Dose Form, Brand Name, Quantity
Clinical Drug BoxIngredient-strength(s), Dose Form, Box size
Branded Drug BoxIngredient-strength(s), Dose Form, Brand Name, Box size
Quantified Clinical BoxIngredient-strength(s), Dose Form, Quantity, Box size
Quantified Branded BoxIngredient-strength(s), Dose Form, Brand Name, Quantity, Box size

Even though all drug classes are inferred, only those will be written to the CONCEPT table that have no mapping to an equivalent Standard Concept.

Matching This step is necessary to add inferred equivalence relationships between new and existing Standard Concepts. All matches are created. Links in the RELATIONSHIP_TO_CONCEPT table are ignored.

The matching considers all components (Ingredient-strength(s), Dose Form, Brand Name, Quantity, Box size) in the order of precedence and optionally for records where possible_excipient is set to 1. A 10% mismatch between strength values is still considered a match. Matching beteween normal and Quantified products compares the Numerator Value of the non-quantied to the Numerator divided by the Denominator Value.

Result All records in the DRUG_CONCEPT_STAGE table are written to the CONCEPT_STAGE table as follows. The standard_concept field is set to 'S' for all products and Ingredients and Brand Names that have no match to existing Standard Concepts. Dose Forms are always written as non-standard.

All records linking drug products to their Ingredients, Dose Forms, Suppliers and Brand Names are written to the CONCEPT_RELATIONSHIP_STAGE table. Note that this can be a one or two step connection:

* Ingredients, Dose Forms, Suppliers and Brand Names that have no equivalent to RxNorm (and are therefore Standard Concepts): These are converted from the INTERNAL_RELATIONSHIP_STAGE table. * Ingredients, Dose Forms and Brand Names that have an RxNorm equivalent (at least one) are not written into the CONCEPT_RELATIONSHIP_STAGE table, but the RxNorm equivalent instead, using the records from the RELATIONSHIP_TO_CONCEPT table with the relationship_id = 'Has standard ing', 'Has standard brand' and 'Has standard form' .

Relationships between Drug Products or derivatives (Drug Forms and Components) are connected through CONCEPT_RELATIONSHIP_STAGE records with the the following relationship_id values:

Concept Class 1 Concept Class 2 Relationship ID
Brand Name Branded Drug Brand name of
Brand Name Branded Drug Comp Brand name of
Brand Name Branded Drug Form Brand name of
Brand Name Ingredient Brand name of
Brand Name Quant Branded Drug Brand name of
Brand Name Marketed Product Brand name of
Branded Drug Brand Name RxNorm has ing
Branded Drug Branded Drug Comp Consists of
Branded Drug Branded Drug Form RxNorm is a
Branded Drug Branded Pack Contained in
Branded Drug Clinical Drug Tradename of
Branded Drug Clinical Drug Comp Consists of
Branded Drug Dose Form RxNorm has dose form
Branded Drug Quant Branded Drug Has quantified form
Branded Drug Quant Clinical Drug Tradename of
Branded Drug Marketed Product Has marketed form
Branded Drug Comp Brand Name RxNorm has ing
Branded Drug Comp Branded Drug Constitutes
Branded Drug Comp Clinical Drug Comp Tradename of
Branded Drug Comp Quant Branded Drug Constitutes
Branded Drug Form Brand Name RxNorm has ing
Branded Drug Form Branded Drug RxNorm inverse is a
Branded Drug Form Clinical Drug Form Tradename of
Branded Drug Form Dose Form RxNorm has dose form
Branded Drug Form Quant Branded Drug RxNorm inverse is a
Branded Pack Branded Drug Contains
Branded Pack Clinical Drug Contains
Branded Pack Clinical Pack Tradename of
Branded Pack Dose Form RxNorm has dose form
Branded Pack Quant Branded Drug Contains
Branded Pack Quant Clinical Drug Contains
Branded Pack Marketed Product Has marketed form
Clinical Drug Branded Drug Has tradename
Clinical Drug Branded Pack Contained in
Clinical Drug Clinical Drug Comp Consists of
Clinical Drug Clinical Drug Form RxNorm is a
Clinical Drug Clinical Pack Contained in
Clinical Drug Dose Form RxNorm has dose form
Clinical Drug Quant Branded Drug Has tradename
Clinical Drug Quant Clinical Drug Has quantified form
Clinical Drug Marketed Product Has marketed form
Clinical Drug Marketed Product Contained in
Clinical Drug Comp Branded Drug Constitutes
Clinical Drug Comp Branded Drug Comp Has tradename
Clinical Drug Comp Clinical Drug Constitutes
Clinical Drug Comp Ingredient Has precise ing
Clinical Drug Comp Ingredient RxNorm has ing
Clinical Drug Comp Quant Branded Drug Constitutes
Clinical Drug Comp Quant Clinical Drug Constitutes
Clinical Drug Form Branded Drug Form Has tradename
Clinical Drug Form Clinical Drug RxNorm inverse is a
Clinical Drug Form Dose Form RxNorm has dose form
Clinical Drug Form Ingredient RxNorm has ing
Clinical Drug Form Quant Clinical Drug RxNorm inverse is a
Clinical Pack Branded Pack Has tradename
Clinical Pack Clinical Drug Contains
Clinical Pack Dose Form RxNorm has dose form
Clinical Pack Quant Clinical Drug Contains
Clinical Pack Marketed Product Has quantified form
Dose Form Branded Drug RxNorm dose form of
Dose Form Branded Drug Form RxNorm dose form of
Dose Form Branded Pack RxNorm dose form of
Dose Form Clinical Drug RxNorm dose form of
Dose Form Clinical Drug Form RxNorm dose form of
Dose Form Clinical Pack RxNorm dose form of
Dose Form Quant Branded Drug RxNorm dose form of
Dose Form Quant Clinical Drug RxNorm dose form of
Dose Form Marketed Product RxNorm dose form of
Ingredient Brand Name Has brand name
Ingredient Clinical Drug Comp RxNorm ing of
Ingredient Clinical Drug Form RxNorm ing of
Marketed Product Brand Name Has brand name
Marketed Product Branded Drug Marketed form of
Marketed Product Branded Pack Marketed form of
Marketed Product Clinical Drug Marketed form of
Marketed Product Clinical Drug Contains
Marketed Product Clinical Pack Marketed form of
Marketed Product Dose Form RxNorm has dose form
Marketed Product Quant Branded Drug Marketed form of
Marketed Product Quant Clinical Drug Contains
Marketed Product Quant Clinical Drug Marketed form of
Marketed Product Supplier Has supplier
Quant Branded Drug Brand Name RxNorm has ing
Quant Branded Drug Branded Drug Quantified form of
Quant Branded Drug Branded Drug Comp Consists of
Quant Branded Drug Branded Drug Form RxNorm is a
Quant Branded Drug Branded Pack Contained in
Quant Branded Drug Clinical Drug Tradename of
Quant Branded Drug Clinical Drug Comp Consists of
Quant Branded Drug Dose Form RxNorm has dose form
Quant Branded Drug Quant Clinical Drug Tradename of
Quant Branded Drug Marketed Product Has marketed form
Quant Clinical Drug Branded Drug Has tradename
Quant Clinical Drug Branded Pack Contained in
Quant Clinical Drug Clinical Drug Quantified form of
Quant Clinical Drug Clinical Drug Comp Consists of
Quant Clinical Drug Clinical Drug Form RxNorm is a
Quant Clinical Drug Clinical Pack Contained in
Quant Clinical Drug Dose Form RxNorm has dose form
Quant Clinical Drug Quant Branded Drug Has tradename
Quant Clinical Drug Marketed Product Has marketed form
Quant Clinical Drug Marketed Product Contained in
Supplier Marketed Product Supplier of

From the pack_content, build the name of the Branded and Clinical Packs.

Relationships between any Drug Concept Class and a Classfication Concept Class is recorded through the “Drug has drug class” and “Drug class of drug” generic relationship pair.

Finally, a new DRUG_STRENGTH_STAGE table should be created from DS_STAGE and the the unit conversions in RELATIONSHIP_TO_CONCEPT, so the content can be added to the DRUG_STRENGTH table. This includes only Drug Concepts that have no mapping to an existing Standard Concept and are now Standard themselves. The ingredient_concept_code field is either the RxNorm equivalent, or from the newly added vocabulary, if unavailable.

documentation/international_drugs.txt · Last modified: 2019/07/19 15:49 by ekorchmar