User Tools

Site Tools


documentation:cdm:single-page

This is an old revision of the document!


License

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

© 2014 Observational Health Data Sciences and Informatics

This work is based on work by the Observational Medical Outcomes Partnership (OMOP) and used under license from the FNIH at http://omop.fnih.org/publiclicense.

All derivative work after the OMOP CDM v4 specification is dedicated to the public domain. Observational Health Data Sciences and Informatics (OHDSI) has waived all copyright and related or neighboring rights to the extent allowed by law.

http://creativecommons.org/publicdomain/zero/1.0/

2014/11/14 11:38 · cgreich

Background

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The Observational Medical Outcomes Partnership (OMOP) was a public-private partnership established to inform the appropriate use of observational healthcare databases for studying the effects of medical products. Over the course of the 5-year project and through its community of researchers from industry, government, and academia, OMOP successfully achieved its aims to:

  1. Conduct methodological research to empirically evaluate the performance of various analytical methods on their ability to identify true associations and avoid false findings,
  2. Develop tools and capabilities for transforming, characterizing, and analyzing disparate data sources across the health care delivery spectrum, and
  3. Establish a shared resource so that the broader research community can collaboratively advance the science.

The results of OMOP's research has been widely published and presented at scientific conferences, including annual symposia.

The OMOP Legacy continues…

The community is actively using the OMOP Common Data Model for their various research purposes. Those tools will continue to be maintained and supported, and information about this work is available in the public domain.

The Observational Health Data Sciences and Informatics (OHDSI) has been established as a multi-stakeholder, interdisciplinary collaborative to create open-source solutions that bring out the value of observational health data through large-scale analytics. The OHDSI collaborative includes all of the original OMOP research investigators, and will develop its tools using the OMOP Common Data Model. Learn more at ohdsi.org.

The OMOP Common Data Model will continue to be an open-source, community standard for observational healthcare data. The model specifications and associated work products will be placed in the public domain, and the entire research community is encouraged to use these tools to support everybody's own research activities.

2014/11/15 02:18 · cgreich
 
 
 
 
 
 
 

CONCEPT table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The Standardized Vocabularies contains records, or Concepts, that uniquely identify each fundamental unit of meaning used to express clinical information in all domain tables of the CDM. Concepts are derived from vocabularies, which represent clinical information across a domain (e.g. conditions, drugs, procedures) through the use of codes and associated descriptions. Some Concepts are designated Standard Concepts, meaning these Concepts can be used as normative expressions of a clinical entity within the OMOP Common Data Model and within standardized analytics. Each Standard Concept belongs to one domain, which defines the location where the Concept would be expected to occur within data tables of the CDM.

Concepts can represent broad categories (like “Cardiovascular disease”), detailed clinical elements (”Myocardial infarction of the anterolateral wall”) or modifying characteristics and attributes that define Concepts at various levels of detail (severity of a disease, associated morphology, etc.).

Records in the Standardized Vocabularies tables are derived from national or international vocabularies such as SNOMED-CT, RxNorm, and LOINC, or custom Concepts defined to cover various aspects of observational data analysis. For a detailed description of these vocabularies, their use in the OMOP CDM and their relationships to each other please refer to the Specifications.

FieldRequiredTypeDescription
concept_idYesintegerA unique identifier for each Concept across all domains.
concept_nameYesvarchar(255)An unambiguous, meaningful and descriptive name for the Concept.
domain_idYesvarchar(20)A foreign key to the DOMAIN table the Concept belongs to.
vocabulary_idYesvarchar(20)A foreign key to the VOCABULARY table indicating from which source the Concept has been adapted.
concept_class_idYesvarchar(20)The attribute or concept class of the Concept. Examples are “Clinical Drug”, “Ingredient”, “Clinical Finding” etc.
standard_conceptNovarchar(1)This flag determines where a Concept is a Standard Concept, i.e. is used in the data, a Classification Concept, or a non-standard Source Concept. The allowables values are 'S' (Standard Concept) and 'C' (Classification Concept), otherwise the content is NULL.
concept_codeYesvarchar(50)The concept code represents the identifier of the Concept in the source vocabulary, such as SNOMED-CT concept IDs, RxNorm RXCUIs etc. Note that concept codes are not unique across vocabularies.
valid_start_dateYesdateThe date when the Concept was first recorded. The default value is 1-Jan-1970, meaning, the Concept has no (known) date of inception.
valid_end_dateYesdateThe date when the Concept became invalid because it was deleted or superseded (updated) by a new concept. The default value is 31-Dec-2099, meaning, the Concept is valid until it becomes deprecated.
invalid_reasonNovarchar(1)Reason the Concept was invalidated. Possible values are D (deleted), U (replaced with an update) or NULL when valid_end_date has the default value.

Conventions

Concepts in the Common Data Model are derived from a number of public or proprietary terminologies such as SNOMED-CT and RxNorm, or custom generated to standardize aspects of observational data. Both types of Concepts are integrated based on the following rules:

  • All Concepts are maintained centrally by the CDM and Vocabularies Working Group. Additional concepts can be added, as needed, upon request.
  • For all Concepts, whether they are custom generated or adopted from published terminologies, a unique numeric identifier concept_id is assigned and used as the key to link all observational data to the corresponding Concept reference data.
  • The concept_id of a Concept is persistent, i.e. stays the same for the same Concept between releases of the Standardized Vocabularies.
  • A descriptive name for each Concept is stored as the Concept Name as part of the CONCEPT table. Additional names and descriptions for the Concept are stored as Synonyms in the CONCEPT_SYNONYM table.
  • Each Concept is assigned to a Domain. For Standard Concepts, these is always a single Domain. Source Concepts can be composite or coordinated entities, and therefore can belong to more than one Domain. The domain_id field of the record contains the abbreviation of the Domain, or Domain combination. Please refer to the Standardized Vocabularies Specification for details of the Domain Assignment.
  • For details of the Vocabularies adopted for use in the OMOP CDM refer to the Standardized Vocabularies Specification.
  • Concept Class designation are attributes of Concepts. Each Vocabulary has its own set of permissible Concept Classes, although the same Concept Class can be used by more than one Vocabulary. Depending on the Vocabulary, the Concept Class may categorize Concepts vertically (parallel) or horizontally (hierarchically). See the specification of each Vocabulary for details.
  • Concept Class attributes should not be confused with Classification Concepts. These are separate Concepts that have a hierarchical relationship to Standard Concepts or each other, while Concept Classes are unique Vocabulary-specific attributes for each Concept.
  • For Concepts inherited from published terminologies, the source code is retained in the concept_code field and can be used to reference the source vocabulary.
  • Standard Concepts (designated as 'S' in the standard_concept field) may appear in CDM tables in all *_concept_id fields, whereas Classification Concepts ('C') should not appear in the CDM data, but participate in the construction of the CONCEPT_ANCESTOR table and can be used to identify Descendants that may appear in the data. See CONCEPT_ANCESTOR table. Non-standard Concepts can only appear in *_source_concept_id fields and are not used in CONCEPT_ANCESTOR table. Please refer to the Standardized Vocabularies Specifications for details of the Standard Concept designation.
  • All logical data elements associated with the various CDM tables (usually in the <domain>_type_concept_id field) are called Type Concepts, including defining characteristics, qualifying attributes etc. They are also stored as Concepts in the CONCEPT table. Since they are generated by OMOP, their is no meaningful concept_code.
  • The lifespan of a Concept is recorded through its valid_start_date, valid_end_date and the invalid_reason fields. This allows Concepts to correctly reflect at which point in time were defined. Usually, Concepts get deprecatd if their meaning was deamed ambigous, a duplication of another Conncept, or needed revision for scientific reason. For example, drug ingredients get updated when different salt or isomer variants enter the market. Usually, drugs taken off the market do not cause a deprecation by the terminology vendor. Since observational data are valid with respect to the time they are recorded, it is key for the Standardized Vocabularies to provide even obsolete codes and maintain their relationships to other current Concepts .
  • Concepts without a known instantiated date are assigned valid_start_date of ‘1-Jan-1970’.
  • Concepts that are not invalid are assigned valid_end_date of ‘31-Dec-2099’.
  • Deprecated Concepts (with a valid_end_date before the release date of the Standardized Vocabularies) will have a value of 'D' (deprecated without successor) or 'U' (updated). The updated Concepts have a record in the CONCEPT_RELATIONSHIP table indicating their active replacement Concept.
  • Values for concept_ids generated as part of Standardized Vocabularies will be reserved from 0 to 2,000,000,000. Above this range, concept_ids are available for local use and are guaranteed not to clash with future releases of the Standardized Vocabularies.
2014/11/15 02:32 · cgreich

VOCABULARY table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The VOCABULARY table includes a list of the Vocabularies collected from various sources or created de novo by the OMOP community. This reference table is populated with a single record for each Vocabulary source and includes a descriptive name and other associated attributes for the Vocabulary.

FieldRequiredTypeDescription
vocabulary_idYesvarchar(20)A unique identifier for each Vocabulary, such as ICD9CM, SNOMED, Visit.
vocabulary_nameYesvarchar(255)The name describing the vocabulary, for example “International Classification of Diseases, Ninth Revision, Clinical Modification, Volume 1 and 2 (NCHS)” etc.
vocabulary_referenceYesvarchar(255)External reference to documentation or available download of the about the vocabulary.
vocabulary_versionYesvarchar(255)Version of the Vocabulary as indicated in the source.
vocabulary_concept_idYesintegerA foreign key that refers to a standard concept identifier in the CONCEPT table for the Vocabulary the VOCABULARY record belongs to.

Conventions

  • There is one record for each Vocabulary. One Vocabulary source or vendor can issue several Vocabularies, each of them creating their own record in the VOCABULARY table. However, the choice of whether a Vocabulary contains Concepts of different Concept Classes, or when these different classes constitute separate Vocabularies cannot precisely be decided based on the definition of what constitutes a Vocabulary. For example, the ICD-9 Volume 1 and 2 codes (ICD9CM, containing predominantly conditions and some procedures and observations) and the ICD-9 Volume 3 codes (ICD9Proc, containing predominantly procedures) are realized as two different Vocabularies. On the other hand, SNOMED-CT codes of the class Condition and those of the class Procedure are part of one and the same Vocabulary. Please refer to the Standardized Vocabularies Specifications for details of each Vocabulary.
  • The vocabulary_id field contains an alphanumerical identifier, that can also be used as the abbreviation of the Vocabulary name.
  • The record with vocabulary_id = 'None' is reserved to contain information regarding the current version of the Entire Standardized Vocabularies.
  • The vocabulary_name field contains the full official name of the Vocabulary, as well as the source or vendor in parenthesis.
  • Each Vocabulary has an entry in the CONCEPT table, which is recorded in the vocabulary_concept_id field. This is for purposes of creating a closed Information Model, where all entities in the OMOP CDM are covered by a unique Concept.
  • In past versions of the VOCABULARY table, the vocabulary_id used to be a numerical value. A conversion table between these old and new IDs is given below:
2014/11/15 02:34 · cgreich

DOMAIN table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The DOMAIN table includes a list of OMOP-defined Domains the Concepts of the Standardized Vocabularies can belong to. A Domain defines the set of allowable Concepts for the standardized fields in the CDM tables. For example, the “Condition” Domain contains Concepts that describe a condition of a patient, and these Concepts can only be stored in the condition_concept_id field of the CONDITION_OCCURRENCE and CONDITION_ERA tables. This reference table is populated with a single record for each Domain and includes a descriptive name for the Domain.

FieldRequiredTypeDescription
domain_idYesvarchar(20)A unique key for each domain.
domain_nameYesvarchar(255)The name describing the Domain, e.g. “Condition”, “Procedure”, “Measurement” etc.
domain_concept_idYesintegerA foreign key that refers to an identifier in the CONCEPT table for the unique Domain Concept the Domain record belongs to.

Conventions

  • There is one record for each Domain. The domains are defined by the tables and fields in the OMOP CDM that can contain Concepts describing all the various aspects of the healthcare experience of a patient.
  • The domain_id field contains an alphanumerical identifier, that can also be used as the abbreviation of the Domain.
  • The domain_name field contains the unabbreviated names of the Domain.
  • Each Domain also has an entry in the Concept table, which is recorded in the domain_concept_id field. This is for purposes of creating a closed Information Model, where all entities in the OMOP CDM are covered by unique Concept.
  • Past versions of the OMOP CDM did not support the notion of a Domain.
2014/11/15 02:35 · cgreich

CONCEPT_CLASS table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The CONCEPT_CLASS table is a reference table, which includes a list of the classifications used to differentiate Concepts within a given Vocabulary. This reference table is populated with a single record for each Concept Class:

FieldRequiredTypeDescription
concept_class_idYesvarchar(20)A unique key for each class.
concept_class_nameYesvarchar(255)The name describing the Concept Class, e.g. “Clinical Finding”, “Ingredient”, etc.
concept_class_concept_idYesintegerA foreign key that refers to an identifier in the CONCEPT table for the unique Concept Class the record belongs to.

Conventions

  • There is one record for each Concept Class. Concept Classes are used to create additional structure to the Concepts within each Vocabulary. Some Concept Classes are unique to a Vocabulary (for example “Clinical Finding” in SNOMED), but others can be used across different Vocabularies. The separation of Concepts through Concept Classes can be semantically horizontal (each Class subsumes Concepts of the same hierarchical level, akin to sub-Vocabularies within a Vocabulary) or vertical (each Class subsumes Concepts of a certain kind, going across hierarchical levels). For example, Concept Classes in SNOMED are vertical: The classes “Procedure” and “Clinical Finding” define very granular to very generic Concepts. On the other hand, “Clinical Drug” and “Ingredient” Concept Classes define horizontal layers or strata in the RxNorm vocabulary, which all belong to the same concept of a Drug.
  • The concept_class_id field contains an alphanumerical identifier, that can also be used as the abbreviation of the Concept Class.
  • The concept_class_name field contains the unabbreviated names of the Concept Class.
  • Each Concept Class also has an entry in the Concept table, which is recorded in the concept_class_concept_id field. This is for purposes of creating a closed Information Model, where all entities in the OMOP CDM are covered by unique Concepts.
  • Past versions of the OMOP CDM did not have a separate reference table for all Concept Classes. Also, the content of the old concept_class and the new concept_class_id fields are not always identical. A conversion talbe can be found here:
concept_class previouslyconcept_class_id Version 5
Administrative conceptAdmin Concept
Admitting SourceAdmitting Source
Anatomical Therapeutic Chemical ClassificationATC
Anatomical Therapeutic Chemical ClassificationATC
APCProcedure
AttributeAttribute
Biobank FlagBiobank Flag
Biological functionBiological Function
Body structureBody Structure
Brand NameBrand Name
Branded DrugBranded Drug
Branded Drug ComponentBranded Drug Comp
Branded Drug FormBranded Drug Form
Branded PackBranded Pack
CCS_DIAGNOSISCondition
CCS_PROCEDURESProcedure
Chart AvailabilityChart Availability
Chemical StructureChemical Structure
Clinical DrugClinical Drug
Clinical Drug ComponentClinical Drug Comp
Clinical Drug FormClinical Drug Form
Clinical findingClinical Finding
Clinical PackClinical Pack
Concept RelationshipConcept Relationship
Condition Occurrence TypeCondition Occur Type
Context-dependent categoryContext-dependent
CPT-4Procedure
CurrencyCurrency
Death TypeDeath Type
Device TypeDevice Type
Discharge DispositionDischarge Dispo
Discharge StatusDischarge Status
DomainDomain
Dose FormDose Form
DRGDiagnostic Category
Drug Exposure TypeDrug Exposure Type
Drug InteractionDrug Interaction
Encounter TypeEncounter Type
Enhanced Therapeutic ClassificationETC
Enrollment BasisEnrollment Basis
Environment or geographical locationLocation
EthnicityEthnicity
EventEvent
GenderGender
HCPCSProcedure
Health Care Provider SpecialtyProvider Specialty
HES specialtyProvider Specialty
High Level Group TermHLGT
High Level TermHLT
HispanicHispanic
ICD-9-ProcedureProcedure
Indication or Contra-indicationInd / CI
IngredientIngredient
LOINC CodeMeasurement
LOINC Multidimensional ClassificationMeas Class
Lowest Level TermLLT
MDCDiagnostic Category
Measurement TypeMeas Type
Mechanism of ActionMechanism of Action
Model componentModel Comp
Morphologic abnormalityMorph Abnormality
MS-DRGDiagnostic Category
Namespace conceptNamespace Concept
Note TypeNote Type
Observable entityObservable Entity
Observation Period TypeObs Period Type
Observation TypeObservation Type
OMOP DOI cohortDrug Cohort
OMOP HOI cohortCondition Cohort
OPCS-4Procedure
OrganismOrganism
Patient StatusPatient Status
Pharmaceutical / biologic productPharma/Biol Product
Pharmaceutical PreparationsPharma Preparation
PharmacokineticsPK
Pharmacologic ClassPharmacologic Class
Physical forcePhysical Force
Physical objectPhysical Object
Physiologic EffectPhysiologic Effect
Place of ServicePlace of Service
Preferred TermPT
ProcedureProcedure
Procedure Occurrence TypeProcedure Occur Type
Qualifier valueQualifier Value
RaceRace
Record artifactRecord Artifact
Revenue CodeRevenue Code
SexGender
Social contextSocial Context
Special conceptSpecial Concept
SpecimenSpecimen
Staging and scalesStaging / Scales
Standardized MedDRA QuerySMQ
SubstanceSubstance
System Organ ClassSOC
Therapeutic ClassTherapeutic Class
UCUMUnit
UCUM CanonicalCanonical Unit
UCUM CustomUnit
UCUM StandardUnit
UndefinedUndefined
UNKNOWNUndefined
VA ClassDrug Class
VA Drug InteractionDrug Interaction
VA ProductDrug Product
VisitVisit
Visit TypeVisit Type
2014/11/15 02:37 · cgreich

CONCEPT_RELATIONSHIP table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The CONCEPT_RELATIONSHIP table contains records that define direct relationships between any two Concepts and the nature or type of the relationship. Each type of a relationship is defined in the RELATIONSHIP table.

FieldRequiredTypeDescription
concept_id_1YesintegerA foreign key to a Concept in the CONCEPT table associated with the relationship. Relationships are directional, and this field represents the source concept designation.
concept_id_2YesintegerA foreign key to a Concept in the CONCEPT table associated with the relationship. Relationships are directional, and this field represents the destination concept designation.
relationship_idYesvarchar(20)A unique identifier to the type or nature of the Relationship as defined in the RELATIONSHIP table.
valid_start_dateYesdateThe date when the instance of the Concept Relationship is first recorded.
valid_end_dateYesdateThe date when the Concept Relationship became invalid because it was deleted or superseded (updated) by a new relationship. Default value is 31-Dec-2099.
invalid_reasonNovarchar(1)Reason the relationship was invalidated. Possible values are 'D' (deleted), 'U' (replaced with an update) or NULL when valid_end_date has the default value.

Conventions

  • Relationships can generally be classified as hierarchical (parent-child) or non-hierarchical (lateral).
  • All Relationships are directional, and each Concept Relationship is represented twice symmetrically within the CONCEPT_RELATIONSHIP table. For example, the two SNOMED concepts of ‘Acute myocardial infarction of the anterior wall’ and ‘Acute myocardial infarction’ have two Concept Relationships: 1- ‘Acute myocardial infarction of the anterior wall’ ‘Is a’ ‘Acute myocardial infarction’, and 2- ‘Acute myocardial infarction’ ‘Subsumes’ ‘Acute myocardial infarction of the anterior wall’.
  • There is one record for each Concept Relationship connecting the same Concepts with the same relationship_id.
  • Since all Concept Relationships exist with their mirror image (concept_id_1 and concept_id_2 swapped, and the relationship_id replaced by the reverse_relationship_id from the RELATIONSHIP table), it is not necessary to query for the existence of a relationship both in the concept_id_1 and concept_id_2 fields.
  • Concept Relationships define direct relationships between Concepts. Indirect relationships through 3rd Concepts are not captured in this table. However, the CONCEPT_ANCESTOR table does this for hierachical relationships over several “generations” of direct relationships.
  • In previous versions of the CDM, the relationship_id used to be a numerical identifier. See the RELATIONSHIP table.
2014/11/15 02:38 · cgreich

RELATIONSHIP table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The RELATIONSHIP table provides a reference list of all types of relationships that can be used to associate any two concepts in the CONCEPT_RELATIONSHP table.

FieldRequiredTypeDescription
relationship_idYesvarchar(20) The type of relationship captured by the relationship record.
relationship_nameYesvarchar(255) The text that describes the relationship type.
is_hierarchicalYesvarchar(1)Defines whether a relationship defines concepts into classes or hierarchies. Values are 1 for hierarchical relationship or 0 if not.
defines_ancestryYesvarchar(1)Defines whether a hierarchical relationship contributes to the concept_ancestor table. These are subsets of the hierarchical relationships. Valid values are 1 or 0.
reverse_relationship_idYesvarchar(20)The identifier for the relationship used to define the reverse relationship between two concepts.
relationship_concept_idYesintegerA foreign key that refers to an identifier in the CONCEPT table for the unique relationship concept.

Conventions

  • There is one record for each Relationship.
  • Relationships are classified as hierarchical (parent-child) or non-hierarchical (lateral)
  • They are used to determine which concept relationship records should be included in the computation of the CONCEPT_ANCESTOR table.
  • The relationship_id field contains an alphanumerical identifier, that can also be used as the abbreviation of the Relationship.
  • The relationship_name field contains the unabbreviated names of the Relationship.
  • Relationships all exist symmetrically, i.e. in both direction. The relationship_id of the opposite Relationship is provided in the reverse_relationship_id field.
  • Each Relationship also has an equivalent entry in the Concept table, which is recorded in the relationship_concept_id field. This is for purposes of creating a closed Information Model, where all entities in the OMOP CDM are covered by unique Concepts.
  • Hierarchical Relationships are used to build a hierarchical tree out of the Concepts, which is recorded in the CONCEPT_ANCESTOR table. For example, “has_ingredient” is a Relationship between Concepst of the Concept Class 'Clinical Drug' and those of 'Ingredient', and all Ingredients can be classified as the “parental” hierarchical Concepts for the drug products they are part of. All 'Is a' Relationships are hierarchical.
  • Relationships, also hierarchical, can be between Concepts within the same Vocabulary or those adopted from different Vocabulary sources.
  • In past versions of the RELATIONSHIP table, the relationship_id used to be a numerical value. A conversion table between these old and new IDs is given below:
relationship_id previouslyrelationship_id Version 5
1LOINC replaced by
2Has precise ing
3Has tradename
4RxNorm has dose form
5Has form
6RxNorm has ing
7Constitutes
8Contains
9Reformulation of
10Subsumes
11NDFRT has dose form
12Induces
13May diagnose
14Has physio effect
15Has CI physio effect
16NDFRT has ing
17Has CI chem class
18Has MoA
19Has CI MoA
20Has PK
21May treat
22CI to
23May prevent
24Has metabolites
25Has metabolism
26May be inhibited by
27Has chem structure
28NDFRT - RxNorm eq
29Has recipient cat
30Has proc site
31Has priority
32Has pathology
33Has part of
34Has severity
35Has revision status
36Has access
37Has occurrence
38Has method
39Has laterality
40Has interprets
41Has indir morph
42Has indir device
43Has specimen
44Has interpretation
45Has intent
46Has focus
47Has manifestation
48Has active ing
49Has finding site
50Has episodicity
51Has dir subst
52Has dir morph
53Has dir device
54Has component
55Has causative agent
56Has asso morph
57Has asso finding
58Has measurement
59Has property
60Has scale type
61Has time aspect
62Has specimen proc
63Has specimen source
64Has specimen morph
65Has specimen topo
66Has specimen subst
67Has due to
68Has relat context
69Has dose form
70Occurs after
71Has asso proc
72Has dir proc site
73Has indir proc site
74Has proc device
75Has proc morph
76Has finding context
77Has proc context
78Has temporal context
79Findinga sso with
80Has surgical appr
81Using device
82Using energy
83Using subst
84Using acc device
85Has clinical course
86Has route of admin
87Using finding method
88Using finding inform
92ICD9P - SNOMED eq
93CPT4 - SNOMED cat
94CPT4 - SNOMED eq
125MedDRA - SNOMED eq
126Has FDA-appr ind
127Has off-label ind
129Has CI
130ETC - RxNorm
131ATC - RxNorm
132SMQ - MedDRA
135LOINC replaces
136Precise ing of
137Tradename of
138RxNorm dose form of
139Form of
140RxNorm ing of
141Consists of
142Contained in
143Reformulated in
144Is a
145NDFRT dose form of
146Induced by
147Diagnosed through
148Physiol effect by
149CI physiol effect by
150NDFRT ing of
151CI chem class of
152MoA of
153CI MoA of
154PK of
155May be treated by
156CI by
157May be prevented by
158Metabolite of
159Metabolism of
160Inhibits effect
161Chem structure of
162RxNorm - NDFRT eq
163Recipient cat of
164Proc site of
165Priority of
166Pathology of
167Part of
168Severity of
169Revision status of
170Access of
171Occurrence of
172Method of
173Laterality of
174Interprets of
175Indir morph of
176Indir device of
177Specimen of
178Interpretation of
179Intent of
180Focus of
181Manifestation of
182Active ing of
183Finding site of
184Episodicity of
185Dir subst of
186Dir morph of
187Dir device of
188Component of
189Causative agent of
190Asso morph of
191Asso finding of
192Measurement of
193Property of
194Scale type of
195Time aspect of
196Specimen proc of
197Specimen identity of
198Specimen morph of
199Specimen topo of
200Specimen subst of
201Due to of
202Relat context of
203Dose form of
204Occurs before
205Asso proc of
206Dir proc site of
207Indir proc site of
208Proc device of
209Proc morph of
210Finding context of
211Proc context of
212Temporal context of
213Asso with finding
214Surgical appr of
215Device used by
216Energy used by
217subst used by
218Acc device used by
219Clinical course of
220Route of admin of
221Finding method of
222Finding inform of
226SNOMED - ICD9P eq
227SNOMED cat - CPT4
228SNOMED - CPT4 eq
239SNOMED - MedDRA eq
240Is FDA-appr ind of
241Is off-label ind of
243Is CI of
244RxNorm - ETC
245RxNorm - ATC
246MedDRA - SMQ
247Ind/CI - SNOMED
248SNOMED - ind/CI
275Has therap class
276Therap class of
277Drug-drug inter for
278Has drug-drug inter
279Has pharma prep
280Pharma prep in
281Inferred class of
282Has inferred class
283SNOMED proc - HCPCS
284HCPCS - SNOMED proc
285RxNorm - NDFRT name
286NDFRT - RxNorm name
287ETC - RxNorm name
288RxNorm - ETC name
289ATC - RxNorm name
290RxNorm - ATC name
291HOI - SNOMED
292SNOMED - HOI
293DOI - RxNorm
294RxNorm - DOI
295HOI - MedDRA
296MedDRA - HOI
297NUCC - CMS Specialty
298CMS Specialty - NUCC
299DRG - MS-DRG eq
300MS-DRG - DRG eq
301DRG - MDC cat
302MDC cat - DRG
303Visit cat - PoS
304PoS - Visit cat
305VAProd - NDFRT
306NDFRT - VAProd
307VAProd - RxNorm eq
308RxNorm - VAProd eq
309RxNorm replaced by
310RxNorm replaces
311SNOMED replaced by
312SNOMED replaces
313ICD9P replaced by
314ICD9P replaces
315Multilex has ing
316Multilex ing of
317RxNorm - Multilex eq
318Multilex - RxNorm eq
319Multilex ing - class
320Class - Multilex ing
321Maps to
322Mapped from
325Map includes child
326Included in map from
327Map excludes child
328Excluded in map from
345UCUM replaced by
346UCUM replaces
347Concept replaced by
348Concept replaces
349Concept same_as to
350Concept same_as from
351Concept alt_to to
352Concept alt_to from
353Concept poss_eq to
354Concept poss_eq from
355Concept was_a to
356Concept was_a from
357SNOMED meas - HCPCS
358HCPCS - SNOMED meas
359Domain subsumes
360Is domain
2014/11/15 02:40 · cgreich

CONCEPT_SYNONYM table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The CONCEPT_SYNONYM table is used to store alternate names and descriptions for Concepts.

FieldRequiredTypeDescription
concept_idYesIntegerA foreign key to the Concept in the CONCEPT table.
concept_synonym_nameYesvarchar(1000)The alternative name for the Concept.
language_concept_idYesintegerA foreign key to a Concept representing the language.

Conventions

  • The concept_name field contains a valid Synonym of a concept, including the description in the concept_name itself. I.e. each Concept has at least one Synonym in the CONCEPT_SYNONYM table. As an example, for a SNOMED-CT Concept, if the fully specified name is stored as the concept_name of the CONCEPT table, then the Preferred Term and Synonyms associated with the Concept are stored in the CONCEPT_SYNONYM table.
  • Only Synonyms that are active and current are stored in the CONCEPT_SYNONYM table. Tracking synonym/description history and mapping of obsolete synonyms to current Concepts/Synonyms is out of scope for the Standard Vocabularies.
  • Currently, only English Synonyms are included.
2014/11/15 02:43 · cgreich

CONCEPT_ANCESTOR table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The CONCEPT_ANCESTOR table is designed to simplify observational analysis by providing the complete hierarchical relationships between Concepts. Only direct parent-child relationships between Concepts are stored in the CONCEPT_RELATIONSHIP table. To determine higher level ancestry connections, all individual direct relationships would have to be navigated at analysis time. The CONCEPT_ANCESTOR table includes records for all parent-child relationships, as well as grandparent-grandchild relationships and those of any other level of lineage. Using the CONCEPT_ANCESTOR table allows for querying for all descendants of a hierarchical concept. For example, drug ingredients and drug products are all descendants of a drug class ancestor.

This table is entirely derived from the CONCEPT, CONCEPT_RELATIONSHIP and RELATIONSHIP tables.

FieldRequiredTypeDescription
ancestor_concept_idYesintegerA foreign key to the concept in the concept table for the higher-level concept that forms the ancestor in the relationship.
descendant_concept_idYesintegerA foreign key to the concept in the concept table for the lower-level concept that forms the descendant in the relationship.
min_levels_of_separationYesintegerThe minimum separation in number of levels of hierarchy between ancestor and descendant concepts. This is an attribute that is used to simplify hierarchic analysis.
max_levels_of_separationYesintegerThe maximum separation in number of levels of hierarchy between ancestor and descendant concepts. This is an attribute that is used to simplify hierarchic analysis.

A path between two concepts can be characterized by the sequence of relationships that need to be traversed in order to reach a descendant concept from an ancestor concept.

For example, for concepts

descendant_concept_iddescendant_concept_nameancestor_concept_idancestor_concept_namemin_levels_of_separationmax_levels_of_separation
313217Atrial fibrillation321588Heart disease34

the shortest path in concept_relationship will be :

313217 Atrial fibrillation Is a 4226399 Fibrillation Is a 44784217 Cardiac arrhythmia Is a 321588 Heart disease

the longest:

313217 Atrial fibrillation Is a 4068155 Atrial arrhythmia Is a 4248028 Supraventricular arrhythmia Is a 44784217 Cardiac arrhythmia Is a 321588 Heart disease

Conventions

  • Each concept is also recorded as an ancestor of itself.
  • Only valid and Standard Concepts participate in the CONCEPT_ANCESTOR table. It is not possible to find ancestors or descendants of deprecated or Source Concepts.
  • Usually, only Concepts of the same Domain are connected through records of the CONCEPT_ANCESTOR table, but there might be exceptions.
2014/11/15 03:16 · cgreich

SOURCE_TO_CONCEPT_MAP table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The source to concept map table is a legacy data structure within the OMOP Common Data Model, recommended for use in ETL processes to maintain local source codes which are not available as Concepts in the Standardized Vocabularies, and to establish mappings for each source code into a Standard Concept as target_concept_ids that can be used to populate the Common Data Model tables. The SOURCE_TO_CONCEPT_MAP table is no longer populated with content within the Standardized Vocabularies published to the OMOP community.

FieldRequiredTypeDescription
source_codeYesvarchar(50)The source code being translated into a Standard Concept.
source_concept_idYesintegerA foreign key to the Source Concept that is being translated into a Standard Concept.
source_vocabulary_idNovarchar(20)A foreign key to the VOCABULARY table defining the vocabulary of the source code that is being translated to a Standard Concept.
source_code_descriptionYesvarchar(255)An optional description for the source code. This is included as a convenience to compare the description of the source code to the name of the concept.
target_concept_idYesintegerA foreign key to the target Concept to which the source code is being mapped.
target_vocabulary_idYesvarchar(20)A foreign key to the VOCABULARY table defining the vocabulary of the target Concept.
valid_start_dateYesdateThe date when the mapping instance was first recorded.
valid_end_dateYesdateThe date when the mapping instance became invalid because it was deleted or superseded (updated) by a new relationship. Default value is 31-Dec-2099.
invalid_reasonNovarchar(1)Reason the mapping instance was invalidated. Possible values are D (deleted), U (replaced with an update) or NULL when valid_end_date has the default value.

Conventions

  • This table is no longer used to distribute mapping information between source codes and Standard Concepts for the Standard Vocabularies. Instead, the CONCEPT_RELATIONSHIP table is used for this purpose, using the relationship_id='Maps to'.
  • However, this table can still be used for the translation of local source codes into Standard Concepts.
  • Note: This table should not be used to translate source codes to Source Concepts. The source code of a Source Concept is captured in its concept_code field. If the source codes used in a given database do not follow correct formatting the ETL will have to perform this translation. For example, if ICD-9-CM codes are recorded without a dot the ETL will have to perform a lookup function that allows identifying the correct ICD-9-CM Source Concept (with the dot in the concept_code field).
  • The source_concept_id, or the combination of the fields source_code and the source_vocabulary_id uniquely identifies the source information. It is the equivalent to the concept_id_1 field in the CONCEPT_RELATIONSHIP table.
  • If there is no source_concept_id available because the source codes are local and not supported by the Standard Vocabulary, the content of the field is 0 (zero, not null) encoding an undefined concept. However, local Source Concepts are established (concept_id values above 2,000,000,000).
  • The source_code_description contains an optional description of the source code.
  • The target_concept_id contains the Concept the source code is mapped to. It is equivalent to the concept_id_2 in the CONCEPT_RELATIONSHIP table
  • The target_vocabulary_id field contains the vocabulary_id of the target concept. It is a duplication of the same information in the CONCEPT record of the Target Concept.
  • The fields valid_start_date, valid_end_date and invalid_reason are used to define the life cycle of the mapping information. Invalid mapping records should not be used for mapping information.
2014/11/15 03:28 · cgreich

DRUG_STRENGTH table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

This table had the field denominator_value added with version 5.0.1 (5-Apr-2016) of the OMOP CDM.


The DRUG_STRENGTH table contains structured content about the amount or concentration and associated units of a specific ingredient contained within a particular drug product. This table is supplemental information to support standardized analysis of drug utilization.

FieldRequiredTypeDescription
drug_concept_idYesintegerA foreign key to the Concept in the CONCEPT table representing the identifier for Branded Drug or Clinical Drug Concept.
ingredient_concept_idYesintegerA foreign key to the Concept in the CONCEPT table, representing the identifier for drug Ingredient Concept contained within the drug product.
amount_valueNofloatThe numeric value associated with the amount of active ingredient contained within the product.
amount_unit_concept_idNointegerA foreign key to the Concept in the CONCEPT table representing the identifier for the Unit for the absolute amount of active ingredient.
numerator_valueNofloatThe numeric value associated with the concentration of the active ingredient contained in the product
numerator_unit_concept_idNointegerA foreign key to the Concept in the CONCEPT table representing the identifier for the numerator Unit for the concentration of active ingredient.
denominator_value (V5.0.1)NofloatThe amount of total liquid (or other divisible product, such as ointment, gel, spray, etc.).
denominator_unit_concept_idNointegerA foreign key to the Concept in the CONCEPT table representing the identifier for the denominator Unit for the concentration of active ingredient.
valid_start_dateYesdateThe date when the Concept was first recorded. The default value is 1-Jan-1970.
valid_end_dateYesdateThe date when the concept became invalid because it was deleted or superseded (updated) by a new Concept. The default value is 31-Dec-2099.
invalid_reasonNovarchar(1)Reason the concept was invalidated. Possible values are 'D' (deleted), 'U' (replaced with an update) or NULL when valid_end_date has the default value.

Conventions

  • The DRUG_STRENGTH table contains information for each active (non-deprecated) standard drug concept.
  • A drug which contains multiple active Ingredients will result in multiple DRUG_STRENGTH records, one for each active ingredient.
  • Ingredient strength information is provided either as absolute amount (usually for solid formulations) or as concentration (usually for liquid formulations).
  • If the absolute amount is provided (for example, 'Acetaminophen 5 MG Tablet') the amount_value and amount_unit_concept_id are used to define this content (in this case 5 and 'MG').
  • If the concentration is provided (for example 'Acetaminophen 48 MG/ML Oral Solution') the numerator_value in combination with the numerator_unit_concept_id and denominator_unit_concept_id are used to define this content (in this case 48, 'MG' and 'ML').
  • In case of Quantified Clinical or Branded Drugs the denominator_value contains the total amount of the solution (not the amount of the ingredient). In all other drug concept classes the denominator amount is NULL because the concentration is always normalized to the unit of the denominator. So, a product containing 960 mg in 20 mL is provided as 48 mg/mL in the Clinical Drug and Clinical Drug Component, while as a Quantified Clinical Drug it is written as 960 mg/20 mL.
  • If the strength is provided in % (volume or mass-percent are not distinguished) it is stored in the numerator_value/numerator_unit_concept_id field combination, with both the denominator_value and denominator_unit_concept_id set to NULL. If it is a Quantified Drug the total amount of drug is provided in the denominator_value/denominator_unit_concept_id pair. E.g., the 30 G Isoconazole 2% Topical Cream is provided as 2% / in Clinical Drug and Clinical Drug Component, and as 2% /30 G.
  • Sometimes, one Ingredient is listed with different units within the same drug. This is very rare, and usually this happens if there are more than one Precise Ingredient. For example, 'Penicillin G, Benzathine 150000 UNT/ML / Penicillin G, Procaine 150000 MEQ/ML Injectable Suspension' contains Penicillin G in two different forms.
  • Sometimes, different ingredients in liquid drugs are listed with different units in the denominator_unit_concept_id. This is usually the case if the ingredients are liquids themselves (concentration provided as mL/mL) or solid substances (mg/mg). In these cases, the general assumptions is made that the density of the drug is that of water, and one can assume 1 g = 1 mL.
  • All Drug vocabularies containing Standard Concepts have entries in the DRUG_STRENGTH table.
2014/11/15 13:18 · cgreich

COHORT_DEFINITION table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The COHORT_DEFINITION table contains records defining a Cohort derived from the data through the associated description and syntax and upon instantiation (execution of the algorithm) placed into the COHORT table. Cohorts are a set of subjects that satisfy a given combination of inclusion criteria for a duration of time. The COHORT_DEFINITION table provides a standardized structure for maintaining the rules governing the inclusion of a subject into a cohort, and can store operational programming code to instantiate the cohort within the OMOP Common Data Model.

FieldRequiredTypeDescription
cohort_definition_idYesintegerA unique identifier for each Cohort.
cohort_definition_nameYesvarchar(255)A short description of the Cohort.
cohort_definition_descriptionNoCLOBA complete description of the Cohort definition
definition_type_concept_idYesintegerType defining what kind of Cohort Definition the record represents and how the syntax may be executed
cohort_definition_syntaxNoCLOBSyntax or code to operationalize the Cohort definition
subject_concept_idYesintegerA foreign key to the Concept to which defines the domain of subjects that are members of the cohort (e.g., Person, Provider, Visit).
cohort_instantiation_dateNoDateA date to indicate when the Cohort was instantiated in the COHORT table

Conventions

  • The cohort_definition_syntax does not prescribe any specific syntax or programming language. Typically, it would be any flavor SQL, a cohort definition language, or a free-text description of the algorithm.
  • The subject_concept_id determines what the individual subjects or entities of the Cohort consists of. In most cases, that would be a Person (patient). But cohorts could also be constructed for Providers, Visits or any other Domain. Note that the Domain is not codified using the alphanumerical domain_id like in the CONCEPT table. Instead, the corresponding Concept is used. The Concepts for each domain can be obtained from the DOMAIN table in the domain_concept_id.
2014/11/17 18:40 · cgreich

ATTRIBUTE_DEFINITION table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The ATTRIBUTE_DEFINITION table contains records defining Attributes, or covariates, to members of a Cohort through an associated description and syntax and upon instantiation (execution of the algorithm) placed into the COHORT_ATTRIBUTE table. Attributes are derived elements that can be selected or calculated for a subject in a Cohort. The ATTRIBUTE_DEFINITION table provides a standardized structure for maintaining the rules governing the calculation of covariates for a subject in a Cohort, and can store operational programming code to instantiate the Attributes for a given Cohort within the OMOP Common Data Model.

FieldRequiredTypeDescription
attribute_definition_idYesintegerA unique identifier for each Attribute.
attribute_nameYesvarchar(255)A short description of the Attribute.
attribute_descriptionNoCLOBA complete description of the Attribute definition
attribute_type_concept_idYesintegerType defining what kind of Attribute Definition the record represents and how the syntax may be executed
attribute_syntaxNoCLOBSyntax or code to operationalize the Attribute definition

Conventions

  • Like the definition syntax field for the COHORT_DEFINITION table, the attribute_definition_syntax does not prescribe any specific syntax or programming language. Typically, it would be any flavor SQL, or a cohort definition language, or a free-text description of the algorithm.
  • The Attribute Definition is generic and not necessarily related to a specific Cohort Definition, however the instantiated Attribute is linked to the Cohort records (see below the COHORT table. For example, the Attribute “Age” can be defined as the amount of time between the cohort_start_date of the COHORT table and the year_of_birth, month_of_birth and day_of_birth of the PERSON table. Thus, such a Attribute Definition can be applied and instantiated with any Cohort, as long as it is applied to a Cohort of the same Domain (Person in this case), as it is defined in the subject_concept_id in the COHORT_DEFINITION table.
2014/11/17 18:44 · cgreich

CDM_SOURCE table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The CDM_SOURCE table contains detail about the source database and the process used to transform the data into the OMOP Common Data Model.

FieldRequiredTypeDescription
cdm_source_nameYesvarchar(255)The full name of the source
cdm_source_abbreviationNovarchar(25)An abbreviation of the name
cdm_holderNovarchar(255)The name of the organization responsible for the development of the CDM instance
source_descriptionNoCLOBA description of the source data origin and purpose for collection. The description may contain a summary of the period of time that is expected to be covered by this dataset.
source_documentation_referenceNovarchar(255)URL or other external reference to location of source documentation
cdm_etl _referenceNovarchar(255)URL or other external reference to location of ETL specification documentation and ETL source code
source_release_dateNodateThe date for which the source data are most current, such as the last day of data capture
cdm_release_dateNodateThe date when the CDM was instantiated
cdm_versionNovarchar(10)The version of CDM used
vocabulary_versionNovarchar(20)The version of the vocabulary used

Conventions

  • If a source database is derived from multiple data feeds, the integration of those disparate sources is expected to be documented in the ETL specifications. The source information on each of the databases can be represented as separate records in the CDM_SOURCE table.
  • Currently, there is no mechanism to link individual records in the CDM tables to their source record in the CDM_SOURCE table.
  • The version of the vocabulary can be obtained from the vocabulary_name field in the VOCABULARY table for the record where vocabulary_id='None'.
2014/11/17 19:24 · cgreich
 

PERSON table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

This table changed in version 5.1 of the OMOP CDM. The name of the field time_of_birth was changed to birth_datetime.


The Person Domain contains records that uniquely identify each patient in the source data who is time at-risk to have clinical observations recorded within the source systems.

FieldRequiredTypeDescription
person_idYesintegerA unique identifier for each person.
gender_concept_idYesintegerA foreign key that refers to an identifier in the CONCEPT table for the unique gender of the person.
year_of_birth YesintegerThe year of birth of the person. For data sources with date of birth, the year is extracted. For data sources where the year of birth is not available, the approximate year of birth is derived based on any age group categorization available.
month_of_birthNointegerThe month of birth of the person. For data sources that provide the precise date of birth, the month is extracted and stored in this field.
day_of_birthNointegerThe day of the month of birth of the person. For data sources that provide the precise date of birth, the day is extracted and stored in this field.
birth_datetimeNodatetimeThe date and time of birth of the person.
race_concept_idYesintegerA foreign key that refers to an identifier in the CONCEPT table for the unique race of the person.
ethnicity_concept_idYesintegerA foreign key that refers to the standard concept identifier in the Standardized Vocabularies for the ethnicity of the person.
location_idNointegerA foreign key to the place of residency for the person in the location table, where the detailed address information is stored.
provider_idNointegerA foreign key to the primary care provider the person is seeing in the provider table.
care_site_idNointegerA foreign key to the site of primary care in the care_site table, where the details of the care site are stored.
person_source_valueNovarchar(50)An (encrypted) key derived from the person identifier in the source data. This is necessary when a use case requires a link back to the person data at the source dataset.
gender_source_valueNovarchar(50)The source code for the gender of the person as it appears in the source data. The person’s gender is mapped to a standard gender concept in the Standardized Vocabularies; the original value is stored here for reference.
gender_source_concept_idNoIntegerA foreign key to the gender concept that refers to the code used in the source.
race_source_valueNovarchar(50)The source code for the race of the person as it appears in the source data. The person race is mapped to a standard race concept in the Standardized Vocabularies and the original value is stored here for reference.
race_source_concept_idNoIntegerA foreign key to the race concept that refers to the code used in the source.
ethnicity_source_valueNovarchar(50)The source code for the ethnicity of the person as it appears in the source data. The person ethnicity is mapped to a standard ethnicity concept in the Standardized Vocabularies and the original code is, stored here for reference.
ethnicity_source_concept_idNoIntegerA foreign key to the ethnicity concept that refers to the code used in the source.

Conventions

  • All tables representing patient-related Domains have a foreign-key reference to the person_id field in the PERSON table.
  • Each person record has associated demographic attributes which are assumed to be constant for the patient throughout the course of their periods of observation. For example, the location or gender is expected to have a unique value per person, even though in life these data may change over time.
  • Valid Gender, Race and Ethnicity Concepts each belong to their own Domain.
  • Ethnicity in the OMOP CDM follows the OMB Standards for Data on Race and Ethnicity: Only distinctions between Hispanics and Non-Hispanics are made.
  • Additional information is stored through references to other tables, such as the home address (location_id) or the primary care provider.
  • The Provider refers to the primary care provider (General Practitioner).
  • The Care Site refers to where the Provider typically provides the primary care.
2014/11/17 19:44 · cgreich

OBSERVATION_PERIOD table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The OBSERVATION_PERIOD table contains records which uniquely define the spans of time for which a Person is at-risk to have clinical events recorded within the source systems, even if no events in fact are recorded (healthy patient with no healthcare interactions).

FieldRequiredTypeDescription
observation_period_idYesintegerA unique identifier for each observation period.
person_idYesintegerA foreign key identifier to the person for whom the observation period is defined. The demographic details of that person are stored in the person table.
observation_period_start_dateYesdateThe start date of the observation period for which data are available from the data source.
observation_period_end_dateYesdateThe end date of the observation period for which data are available from the data source.
period_type_concept_idYesIntegerA foreign key identifier to the predefined concept in the Standardized Vocabularies reflecting the source of the observation period information

Conventions

  • One Person may have one or more disjoint observation periods, during which times analyses may assume that clinical events would be captured if observed, and outside of which no clinical events may be recorded.
  • Each Person can have more than one valid OBSERVATION_PERIOD record, but no two observation periods can overlap in time for a given person.
  • As a general assumption, during an Observation Period any clinical event that happens to the patient is expected to be recorded. Conversely, the absence of data indicates that no clinical events occurred to the patient.
  • No clinical data are valid outside an active Observation Period. Clinical data that refer to a time outside (diagnoses of previous conditions such as “Old MI” or medical history) of an active Observation Period are recorded as Observations. The date of the Observation is the first day of the first Observation Period of a patient.
  • For claims data, observation periods are inferred from the enrollment periods to a health benefit plan.
  • For EHR data, the observation period cannot be determined explicitly, because patients usually do not announce their departure from a certain healthcare provider. The ETL will have to apply some heuristic to make a reasonable guess on what the observation_period should be. Refer to the ETL documentation for details.
2014/11/17 20:02 · cgreich

SPECIMEN

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

This table changed in version 5.1 of the OMOP CDM. The field specimen_datetime was added.


The specimen domain contains the records identifying biological samples from a person.

FieldRequiredTypeDescription
specimen_idYesintegerA unique identifier for each specimen.
person_idYesintegerA foreign key identifier to the Person for whom the Specimen is recorded.
specimen_concept_idYesintegerA foreign key referring to a Standard Concept identifier in the Standardized Vocabularies for the Specimen.
specimen_type_concept_idYesintegerA foreign key referring to the Concept identifier in the Standardized Vocabularies reflecting the system of record from which the Specimen was represented in the source data.
specimen_dateYesdateThe date the specimen was obtained from the Person.
specimen_datetimeNodatetimeThe date and time on the date when the Specimen was obtained from the person.
quantityNofloatThe amount of specimen collection from the person during the sampling procedure.
unit_concept_idNointegerA foreign key to a Standard Concept identifier for the Unit associated with the numeric quantity of the Specimen collection.
anatomic_site_concept_idNointegerA foreign key to a Standard Concept identifier for the anatomic location of specimen collection.
disease_status_concept_idNointegerA foreign key to a Standard Concept identifier for the Disease Status of specimen collection.
specimen_source_idNovarchar(50)The Specimen identifier as it appears in the source data.
specimen_source_valueNovarchar(50)The Specimen value as it appears in the source data. This value is mapped to a Standard Concept in the Standardized Vocabularies and the original code is, stored here for reference.
unit_source_valueNovarchar(50)The information about the Unit as detailed in the source.
anatomic_site_source_valueNovarchar(50)The information about the anatomic site as detailed in the source.
disease_status_source_valueNovarchar(50)The information about the disease status as detailed in the source.

Conventions

  • Anatomic site is coded at the most specific level of granularity possible, such that higher level classifications can be derived using the Standardized Vocabularies.
2014/11/19 22:38 · cgreich

DEATH table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

This table changed in version 5.1 of the OMOP CDM. The death_datetime was added.


The death domain contains the clinical event for how and when a Person dies. A person can have up to one record if the source system contains evidence about the Death, such as:

  • Condition Code in the Header or Detail information of claims
  • Status of enrollment into a health plan
  • Explicit record in EHR data
FieldRequiredTypeDescription
person_idYesintegerA foreign key identifier to the deceased person. The demographic details of that person are stored in the person table.
death_date YesdateThe date the person was deceased. If the precise date including day or month is not known or not allowed, December is used as the default month, and the last day of the month the default day.
death_datetime NodatetimeThe date and time the person was deceased. If the precise date including day or month is not known or not allowed, December is used as the default month, and the last day of the month the default day.
death_type_concept_idYesintegerA foreign key referring to the predefined concept identifier in the Standardized Vocabularies reflecting how the death was represented in the source data.
cause_concept_idNointegerA foreign key referring to a standard concept identifier in the Standardized Vocabularies for conditions.
cause_source_valueNovarchar(50)The source code for the cause of death as it appears in the source data. This code is mapped to a standard concept in the Standardized Vocabularies and the original code is, stored here for reference.
cause_source_concept_idNointegerA foreign key to the concept that refers to the code used in the source. Note, this variable name is abbreviated to ensure it will be allowable across database platforms.

Conventions

  • Living patients should not contain any information in the DEATH table.
  • Each Person may have more than one record of death in the source data. It is the task of the ETL to pick the most plausible or most accurate records to be aggregated and stored as a single record in the DEATH table.
  • If the Death Date cannot be precisely determined from the data, the best approximation should be used.
  • Valid Concepts for the cause_concept_id have domain_id='Condition'.
2014/11/19 23:22 · cgreich

VISIT_OCCURRENCE table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

This table changed version 5.1 of the OMOP CDM. The fields visit_start_datetime and visit_end_datetime were added. For 5.0.1, the fields admitting_source_concept_id, discharge_to_concept_id, admitting_source_value and discharge_to_source_value were added.


The VISIT_OCCURRENCE table contains the spans of time a Person continuously receives medical services from one or more providers at a Care Site in a given setting within the health care system. Visits are classified into 4 settings: outpatient care, inpatient confinement, emergency room, and long-term care. Persons may transition between these settings over the course of an episode of care (for example, treatment of a disease onset).

FieldRequiredTypeDescription
visit_occurrence_idYesintegerA unique identifier for each Person's visit or encounter at a healthcare provider.
person_idYesintegerA foreign key identifier to the Person for whom the visit is recorded. The demographic details of that Person are stored in the PERSON table.
visit_concept_idYesintegerA foreign key that refers to a visit Concept identifier in the Standardized Vocabularies.
visit_start_dateYesdateThe start date of the visit.
visit_start_datetimeNodatetimeThe date and time of the visit started.
visit_end_dateYesdateThe end date of the visit. If this is a one-day visit the end date should match the start date.
visit_end_datetimeNodatetimeThe date and time of the visit end.
visit_type_concept_idYesIntegerA foreign key to the predefined Concept identifier in the Standardized Vocabularies reflecting the type of source data from which the visit record is derived.
provider_idNointegerA foreign key to the provider in the provider table who was associated with the visit.
care_site_idNointegerA foreign key to the care site in the care site table that was visited.
admitting_source_concept_idNointegerA foreign key to the predefined concept in the Place of Service Vocabulary reflecting the admitting source for a visit.
discharge_to_concept_idNointegerA foreign key to the predefined concept in the Place of Service Vocabulary reflecting the discharge disposition (destination) for a visit.
preceding_visit_occurrence_idNointegerA foreign key to the VISIT_OCCURRENCE table record of the visit immediately preceding this visit.
visit_source_valueNostring(50)The source code for the visit as it appears in the source data.
visit_source_concept_idNoIntegerA foreign key to a Concept that refers to the code used in the source.
admitting_source_valueNostring(50)The source code for the admitting source as it appears in the source data.
discharge_to_source_valueNostring(50)The source code for the discharge disposition as it appears in the source data.

Conventions

  • A Visit Occurrence is recorded for each visit to a healthcare facility.
  • Valid Visit Concepts belong to the “Visit” domain.
  • Standard Visit Concepts are defined as Inpatient Visit, Outpatient Visit, Emergency Room Visit, Long Term Care Visit and combined ER and Inpatient Visit. The latter is necessary because it is close to impossible to separate the two in many EHR system, treating them interchangeably. To annotate this correctly, the visit concept “Emergency Room and Inpatient Visit” (concept_id=262) should be used.
  • Handling of death: In case when patient died during admission (Visit_Occurrence. discharge_disposition_concept_id = 4216643 ‘Patient died’), a record in the Death table should be created with death_type_concept_id = 44818516 (“EHR discharge status “Expired”).
  • Source Concepts from place of service vocabularies are mapped into these standard visit Concepts in the Standardized Vocabularies.
  • At any one day, there could be more than one visit.
  • One visit may involve multiple providers, in which case the ETL must specify how a single provider id is selected or leave the provider_id field null.
  • One visit may involve multiple Care Sites, in which case the ETL must specify how a single care_site id is selected or leave the care_site_id field null.
  • Visits are recorded in various data sources in different forms with varying levels of standardization. For example:
    • Medical Claims include Inpatient Admissions, Outpatient Services, and Emergency Room visits.
    • Electronic Health Records may capture Person visits as part of the activities recorded depending wether the EHR system is used at the different Care Sites.
  • Sequential relationships between Visits within an episode of care are represented through chaining them in the preceding_visit_occurrence_id.
2014/11/19 23:50 · cgreich

PROCEDURE_OCCURRENCE table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

This table changed in version 5.1 of the OMOP CDM. The field procedure_datetime was added.


The PROCEDURE_OCCURRENCE tabe contains records of activities or processes ordered by, or carried out by, a healthcare provider on the patient to have a diagnostic or therapeutic purpose. Procedures are present in various data sources in different forms with varying levels of standardization. For example:

  • Medical Claims include procedure codes that are submitted as part of a claim for health services rendered, including procedures performed.
  • Electronic Health Records that capture procedures as orders.
FieldRequiredTypeDescription
procedure_occurrence_idYesintegerA system-generated unique identifier for each Procedure Occurrence.
person_idYesintegerA foreign key identifier to the Person who is subjected to the Procedure. The demographic details of that Person are stored in the PERSON table.
procedure_concept_idYesintegerA foreign key that refers to a standard procedure Concept identifier in the Standardized Vocabularies.
procedure_dateYesdateThe date on which the Procedure was performed.
procedure_datetimeNodatetimeThe date and time on which the Procedure was performed.
procedure_type_concept_idYesintegerA foreign key to the predefined Concept identifier in the Standardized Vocabularies reflecting the type of source data from which the procedure record is derived.
modifier_concept_idNointegerA foreign key to a Standard Concept identifier for a modifier to the Procedure (e.g. bilateral)
quantityNointegerThe quantity of procedures ordered or administered.
provider_idNointegerA foreign key to the provider in the provider table who was responsible for carrying out the procedure.
visit_occurrence_idNointegerA foreign key to the visit in the visit table during which the Procedure was carried out.
procedure_source_valueNovarchar(50)The source code for the Procedure as it appears in the source data. This code is mapped to a standard procedure Concept in the Standardized Vocabularies and the original code is, stored here for reference. Procedure source codes are typically ICD-9-Proc, CPT-4, HCPCS or OPCS-4 codes.
procedure_source_concept_idNointegerA foreign key to a Procedure Concept that refers to the code used in the source.
qualifier_source_valueNovarchar(50)The source code for the qualifier as it appears in the source data.

Conventions

  • Valid Procedure Concepts belong to the “Procedure” domain. Procedure Concepts are based on a variety of vocabularies: SNOMED-CT, ICD-9-Proc, CPT-4, HCPCS and OPCS-4, but also atypical Vocabularies such as ICD-9-CM or MedDRA.
  • Procedures are expected to be carried out within one day and therefore have no end date.
  • Procedures could involve the application of a drug, in which case the procedural component is recorded in the procedure table and simultaneously the administered drug in the drug exposure table when both the procedural component and drug are identifiable.
  • If the quantity value is omitted, a single procedure is assumed.
  • The Procedure Type defines from where the Procedure Occurrence is drawn or inferred. For administrative claims records the type indicates whether a Procedure was primary or secondary and their relative positioning within a claim.
  • The Visit during which the procedure was performed is recorded through a reference to the VISIT_OCCURRENCE table. This information is not always available.
  • The Provider carrying out the procedure is recorded through a reference to the PROVIDER table. This information is not always available.
2014/11/20 00:08 · cgreich

DRUG_EXPOSURE table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

This table changed in version 5.1 of the OMOP CDM. The fields drug_exposure_start_datetime and drug_exposure_end_datetime were added.


The drug exposure domain captures records about the utilization of a Drug when ingested or otherwise introduced into the body. A Drug is a biochemical substance formulated in such a way that when administered to a Person it will exert a certain physiological effect. Drugs include prescription and over-the-counter medicines, vaccines, and large-molecule biologic therapies. Radiological devices ingested or applied locally do not count as Drugs.

Drug Exposure is inferred from clinical events associated with orders, prescriptions written, pharmacy dispensings, procedural administrations, and other patient-reported information, for example:

  • The “Prescription” section of an EHR captures prescriptions written by physicians or from electronic ordering systems
  • The “Medication list” section of an EHR for both non-prescription products and medications prescribed by other providers
  • Prescriptions filled at dispensing providers such as pharmacies, and then captured in reimbursement claim systems
  • Drugs administered as part of a Procedure, such as chemotherapy or vaccines.
FieldRequiredTypeDescription
drug_exposure_idYesintegerA system-generated unique identifier for each Drug utilization event.
person_idYesintegerA foreign key identifier to the person who is subjected to the Drug. The demographic details of that person are stored in the person table.
drug_concept_idYesintegerA foreign key that refers to a Standard Concept identifier in the Standardized Vocabularies for the Drug concept.
drug_exposure_start_dateYesdateThe start date for the current instance of Drug utilization. Valid entries include a start date of a prescription, the date a prescription was filled, or the date on which a Drug administration procedure was recorded.
drug_exposure_start_datetimeNodatetimeThe start date and time for the current instance of Drug utilization. Valid entries include a start date of a prescription, the date a prescription was filled, or the date on which a Drug administration procedure was recorded.
drug_exposure_end_dateNodateThe end date for the current instance of Drug utilization. It is not available from all sources.
drug_exposure_end_datetimeNodatetimeThe end date and time for the current instance of Drug utilization. It is not available from all sources.
drug_type_concept_idYesinteger A foreign key to the predefined Concept identifier in the Standardized Vocabularies reflecting the type of Drug Exposure recorded. It indicates how the Drug Exposure was represented in the source data.
stop_reasonNovarchar(20)The reason the Drug was stopped. Reasons include regimen completed, changed, removed, etc.
refillsNointegerThe number of refills after the initial prescription. The initial prescription is not counted, values start with 0.
quantity NofloatThe quantity of drug as recorded in the original prescription or dispensing record.
days_supplyNointegerThe number of days of supply of the medication as recorded in the original prescription or dispensing record.
sigNoclobThe directions (“signetur”) on the Drug prescription as recorded in the original prescription (and printed on the container) or dispensing record.
route_concept_idNointegerA foreign key to a predefined concept in the Standardized Vocabularies reflecting the route of administration.
effective_drug_doseNofloatNumerical value of Drug dose for this Drug Exposure record.
dose_unit_concept_ idNointegerA foreign key to a predefined concept in the Standardized Vocabularies reflecting the unit the effective_drug_dose value is expressed.
lot_numberNovarchar(50)An identifier assigned to a particular quantity or lot of Drug product from the manufacturer.
provider_idNointegerA foreign key to the provider in the provider table who initiated (prescribed or administered) the Drug Exposure.
visit_occurrence_idNointegerA foreign key to the visit in the visit table during which the Drug Exposure was initiated.
drug_source_valueNovarchar(50)The source code for the Drug as it appears in the source data. This code is mapped to a Standard Drug concept in the Standardized Vocabularies and the original code is, stored here for reference.
drug_source_concept_idNointegerA foreign key to a Drug Concept that refers to the code used in the source.
route_source_valueNovarchar(50)The information about the route of administration as detailed in the source.
dose_unit_source_valueNovarchar(50)The information about the dose unit as detailed in the source.

Conventions

  • Valid Concepts for the drug_concept_id field belong to the “Drug” domain. Most Concepts in the Drug domain are based on RxNorm, but some may come from other sources. Concepts are members of the Clinical Drug or Pack, Branded Drug or Pack, Drug Component or Ingredient classes.
  • Source drug identifiers, including NDC codes, Generic Product Identifiers, etc. are mapped to Standard Drug Concepts in the Standardized Vocabularies (e.g., based on RxNorm). When the Drug Source Value of the code cannot be translated into standard Drug Concept IDs, a Drug exposure entry is stored with only the corresponding source_concept_id and drug_source_value and a drug_concept_id of 0.
  • The Drug Concept with the most detailed content of information is preferred during the mapping process. These are indicated in the concept_class_id field of the Concept and are recorded in the following order of precedence: “Marketed Product”, “Branded Pack”, “Clinical Pack”, “Branded Drug”, “Clinical Drug”, “Branded Drug Component”, “Clinical Drug Component”, “Branded Drug Form”, “Clinical Drug Form”, and only if no other information is available “Ingredient”. Note: If only the drug class (i.e. “Diuretic”, “NSAID”) is known, the drug_concept_id should contain 0.
  • A Drug Type is assigned to each Drug Exposure to track from what source the information was drawn or inferred from. The valid domain_id for these Concepts is “Drug Type”.
  • The content of the refills field determines the current number of refills, not the number of remaining refills. For example, for a drug prescription with 2 refills, the content of this field for the 3 Drug Exposure events are null, 1 and 2.
  • The route_concept_id refers to a Standard Concepts of the “Route” domain. Note: Route information can also be inferred from the Drug product itself by determining the Drug Form of the Concept, creating some partial overlap of the same type of information. However, the route_concept_id could resolve ambiguities of how a certain Drug Form is actually applied. For example, a “Solution” could be used orally or parentherally, and this field will make this determination.
  • The Effective Drug Dose and the Dose Unit Concepts are provided in cases when dose information is explicitly provided, as it is typically for pediatric and chemotherapeutic treatments. The domain_id for the Dose Unit Concept is “Unit”. Note: this information can only be present if the Drug contains a single active ingredient. Combination products which have doses for each ingredient need to be recorded as separate records.
  • The lot_number field contains an identifier assigned from the manufacturer of the Drug product.
  • If possible, the visit in which the drug was prescribed or delivered is recorded in the visit_occurrence_id field through a reference to the visit table.
  • If possible, the prescribing or administering provider (physician or nurse) is recorded in the provider_id field through a reference to the provider table.
2014/12/04 08:25 · cgreich

DEVICE_EXPOSURE table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

This table changed in version 5.1 of the OMOP CDM. The fields device_exposure_start_datetime and device_exposure_end_datetime were added.


The device exposure domain captures information about a person’s exposure to a foreign physical object or instrument that which is used for diagnostic or therapeutic purposes through a mechanism beyond chemical action. Devices include implantable objects (e.g. pacemakers, stents, artificial joints), medical equipment and supplies (e.g. bandages, crutches, syringes), other instruments used in medical procedures (e.g. sutures, defibrillators) and material used in clinical care (e.g. adhesives, body material, dental material, surgical material).

FieldRequiredTypeDescription
device_exposure_idYesintegerA system-generated unique identifier for each Device Exposure.
person_idYesintegerA foreign key identifier to the Person who is subjected to the Device. The demographic details of that person are stored in the Person table.
device_concept_idYesintegerA foreign key that refers to a Standard Concept identifier in the Standardized Vocabularies for the Device concept.
device_exposure_start_dateYesdateThe date the Device or supply was applied or used.
device_exposure_start_datetimeNodatetimeThe date and time the Device or supply was applied or used.
device_exposure_end_dateNodateThe date the Device or supply was removed from use.
device_exposure_end_datetimeNodatetimeThe date and time the Device or supply was removed from use.
device_type_concept_idYesintegerA foreign key to the predefined Concept identifier in the Standardized Vocabularies reflecting the type of Device Exposure recorded. It indicates how the Device Exposure was represented in the source data.
unique_device_id Novarchar(50)A UDI or equivalent identifying the instance of the Device used in the Person.
quantityNointegerThe number of individual Devices used for the exposure.
provider_idNointegerA foreign key to the provider in the PROVIDER table who initiated of administered the Device.
visit_occurrence_idNointegerA foreign key to the visit in the VISIT table during which the device was used.
device_source_valueNovarchar(50)The source code for the Device as it appears in the source data. This code is mapped to a standard Device Concept in the Standardized Vocabularies and the original code is stored here for reference.
device_source_ concept_idNointegerA foreign key to a Device Concept that refers to the code used in the source.

Conventions

  • The distinction between Devices or supplies and procedures are sometimes blurry, but the former are physical objects while the latter are actions, often to apply a Device or supply.
  • For medical devices that are regulated by the FDA, if a Unique Device Identification (UDI) is provided if available in the data source, and is recorded in the unique_device_id field.
  • Valid Device Concepts belong to the “Device” domain. The Concepts of this domain are derived from the DI portion of a UDI or based on other source vocabularies, like HCPCS.
  • A Device Type is assigned to each Device Exposure to track from what source the information was drawn or inferred. The valid domain_id for these Concepts is “Device Type”.
  • The Visit during which the Device was first used is recorded through a reference to the VISIT_OCCURRENCE table. This information is not always available.
  • The Provider exposing the patient to the Device is recorded through a reference to the PROVIDER table. This information is not always available.
2014/12/04 09:05 · cgreich

CONDITION_OCCURRENCE table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

This table changed in version 5.1 of the OMOP CDM. The fields condition_start_datetime and condition_end_datetime were added. For 5.0.1, this table changed in version 5.X of the OMOP CDM. The fields condition_status_concept_id and condition_status_source_value were added.


Conditions are records of a Person suggesting the presence of a disease or medical condition stated as a diagnosis, a sign or a symptom, which is either observed by a Provider or reported by the patient. Conditions are recorded in different sources and levels of standardization, for example:

  • Medical claims data include diagnoses coded in ICD-9-CM that are submitted as part of a reimbursement claim for health services and
  • EHRs may capture Person Conditions in the form of diagnosis codes or symptoms.
Field Required Type Description
condition_occurrence_id Yes integer A unique identifier for each Condition Occurrence event.
person_id Yes integer A foreign key identifier to the Person who is experiencing the condition. The demographic details of that Person are stored in the PERSON table.
condition_concept_id Yes integer A foreign key that refers to a Standard Condition Concept identifier in the Standardized Vocabularies.
condition_start_date Yes date The date when the instance of the Condition is recorded.
condition_start_datetime No datetime The date and time when the instance of the Condition is recorded.
condition_end_date No date The date when the instance of the Condition is considered to have ended.
condition_end_datetime No date The date when the instance of the Condition is considered to have ended.
condition_type_concept_id Yes integer A foreign key to the predefined Concept identifier in the Standardized Vocabularies reflecting the source data from which the condition was recorded, the level of standardization, and the type of occurrence.
stop_reason No varchar(20) The reason that the condition was no longer present, as indicated in the source data.
provider_id No integer A foreign key to the Provider in the PROVIDER table who was responsible for capturing (diagnosing) the Condition.
visit_occurrence_id No integer A foreign key to the visit in the VISIT table during which the Condition was determined (diagnosed).
condition_status_concept_id No integer A foreign key to the predefined concept in the standard vocabulary reflecting the condition status.
condition_source_concept_id No integer A foreign key to a Condition Concept that refers to the code used in the source.
condition_source_value No varchar(50) The source code for the condition as it appears in the source data. This code is mapped to a standard condition concept in the Standardized Vocabularies and the original code is stored here for reference.
condition_status_source_value No varchar(50)

Conventions

  • Valid Condition Concepts belong to the “Condition” domain.
  • Condition records are typically inferred from diagnostic codes recorded in the source data. Such code system, like ICD-9-CM, ICD-10-CM, Read etc., provide a comprehensive coverage of conditions. However, if the diagnostic code in the source does not define a condition, but rather an observation or a procedure, then such information is not stored in the CONDITION_OCCURRENCE table, but in the respective tables instead.
  • Source Condition identifiers are mapped to Standard Concepts for Conditions in the Standardized Vocabularies. When the source code cannot be translated into a Standard Concept, a CONDITION_OCCURRENCE entry is stored with only the corresponding source_concept_id and source_value, while the condition_concept_id is set to 0.
  • Family history and past diagnoses (“history of”) are not recorded in the CONDITION_OCCURRENCE table. Instead, they are listed in the OBSERVATION table.
  • Codes written in the process of establishing the diagnosis, such as “question of” of and “rule out”, are not represented here. Instead, they are listed in the OBSERVATION table, if they are used for analyses.
  • A Condition Occurrence Type is assigned based on the data source and type of condition attribute, for example:
    • ICD-9-CM Primary Diagnosis from inpatient and outpatient Claims
    • ICD-9-CM Secondary Diagnoses from inpatient and outpatient Claims
    • Diagnoses or problems recorded in an EHR.
  • The Stop Reason indicates why a Condition is no longer valid with respect to the purpose within the source data. Typical values include “Discharged”, “Resolved”, etc. Note that a Stop Reason does not necessarily imply that the condition is no longer occurring.
  • Condition source codes are typically ICD-9-CM, Read or ICD-10 diagnosis codes from medical claims or discharge status/visit diagnosis codes from EHRs.
  • The Condition Status reflects when the condition was diagnosed, implying a different depth of diagnostic work-up:
    • Admitting diagnosis: use concept_id 4203942
    • Preliminary diagnosis: use concept_id 4033240
    • Final diagnosis: use concept_id 4230359 – should also be used for ‘Discharge diagnosis’
2014/12/04 09:37 · cgreich

MEASUREMENT table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

This table changed in version 5.1 of the OMOP CDM. The field measurement_datetime was added.


The MEASUREMENT table contains records of Measurement, i.e. structured values (numerical or categorical) obtained through systematic and standardized examination or testing of a Person or Person's sample. The MEASUREMENT table contains both orders and results of such Measurements as laboratory tests, vital signs, quantitative findings from pathology reports, etc.

FieldRequiredTypeDescription
measurement_idYesintegerA unique identifier for each Measurement.
person_idYesintegerA foreign key identifier to the Person about whom the measurement was recorded. The demographic details of that Person are stored in the PERSON table.
measurement_concept_idYesintegerA foreign key to the standard measurement concept identifier in the Standardized Vocabularies.
measurement_dateYesdateThe date of the Measurement.
measurement_datetimeNodatetimeThe date and time of the Measurement. (Some database systems don't have a datatype of time. To accomodate all temporal analyses, datatype datetime can be used (combining measurement_date and measurement_time)Relevant Forum Discussion
measurement_type_concept_idYesintegerA foreign key to the predefined Concept in the Standardized Vocabularies reflecting the provenance from where the Measurement record was recorded.
operator_concept_idNointegerA foreign key identifier to the predefined Concept in the Standardized Vocabularies reflecting the mathematical operator that is applied to the value_as_number. Operators are <, ≤, =, ≥, >.
value_as_numberNofloatA Measurement result where the result is expressed as a numeric value.
value_as_concept_idNointegerA foreign key to a Measurement result represented as a Concept from the Standardized Vocabularies (e.g., positive/negative, present/absent, low/high, etc.).
unit_concept_idNointegerA foreign key to a Standard Concept ID of Measurement Units in the Standardized Vocabularies.
range_lowNofloatThe lower limit of the normal range of the Measurement result. The lower range is assumed to be of the same unit of measure as the Measurement value.
range_highNofloatThe upper limit of the normal range of the Measurement. The upper range is assumed to be of the same unit of measure as the Measurement value.
provider_idNointegerA foreign key to the provider in the PROVIDER table who was responsible for initiating or obtaining the measurement.
visit_occurrence_idNointegerA foreign key to the Visit in the VISIT_OCCURRENCE table during which the Measurement was recorded.
measurement_source_valueNovarchar(50)The Measurement name as it appears in the source data. This code is mapped to a Standard Concept in the Standardized Vocabularies and the original code is stored here for reference.
measurement_source_concept_idNointegerA foreign key to a Concept in the Standard Vocabularies that refers to the code used in the source.
unit_source_valueNovarchar(50)The source code for the unit as it appears in the source data. This code is mapped to a standard unit concept in the Standardized Vocabularies and the original code is stored here for reference.
value_source_valueNovarchar(50)The source value associated with the content of the value_as_number or value_as_concept_id as stored in the source data.

Conventions

  • Measurements differ from Observations in that they require a standardized test or some other activity to generate a quantitative or qualitative result. For example, LOINC 1755-8 concept_id 3027035 'Albumin [Mass/time] in 24 hour Urine' is the lab test to measure a certain chemical in a urine sample.
  • Even though each Measurement always have a result, the fields value_as_number and value_as_concept_id are not mandatory. When the result is not known, the Measurement record represents just the fact that the corresponding Measurement was carried out, which in itself is already useful information for some use cases.
  • Valid Measurement Concepts (measurement_concept_id) belong to the 'Measurement' domain, but could overlap with the 'Observation' domain. This is due to the fact that there is a continuum between systematic examination or testing (Measurement) and a simple determination of fact (Observation). When the Measurement Source Value of the code cannot be translated into a standard Measurement Concept ID, a Measurement entry is stored with only the corresponding source_concept_id and measurement_source_value and a measurement_concept_id of 0.
  • Measurements are stored as attribute value pairs, with the attribute as the Measurement Concept and the value representing the result. The value can be a Concept (stored in value_as_concept), or a numerical value (value_as_number) with a Unit (unit_concept_id).
  • Valid Concepts for the value_as_concept field belong to the 'Meas Value' domain.
  • For some Measurement Concepts, the result is included in the test. For example, ICD10 concept_id 45595451 “Presence of alcohol in blood, level not specified” indicates a Measurement and the result (present). In those situations, the CONCEPT_RELATIONSHIP table in addition to the “Maps to” record contains a second record with the relationship_id set to “Maps to value”. In this example, the “Maps to” relationship directs to 4041715 “Blood ethanol measurement” as well as a “Maps to value” record to 4181412 “Present”.
  • The operator_concept_id is optionally given for relative Measurements where the precise value is not available but its relation to a certain benchmarking value is. For example, this can be used for minimal detection thresholds of a test.
  • The meaning of Concept 4172703 for '=' is identical to omission of a operator_concept_id value. Since the use of this field is rare, it's important when devising analyses to not to forget testing for the content of this field for values different from =.
  • Valid Concepts for the operator_concept_id field belong to the 'Meas Value Operator' domain.
  • The Unit is optional even if a value_as_number is provided.
  • If reference ranges for upper and lower limit of normal as provided (typically by a laboratory) these are stored in the range_high and range_low fields. Ranges have the same unit as the value_as_number.
  • The Visit during which the observation was made is recorded through a reference to the VISIT_OCCURRENCE table. This information is not always available.
  • The Provider making the observation is recorded through a reference to the PROVIDER table. This information is not always available.
2014/12/05 20:43 · cgreich

NOTE table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

This table changed in version 5.2 of the OMOP CDM. The fields note_datetime, note_class_concept_id, note_title, encoding_concept_id, language_concept_id were added.


The NOTE table captures unstructured information that was recorded by a provider about a patient in free text notes on a given date.

FieldRequiredTypeDescription
note_idYesintegerA unique identifier for each note.
person_idYesintegerA foreign key identifier to the Person about whom the Note was recorded. The demographic details of that Person are stored in the PERSON table.
note_date YesdateThe date the note was recorded.
note_datetimeNodatetimeThe date and time the note was recorded.
note_type_concept_idYesintegerA foreign key to the predefined Concept in the Standardized Vocabularies reflecting the type, origin or provenance of the Note.
note_class_concept_idYesintegerA foreign key to the predefined Concept in the Standardized Vocabularies reflecting the HL7 LOINC Document Type Vocabulary classification of the note.
note_titleNostring(250)The title of the Note as it appears in the source.
note_textNoRBDMS dependent textThe content of the Note.
encoding_concept_idYesintegerA foreign key to the predefined Concept in the Standardized Vocabularies reflecting the note character encoding type.
language_concept_idYesintegerA foreign key to the predefined Concept in the Standardized Vocabularies reflecting the language of the note.
provider_idNointegerA foreign key to the Provider in the PROVIDER table who took the Note.
visit_occurrence_idNointegerForeign key to the Visit in the VISIT_OCCURRENCE table when the Note was taken.

Conventions

  • The NOTE table contains free text (in ASCII, or preferably in UTF8 format) taken by a healthcare Provider.
  • The Visit during which the note was written is recorded through a reference to the VISIT_OCCURRENCE table. This information is not always available.
  • The Provider making the note is recorded through a reference to the PROVIDER table. This information is not always available.
  • The type of note_text is CLOB or string(MAX) depending on RDBMS
2014/12/05 20:43 · cgreich

OBSERVATION table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

This table changed in version 5.1 of the OMOP CDM. The field observation_datetime was added.


The OBSERVATION table captures clinical facts about a Person obtained in the context of examination, questioning or a procedure. Any data that cannot be represented by any other domains, such as social and lifestyle facts, medical history, family history, etc. are recorded here.

FieldRequiredTypeDescription
observation_idYesintegerA unique identifier for each observation.
person_idYesintegerA foreign key identifier to the Person about whom the observation was recorded. The demographic details of that Person are stored in the PERSON table.
observation_concept_idYesintegerA foreign key to the standard observation concept identifier in the Standardized Vocabularies.
observation_dateYesdateThe date of the observation.
observation_datetimeNodatetimeThe date and time of the observation.
observation_type_concept_idYesintegerA foreign key to the predefined concept identifier in the Standardized Vocabularies reflecting the type of the observation.
value_as_numberNofloatThe observation result stored as a number. This is applicable to observations where the result is expressed as a numeric value.
value_as_stringNovarchar(60)The observation result stored as a string. This is applicable to observations where the result is expressed as verbatim text.
value_as_concept_idNoIntegerA foreign key to an observation result stored as a Concept ID. This is applicable to observations where the result can be expressed as a Standard Concept from the Standardized Vocabularies (e.g., positive/negative, present/absent, low/high, etc.).
qualifier_concept_idNointegerA foreign key to a Standard Concept ID for a qualifier (e.g., severity of drug-drug interaction alert)
unit_concept_idNointegerA foreign key to a Standard Concept ID of measurement units in the Standardized Vocabularies.
provider_idNointegerA foreign key to the provider in the PROVIDER table who was responsible for making the observation.
visit_occurrence_idNointegerA foreign key to the visit in the VISIT_OCCURRENCE table during which the observation was recorded.
observation_source_valueNovarchar(50)The observation code as it appears in the source data. This code is mapped to a Standard Concept in the Standardized Vocabularies and the original code is, stored here for reference.
observation_source_concept_idNointegerA foreign key to a Concept that refers to the code used in the source.
unit_source_valueNovarchar(50)The source code for the unit as it appears in the source data. This code is mapped to a standard unit concept in the Standardized Vocabularies and the original code is, stored here for reference.
qualifier_source_valueNovarchar(50)The source value associated with a qualifier to characterize the observation

Conventions

  • Observations differ from Measurements in that they do not require a standardized test or some other activity to generate clinical fact. Typical observations are medical history, family history, the stated need for certain treatment, social circumstances, lifestyle choices, healthcare utilization patterns, etc. If the generation clinical facts requires a standardized testing such as lab testing or imaging and leads to a standardized result, the data item is recorded in the MEASUREMENT table. If the clinical fact observed determines a sign, symptom, diagnosis of a disease or other medical condition, it is recorded in the CONDITION_OCCURRENCE table.
  • Valid Observation Concepts are not enforced to be from any domain. They still should be Standard Concepts, and they typically belong to the “Observation” or sometimes “Measurement” domain.
  • Observation can be stored as attribute value pairs, with the attribute as the Observation Concept and the value representing the clinical fact. This fact can be a Concept (stored in value_as_concept), a numerical value (value_as_number) or a verbatim string (value_as_string). Even though Observations do not have an explicit result, the clinical fact can be stated separately from the type of Observation in the value_as_ fields.
  • It is recommended for observations that are suggestive statements of positive assertion should have a value of “Yes” (concept_id=4188539), recorded, even though the null value is the equivalent.
  • Valid Concepts of the value_as_concept field are not enforced, but typically belong to the “Meas Value” domain.
  • For numerical facts a Unit can be provided in the unit_concept_id.
  • For facts represented as Concepts no domain membership is enforced.
  • Note that the value of value_as_concept_id may be provided through mapping from a source Concept which contains the content of the Observation. In those situations, the CONCEPT_RELATIONSHIP table in addition to the “Maps to” record contains a second record with the relationship_id set to “Maps to value”. For example, ICD9CM V17.5 concept_id 44828510 “Family history of asthma” has a “Maps to” relationship to 4167217 “Family history of clinical finding” as well as a “Maps to value” record to 317009 “Asthma”.
  • The qualifier_concept_id field contains all attributes specifying the clinical fact further, such as as degrees, severities, drug-drug interaction alerts etc.
  • The Visit during which the observation was made is recorded through a reference to the VISIT_OCCURRENCE table. This information is not always available.
  • The Provider making the observation is recorded through a reference to the PROVIDER table. This information is not always available.
2014/12/05 20:44 · cgreich

FACT_RELATIONSHIP table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The FACT_RELATIONSHIP table contains records about the relationships between facts stored as records in any table of the CDM. Relationships can be defined between facts from the same domain (table), or different domains. Examples of Fact Relationships include: Person relationships (parent-child), care site relationships (hierarchical organizational structure of facilities within a health system), indication relationship (between drug exposures and associated conditions), usage relationships (of devices during the course of an associated procedure), or facts derived from one another (measurements derived from an associated specimen).

FieldRequiredTypeDescription
domain_concept _id_1YesintegerThe concept representing the domain of fact one, from which the corresponding table can be inferred.
fact_id_1YesintegerThe unique identifier in the table corresponding to the domain of fact one.
domain_concept_id_2YesintegerThe concept representing the domain of fact two, from which the corresponding table can be inferred.
fact_id_2YesintegerThe unique identifier in the table corresponding to the domain of fact two.
relationship_concept_id YesintegerA foreign key to a Standard Concept ID of relationship in the Standardized Vocabularies.

Conventions

  • All relationships are directional, and each relationship is represented twice symmetrically within the FACT_RELATIONSHIP table. For example, two persons if person_id = 1 is the mother of person_id = 2 two records are in the FACT_RELATIONSHIP table (all strings in fact concept_id records in the Concept table:
    • Person, 1, Person, 2, parent of
    • Person, 2, Person, 1, child of
2014/12/05 20:47 · cgreich
 

LOCATION table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The LOCATION table represents a generic way to capture physical location or address informationof Persons and Care Sites.

FieldRequiredTypeDescription
location_idYesintegerA unique identifier for each geographic location.
address_1Novarchar(50)The address field 1, typically used for the street address, as it appears in the source data.
address_2Novarchar(50)The address field 2, typically used for additional detail such as buildings, suites, floors, as it appears in the source data.
city Novarchar(50)The city field as it appears in the source data.
stateNovarchar(2)The state field as it appears in the source data.
zipNovarchar(9)The zip or postal code.
countyNovarchar(20)The county.
location_source_valueNovarchar(50)The verbatim information that is used to uniquely identify the location as it appears in the source data.

Conventions

  • Each address or Location is unique and is present only once in the table.
  • Locations do not contain names, such as the name of a hospital. In order to construct a full address that can be used in the postal service, the address information from the Location needs to be combined with information from the Care Site. The PERSON table does not contain name information at all.
  • All fields in the Location tables contain the verbatim data in the source, no mapping or normalization takes place. None of the fields are mandatory. If the source data have no Location information at all, all Locations are represented by a single record. Typically, source data contain full or partial zip or postal codes or county or census district information.
  • Zip codes are handled as strings of up to 9 characters length. For US addresses, these represent either a 3-digit abbreviated Zip code as provided by many sources for patient protection reasons, the full 5-digit Zip or the 9-digit (ZIP + 4) codes. Unless for specific reasons analytical methods should expect and utilize only the first 3 digits. For international addresses, different rules apply.
  • The county information can be provided and is not redundant with information from the zip codes as not all of these have an unambiguous county designation.
  • No country information is expected as source data are always collected within a single country.
2014/12/05 21:02 · cgreich

CARE_SITE table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The CARE_SITE table contains a list of uniquely identified institutional (physical or organizational) units where healthcare delivery is practiced (offices, wards, hospitals, clinics, etc.).

FieldRequiredTypeDescription
care_site_idYesintegerA unique identifier for each Care Site.
care_site_nameNovarchar(255)The verbatim description or name of the Care Site as in data source
place_of_service_concept_idNointegerA foreign key that refers to a Place of Service Concept ID in the Standardized Vocabularies.
location_idNointegerA foreign key to the geographic Location in the LOCATION table, where the detailed address information is stored.
care_site_source_valueNovarchar(50)The identifier for the Care Site in the source data, stored here for reference.
place_of_service_source_valueNovarchar(50)The source code for the Place of Service as it appears in the source data, stored here for reference.

Conventions

  • Care site is a unique combination of location_id and place_of_service_source_value.
  • Every record in the visit_occurrence table may have only one care site
  • Care site does not take into account the provider (human) information such a specialty.
  • Many source data do not make a distinction between individual and institutional providers. The CARE_SITE table contains the institutional providers.
  • If the source, instead of uniquely identifying individual Care Sites, only provides limited information such as Place of Service, generic or “pooled” Care Site records are listed in the CARE_SITE table.
  • There are hierarchical and business relationships between Care Sites. For example,wards can belong to clinics or departments, which can in turn belong to hospitals, which in turn can belong to hospital systems, which in turn can belong to HMOs.
  • The relationships between Care Sites are defined in the FACT_RELATIONSHIP table.
  • The Care Site Source Value typically contains the name of the Care Site.
  • The Place of Service Concepts belongs to the Domain 'Place of Service'.
2014/12/05 21:02 · cgreich

PROVIDER table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The PROVIDER table contains a list of uniquely identified healthcare providers. These are individuals providing hands-on healthcare to patients, such as physicians, nurses, midwives, physical therapists etc.

FieldRequiredTypeDescription
provider_idYesintegerA unique identifier for each Provider.
provider_nameNovarchar(50)A description of the Provider.
npiNovarchar(20)The National Provider Identifier (NPI) of the provider.
deaNovarchar(20)The Drug Enforcement Administration (DEA) number of the provider.
specialty_concept_idNointegerA foreign key to a Standard Specialty Concept ID in the Standardized Vocabularies.
care_site_idNointegerA foreign key to the main Care Site where the provider is practicing.
year_of_birthNointegerThe year of birth of the Provider.
gender_concept_idNointegerThe gender of the Provider.
provider_source_valueNovarchar(50)The identifier used for the Provider in the source data, stored here for reference.
specialty_source_valueNovarchar(50)The source code for the Provider specialty as it appears in the source data, stored here for reference.
specialty_source_concept_idNointegerA foreign key to a Concept that refers to the code used in the source.
gender_source_valueNovarchar(50)The gender code for the Provider as it appears in the source data, stored here for reference.
gender_source_concept_idNointegerA foreign key to a Concept that refers to the code used in the source.

Conventions

  • Many sources do not make a distinction between individual and institutional providers. The PROVIDER table contains the individual providers.
  • If the source, instead of uniquely identifying individual providers, only provides limited information such as specialty, generic or “pooled” Provider records are listed in the PROVIDER table.
  • A single Provider cannot be listed twice (be duplicated) in the table. If a Provider has more than one Specialty, the main or most often exerted specialty should be recorded.
  • Valid Specialty Concepts belong to the 'Specialty' domain.
  • The care_site_id represent a fixed relationship between a Provider and her main Care Site. Providers are also linked to Care Sites through Condition, Procedure and Visit records.
2014/12/05 21:03 · cgreich
 

PAYER_PLAN_PERIOD table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The PAYER_PLAN_PERIOD table captures details of the period of time that a Person is continuously enrolled under a specific health Plan benefit structure from a given Payer. Each Person receiving healthcare is typically covered by a health benefit plan, which pays for (fully or partially), or directly provides, the care. These benefit plans are provided by payers, such as health insurances or state or government agencies. In each plan the details of the health benefits are defined for the Person or her family, and the health benefit Plan might change over time typically with increasing utilization (reaching certain cost thresholds such as deductibles), plan availability and purchasing choices of the Person. The unique combinations of Payer organizations, health benefit Plans and time periods in which they are valid for a Person are recorded in this table.

FieldRequiredTypeDescription
payer_plan_period_idYesintegerA identifier for each unique combination of payer, plan, family code and time span.
person_idYesintegerA foreign key identifier to the Person covered by the payer. The demographic details of that Person are stored in the PERSON table.
payer_plan_period_start_dateYesdateThe start date of the payer plan period.
payer_plan_period_end_dateYesdateThe end date of the payer plan period.
payer_source_valueNovarchar(50)The source code for the payer as it appears in the source data.
plan_source_valueNovarchar(50)The source code for the Person's health benefit plan as it appears in the source data.
family_source_valueNovarchar(50)The source code for the Person's family as it appears in the source data.

Conventions

  • Different Payers have different designs for their health benefit Plans. The PAYER_PLAN_PERIOD table does not capture all details of the plan design or the relationship between Plans or the cost of healthcare triggering a change from one Plan to another. However, it allows identifying the unique combination of Payer (insurer), Plan (determining healthcare benefits and limits) and Person. Typically, depending on healthcare utilization, a Person may have one or many subsequent Plans during coverage by a single Payer.
  • Payer or Plan information is not normalized or recorded as part of the Standard Vocabularies. Instead, they are referred to only by their source_value.
  • Typically, family members are covered under the same Plan as the Person. In those cases, the payer_source_value, plan_source_value and family_source_value are identical.
2014/12/05 21:04 · cgreich

VISIT_COST table

As of version 5.0.1 (5-Apr-2016), this table is no longer part of the OMOP CDM. It is replaced by the COST table.

For prior definition, see below.


The VISIT_COST table captures the cost of a Visit of a Person not itemized to specific procedures, drugs, or devices used during the Visit.

FieldRequiredTypeDescription
visit_cost_idYesintegerA unique identifier for each procedure cost record.
visit_occurrence_idYesintegerA foreign key identifier to the procedure record for which cost data are recorded.
currency_concept_idNointegerA concept representing the 3-letter code used to delineate international currencies, such as USD for US Dollar.
paid_copayNofloatThe amount paid by the Person as a fixed contribution to the expenses. Copay does not contribute to the out_of_pocket expenses.
paid_coinsuranceNofloatThe amount paid by the Person as a joint assumption of risk. Typically, this is a percentage of the expenses defined by the Health benefit Plan after the person's deductible is exceeded.
paid_toward_ deductibleNofloatThe amount paid by the Person that is counted toward the deductible defined by the health benefit Plan.
paid_by_payerNofloatThe amount paid by the Payer (insurer). If there is more than one Payer, several VISIT_COST records indicate that fact.
paid_by_coordination_benefitsNofloatThe amount paid by a secondary Payer through the coordination of benefits.
total_out_of_pocketNofloatThe total amount paid by the Person as a share of the expenses, excluding the copay.
total_paidNofloatThe total amount paid for the expenses of the procedure.
payer_plan_period_idNointegerA foreign key to the PAYER_PLAN_PERIOD table, where the details of the Payer, Plan and Family are stored.

Conventions

  • The cost of the Visit may contain just board and food, but could also include the entire cost of everything that was happening to the patient during the Visit.
  • All other conventions apply as in the PROCEDURE_COST table.
2014/12/05 21:05 · cgreich

PROCEDURE_COST table

As of version 5.0.1 (5-Apr-2016), this table is no longer part of the OMOP CDM. It is replaced by the COST table.

For prior definition, see below.


The PROCEDURE_COST table captures the cost of a Procedure performed on a Person. The information about the cost is only derived from the amounts paid for the Procedure. This is in contrast to the Drug Cost data which also contain information about true amount charged by the distributor. In addition, Revenue codes are captured.

FieldRequiredTypeDescription
procedure_cost_idYesintegerA unique identifier for each procedure cost record.
procedure_occurrence_idYesintegerA foreign key identifier to the procedure record for which cost data are recorded.
currency_concept_idNointegerA concept representing the 3-letter code used to delineate international currencies, such as USD for US Dollar.
paid_copayNofloatThe amount paid by the Person as a fixed contribution to the expenses. Copay does not contribute to the out_of_pocket expenses.
paid_coinsuranceNofloatThe amount paid by the Person as a joint assumption of risk. Typically, this is a percentage of the expenses defined by the health benefit Plan after the Person's deductible is exceeded.
paid_toward_deductibleNofloatThe amount paid by the Person that is counted toward the deductible defined by the health benefit Plan.
paid_by_payerNofloatThe amount paid by the Payer. If there is more than one Payer, several PROCEDURE_COST records indicate that fact.
paid_by_coordination_benefitsNofloatThe amount paid by a secondary Payer through the coordination of benefits.
total_out_of_pocketNofloatThe total amount paid by the Person as a share of the expenses
total_paidNofloatThe total amount paid for the expenses of the Procedure.
revenue_code_concept_idNointegerA foreign key referring to a Standard Concept ID in the Standardized Vocabularies for Revenue codes.
payer_plan_period_idNointegerA foreign key to the PAYER_PLAN_PERIOD table, where the details of the payer, plan and family are stored.
revenue_code_source_valueNovarchar(50)The source code for the Revenue code as it appears in the source data, stored here for reference.

Conventions

  • Each Procedure Occurrence may have any number of corresponding records in the PROCEDURE_COST table, but often it is none (cost data not captured) or one (one payment per Procedure). They are linked directly through the Procedure Occurrence ID field.
  • If Procedures payments are bundled and the cost of such a bundle might be represented in only one of the component Procedures. The FACT_RELATIONSHIP table contains the relationship between the charged Procedure and the Procedure belonging to the bundle.
  • The amounts paid are:
    • Copay – a fixed amount to be paid by the Person
    • Coinsurance – a relative amount of the total paid by the Person
    • Deductible – an amount of money paid by the Person before the Payer starts contributing
    • Primary Payer – the amount the primary Payer pays towards the total
    • Coordination of Benefits – the amount a secondary Payer or Family Plan pays towards the total
    • Out of Pocket = Copay + Coinsurance + Deductible
    • Total – the total amount paid for the Procedure
  • The amounts in various payment components should equal the total, so Copay + Coinsurance + Deductible + Primary Payer + COB = Total Paid. In reality, this is not always reflected in the source data. It is up to the ETL to determine how to deal with quality problems in the data.
  • The revenue_code_concept_id determines what service within a provider is charging for the service
2014/12/05 21:05 · cgreich

DRUG_COST table

As of version 5.0.1 (5-Apr-2016), this table is no longer part of the OMOP CDM. It is replaced by the COST table.

For prior definition, see below.


The DRUG_COST table captures records containing the cost of a Drug Exposure. The information about the cost is defined by the amount of money paid by the Person and Payer for the Drug, as well as the charged cost of the Drug. In addition, a reference to the health plan information in the PAYER_PLAN_PERIOD table is stored in the record that is responsible for the determination of the cost as well as some of the Payments.

FieldRequiredTypeDescription
drug_cost_idYesintegerA unique identifier for each DRUG_COST record.
drug_exposure_idYesintegerA foreign key identifier to the Drug record for which cost data are recorded.
currency_concept_idNointegerA concept representing the 3-letter code used to delineate international currencies, such as USD for US Dollar.
paid_copayNofloatThe amount paid by the Person as a fixed contribution to the expenses. Copay does not contribute to the out of pocket expenses.
paid_coinsuranceNofloatThe amount paid by the Person as a joint assumption of risk. Typically, this is a percentage of the expenses defined by the Payer Plan after the Person's deductible is exceeded.
paid_toward_deductibleNofloatThe amount paid by the Person that is counted toward the deductible defined by the Payer Plan.
paid_by_payerNofloatThe amount paid by the Payer. If there is more than one Payer, several DRUG_COST records indicate that fact.
paid_by_coordination_benefitsNofloatThe amount paid by a secondary Payer through the coordination of benefits.
total_out_of_pocketNofloatThe total amount paid by the Person as a share of the expenses.
total_paidNofloatThe total amount paid for the expenses of drug exposure.
ingredient_costNofloatThe portion of the drug expenses due to the cost charged by the manufacturer for the drug, typically a percentage of the Average Wholesale Price.
dispensing_feeNofloatThe portion of the drug expenses due to the dispensing fee charged by the pharmacy, typically a fixed amount.
average_wholesale_priceNofloatList price of a Drug set by the manufacturer.
payer_plan_period_idNointegerA foreign key to the PAYER_PLAN_PERIOD table, where the details of the Payer, Plan and Family are stored.

Conventions

  • Each Drug Exposure may have any number of corresponding records in the DRUG_COST table, but usually it is none (no cost data recorded) or one. They are linked directly through the drug_exposure_id field.
  • The amounts paid are:
    • Copay – a fixed amount to be paid by the Person
    • Coinsurance – a relative amount of the total paid by the Person
    • Deductible – an amount of money paid by the Person before the Payer starts contributing
    • Primary Payer – the amount the primary Payer pays towards the total
    • Coordination of Benefits – the amount a secondary Payer or Family Plan pays towards the total
    • Out of Pocket = Copay + Coinsurance + Deductible
    • Total – the total amount paid for the Drug Exposure
  • Drug costs are:
    • Ingredient Cost – the amount charged by the wholesale distributor or manufacturer
    • Dispensing Fee – the amount charged by the pharmacy
    • Sales Tax. This is usually very small and typically not provided by most source data, and therefore not included in the CDM
  • The amount paid should equal the cost, so Copay + Coinsurance + Deductible + Primary Payer + Coordination of Benefits = Total Paid = Ingredient Cost + Dispensing Fee. In reality, this is not always reflected in the source data. It is up to the ETL to determine how to deal with quality problems in the data.
  • The Average Wholesale Price is the list price of the drug, but not the price actually charged or paid.
2014/12/05 21:06 · cgreich

DEVICE_COST table

As of version 5.0.1 (5-Apr-2016), this table is no longer part of the OMOP CDM. It is replaced by the COST table.

For prior definition, see below.


The DEVICE_COST table captures the cost of a medical Device or supply used on a Person. The information about the cost is only derived from the amounts paid for the device.

FieldRequiredTypeDescription
device_cost_idYesintegerA unique identifier for each DEVICE_COST record.
device_exposure_ idYesintegerA foreign key identifier to the DEVICE_EXPOSURE record for which cost data are recorded.
currency_concept_idNointegerA concept representing the 3-letter code used to delineate international currencies, such as USD for US Dollar.
paid_copayNofloatThe amount paid by the Person as a fixed contribution to the expenses. Copay does not contribute to the out_of_pocket expenses.
paid_coinsuranceNofloatThe amount paid by the Person as a joint assumption of risk. Typically, this is a percentage of the expenses defined by the Payer Plan after the person's deductible is exceeded.
paid_toward_ deductibleNofloatThe amount paid by the Person that is counted toward the deductible defined by the Payer Plan.
paid_by_payerNofloatThe amount paid by the Payer. If there is more than one payer, several procedure_cost records indicate that fact.
paid_by_coordination_benefitsNofloatThe amount paid by a secondary payer through the coordination of benefits.
total_out_of_pocketNofloatThe total amount paid by the Person as a share of the expenses, excluding the copay.
total_paidNofloatThe total amount paid for the expenses of the procedure.
payer_plan_period_idNointegerA foreign key to the payer_plan_period table, where the details of the payer, plan and family are stored.

Conventions

  • If the Device is derived from a Procedure record, all conventions apply to the field’s equivalent to the procedure_cost (see above).
2014/12/05 21:06 · cgreich
 

COHORT table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The COHORT table contains records of subjects that satisfy a given set of criteria for a duration of time. The definition of the cohort is contained within the COHORT_DEFINITION table. Cohorts can be constructed of patients (Persons), Providers or Visits.

FieldRequiredTypeDescription
cohort_definition_idYesintegerA foreign key to a record in the COHORT_DEFINITION table containing relevant Cohort Definition information.
subject_idYesintegerA foreign key to the subject in the cohort. These could be referring to records in the PERSON, PROVIDER, VISIT_OCCURRENCE table.
cohort_start_dateYesdateThe date when the Cohort Definition criteria for the Person, Provider or Visit first match.
cohort_end_dateYesdateThe date when the Cohort Definition criteria for the Person, Provider or Visit no longer match or the Cohort membership was terminated.

Conventions

  • The core of a Cohort is the unifying definition or feature of the Cohort. This is captured in the cohort_definition_id. For example, Cohorts can include patients diagnosed with a specific condition, patients exposed to a particular drug, or Providers who have performed a specific Procedure.
  • Cohort records must have a Start Date
  • Cohort records must have an End Date, but may be set to Start Date or could have applied a censored date using the Observation Period Start Date.
  • Cohort records must contain a Subject Id, which can refer to the Person, Provider, or Visit record. The Cohort Definition will define the type of subject through the subject concept id.
2014/12/05 21:07 · cgreich

COHORT_ATTRIBUTE table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

The COHORT_ATTRIBUTE table contains attributes associated with each subject within a cohort, as defined by a given set of criteria for a duration of time. The definition of the Cohort Attribute is contained in the ATTRIBUTE_DEFINITION table.

FieldRequiredTypeDescription
cohort_definition_idYesintegerA foreign key to a record in the COHORT_DEFINITION table containing relevant Cohort Definition information.
subject_idYesintegerA foreign key to the subject in the Cohort. These could be referring to records in the PERSON, PROVIDER, VISIT_OCCURRENCE table.
cohort_start_dateYesdateThe date when the Cohort Definition criteria for the Person, Provider or Visit first match.
cohort_end_dateYesdateThe date when the Cohort Definition criteria for the Person, Provider or Visit no longer match or the Cohort membership was terminated.
attribute_definition_idYesintegerA foreign key to a record in the ATTRIBUTE_DEFINITION table containing relevant Attribute Definition information.
value_as_numberNofloatThe attribute result stored as a number. This is applicable to attributes where the result is expressed as a numeric value.
value_as_concept_idNointegerThe attribute result stored as a Concept ID. This is applicable to attributes where the result is expressed as a categorical value.

Conventions

  • Each record in the COHORT_ATTRIBUTE table is linked to a specific record in the COHORT table, identified by matching cohort_definition_id, subject_id, cohort_start_date and cohort_end_date fields.
  • It adds to the Cohort records calculated co-variates (for example age, BMI) or composite scales (for example Charleson index).
  • The unifying definition or feature of the Cohort Attribute is captured in the attribute_definition_id referring to a record in the ATTRIBUTE_DEFINITION table.
  • The actual result or value of the Cohort Attribute (co-variate, index value) is captured in the value_as_number (if the value is numberic) or the value_as_concept_id (if the value is a concept) fields.
2014/12/05 21:08 · cgreich

DRUG_ERA table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

A Drug Era is defined as a span of time when the Person is assumed to be exposed to a particular active ingredient. A Drug Era is not the same as a Drug Exposure: Exposures are individual records corresponding to the source when Drug was delivered to the Person, while successive periods of Drug Exposures are combined under certain rules to produce continuous Drug Eras.

FieldRequiredTypeDescription
drug_era_idYesintegerA unique identifier for each Drug Era.
person_idYesintegerA foreign key identifier to the Person who is subjected to the Drug during the fDrug Era. The demographic details of that Person are stored in the PERSON table.
drug_concept_idYesintegerA foreign key that refers to a Standard Concept identifier in the Standardized Vocabularies for the Ingredient Concept.
drug_era_start_dateYesdateThe start date for the Drug Era constructed from the individual instances of Drug Exposures. It is the start date of the very first chronologically recorded instance of conutilization of a Drug.
drug_era_end_dateYesdateThe end date for the drug era constructed from the individual instance of drug exposures. It is the end date of the final continuously recorded instance of utilization of a drug.
drug_exposure_countNointegerThe number of individual Drug Exposure occurrences used to construct the Drug Era.
gap_daysNointegerThe number of days that are not covered by DRUG_EXPOSURE records that were used to make up the era record.

Conventions

  • Drug Eras are derived from records in the DRUG_EXPOSURE table using a standardized algorithm.
  • Each Drug Era corresponds to one or many Drug Exposures that form a continuous interval and contain the same Drug Ingredient (active compound).
  • The drug_concept_id field only contains Concepts that have the concept_class 'Ingredient'. The Ingredient is derived from the Drug Concepts in the DRUG_EXPOSURE table that are aggregated into the Drug Era record.
  • The Drug Era Start Date is the start date of the first Drug Exposure.
  • The Drug Era End Date is the end date of the last Drug Exposure. The End Date of each Drug Exposure is either taken from the field drug_exposure_end_date or, as it is typically not available, inferred using the following rules:
    • For pharmacy prescription data, the date when the drug was dispensed plus the number of days of supply are used to extrapolate the End Date for the Drug Exposure. Depending on the country-specific healthcare system, this supply information is either explicitly provided in the day_supply field or inferred from package size or similar information.
    • For Procedure Drugs, usually the drug is administered on a single date (i.e., the administration date).
    • A standard Persistence Window of 30 days (gap, slack) is permitted between two subsequent such extrapolated DRUG_EXPOSURE records to be considered to be merged into a single Drug Era.
  • The Gap Days determine how many total drug-free days are observed between all Drug Exposure events that contribute to a DRUG_ERA record. It is assumed that the drugs are “not stockpiled” by the patient, i.e. that if a new drug prescription or refill is observed (a new DRUG_EXPOSURE record is written), the remaining supply from the previous events is abandoned.
  • The difference between Persistence Window and Gap Days is that the former is the maximum drug-free time allowed between two subsequent DRUG_EXPOSURE records, while the latter is the sum of actual drug-free days for the given Drug Era under the above assumption of non-stockpiling.
  • The choice of a standard Persistence Window of 30 and the non-stockpiling assumption is arbitrary, but has been shown to deliver good results in drug-outcome estimation. Other problems, such as estimation of drug compliance, my require a different or drug-dependent Persistence Window/stockpiling assumption. Researchers are encouraged to consider creating their own Drug Eras with different parameters as Cohorts and store them in the COHORT table.

Application to Drug Era rules to generation of two Lisinopril eras.

2014/12/05 21:09 · cgreich

DOSE_ERA table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

A Dose Era is defined as a span of time when the Person is assumed to be exposed to a constant dose of a specific active ingredient.

FieldRequiredTypeDescription
dose_era_idYesintegerA unique identifier for each Dose Era.
person_idYesintegerA foreign key identifier to the Person who is subjected to the drug during the drug era. The demographic details of that Person are stored in the PERSON table.
drug_concept_idYesintegerA foreign key that refers to a Standard Concept identifier in the Standardized Vocabularies for the active Ingredient Concept.
unit_concept_idYesintegerA foreign key that refers to a Standard Concept identifier in the Standardized Vocabularies for the unit concept.
dose_valueYesfloatThe numeric value of the dose.
dose_era_start_dateYesdateThe start date for the drug era constructed from the individual instances of drug exposures. It is the start date of the very first chronologically recorded instance of utilization of a drug.
dose_era_end_dateYesdateThe end date for the drug era constructed from the individual instance of drug exposures. It is the end date of the final continuously recorded instance of utilization of a drug.

Conventions

  • Dose Eras will be derived from records in the DRUG_EXPOSURE table and the Dose information from the DRUG_STRENGTH table using a standardized algorithm.
  • Each Dose Era corresponds to one or many Drug Exposures that form a continuous interval and contain the same Drug Ingredient (active compound) at the same effective daily dose.
  • Dose Form information is not taken into account. So, if the patient changes between different formuations, or different manufacturers with the same formulation, the Dose Era is still spanning the entire time of exposure to the Ingredient.
  • The daily dose is calculated for each DRUG_EXPOSURE record by calculating the total dose of the record and dividing by the duration.
  • The total dose of a DRUG_EXPOSURE record is calculated with the help of the DRUG_STRENGTH table containing the dosage information for each drug as following:
1 Tablets and other fixed amount formulations
Example: Acetaminophen (Paracetamol) 500 mg, 20 tablets.
DRUG_STRENGTH The denominator_unit is empty
DRUG_EXPOSURE The quantity refers to number of pieces, e.g. tablets
In the example: 20
Ingredient dose=quantity x amount_value [amount_unit_concept_id]
Acetaminophen dose = 20 x 500mg = 10,000mg
2 Puffs of an inhaler
Note: There is no difference to use case 1 besides that the DRUG_STRENGTH table may put {actuat} in the denominator unit. In this case the strength is provided in the numerator.
DRUG_STRENGTH The denominator_unit is {actuat}
DRUG_EXPOSURE The quantity refers to the number of pieces, e.g. puffs
Ingredient dose=quantity x numerator_value [numerator_unit_concept_id]
3 Quantified Drugs which are formulated as a concentration
Example: The Clinical Drug is Acetaminophen 250 mg/mL in a 5mL oral suspension. The Quantified Clinical Drug would have 1250 mg / 5 ml in the DRUG_STRENGTH table. Two suspensions are dispensed.
DRUG_STRENGTH The denominator_unit is either mg or mL. The denominator_value might be different from 1.
DRUG_EXPOSURE The quantity refers to a fraction or, multiple of the pack.
Example: 2
Ingredient dose=quantity x numerator_value [numerator_unit_concept_id]
Acetaminophen dose = 2 x 1250mg = 2500mg
4 Drugs with the total amount provided in quantity, e.g. chemotherapeutics
Example: 42799258 “Benzyl Alcohol 0.1 ML/ML / Pramoxine hydrochloride 0.01 MG/MG Topical Gel” dispensed in a 1.25oz pack.
DRUG_STRENGTH The denominator_unit is either mg or mL.
Example: Benzyl Alcohol in mL and Pramoxine hydrochloride in mg
DRUG_EXPOSURE The quantity refers to mL or g.
Example: 1.25 x 30 (conversion factor oz → mL) = 37
Ingredient dose=quantity x numerator_value [numerator_unit_concept_id]
Benzyl Alcohol dose = 37 x 0.1mL = 3.7mL
Pramoxine hydrochloride dose = 37 x 0.01mg x 1000 = 370mg
Note: The analytical side should check the denominator in the DRUG_STRENGTH table. As mg is used for the second ingredient the factor 1000 will be applied to convert between g and mg.
5 Compounded drugs
Example: Ibuprofen 20%/Piroxicam 1% Cream, 30ml in 5ml tubes.
DRUG_STRENGTH We need entries for the ingredients of Ibuprofen and Piroxicam, probably with an amount_value of 1 and a unit of mg.
DRUG_EXPOSURE The quantity refers to the total amount of the compound. Use one record in the DRUG_EXPOSURE table for each compound.
Example: 20% Ibuprofen of 30ml = 6mL, 1% Piroxicam of 30ml = 0.3mL
Ingredient dose=Depends on the drugs involved: One of the use cases above.
Ibuprofen dose = 6 x 1mg x 1000 = 6000mg
Piroxicam dose = 0.3 x 1mg x 1000 = 300mg
Note: The analytical side determines that the denominator for both ingredients in the DRUG_STRENGTH table is mg and applies the factor 1000 to convert between mL/g and mg.
6 Drugs with the active ingredient released over time, e.g. patches
Example: Ethinyl Estradiol 0.000833 MG/HR / norelgestromin 0.00625 MG/HR Weekly Transdermal Patch
DRUG_STRENGTH The denominator units refer to hour.
Example: Ethinyl Estradiol 0.000833 mg/h / norelgestromin 0.00625 mg/h
DRUG_EXPOSURE The quantity refers to the number of pieces.
Example: 1 patch
Ingredient rate=numerator_value [numerator_unit_concept_id]
Ethinyl Estradiol rate = 0.000833 mg/h
norelgestromin rate 0.00625 mg/h
Note: This can be converted to a daily dosage by multiplying it with 24. (Assuming 1 patch at a time for at least 24 hours)
2014/12/05 21:11 · cgreich

CONDITION_ERA table

THIS IS OUTDATED. All documentation is now on the github wiki. Please refer there or to the CDM working group for more information

A Condition Era is defined as a span of time when the Person is assumed to have a given condition. Similar to Drug Eras, Condition Eras are chronological periods of Condition Occurrence. Combining individual Condition Occurrences into a single Condition Era serves two purposes:

  • It allows aggregation of chronic conditions that require frequent ongoing care, instead of treating each Condition Occurrence as an independent event.
  • It allows aggregation of multiple, closely timed doctor visits for the same Condition to avoid double-counting the Condition Occurrences.

For example, consider a Person who visits her Primary Care Physician (PCP) and who is referred to a specialist. At a later time, the Person visits the specialist, who confirms the PCP’s original diagnosis and provides the appropriate treatment to resolve the condition. These two independent doctor visits should be aggregated into one Condition Era.

FieldRequiredTypeDescription
condition_era_idYesintegerA unique identifier for each Condition Era.
person_idYesintegerA foreign key identifier to the Person who is experiencing the Condition during the Condition Era. The demographic details of that Person are stored in the PERSON table.
condition_concept_idYesintegerA foreign key that refers to a standard Condition Concept identifier in the Standardized Vocabularies.
condition_era_start_dateYesdateThe start date for the Condition Era constructed from the individual instances of Condition Occurrences. It is the start date of the very first chronologically recorded instance of the condition.
condition_era_end_dateYesdateThe end date for the Condition Era constructed from the individual instances of Condition Occurrences. It is the end date of the final continuously recorded instance of the Condition.
condition_occurrence_countNointegerThe number of individual Condition Occurrences used to construct the condition era.

Conventions

  • Condition Era records will be derived from the records in the CONDITION_OCCURRENCE table using a standardized algorithm.
  • Each Condition Era corresponds to one or many Condition Occurrence records that form a continuous interval.

The condition_concept_id field contains Concepts that are identical to those of the CONDITION_OCCURRENCE table records that make up the Condition Era. In contrast to Drug Eras, Condition Eras are not aggregated to contain Conditions of different hierarchical layers. The Condition Era Start Date is the start date of the first Condition Occurrence. The Condition Era End Date is the end date of the last Condition Occurrence.

  • Condition Eras are built with a Persistence Window of 30 days, meaning, if no occurence of the same condition_concept_id happens within 30 days of any one occurrence, it will be considered the condition_era_end_date.
2014/12/05 21:12 · cgreich
documentation/cdm/single-page.1417484545.txt.gz · Last modified: 2014/12/02 01:42 by jduke