User Tools

Site Tools


documentation:next_cdm:metadata

This is an old revision of the document!


Proposing person: Vojtech Huser

Discussion link: http://forums.ohdsi.org/t/metadata-extension-to-cdm/1746/1

Table CDM_SOURCE provides metadata. (http://www.ohdsi.org/web/wiki/doku.php?id=documentation:cdm:cdm_source)

Use case

  1. only run certain data quality checks when they are appropriate to the dataset (e.g., general population dataset)
  2. display metadata within Atlas-Achilles Web

CDM changes

The proposal is adding a single table to the CDM specs.

new METADATA table

Tablename: METADATA

This table is relying on concept_id's that exist for CDM tables. In Atlas, search for those using advanced search and selecting Metadata.

Column Description Data_type
METADATA_CONCEPT_ID OMOP Vocabulary CONCEPT_ID that identifies the information you with to track (e.g. 8 for metadata about a Visit) INT
METADATA_TYPE_CONCEPT_ID OMOP Vocabulary CONCEPT_ID that identifies the type information you with to track (e.g. 1 for metadata about Domains such as a Visit) INT
NAME Name of the CONCEPT_ID stored in METADATA_CONCEPT_ID or in the event there is not an applicable CONCEPT_ID NAME can be used to represent the data stored (e.g. CDM_BUILDER VERSION) VARCHAR(250)
VALUE Store the metadata value you wish to capture NVCHAR

Example records:

METADATA_CONCEPT_ID METADATA_TYPE_CONCEPT_ID NAME VALUE
8 1 VISIT For the outpatient visits, all activity that is recorded on a single day for a person is considered to have occurred during one visit with the visit start and end date corresponding to this date.
0 0 CDM_BUILDER VERSION 1.8.0.9
00DATASET_TYPEClinical Trial Data

CDM_SOURCE table

We want to the following to the specs for the CDM_SOURCE table. Only one row is expected in this table.

END OF PROPOSAL












Text below only reflects some historical notes related to the proposal above.

Details 1

Proposing person: Patrick Ryan, Martijn Schuemie, Ajit Londhe, & Erica Voss

(may need to be updated)

Additionally we would like the CDM_SOURCE table to store metadata about each of the domains. Our idea is to implement it by adding an additional column for each domain in the CDM to the CDM_SOURCE table (i.e. CDM_SOURCE.VISIT_OCCURRENCE, CDM_SOURCE.PERSON, etc). The value this brings is this will allow us to display information about a specific domain on an ACHILLES report. For example, VISIT_OCCURRENCE logic in PREMIER is fairly complex and displaying a description of that logic at the point where someone is reviewing the data in ACHILLES would be beneficial.

Here is an example of some text for JMDC:

Database as a whole

(already has a column) JMDC database consists of data from 60 Society-Managed Health Insurances covering workers aged 18 to 65 and their dependents (children younger than 18 years old and elderly people older than 65 years old). The old people (particularly those aged 66 or older) are less representative as compared with whole population in the nation. When estimated among the people who are younger than 66 years old, the proportion of children younger than 18 years old in JMDC is approximately the same as the proportion in the whole nation. JMDC data includes data on membership status of the insured people and claims data provided by insurers under contract. Claims data are derived from monthly claims issued by clinics, hospitals and community pharmacies.

Person

JMDC covers workers aged 18 to 65 and their dependents (children younger than 18 years old and elderly people older than 65 years old). The old people (particularly those aged 66 or older) are less representative as compared with whole population in the nation. When estimated among the people who are younger than 66 years old, the proportion of children younger than 18 years old in JMDC is approximately the same as the proportion in the whole nation. Only the year of birth is available, so not the day or month.

Observation_period

The observation period is defined as the time of enrollment in the health insurance. If the member is a dependent, the enrollment depends on the enrollment of the main beneficiary.

Care_site

Care sites in JMDC are institutions where care is provided, typically a department in a hospital.


Details 2
  • clarify that one row is expected in this table
  • add column DATASET_TYPE_CONCEPT_ID Definition: Reference to concept_id in OHDSI/OMOP Terminology (class = “Dataset Type”) that indicates what type of data is in the dataset. Set to NULL if none of the concepts correctly characterizes the data. For large samples of specialized population by insurance (e.g., US Medicaide, use general population concepts)
    • Values are: General population EHR data, General population claims data, General Population EHR + Claims Data, Clinical Trial Data

Advanced Data Quality checks (inside Achilles Heel) would take advantage of this information in this new column.

DATASET_TYPE_CONCEPT_ID
  • if you don't want to (or can't) declare the type of data, use concept 0 (*)
  • Clinical trial data (dataset type) (*)
  • Multiple sources (dataset type)
  • Registry data (dataset type)
  • Predominantly Electronic Health Record data (dataset type)
  • Predominantly Administrative/Claims data (dataset type)
  • Predominantly Health Information Exchange data (dataset type)
  • Data limited to a single medical specialty/clinical domain, not covering general population (dataset type) (*)

Predominantly means if at least 51% of significant records comes from a given source. Inpatient vs outpatient data can be determined from visit types and does not need to be classified above.


Column Description Data type
DATASET_TYPE_CONCEPT_ID Type of dataset. Reference to OMOP Concept that provides dataset type classification. integer
Details 3

Proposing person: Ajit Londhe, & Erica Voss

We would like to propose the following table to hold metadata:

Tablename: METADATA

Column Description Data_type
METADATA_CONCEPT_ID OMOP Vocabulary CONCEPT_ID that identifies the information you with to track (e.g. 8 for metadata about a Visit) INT
METADATA_TYPE_CONCEPT_ID OMOP Vocabulary CONCEPT_ID that identifies the type information you with to track (e.g. 1 for metadata about Domains such as a Visit) INT
NAME Name of the CONCEPT_ID stored in METADATA_CONCEPT_ID or in the event there is not an applicable CONCEPT_ID NAME can be used to represent the data stored (e.g. CDM_BUILDER VERSION) VARCHAR(250)
VALUE Store the metadata value you wish to capture NVCHAR

Example records:

METADATA_CONCEPT_ID METADATA_TYPE_CONCEPT_ID NAME VALUE
8 1 VISIT For the outpatient visits, all activity that is recorded on a single day for a person is considered to have occurred during one visit with the visit start and end date corresponding to this date.
0 0 CDM_BUILDER VERSION 1.8.0.9

NOTES original table was

Column Description Data type
DATASET_TYPE_CONCEPT_ID Type of dataset. Reference to OMOP Concept that provides dataset type classification. integer
PERSON text
OBSERVATION_PERIOD text
VISIT_OCCURRENCE Description of the logic used to populate the table (column name indicates the table). text
PROCEDURE_OCCURRENCE Description of the logic used to populate the table (column name indicates the table). text
CONDITION_OCCURRENCE Description of the logic used to populate the table (column name indicates the table). text
DRUG_EXPOSURE Description of the logic used to populate the table (column name indicates the table). text
MEASUREMENT Description of the logic used to populate the table (column name indicates the table). text
documentation/next_cdm/metadata.1477511046.txt.gz · Last modified: 2016/10/26 19:44 (external edit)