New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingest NAACCR into the OMOP vocabulary tables. #6
Comments
Rimma's instructions:
|
@dimshitc http://applications.naaccr.org/querybuilder/default.aspx?Version=18 Here is the filtered list: |
Here is @dimshitc first cut at putting NAACCR into a computable format: https://docs.google.com/spreadsheets/d/1dEvy9xWhEfy-AnjhItMOe5MHbaUUrMZwSk2rZCzxO0k/edit?usp=sharing |
Procedures details:
|
Histological grades: 1 G1: Well differentiated with same unsignificant name variations: A Well differentiated H High grade M TP53 or CTNNB Mutation - Adrenal Gland each site uses only numbers OR only letters with number 9 (unknown), 1-4 is synonymical to A-D though How to create a concept except "M": concept_code = 'Grade_M_Adranal_Breast' should we neglect the name difference? |
We should only ingest into the OMOP vocabulary tables NAACCR data items from the following NAACCR sections:
This covers 442 NAACCR items. This will help us focus on the good stuff in NAACCR and weed out the cruft. Some NAACCR items are site-independent and some are site-dependent.
Ingesting Site-independent NAACCR Items into the OMOP Vocabulary TablesSite-independent NAACCR items should be rather straightforward to ingest into the OMOP vocabulary tables. For example, the NAACCR item #410 'Laterality' should be ingested like so: CONCEPT
CONCEPT_RELATIONSHIP
Ingesting Site-dependent NAACCR Items into the OMOP Vocabulary TablesSite-dependent NAACCR items are not as straightforward to ingest into the OMOP vocabulary tables. There are 3 different bindings of site-dependent NAACCR items:
SEER EOD includes NAACCR SSDI. So we can focus on SEER EOD and SEER TNM. Both of these bindings are bound to schemas. Schemas are high-level names of anatomic sites that bundle combinations of ICD-O-3 site/histology. The schemas for SEER EOD and SEER TNM overlap but are not equivalent. Some schema examples:
Some of the NAACCR items bound to an EOD schema/TNM schema have possible values that do not vary based on schema. Some of the NAACCR items bound to an EOD schema/TNM schema have possible values that do vary based on schema. Here is an examples of a non-varying NAACCR item from two EOD schemas:
This NAACCR item can be ingested into the OMOP vocabulary tables like so: CONCEPT
CONCEPT_RELATIONSHIP
Here is an examples of a NAACCR item that has varying possible values across EOD schemas:
This NAACCR item can be ingested into the OMOP vocabulary tables like so: CONCEPT
CONCEPT_RELATIONSHIP
|
domain_id ='NAACCR', it can't be. should be Observation or Measurement depending on where you want to go this data to. |
@dimshitc That is a typo. Sorry. Fixed. The domain for most NAACCR attributes/questions should be 'Measurement' and for NAACCR values/answers should be 'Meas Value'. The vocabulary_id should be NAACCR. Some treatment-related NAACCR attributes/questions values/answers will be in the new Treatment domain (I will work on the Treatment NAACCR items/item answers this weekend). |
The first version of this has been completed. |
(Except for M-9727, 9733, 9741-9742, 9764-9809, 9832, 9840-9931, 9945-9946, 9950-9967 and 9975-9992)", "row":[ {}, {}]}. The “site_inclusions” key details the range of ICDO topography/anatomical sites bound to this list of surgery codes. The “row” key is an array of hashes that represents the site-specific surgery code hierarchy. Some of the hashes in the “row” key are for line-break formatting and instructional text. Ignore all entries in the “row” key with a “line_break”: true key/value pair or that have a “code”:”” key/value pair. The hierarchal structure of the “row” key is based on the order of the entries. The “level” key can go from 0 to 3.
row.json
documentation.html
The text was updated successfully, but these errors were encountered: