Christian Reich, Rimma Belankaya, Dmytry Dymshyts, Andrew Williams, Robert Miller, Michael Gurley
Reviewed Dmytry's latest progress on the NAACCR ingestion script.
Discussed the possibility of moving NAACCR schemas into a new structure to keep the CONCEPT.concept_code clean (not contain anything NOT in NAACCR source data). But we decided for version 1 we would include NAACCR schemas in the vocabulary, and include NAACCR schema IDs in site-dependent NAACCR items and NAACCR Item Codes. This means that ETL developers will have to know to use Rimma's ETL code to map NAACCR data into OMOP.
Agreed that mapping NAACCR schemas to ICD site/histology combinations is a must have for version 1 of NAACCR ingestion. Rimma's ETL SQL will not work without it.
Discussed whether NAACCR site-dependent variables should be duplicated across schemas. Dmytry's first version of NAACCR ingestion is duplicating them. We agreed that they should not. Dmytry said he can dedupe them.