====== Minutes_Meeting_02032016 ======

==== Attendees ====

Hua Xu, Jon Duke, George Hripcsak, Karthik Natarajan, Anupama Gururaj, Mark Khayter, Min Jiang, Alexandre Yahi, Noemie Elhadad, Juan M Banda, Olga Patterson, Lian Hu

==== Agenda ====

{{:projects:workgroups:nlp_wg_meeting_02032016_final.pdf|}}

  - Minimal Model Presentation – Alex
  - Note-type mapping Presentation – Karthik
  - Share existing ontologies from Vanderbilt (Hua) and Regenstrief (Jon)
  - Share strategies for combining data from different searches – Jon
  - Report on WG for commenting – Hua
  - Wrappers for cTAKES and Metamap – Min
  - Improvements to search engine set up using MT samples – Min
  - Textual Data Representation – Discussion
  - Goals of 2016
  - Change of meeting time

===Minutes===

  - Minimal model presentation - Alex {{:projects:workgroups:ohdsi_nlp_wg_yahi.pdf|}}
        - the model is based on the SHARE-N model and adapted to the current data structure. This model incorporates other semantic types and all of the modifiers are not available in cTAKES yet.
        - the notes were processed from eMERGE cohort at Columbia with about 60,000 notes encompassing 1700 patients. The original patient number was 3200.
        - In theory, a set containing the combination of minimal modifiers can be generated. Practically, can we trust the data enough to add it into OHDSI tables? - only highest confidence data (with maximum PPV) should be added to the tables.
        - Next steps:
          - Look at the note sections to determine the errors.
          - Work with Sunny to generate the NLP outputs for the phenotyping data
          - Evaluate by comparisons with structured data
          - Make the system more robust
          - Generate a protocol and/or annotation guidelines
          - Share the data as a Gold standard with manually annotated CUIs
          - Alex's script is to be tried on different datasets and evaluated across notes from different institutions
          - Identify minimal set of notes to work with when recommending to the OHDSI community
          - Identify sets of concepts that are not reliable - negation is a very good example of this idea.
          - Continue discussion of NLP system evaluation across different sites
   - The NLP-WG will meet on second Wednesday of every month

===Action Items===

  - Note-type mapping Presentation - Karthik
  - Share existing ontologies from Vanderbilt (Hua) and Regenstrief (Jon)
  - Share strategies for combining data from different searches - Jon
  - Report on WG for commenting - Hua
  - Wrappers for cTAKES and Metamap - Min
  - Improvements to search engine set up using MT samples - Min
  - Textual Data Representation - Discussion
  - NLP system evaluation across different sites - Discussion