User Tools

Site Tools



This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
projects:workgroups:minutes [2015/10/07 16:10]
anu_gururaj [Agenda]
projects:workgroups:minutes [2015/10/22 19:19]
Line 3: Line 3:
 ==== Attendees ==== ==== Attendees ====
-Hua Xu, Jon Duke, Noemie Elhadad, Anupama Gururaj+Hua Xu, Jon Duke, Noemie Elhadad, Anupama Gururaj, Alexandre Yahi, Thomas Ginter, Olga Patterson, George Hripsack, Vojtech Huser
 ==== Agenda ==== ==== Agenda ====
-  - Presentation by Dr. Jon Duke, Title: ​Regenstrief ​NLP platform ​and approach ​to validation ​of phenotypes +  ​-IRB for use of clinical text 
-  - Presentation by Dr. Noemie Elhadad, Title: NLP schemas ​and clinical ​NLP tools in ShARe +  -Clinical text data storage and representation schema 
-  - Discussion +    ​- Presentation by Dr. Noemie Elhadad 
 +    - Title: NLP schemas ​and clinical NLP tools in ShARe, {{:​projects:​workgroups:​15ohdsi_nlp_share.pdf|File}} 
 +        - output of converted unstructured text could be in the form of structured data, bag of words and word embedding. Structured data and bag of words are the most useful in the current context. 
 +        - the ShARe schema for structured output combines many initiatives such as SHARP, THYME etc. 
 +    - Discussion – Next steps 
 +        - Table structure for storing concept level NLP outputs ​to be determined 
 +        - It is sufficient to start with structured output 
 +        - A concept table with concept ID in each row and note IDs should be generated 
 +        - OMOP vocabulary is to be used to aggregate concept to a higher level to manage and condense the number ​of concepts 
 +        - Next step is to go through all the columns exhaustively for all attributes, merge them and then decide the attributes that should be used in the table 
 +  ​-NLP tools/​pipelines for ETL 
 +  -Use cases, e.g, phenotyping for cohort selection using NLP outputs 
 +    ​- Presentation by Dr. Jon Duke, Title: ​Regenstrief ​NLP platform ​and approach to validation of phenotypes 
 +        - the NLP platform is composed of a state machine with Regex based system 
 +        - the NLP data analysis tool is currently being used for data analytics at Regenstrief. 
 +        - the tool has text search capabilities and was demonstrated at the meeting 
 +    ​- Discussion ​– Next steps 
 +        - need to determine if the API for keyword search based on Solr or ElasticSearch can be shared  
 +  -Discussion
projects/workgroups/minutes.txt · Last modified: 2015/10/29 18:36 by anu_gururaj