Digital Library

cab1

 
Title:      EXTRACT CLINICAL MEASUREMENT VALUES USING A REGULAR EXPRESSION PATTERN DISCOVERY ALGORITHM VS SUPPORT VECTOR MACHINE
Author(s):      Douglas Redd, Bryan Gibson, Maureen A. Murtaugh, Joseph Goulet and Qing Zeng-Treitler
ISBN:      978-989-8533-77-7
Editors:      Mário Macedo and Piet Kommers
Year:      2018
Edition:      Single
Keywords:      Medical Informatics, Clinical Informatics, Natural Language Processing, Machine Learning, Regular Expressions
Type:      Full Paper
First Page:      29
Last Page:      36
Language:      English
Cover:      cover          
Full Contents:      click to dowload Download
Paper Abstract:      Background: Clinical measurements are commonly embedded in free-text clinical notes. These can be extracted using natural language processing, but this can be resource intensive with limited generalizability. We demonstrate a new approach using regular expression discovery for extraction (REDEx), a supervised machine learning algorithm that we have developed that automatically generates regular expressions to extract measurements with reduced effort. Results: We compare this approach to that of a support vector machine (SVM) in the task of body weight extraction. 968 weight values were annotated in 300 clinical notes and used for training of the REDEx and SVM models. 98 regular expressions were automatically generated by REDEx. In 10-fold cross validation the REDEx model consistently outperformed the SVM model, with precision .99 vs .85, recall .98 vs. .87, f1-score .99 vs .86, and accuracy .98 vs. .82.
   

Social Media Links

Search

Login