Prevalence and Evaluation of Potential Abbreviations in Intensive Care Documentation: A Clinical Language Exploration, Annotation Research: Utilizing Open Source and Commercial Applications
[Thesis]
Brundage, David M.
Srinivasan, Shankar
Rutgers The State University of New Jersey, Rutgers School of Health Professions
2020
139 p.
Ph.D.
Rutgers The State University of New Jersey, Rutgers School of Health Professions
2020
Introduction: Abbreviations are often used in clinical documentation to reduce time spent documenting in electronic health records and to save space during documentation. Abbreviations represent a specific challenge in healthcare as they can often contain multiple means. This ambiguous use of abbreviations is a patient safety issue for clinicians who do not properly understand the intended use of the abbreviation and presents a health literacy issue to patients as they try and understand what a provider's note says about the care provided. Plenty of research has been done on a clinician's ability to disambiguate abbreviations, but little work has been done to assess how clinicians are using abbreviations or creating tools to assist administrators and clinicians to explore the documentation of their providers. Methods: A semi-supervised approach was taken to identify potential abbreviations within the MIMIC-III database. Over 400 million-word tokens were compared to a list approved abbreviation for Beth Israel Deaconess Hospital. The results of this semi-supervised identification were used to analyze the use of abbreviations and prevalence of abbreviations within the dataset. Results: 463,175,566 raw word tokens were compared to a list of 1,742 approved abbreviations. On average, every document within MIMIC contained almost 14 abbreviation tokens, or roughly 9% of an average note is comprised of potential abbreviations. Some notes contained almost 26% of potential abbreviation tokens. The average count of potential abbreviations for a note created by an RN is 21.87, and the average count of potential abbreviations in a note created by an MD is 11.39. There is a substantial difference in the number of abbreviations used in a note by an RN and MD. MIMIC note events contain a substantial amount of Using the Medrec2vec word embedding model we extracted the ten most similar terms for each approved abbreviation at Beth-Israel Deaconess (BID) and assessed if vector space contained the semantic meaning for the abbreviated term. Of the 1,743 abbreviations approved by BID the word embedding model was able to accurately extract the semantic relationship for 963 terms.620 abbreviation terms were not able to extract the appropriate semantic term, and 160 terms were not found within the vector space of the model. Our model achieved a precision of .60, a recall of .85, and an F1 of .71, while our model performed decently only using term similarity, it struggled when abbreviations had multiple meanings. Conclusion: Using the MIMIC data set we have shown that clinical abbreviations and complex clinical jargon make up a specific amount of provider documentation. 8.22% of total words within the MIMIC note events table is a term found within the Beth Israel Deaconess approved abbreviation list. We have also shown that there is the capability to replace abbreviations in medical text to provide additional context to patients. abbreviations>=5
Health care management
Information science
Brundage, David M.
Srinivasan, Shankar
Rutgers The State University of New Jersey, Rutgers School of Health Professions