BMC Medical Informatics and Decision Making | |
Empirical advances with text mining of electronic health records | |
Research Article | |
T. Delespierre1  P. Denormandie2  L. Josseran3  A. Bar-Hen4  | |
[1] Institut du Bien Vieillir Korian, 21-25 rue Balzac, 75008, Paris, France;Research lab: EA 4047, UFR des Sciences de la Santé Simone Veil, UVSQ Université Paris-Saclay, 2 Avenue de la Source de la Bièvre, 78180, Montigny le Bretonneux, France;MNH Group, 185 rue de Bercy, 75012, Paris, France;Research lab: EA 4047, UFR des Sciences de la Santé Simone Veil, UVSQ Université Paris-Saclay, 2 Avenue de la Source de la Bièvre, 78180, Montigny le Bretonneux, France;UFR de Mathématiques et Informatique, Université de Paris Descartes, 45 rue des Saints-Pères, 75006, Paris, France; | |
关键词: Nursing homes; SQL query; Information extraction; Named entity recognition; Data mining; Text mining; Word cloud; Multiple component analysis; Principal component analysis; Hierarchical clustering; | |
DOI : 10.1186/s12911-017-0519-0 | |
received in 2016-12-07, accepted in 2017-08-04, 发布年份 2017 | |
来源: Springer | |
【 摘 要 】
BackgroundKorian is a private group specializing in medical accommodations for elderly and dependent people. A professional data warehouse (DWH) established in 2010 hosts all of the residents’ data. Inside this information system (IS), clinical narratives (CNs) were used only by medical staff as a residents’ care linking tool.The objective of this study was to show that, through qualitative and quantitative textual analysis of a relatively small physiotherapy and well-defined CN sample, it was possible to build a physiotherapy corpus and, through this process, generate a new body of knowledge by adding relevant information to describe the residents’ care and lives.MethodsMeaningful words were extracted through Standard Query Language (SQL) with the LIKE function and wildcards to perform pattern matching, followed by text mining and a word cloud using R® packages. Another step involved principal components and multiple correspondence analyses, plus clustering on the same residents’ sample as well as on other health data using a health model measuring the residents’ care level needs.ResultsBy combining these techniques, physiotherapy treatments could be characterized by a list of constructed keywords, and the residents’ health characteristics were built. Feeding defects or health outlier groups could be detected, physiotherapy residents’ data and their health data were matched, and differences in health situations showed qualitative and quantitative differences in physiotherapy narratives.ConclusionsThis textual experiment using a textual process in two stages showed that text mining and data mining techniques provide convenient tools to improve residents’ health and quality of care by adding new, simple, useable data to the electronic health record (EHR). When used with a normalized physiotherapy problem list, text mining through information extraction (IE), named entity recognition (NER) and data mining (DM) can provide a real advantage to describe health care, adding new medical material and helping to integrate the EHR system into the health staff work environment.
【 授权许可】
CC BY
© The Author(s). 2017
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202311096826590ZK.pdf | 1902KB | download |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]
- [30]
- [31]
- [32]
- [33]
- [34]
- [35]
- [36]
- [37]
- [38]
- [39]
- [40]
- [41]
- [42]
- [43]
- [44]
- [45]
- [46]
- [47]
- [48]
- [49]
- [50]
- [51]
- [52]
- [53]