期刊论文详细信息
Heliyon 卷:6
Assessment of supervised classifiers for the task of detecting messages with suicidal ideation
Héctor Andrés Melgar Sasieta1  José Manuel Gómez Soriano2  Roberto Wellington Acuña Caicedo3 
[1] Corresponding author.;
[2] Departamento de Ingeniería, Sección de Ingeniería Informática, Escuela de Posgrado, Pontificia Universidad Católica del Perú, Lima, Peru;
[3] Carrera de Tecnología de la Información, Universidad Estatal del Sur de Manabí, Ecuador;
关键词: Computer science;    Suicidal ideation;    Supervised classifiers;    Machine learning;    Social networks;    Automatic classification;   
DOI  :  
来源: DOAJ
【 摘 要 】

According to the World Health Organization (WHO) close to 800,000 people worldwide die by suicide each year, and many more attempts to do it. In consequence, the WHO recognizes suicide as a global public health priority, which affects not only rich countries but poor and middle-income countries as well. This study makes a systematic analysis of 28 supervised classifiers using different features of the corpus Life to detect messages with suicidal ideation and depression to know if these can be used in an automatic prevention online system.The Life Corpus, used in this research, is a bilingual text corpus (English and Spanish) oriented to the detection of suicide ideation. This corpus was constructed retrieving texts from several social networks and its quality was measured using mutual annotation agreement. The different experiments determined that the classifier with the best performance was KStar, with the corpus features POS-SYNSETS-NUM, achieving the best results with the ROC Area metrics of 0,81036 and F-measure of 0,7148. The present research fulfilled the objective of discovering which supervised classifiers and which features are the most suitable for the automatic classification of messages with suicidal ideation using the Life Corpus.Also, given the imbalance of the results, a new precision measure was developed called the Two-dimensional Accuracy and Recovery Index (GDP), which can provide better results, in unbalanced systems, than the usual measures to assess the quality of the results (measure F, Area ROC), and thus increase the number of messages at risk of suicidal ideation, detected at the cost of receiving more messages that are not related to suicide or vice versa.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:2次