Using Web Data in the Medical Domain 2010. | |
Can ProMED-mail Bootstrap Blogs? Automatic Labeling of Victim-reporting Sentences | |
计算机科学;图书情报档案学 | |
Avar¶e Stewart ; Kerstin Denecke | |
Others : http://ceur-ws.org/Vol-572/paper3.pdf PID : 40856 |
|
来源: CEUR | |
【 摘 要 】
Due to the proliferation of social media data and user-generated content available, monitoring trends or using this data in other scenarios becomes more interesting. Our research focuses on the extraction of information on health events from user generated content with the objective to support Epidemic Intelligence. Specifically, we describe and evaluate a method for identifying sentences relevant for event extraction. Labeled data is unavailable for this task and manual annotation is expensive. Therefore, in order to reduce the number of labeled examples, we apply a bootstrapping algorithm for this task. In more detail, we will study the suitability of a classifier trained on one text type (e-mails) for the classification of texts of another text type (blogs).
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
Can ProMED-mail Bootstrap Blogs? Automatic Labeling of Victim-reporting Sentences | 496KB | download |