Applied Sciences | |
Enhancement of Text Analysis Using Context-Aware Normalization of Social Media Informal Text | |
Jebran Khan1  Sungchang Lee1  | |
[1] School of Electronics and Information Engineering, Korea Aerospace University, Deogyang-gu, Goyang-si 412-791, Korea; | |
关键词: social media; noisy text; informal text; LSTM; BERT; text normalization; | |
DOI : 10.3390/app11178172 | |
来源: DOAJ |
【 摘 要 】
We proposed an application and data variations-independent, generic social media Textual Variations Handler (TVH) to deal with a wide range of noise in textual data generated in various social media (SM) applications for enhanced text analysis. The aim is to build an effective hybrid normalization technique that ensures the use of useful information of the noisy text in its intended form instead of filtering them out to analyze SM text better. The proposed TVH performs context-aware text normalization based on intended meaning to avoid the wrong word substitution. We integrate the TVH with state-of-the-art (SOTA) deep-learning-based text analysis methods to enhance their performance for noisy SM text data. The proposed scheme shows promising improvement in the text analysis of informal SM text in terms of precision, recall, accuracy, and F1-score in simulation.
【 授权许可】
Unknown