期刊论文

【摘要】

The growing popularity of online platforms that allow users to communicate with each other, exchange opinions about various events and leave comments, has contributed to the development of natural language processing algorithms. Tens of millions of messages per day published by users of a certain social network must be analyzed in real time for moderation to prevent the spread of various illegal or offensive information, threats and other types of toxic comments. Of course, such a large amount of information can be processed quite quickly only automatically. That is why it is necessary to find a way to teach a computer to ``understand'' a text written by a man. It is a non-trivial task, even if the word ``understand'' here means only to detect or classify. The rapid development of machine learning technologies has led to the widespread adoption of new algorithms. Many tasks that for years were considered almost impossible to solve using computer now can be successfully solved with deep learning technologies. In this article the author presents new algorithms that can successfully solve the problem of toxic comments detection using deep learning technologies and neural networks. Furthermore, in this article will be presented the results of the developed algorithms, as well as the results of their ensemble, tested on a large training set, gathered and marked up by Google and Jigsaw.

【授权许可】

Unknown

Proceedings of the XXth Conference of Open Innovations Association FRUCT
Avoiding Unintended Bias in Toxicity Classification with Neural Networks

Sergey Morzhov¹
[1] P.G. Demidov Yaroslavl State University, Russia;
关键词: toxicity; natural language processing; nlp; deep learning; recurrent neural networks; rnn; lstm; gru; attention mechanism; word embedding; fasttext; glove; bert;
DOI : 10.23919/FRUCT48808.2020.9087368
来源: DOAJ


	文献评价指标
	下载次数：0次	浏览次数：0次

【 摘 要 】

【 授权许可】

【摘要】

【授权许可】