Paper Title
Multi-Label Classification of Toxic Comments using Fast-Text and CNN
Abstract
Social networking and online conversation platforms provide us the power and ease to share our views as well as
ideas. However, we are facing situations where, most of the people have taken these platforms as take for granted where they
see it as an opportunity to harass and target others leading to cyber-attack and cyber- bullying, nightmare experiences and
suicidal attempts in extreme cases. Manually identifying and classifying such comments is a very long, tiresome and
unreliable process. In order to solve this challenge, we are aiming to develop a deep learning system which will identify such
negative content over online discussion sections and successively classify them into proper labels. Our proposed model aims
to apply the text-based convolution Neural Network (CNN) with word embedding, using Fast Text Method to analyze the
comments. Fast-Text has shown more efficient results than Word2Vec and GLOVE model. Our model will aim to improve
detecting different types of toxicity to improve the social media experience. The dataset used for building the model is
Wikipedia’s talk page edits.
Keywords - CNN - Convolution Neural Network, RNN - Recurrent Neural Networks, TFIDF - Text Frequency Inverse
Document Frequency, SVM - Support Vector Machines, GLOVE-Global Vector for Word Representation.