CYBERBULLYING DETECTION ON SOCIAL MEDIA: LEVERAGING TF-IDF AND LSTM FOR ROBUST CLASSIFICATION
Keywords:
Cyber bullying, TF-IDF, LSTM, social media, deep learning.Abstract
This study investigates the detection of cyberbullying on social media using both traditional and advanced techniques. Specifically, it combines Term Frequency-Inverse Document Frequency (TF-IDF) feature extraction with a Long Short-Term Memory (LSTM) deep learning model. The dataset, consisting of over 47,000 labeled tweets, was rigorously preprocessed to optimize feature extraction and minimize noise. The LSTM model, designed with a sequential architecture that incorporates embedding, LSTM, and dense layers, achieved an accuracy of 91%, demonstrating its effectiveness in capturing contextual information. In contrast, the TF-IDF method provided valuable interpretability, complementing the deep learning approach. The results demonstrate high performance; however, challenges persist in distinguishing between certain classes, particularly Class 3 and Class 4, due to their inherent complexity.