Investigating the resilience of fake news detection algorithms against adversarial manipulations
Feature Selection for Time Series Classification & Adaptation for Ordinal Data
Rany Zeidan, Tel-Aviv University Advisors:Prof. Joachim Meyer
Abstract:
The pervasive spread of fake news presents significant challenges to public trust and social stability. This seminar explores the resilience of fake news detection algorithms against adversarial manipulations. Utilizing diverse datasets from Kaggle, and especially the WELFake dataset, our research aims to enhance the robustness of these algorithms by introducing new defensive strategies. Initially, we conducted a thorough exploratory data analysis and cleaning process to understand and prepare the datasets. We utilized the WELFake dataset from Kaggle, which consists of more than 71,000 news articles equally divided between fake and real news. Through feature engineering, we enhanced detection capabilities by introducing additional features such as text size, capitalization metrics, and sentiment analysis. Our methodology involved training multiple models, including traditional machine learning algorithms and deep learning architectures like LSTM, achieving over 95% accuracy. Using SHAP analysis, we identified key features that distinguish real from fake news. We then manipulated these influential features (e.g., body length, percentage of uppercase in titles, polarity, subjectivity, exclamation count, quotes, hashtags, @ symbols), similar to what an adversarial agent may do to impair algorithmic classifications. We demonstrated a significant drop in the model’s recall rate for fake news, exposing vulnerabilities due to adversarial attacks. To counteract these manipulations, we implemented continuous monitoring and defensive retraining strategies. By incorporating adversarially modified data into the training process, the model's robustness was significantly improved. Even a small percentage of such data in retraining enhanced the model’s performance, restoring its detection capabilities to previous levels. Our findings highlight the susceptibility of machine learning models to adversarial feature manipulation and underscore the necessity for ongoing adaptation in fake news detection systems. The study concludes that constant performance monitoring and strategic retraining with a mix of original and modified data are crucial for maintaining and improving detection accuracy over time. This work emphasizes the importance of adaptability in the design and implementation of fake news detection systems to ensure their continued effectiveness against evolving adversarial tactics.
Bio:
Rany Zeidan holds a B.Sc. degree in Computer Science from the Technion - IIT and has previously worked in engineering development both in Israel and abroad. He is currently an M.Sc. student in Industrial Engineering at Tel Aviv University, specializing in AI and Data Science.