Salecha, Aadesh2021-07-232021-07-232021-05https://hdl.handle.net/11299/222250Honors Thesis. 2021. Author: Aadesh Salecha Major: Computer Science Advisor: Jaideep Srivastava. 1 computer file (PDF); 152 pages.Social media platforms like Twitter and Facebook have made the world a more connected place and have become indispensable parts of our lives. However, these networks have also become conducive environments for massive diffusion of misinformation. These platforms generate huge volumes of data, a sizable portion of which consists of what has popularly come to be known as fake news. These sites are also plagued with automated bots which serve as catalysts for the dispersion of misinformation whilst also making it harder for researchers to study misinformation by exponentially increasing the volume of data generated. This thesis is a part of a larger effort by researchers to advance our understanding of the spread of misinformation and its characteristics. In this thesis I first outline an approach we used to build a massive fake news dataset that was rich enough to capture complex behavioural patterns. Next, I describe an approach that we used to build machine learning models to detect false information spreaders on Twitter and present an empirical validation of our models that yield accuracies of over 90%. Finally, I propose a pipeline to filter out bots from these datasets by building on existing state-of-the-art bot detection techniques. I also present a comprehensive analysis of the effects that these bots have on fake news spreader detection. I conclude that a bot filtration phase is essential in ensuring optimal performance of models in predicting likely spreaders.enSumma Cum LaudeCollege of Science and EngineeringComputer ScienceEmpirical Study of the Spread of Misinformation: A Big Data ApproachThesis or Dissertation