Twitter Sentiment Analysis of Lagos State 2023 Gubernatorial Election Using BERT

Many cutting-edge language models have been used in the past to forecast election results. Sentiment analysis aids in opinion mining – a common experiment used to detect public opinions – on a given topic. Twitter has gained popularity and established itself as a crucial instrument for analyzing public opinion on elections and other trending issues. The unexpected but interesting results of recently held Nigeria's presidential


Introduction
Sentiment analysis models, using public political discourse, have gained a wide popularity in the field of political methodology (Roberts, 2018); an approach for the study of quantitative and qualitative approaches used to understand politics and political systems (Achen, 2002). Several studies have presented methods which often combine statistics, mathematics and formal theory for the understanding of politics. In the past, election outcomes have typically been anticipated using analytical and statistical prediction techniques, with the methodology relying on surveys and qualitative methodologies, and studying political party manifestos while keeping an eye on trends in the mainstream media (Ascher, 1982;Remmer, 1993). It has become increasingly challenging yet interesting to anticipate elections due to greater competition and opposition in the government, especially in countries with multi-party systems like Nigeria (Cameron, Barrett, & Stewardson, 2016;Oluro & Bamigbose, 2021). The history of election forecasting goes back in time with a number of reasons such as aiding politicians to adapt their strategy/ideas, helping scientists to improve their forecasting models, assisting media to inform readers/viewers, and satisfying the curiosity of citizens among other things (Sanders, 2023).
Before and after the social media revolution, surveys have been used frequently to identify opinions prior to elections; however, this approach has been limited by the difficulty in constructing an appropriate sampling procedure, hence making it difficult to obtain a representative sample of political viewpoints. Social media platforms such as Twitter, Facebook, Reddit and Instagram have helped in some way to overcome the "systematically inappropriate" sampling procedure in survey studies and have been the prominent tools of political campaigns and activisms during elections. The recent advancements in the fields of Natural Language Understanding (NLU) and Natural Language Processing (NLP) have improved the reliability of prediction models built for unstructured and unsupervised dataset. The social media has radically upended the traditional campaign norms and tactics in national elections vis-à-vis its volume, velocity, scope and tactics of use (Allcott & Gentzkow, 2017;Hendricks & Schill, 2017). Studies also showed that even if it cannot be categorically said that social media singlehandedly elected Donald Trump, his social media campaign strategies changed the way social media will be used in elections in the future (Hendricks & Schill, 2017;Wani & Alone, 2014). Specifically, users of Twitter commonly seek information to digest or disseminate. Being a unique platform where trends are easily formed around topics via generation of hashtags as against individual posts which is prominent across other social media outlets. It is therefore easier to track users' moods and directions as regards selected topics. Advertently, in a study to examine the potential of Twittersphere, authors found that out of all sampled categories, political hashtags and therefore discussions were the most persistent on Twitter (Kwak, Lee, Park, & Moon, 2010).
There has been a growing interest in the use of NLP and other artificial intelligence techniques to predict election results in recent years with the numerous amount of social media datasets generated daily. These techniques include sentiment analysis and topic modeling in addition to more advanced models that incorporate deep learning and fundamental statistical techniques. Two important aspects of NLP are sentiment analysis and opinion mining, which aid in classifying and investigating the behavior and approach of social media users with regards to brands, events, companies, customer services as well as elections. There are algorithms for automating the process of extracting emotions from user's posts by processing unstructured texts and preparing models that extract knowledge from it. In this study, we consider Twitter; one of the prominent social network sites, to analyze the upcoming 2023 Lagos State Gubernatorial election. Given the statistics, Twitter has an active monthly user of 316 million and, on average, about 500 million tweets are posted daily. A review of related works in subsequent section shows that Twitter is one of the most powerful tools in political analysis. In the current fast-paced global network system, information spreads digitally between users and shapes the way these users feel about an event, thereby making it crucial to understand the thought polarity, emotions and sentiment of the masses.
This study was conducted to understand opinions on 2023 Lagos State gubernatorial election using Twitter dataset. It aims to capture, process and evaluate the public opinions from three main perspectives: sentiment and timeline analysis of the personal accounts of contesting candidates; sentiment and general tweet analysis of the public on the three candidates and general tweet analysis of the public on the Lagos State 2023 elections without reference to any of the candidates. The study would therefore concentrate on the following methodology: i.
Identifying keywords, hashtags and accounts which wholistically explain the subject matter as well as capture the top three contestants for Lagos State 2023 gubernatorial elections. ii.
Scraping the tweets through Twitter API using Python programming. iii. Data cleaning (removing white spaces, links, punctuations, stop words, tokenization, retweet). iv. Developing, pretraining and tuning a BERT model and training with existing annotated IMDB dataset. v.
Using Natural Language Understanding techniques such as sentence polarity, topic modeling, entity extraction, word frequency, word cloud etc. to understand the personal profiles of the three strongest Lagos gubernatorial candidates. vi. Analyzing the result to predict the direction of Lagos State election to be conducted on March 18, 2023.

Study Premise
Lagos State happens to be the smallest of the thirty-six states in Nigeria, Federal Capital Territory inclusive, but with the highest urban population consisting 27.4% of the National population. Lagos is Africa's leading New Partnership for Africa's Development (NEPAD) city and world's sixth megacity which attained its megacity status in 1995. It is essentially a Yoruba environment but with diverse culture and traditions and inhabited by the Yorubas, Aworis and the Ogus, with other pioneer immigrant settlers (Edos, Saros, Brazilians, Kannike/Tapa) collectively called Lagosians but also referred to as the Ekos. Due to its strong economic foundation, maritime location, and socio-political significance which led to a high rate of migration to the State, Lagos has earned the title of megacity.
Politically, Lagos State consists of 20 Local Government Areas (LGAs) and 37 Local Council Development Areas (LCDAs). The prominent ruling political party in Lagos is the All Progressives Congress (APC) which is one of the three major ruling parties in Nigeria, alongside People's Democratic Party (PDP) and Labour Party (LP). Unlike in the past where the Presidential election results were dominated by APC and PDP only, the Presidential election which held on the 25 th February, 2023 was keenly contested by three political parties with the winning distribution shown in Figure 1. This is as rightly predicted by authors in previous study (Olabanjo et al., 2022). After the Independent National Electoral Commission (INEC) -the election conducting body -announced the results of Presidential election, there have been allegations of rigging, misconducts and misappropriation in the electoral process, improper conducts, unacceptable result transmission from polling booths, amongst others, thereby electrifying the Twitter space. This agitation, with the fact that the 2023 winner party APC did not win Lagos, brings a major paradigm shift from the status-quo of the APC party always winning Lagos State gubernatorial seat. This implies that the result of the forthcoming governorship election in Lagos State will be as interestingly dicey as the Presidential election; hence, the justification for this study.

Related Works
Sentiment analysis has been used to predict the opinions of the citizens on US election using Twitter data (Wang, Can, Kazemzadeh, Bar, & Narayanan, 2012). The authors used 17,000 tweets to train their model (Naive Bayes), and by categorizing the tweets into positive, negative, neutral, and not-sure, the model was able to predict outcomes, however, with less than 60% accuracy. This helped in the analysis of real-time tweets from the people, which provided invaluable insights into the public's perceptions of each candidate. Another study (Neogi, Garg, Mishra, & Dwivedi, 2021) scraped over 20,000 tweets from Twitter to analyze and categorize them as favorable, negative, or neutral in order to get international perspectives regarding a farmer protest which took place in India. Both TF-IDF and Bag of Words were employed in the analysis, however Bag of Words performed better than TF-IDF.
Other authors (Chandra & Saini, 2021;Somula, Dinesh Kumar, Aravindharamanan, & Govinda, 2020) also established that Twitter can be used as an election indicator and has the ability to predict the favorite candidate of the people to emerge as the winner before the election is conducted. The people gave positive opinions about Donald Trump in almost all states in the United States prior to election. 1,000,000 tweets were collected from various users from different states and sentiment analysis was conducted. A study (Kausar, Soosaimanickam, Nasar, & Applications, 2021) used Twitter data to check and conduct how different countries affected by the Corona virus coped with the situations. Tweets posted in English were analyzed to give the opinion and emotion of the people concerning the pandemic in their respective countries and 50,000 tweets were used in the study. Studies have also combined different features and used ensemble models to increase the detection of polarity of tweets for sentiment analysis and established that unsupervised and ensemble machine learning models outperformed other classical machine learning models in detecting opinions from text (Bibi et al., 2022;Carvalho & Plastino, 2021). Authors have used Twitter data to examine political homophily in American Presidential Elections in 2016 where 4.6 million tweets were collected for analysis (Caetano, Lima, Santos, Marques-Neto, & applications, 2018).
In some studies, authors used the ratio of positive message rate and negative message rate to predict the likely winner of a forthcoming election using twitter data and it showed that these opinions could be used to predict candidate that will emerge as the winner (Yavari, Hassanpour, Rahimpour Cami, & Mahdavi, 2022). Twitter data has been used to establish that social media is not only used to express opinions, but it is also used to share ideas and opinions among other users. Authors used 100,000 tweets to predict German federal election in 2009 which could serve as a political indicator for the election (Tumasjan, Sprenger, Sandner, & Welpe, 2010) and another study (Schmidt et al., 2022) also predicted the outcome of German presidential election of 2021 using 58,000 tweets which they established that traditional machine learning methods like Naive Bayes performed less than transformer-based models like the bidirectional encoder from transformers (BERT).
Authors (Razzaq, Qamar, & Bilal, 2014) performed social media sentiments about political parties to study and forecast Pakistan's general election. They used supervised machine learning algorithms to classify tweets into positive, negative, and neutral. The findings of their experiment shows that social media content can be a useful indication for identifying political behavior of various parties. In another study (Singh, Sawhney, & Kahlon, 2017), 90154 tweets were analyzed and the results were compared with the actual election results, their model predicted the winning party accurately. Twitter is indeed used as a platform for political deliberation. The political preferences shown via basic tweets comes close to traditional election polls. Simply, the number of tweets collected per party reflects the votes they receive in the election. Therefore, it's concluded that Twitter can be considered a good indication of real-time political sentiment (Tumasjan, Sprenger, Sandner, & Welpe, 2011).

Materials and Methods
This study seeks to analyze tweets of the political discourse around the upcoming 2023 Gubernatorial election in Lagos State and its prominent contesting candidates and this section describes in detail the approach used in discovering the sentiments of people around this area of public interest. Figure 2 shows the workflow of the BERT framework designed for this study. Every tweet with a political content either contains a neutral, positive or negative sentiment for or against a party or candidate. The sentiments contained in tweets especially when it is specific to a candidate is not easy to compute with algorithms since emotion expression varies with the personality, region and cultural background of each person. Since this is an unstructured (unlabelled) dataset, sentiment analysis is often challenging because of the features, context and semantics peculiarity of each tweet. The study framework has been stratified into four phases namely: the data collection and preprocessing phase, the BERT design phase, the Natural Language Understanding Inference (NLUI) phase and the performance evaluation phase.

a. Data Collection and Preprocessing Stage:
This phase is concerned with the systematic extraction of relevant tweets from Twitter databases concerning Lagos State gubernatorial election. Two categories of dataset were extracted from Twitter using the Python snscrape library: personal tweets of three most prominent candidates in Lagos election race and the public tweets showing the interest of the potential voters in each of these candidates. After scraping, only the attributes of interest were retained for the study. The preprocessing (Algorithm 1) was done to remove noisy, irrelevant and inconsistent contents from the dataset.

Algorithm 1. Preprocessing Tweets
Input: Twitter posts Output: Clean, preprocessed, data with no noise For each comment in the Twitter data file Initialize temporary empty string processedTweet to store result of output.
1. Replace all URLs or https://links with the word 'URL' using regular expression methods and store the result in processedTweet.
2. Replace all '@username' with the word 'AT_USER' and store the result in processedTweet.
3. Filter All #Hashtags and RT from the comment and store the result in processedTweet. 4. Look for repetitions of two or more characters and replace with the character itself. Store result in processedTweet. 5. Filter all additional special characters (: n | [ ] ; : {} -+ ( ) < > ? ! @ # % *,) from the comment. Store result in processedTweet. 6. Remove the word 'URL' which was replaced in step 1 and store the result in processedTweet. 7. Remove the word 'AT_USER' which was replaced in step 1 and store the result in processedTweet.
The output of this phase is a fully preprocessed dataset ready to be passed into the BERT model.

b. BERT Design Phase:
This involves the tokenization of the tweets, stemming and lemmatization and the BERT finetuning and evaluation. Tokenization is a term which describes separating a corpus into smallest, syntactically meaningful, units. It is a basic stage in text data modeling that helps in understanding the meaning of a text by looking at the order in which the words are used. Porter stemmers were used to remove the suffix to create stems and essentially return each word to their base forms (Jivani, 2011). This is then passed into the BERT pipeline ( Figure 3) for sentiment analysis. Bidirectional Encoder Representations from Transformers (BERT) model is a Google AI Language masterpiece which has been widely applied in a wide range of NLP tasks (Devlin et al., 2018;Ravichandiran, 2021). The main strength of the BERT model is its application of bidirectional training of transformer which is a popular attention model to language modeling. Earlier similar models looked at text sequences from either a left-to-right or a combined left-to-right and right-to-left training perspective. BERT models have demonstrated that bidirectionally trained language models can comprehend context and flow of language more deeply than single-direction language models (Lee & Hsiang, 2020). The Linear Support Vector Classifier was used to test the accuracy of our deep models. SVM has been one of the most robust prediction techniques which is based on a statistical learning framework (Noble, 2006). Its workings and model development strategies have been broadly explained in some existing works (Pisner & Schnyer, 2020;Suthaharan, 2016;Widodo & Yang, 2007). The BERT pipeline as shown in Figure 3 has two main phases: the pre-training and the finetuning; the first concerned with the pretraining of unlabeled data while the second deals with modifying the learning parameters of the model. This is done until the optimum performance is achieved. BERT models use a multi-layer bidirectional transformer encoder with 6 layers in the encoder. For each word entered into the attention layer of the model, it is converted into Query (Q), Key (K) and Value (V) representing the actual word, keyword or meaning of the word and the value of the intent, purpose or polarity of the word respectively(Di Caro & Grella, 2013;Fan & Chang, 2010). The finetuning and evaluation process deals basically with the selection of hyperparameters and compares experimentally the results of training and testing. The major hyperparameters which require tuning in a BERT model are the learning rate, epoch and the batch size.
c. Natural Language Understanding Phase: Three tasks are embedded in this phase to aid our understanding of the dataset being studied especially as it relates to the three contesting candidates under study. The time-series-frequency analysis was used to identify the tweet pattern and interest growth of voters regarding the candidates. Topic modeling is a technique for unsupervised categorization of twitter documents which helps to identify natural groups of words even when we are not certain what the outcome will be. In this study, the Latent Dirichlet Allocation (LDA), a particularly popular algorithm for achieving this task, was used (Wei & Croft, 2006;Yu & Yang, 2001). A −dimensional Dirichlet random variable takes values in the ( − 1)simplex, that is, a -vector lies in the − 1 simplex ≥ 0, Σ { =1} = 1 and has the probability density function on this simplex as given in Equation (1) where is a −vector with component > 0 and Γ( ) is the Gamma function.

Results
In this section, we present and discuss the results obtained in this study with their implications for the set goals. Results are recorded in terms of the dataset, processing, parameter tuning, inferences and performance evaluation of the BERT sentiment model.

a. Dataset
Our dataset was extracted from immediately after the announcement of the Presidential election results (March 1, 2023) till two days before the gubernatorial elections. There is a keen interest of Twitter users in the Lagos State election, hence the vast number of tweets scraped for this study. Figure 4 shows the size of the dataset as it passed through each preprocessing task. Figure 3 is a representation of the change in the size of dataset during each preprocessing task. The raw dataset was reduced from 302MB to 201MB after the preprocessing tasks were completed. In this section, we examined the personal Twitter accounts (@jidesanwoolu, @GRVLagos and @ officialjandor) of each of the three Lagos Gubernatorial election candidates of interest from March 1, 2023 which happens to be the day attention shifted from the national election to the state election to March 17, 2023 (the eve of the state election). The first dataset contained the personal tweets of the candidates and Table 4 shows their tweet summarization, tweet frequency and impressions made by their personal tweets. The percentage total impression is given in Figure 6. Figure 5 shows the tweet counts of the respective candidates per day for the seventeen days study period. Results also show that Jandor has the least percentage total impressions (near zero), tweets and followers.  Our social network analysis on each of the three candidates for the period covered in this study produced Figure 7. This was used to understand the Twitter activities of the verified friends of the gubernatorial election candidates in terms of retweets and mentions. It shows that Sanwoolu maintained the strongest and most active verified network of Twitter friends relative to GRVLagos and Jandor. Jandor, on the other hand, has many friends who are dormant and inactive to his posts. Sanwoolu has prominently active verified friends including the Presidentelect, Bola Ahmed Tinubu, Babatunde Raji Fashola (minister for Works and Housing of Nigeria), etc. The BERT model was applied to the public tweets of the three candidates to determine the sentiments of the public concerning them. Tweets were categorized into Positive, Neutral and Negative represented by 1, 0 and -1 respectively. Table 2 presents the public tweets as scraped from Twitter. It shows that GRVLagos of the Labour Party amassed most public tweets and impressions. Figure 9 shows the sentiments of the tweet concerning the candidates as determined by the pretrained BERT model. This figure also shows that Jandor made the least total public impressions so far in this gubernatorial election contest. Also GRVLagos has made the most impression on Twitter with the highest tweets and highest overall public positive sentiments as adjudged by our BERT model.  The tuned parameters yield different results given the various experimental stages of parameter tuning with different learning rates (LR). The results of the precision, recall and f1-measure as reflected by the variations in the learning rate are explained in Figure 9. It also shows that the learning rate at 1e-7 gave the best performance. This shows that the smaller the learning rate, the higher the accuracy but the larger the epoch size, the higher the accuracy.  Figure 9. Results of the parameter tuning per epoch.

Discussion
In this study, we proposed a modified BERT model for the understanding the discourse around Lagos State 2023 gubernatorial election. Lagos State is a megacity with the highest population as well as the highest internal revenue generating state when compared with other 35 states in Nigeria. Agitations arising from the results of the just concluded Presidential election have given unprecedented attention to the State election. In order to fully understand the Twitter crowd behind each of the three candidates from the three prominent candidates, personal and public tweets were examined from March 1, 2023 to March 17, 2023. BERT model was used to establish the sentiments of the public tweets. Our experiments confirmed that: i. Sanwoolu, being an active government official, maintained the highest number of followers. ii.
The dominant discourse in the negative Sanwoolu tweets is the occurrence of the popular EndSARS protest which allegedly took the lives of peaceful protesters in Lagos. iii.
The dominant word in the GRVLagos negative tweets is the "no man's land" phrase which is a discussion of whether he is actually the son of the soil or qualified to rule Lagos given his maternal East-Nigeria's descent. iv.
Sanwoolu, however, maintained the strongest social media networks with many inpower personnel in his active circle. v.
GRVLagos has the highest twitter impressions and the highest positive impressions of the three candidates. vi.
Jandor does not pull the twitter crowd, hence, making the outcome of the election to become a potential two-horse race between Sanwoolu and GRVLagos.
The proposed model which gave a relatively high accuracy if compared with other non-deep learning models, although developed to understand the forthcoming elections in Lagos State, can be adapted to any election scenario.

Conclusion
This study shows that elections can be significantly impacted by sentiment analysis research by shedding light on voter sentiment and public opinion. Sentiment analysis examines social media messages and posts, news articles, and other sources of data to ascertain the sentiment or emotion underlying them using Natural Language Processing and machine learning techniques. This study can assist political parties and candidates in understanding how voters feel about them and their policies during an election. Sentiment analysis study can uncover trends in how people react to political occasions and speeches by examining social media posts and news articles. Political campaigns can use this information to modify their strategies and tactics as necessary. For instance, if sentiment analysis reveals that people disapprove of a specific policy proposal, a campaign may change or eliminate that idea to win over more votes. Furthermore, campaigns can use sentiment analysis to pinpoint the major topics that are resonating with voters and change their messaging to highlight those issues. Overall, sentiment analysis can provide valuable insights into public opinion and the mood of the electorate, which can help political campaigns make more informed decisions and adjust their strategies to better appeal to voters.
The BERT model has also proven to be an effective sentiment model. The reason BERT model is powerful is because instead of predicting the subsequent word in the sequence, the model accounts for all words in the sequence and comes up with a deeper understanding of the context and its potency has been proven in many literature.