Table of Contents
In this practical, we are going to see how to apply sentiment analysis using Twitter data. First, we will be extracting Twitter data by connecting to Twitter and then we will be applying a sentiment score on the Tweets.
The following packages are required to do this practical.
- Python 3.6.0b1
- pip install tweepy
- pip install –U nltk pip install twython
The figure 1 shows the script, we will analyze this code line by line.
- First the relevant packages should be imported. The nldk, json packages are imported. From the tweepy, Stream and QAuthHandler, StreamListner, etc modules should be imported as well.
- If you are running this for the first time, the latest version of Vader Lexicon should be downloaded.
- In order to connect the Twitter data, a developer account and an App should be created. In this way we can get the keys. Then these keys should be inserted into the variables cAPIKey, cAPIsecret, aAPItoken, andaAPIsecret as shown in the figure 1.
- In order to authenticate, we have to pass the keys to the OAuthHandler() function. Name this variable as authentication.
- Then let’s call the Twitter stream, by the stream() function. Then pass authentication and read_data() class as the input parameters.
- What the read_data class will do is, it will import your Twitter data using StreamListner.
- Consider the on_data definition. It takes self and data as the input parameters. So, the data will have all Twitter data that will have a JSON format which will be having many columns.
- From these columns, we need only the column which has the tweets. Then those texts are assigned to the tweet_text variable.
- Then these tweets need to be encoded into utf-8. If there are errors, we are going to ignore them. These are specified in the text_data variable.
- Once the tweets are available, by using SentimentIntensityAnalyzer() we are going to apply scores on the Twitter data.
- Then we will be printing out the tweets and their sentiment score.
- Here we are limiting the number of tweets to 3. So, it will only print 3 records. This limit is verified by the if conditional statement. If the n is going to be higher than 3, it will fail.
- We can filter our Twitter stream by twitterStream.filter() function. We are going to filter the tweets by the keyword cricket.
- Then let’s execute the code. The output is shown in figure 2.
- Observe the output. It shows the negative score, neutral score, positive score, and compound score for each tweet. Also, note that the tweets contain the cricket keyword.
- If we want to load more tweets, as an example say 10, just change the limit of n at the if conditional statement as shown in figure 3.