Using OAuth for downloading Twitter Sentiment Analysis dataset

While I was working my project on twitter sentiment analysis, I was looking into the various sources for an available dataset. And I found this by Niek Sanders.  This includes dataset of hand classified tweets for training and testing sentiment analysis algorithms. The zip file includes a python script install.py that downloads the tweets specified in the corpus.csv file. The corpus.csv does not contain the tweets, but just the IDs. The tweets need to be downloaded via API as the API agreement restricts sharing the tweets.

This was a great place to start, but I had a hard time downloading the tweets that gave me an error – “{\”errors\”: [{\”message\”: \”The Twitter REST API v1 is no longer active. Please migrate to API v1.1. https://dev.twitter.com/docs/api/1.1/overview.\”, \”code\”: 68}]}”. I noticed that many are facing similar problems and I also did not find the one resource that had all the details to overcome the same and decided to compile it all in this. Hope you find it useful.

The error is due to the change in the API version. So, the API version 1.0 cannot be used and you need to migrate to the new version 1.1 to download the dataset mentioned above. The new API version does not have the tweets as public anymore and requires the use of OAuth(Twitter uses this to provide authorized access to its API) for accessing/downloading tweets. The install.py needs to be modified to use OAuth and download tweets.

This is a good place to refer to make the API migration and using oauth. But this does not work as is, and some changes had to be made to get this working, like

body=post_body

had to be converted to

body=""

You would first need to generate the consumer_key, consumer_secret, access_token_key and access_token_secret for the oauth request function. This can be done by creating your application here. This will give you the consumer key, consumer secret. In the details tab of your application, you can generate the access token  and access token secret.

Continue reading

Advertisements