Streaming Tweets from Twitter to Database




Streaming tweets from the Twitter API v1.1

First let's cover streaming tweets from Twitter. You're going to need a Twitter dev account. Sometimes Twitter uses dev.twitter.com to advertise various things they expect devs to be interested in. The problem is they sometimes make it hard to get to where you want to be. Just in case they've done this and the video's method isn't available, here's the link: Twitter Apps

Next, make a new application, filling in your name, description, website, agree to their terms, do the captcha, and create the application.

Once submitted successfully, you should be presented with a page where you can see your consumer key and consumer secret. Now you need an access token, so scroll down and click on "create my access token."

After a few moments, refresh, and you should be able to see the access key and access token. Once you have that, you're going to need to get Tweepy, which is a Python module for streaming Twitter tweets.

It is probably easiest to download and install Tweepy via pip if you're using a current version of Python.

pip install tweepy

You can also install using the setup.py method that I used in the video.

Once you have Tweepy, let's show a very basic example of its use:

from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener


#consumer key, consumer secret, access token, access secret.
ckey=""
csecret=""
atoken=""
asecret=""

class listener(StreamListener):

    def on_data(self, data):
        print(data)
        return(True)

    def on_error(self, status):
        print status

auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)

twitterStream = Stream(auth, listener())
twitterStream.filter(track=["car"])
		

Now, revisiting our code from the MySQL tutorial series, how might we insert this data into a database?

Code from MySQL with Python tutorials:

import MySQLdb
import time

#        replace mysql.server with "localhost" if you are running via your own server!
#                        server       MySQL username	MySQL pass  Database name.
conn = MySQLdb.connect("mysql.server","beginneraccount","cookies","beginneraccount$tutorial")

c = conn.cursor()

username='python'

tweet='man im so cool'

c.execute("INSERT INTO taula (time, username, tweet) VALUES (%s,%s,%s)",
            (time.time(), username, tweet))

conn.commit()


			

Looks like all we really need to do is connect these two scripts, so let's mesh them like so:

from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import MySQLdb
import time
import json

Above, we're adding the imports of MySQLdb and time. MySQLdb is the module we're going to use to interact with our MySQL database, and time is to get our current time. Finally, we're importing json. This module is part of your standard library, and you will not need to download/install it. We're going to be using the json module to interact with the returned json that the Twitter API throws at us.

#        replace mysql.server with "localhost" if you are running via your own server!
#                        server       MySQL username	MySQL pass  Database name.
conn = MySQLdb.connect("mysql.server","beginneraccount","cookies","beginneraccount$tutorial")

c = conn.cursor()


#consumer key, consumer secret, access token, access secret.
ckey=""
csecret=""
atoken=""
asecret=""

Here, we've meshed some of our main global variables. Added to the existing Twitter script is mainly the connection and the cursor definition.


class listener(StreamListener):

    def on_data(self, data):

        all_data = json.loads(data)
        tweet = all_data["text"]
        username = all_data["user"]["screen_name"]

        c.execute("INSERT INTO taula (time, username, tweet) VALUES (%s,%s,%s)",
            (time.time(), username, tweet))

        conn.commit()

        print((username,tweet))

        return True

Now we've written our listener class, with quite a few changes. We use json.loads to load the string data from the Twitter API data. Then, we identify the "tweet" as the "text" section of the json data. We then identify the username by visiting the all_data json, then navigate to the "user" part, and then use the "screen_name" identifier that Twitter gives to the user's name.

Once we have that data, we execute an INSERT query, inserting the time.time() value for time, the username, and the tweet data. Finally, we commit the data.

For our viewing pleasure, we are also printing out the username and tweet data, so that we can see it's working.

Next, we finish out the script, nothing major changed here:


    def on_error(self, status):
        print status

auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)

twitterStream = Stream(auth, listener())
twitterStream.filter(track=["car"])

	  

All together now:

from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import MySQLdb
import time
import json



#        replace mysql.server with "localhost" if you are running via your own server!
#                        server       MySQL username	MySQL pass  Database name.
conn = MySQLdb.connect("mysql.server","beginneraccount","cookies","beginneraccount$tutorial")

c = conn.cursor()


#consumer key, consumer secret, access token, access secret.
ckey=""
csecret=""
atoken=""
asecret=""

class listener(StreamListener):

    def on_data(self, data):
        all_data = json.loads(data)

        tweet = all_data["text"]

        username = all_data["user"]["screen_name"]

        c.execute("INSERT INTO taula (time, username, tweet) VALUES (%s,%s,%s)",
            (time.time(), username, tweet))

        conn.commit()

        print((username,tweet))

        return True

    def on_error(self, status):
        print status

auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)

twitterStream = Stream(auth, listener())
twitterStream.filter(track=["car"])

Want more tutorials? Head to the