Over the past few days Twitter announced that it is terminating their relationships with resellers of the main “fire hose” Twitter data stream. It’s termed a fire hose because of the sheer volume of tweets generated. There are something like 6,000 tweets generated per second, a lot of data by anyone’s standards. Twitter offers a variety of other ways of getting a sample of tweets through an “API”, Application Programming Interfaces that enable software such as R, Python and many others to access Twitter data. However this is a sample of tweets, not a real time stream of all tweets. You can also access Twitter from user accounts and usually find what you want, but by no means can you guarantee that you get all the tweets that you wanted on a subject. That’s fine though, these streams and accounts are free, it is hard to complain about something that is free. You can read more about these facilities at https://dev.twitter.com/ .
Some years ago a company called Gnip, now acquired by Twitter, was reselling fire hose access along with Datasift (datasift.com) and NTT Data (nttdata.com) who provided Japanese language tweets only. These companies provided a framework for access to the Twitter fire hose for a fee. Now these companies cannot resell the Twitter fire hose data. Datasift has many social media streams available, including information from Facebook, but losing Twitter access has to be a blow. A couple of weeks ago I decided to try to get access to the Twitter fire hose. This is related to a social media project I will be working on. I had used a free Gnip account some years ago when they first started, I’d now got access to the public Twitter API and I wanted to see what the fire hose would cost. I thought it would be logical to ask Twitter, as they had acquired Gnip recently, and also to ask Datasift. I dutifully went through the sales contact process for Twitter and eventually had a call with I thought was a sales person and his supervisor. I say “thought” because it proved very hard to get any sort of structured price out of them. The conversation with Twitter was very obtuse, I only received a vague idea of price. In the end I gave up asking . Oh well.
I decided to call Datasift next, and I will confess I was expecting something like the same treatment. Perhaps Twitter is very strict about who buys their data and the “Kafkaesque” lead qualification process was required by Twitter to get access to the fire hose. After all it is Twitters’ data, they can do as they please. Happily Datasift proved to be extremely efficient, I quickly got exactly the information I needed. I got a firm price for Twitter fire hose access, I was told what other facilities I would need from Datasift and what they would cost. My sales person, James Johnson, was the epitome of helpfulness and professionalism. As it turned out the cost was too high for the current project, but within bounds if I was starting any sort of social media analysis business. Datasift do have a range of other social media sources such as Reddit, Tumblr etc. According to the CEO of Datasift, Nick Halstead, Datasift will soon be providing Facebook topic data. Datasift does have an easy to use sign up process for accounts and it costs nothing to experiment a little with their data feeds. If you are interested in raw social media data I recommend Datasift.
I am truly disappointed that Twitter has seen to cut Datasift off from the fire hose. I’m looking forward to the Facebook topic data from Datasift. I am sure I will be told clearly the price and conditions of usage. As for Twitter, who knows? Maybe one day they will make a profit…..