Sunday, August 07, 2011

About the @mention constellations

Update: find out how to make one of these.

So, about this @mention constellations stuff.

The FAQ:

  • What exactly am I looking at? The main visualization is a map of Twitter mentions on June 21st. Each dot is a Twitter account. Each arrow dot-to-dot illustrates one account mentioning another. Despite the scale of the diagram the underlying dataset is relatively tiny: less than 10 minutes of conversation.
  • Why do some accounts seem to mention themselves? Occasionally accounts do actually mention themselves.
  • Can I get the data? Behind this visualization of June 21st in particular? No. In order to make the same thing from another day? Sure, you can get more than enough data to produce these things for free from the Twitter Streaming API.
  • What's the blob in the middle? Technically speaking it's the largest connected component of the mention graph. I just uploaded a detailed look inside it.
  • What software did you use to make this? Mainly Graphviz.
  • Sure, but how exactly did you make it? I took a sample of Tweets from Twitter's internal Hadoop cluster. I used a tiny Python script to extract the mentions. I loaded the data into a local MySQL instance. I queried MySQL for a sample of the mentions. I formatted the sample into dot using Perl and I laid out and rendered a PNG using Graphviz.
  • Is this your job at Twitter? No, this is a hobby project.
  • What's it like to work at Twitter? Very cool indeed. If you're interested I wrote some stuff about my transition from Google to Twitter at http://isaa.ch/workthoughts.