Friday, December 30, 2011

Electoral reform, epilogue

After my last post people kept sending in more things, and I keep discovering more from following the leads they give me. The topic also has a deep academic history, which I should have guessed. I thought I'd wrap up here with a quick summary before moving on.

A friend Matt pointed me at a 2005 New York Times article entitled "Why Vote?". It's fascinating and worth a read, but the bit about "your chances of winning a lottery and of affecting an election are pretty similar" is demonstrably false, as shown in the last post: fact is that you're trillions upon trillions of times more likely to win the lottery than to affect an election.

As mentioned (and misattributed) in the article, Anthony Downs, the pre-eminent economist and political scientist, in his 1957 work "An Economic Theory of Democracy", concluded that "a rational individual should abstain from voting". This, "Downs's Paradox" can be stated as

In voting, compute the benefits (B) of having one's candidate win and weight them by the infinitesimal probability (P) that one's vote will be decisive. Then, since voting is costly (takes time, mainly), calculate the overall reward (R), proportional to the probability of actually turning out, as R = P×B - C. Because P is so small, this will be negative for almost any positive C. Thus, no rational individual should vote.
And yet supposedly rational people do in fact vote—hence the paradox.

Some attempts to resolve this issue have introduced a new term, D, to the equation, representing the reward one gets from expressing oneself, participating generally in the democratic process, or fulfilling an endogenous sense of civic duty. Then the reward and probability of turning out becomes R = P×B - C + D which may be positive even for minuscule P, explaining the empirically observed turnout.

This leads us to the research on the Swiss system which is referenced by the Times article. In that paper Patricia Funk postulates another factor contributing to the term D: that of exogenous social pressure to vote. From the paper,

The key innovation of this paper is to use a natural experiment, which allows me to shed light on this particular motivator to vote: the introduction of optional mail voting in Switzerland.

The intuition behind this experiment lies in the opposite effects, postal voting (or other modern voting tools such as internet voting) have on economic and social incentives to vote. Concerning the first, the obvious effect is a reduction in voting costs, with a positive effect on turnout. Secondly, mail-in or internet voting renders the voting act unobservable. If social pressure matters for voting decisions, the presence of mail-in ballots provides an opportunity to escape. Therefore, the more social concerns matter for voting decisions, the more distinctive the mail ballot system’s trade-off between cost reduction and a reduction in social incentives.

While previous voting models cannot easily account for a negative turnout effect of mail-in or internet voting alternatives, a positive turnout effect is consistent with both traditional voting models and with those that include a concern for social motives. The sharpest test for social pressure arises from looking at the effect of postal voting in different-sized communities. A large number of anthropological studies have documented that social control is particularly strong in small and close-knit communities. People know each other and gossip about who does their civic duty and who does not. Therefore, the relief from social pressure is supposedly the highest in small communities and
ceteris paribus, also this negative "social effect" on turnout.

What the study found was that the reduction in voting cost afforded by the opportunity to vote by post didn't result in any statistically significant increase in turnout. In fact,

Turnout declined up to 7 percentage points in the [administrative division] with the highest share (i.e. 36%) of citizens living in small communities. A replication of the same procedure with community-level data confirms that the turnout decrease was particularly a “small-community”-phenomenon.

That is to say, once the Swiss were no longer obligated by the social pressure in close-knit communities to show their faces at the polling places, they didn't.


One final note from my friend Andrew, who pointed me at the Asimov short story "Franchise" in which

the computer Multivac selects a single person to answer a number of questions. Multivac will then use the answers and other data to determine what the results of an election would be, avoiding the need for an actual election to be held.

I couldn't find an ebook so I ordered the paperback. Sounds like a good read.

Update 9.37pm December 30th: I couldn't leave it alone. Two final points, and then changing topic:
  1. Read Hannah's comment on the original post. It brings in Toqueville, about whom I now feel I should be more educated and you might too.
  2. I also wanted to add a super thought experiment which Robin threw in: what if government representatives themselves were simply chosen at random from the electorate? "It's super-duper jury service. Which, as an analogy, I realize, does not exactly fill a person with optimism," he writes but of course you've got to think past the implementation difficulties. Personally I'm pretty sure it'd be utter chaos in the short term but in the long term I figure the infrastructure surrounding these hapless "politicians" would adapt to the new regime. Probably they'd make it like the last.

Thursday, December 29, 2011

Electoral reform, redux

I had a great set of responses to my recent post about electoral reform. I got emails, Tweets, blog comments, and comments on Google+. Many interesting thoughts and articles, which I thought I'd respond to in this follow-up.

Some folks argued with the decision not to vote. How could that be a good idea? What if everybody did that? This one's easy. Firstly, that'd be great. Secondly, it's irrelevant. Consider: I might do my bit about overpopulation by deciding not to have kids. Sure, if everybody did that then the consequences would be catastrophic but that doesn't disqualify it from being a rational individual choice. Indeed, many people do make that choice with net positive effect.

Nick provided an interesting post on the "fallacy of the deciding vote". It's a well reasoned piece, but premise (2) renders it inapplicable: the argument at hand isn't whether my individual vote unilaterally decides the result (which obviously it doesn't, as the article points out, in all but the most degenerate cases). It's about the probability that adding or subtracting my individual vote has an effect on the outcome. I don't need my vote to be the single deciding vote by any means.

So that we're clear, before we move on let's call a given individual voter democratically impotent in an election if the election has an identical outcome with or without their vote. Conversely the voter is democratically potent if removing their vote changes the outcome.

Dominic commented on the post presenting a syllogism:

Premise: Your vote is the same as everyone else's
Premise: Your vote literally counts for nothing
Conclusion: Everybody's vote counts for nothing

...and yet, we have a result. I suggest that premise 2 is faulty
and he's right. My hyperbolic "I literally count for nothing" should really have been "in the span of my lifetime, with overwhelming probability, I'm democratically impotent in every election in which I'm eligible to vote". There's a mathematical distinction but barely a practical one.

Interestingly, as you remove more and more voters from the electorate the chances of a voter which remains being democratically potent increase significantly. Thomas Pogge, Yale's Professor of Philosophy and International Affairs, showed in a post last year that in a small population of 63 voters, with each voter choosing independently and with equal probability between two electoral candidates, a given voter has a full 10% chance of casting a deciding vote.

I like those odds, but in large populations things look much bleaker.

In the simple equiprobabilistic model which Pogge presents, in an electorate of 100,001 people the probability of a given voter casting a deciding vote is about 1 in 400. Not bad at all! Assume instead, though, a slight general preference in the populace for one candidate over the other, let's say 49% to 51%, and suddenly your 1 in 400 chance of affecting the outcome in Pogge's model sinks to 1 in 193 billion.

Scale up from there to a population of just a million, and you as an individual voter are democratically impotent with a novemvigintillion to 1 probability¹. If you voted in an election of this kind every second for as long as you lived (or even, hey, every nanosecond for the entire lifetime of the universe), chances are—by an inconceivable margin—that you'd not affect the outcome of even a single one.

Little wonder, then, that cynicism about democracy comes so easy. Richard passed along a link to a passionate soi-disant rant including:

I am being sadly sincere when I describe [democracy] as a system which is much better at giving the feeling of participation than actual participation. To me, this is one of the terrible things about democracy (and part of why it is so successful) - because voting lets people feel like they can influence things. Even if they don't vote, they feel like they could have voted.

But any one vote never matters...
…as indeed is demonstrated above.

So we come back to that idea again of selecting the election winner by picking a single ballot at random and going with that. Now one's chances of democratic potency in an electorate of a million people voting between two candidates with a 49%/51% baseline preference is simply a million to one. Still sounds like long odds but it's a trillion trillion trillion trillion trillion trillion trillion times better than before. In this system people are thus vastly more individually empowered, the tyranny of the majority is ameliorated, and the end result remains proportionally representative, albeit with added statistical noise. It's a win-win-win!

At the end of the day, though, as James noted, whoever wins an election it's always a politician. A solution to that problem is left as an exercise for the reader.

Next in this series: Electoral reform, epilogue

¹ a novemvigintillion is a billion billion billion billion billion billion billion billion billion billion; about ten billion times the number of atoms in the universe.

Tuesday, December 27, 2011

Christmas 2011

A brief interlude to regular programming: I thought I'd blog a few pictures from a Christmas Day in San Francisco. I already put a bunch from earlier in the season on Flickr, as well as a set from Christmas Eve, but here are some from December 25 itself.

Probably for the last Christmas in a while, we got up late in the morning: Lux slept an incredible 15 hours and didn't wake up until 10.30am. By that time I'd already had a lie-in and a good chat with my mom at home in the UK—both generous presents themselves, and altogether a lovely start to the day. When Lux did wake we had a family breakfast

Blueberries for breakfast
and opened some presents.

It was a beautiful day, clear and bright and dry, so we headed up Bernal Hill for a walk. I feel very lucky to live in this climate, on this hill, in this place

Bernal Christmas
with my favorite people
Bean and rocks

Seriously. The most precious companions (and one more family member on the way) as well as this view ten minutes' walk from your house on Christmas Day?

San Francisco on Christmas
Truly I'm blessed.

Tuesday, December 20, 2011

Electoral reform

I don't vote. I've never voted. It's a long story.

If, though, at the end of my life you count the number of elections in which I was eligible to vote, whose ultimate outcome was decided by a single vote, and therefore could have been affected by mine, I bet you count zero. Add my vote, take my vote, no change in the result in any election. I literally count for nothing.

My vote would count for more in the following system I first saw described nearly ten years ago:

Instead of counting the ballot papers, and declaring the winner to be whoever got most votes, the ballot papers would be put into a tombola, thoroughly mixed and one ballot paper taken out. The winner would be whoever was voted for on that ballot paper.

To encourage a high turnout, only the winning candidate would retain his deposit. The remaining deposits would be given, as a prize, to whomever cast the vote which was taken out of the tombola.

In this arrangement I'm actually more likely as an individual to have my own vote affect the outcome than in the current system. It doesn't seem any less democratic: as the original post says it's just a statistically noisy form of proportional representation.

I note the following advantages over the "first past the post" system:

The future of politics is right here! What's not to love?

Now read Electoral reform, redux, for more on the math and philosophy of this proposal—and why not to vote in the current system.

Lasting value

As I said, my dad's a teacher. And a few days ago @pichipsandgravy tweeted at me.

Now it turns out that yes, when I was growing up my dad taught kids in elementary and middle schools in the north of England. And to my utter shame at the time the family car was indeed a yellow Citro├źn 2CV

So I wrote

and then over the course of a few more Tweets (from me in San Francisco to @pichipsandgravy on an oil rig in the North Sea) established that yes, my dad was his teacher at school about 30 years ago. He'd learned guitar from my dad, and still plays. He remembered my father being a particularly special teacher; was hoping to get back in touch to say thanks.

My wife Wendy is a teacher too, and (on a smaller timescale) gets the same thing: kids of classes past making touching personal gestures of appreciation; being a part of the neighborhood's social history; local families recognizing her on the street and excited to see her.

It makes you think! I wrote my dad "I can guarantee you that nobody's going to be writing to me in 30 years being appreciative of my work and how it's affected their life!" and I believe it.

¹ which was desperately uncool until the same color and model featured in the Bond film For Your Eyes Only in 1981.

Monday, December 19, 2011

On conformity

My dad's a teacher. In his study when I was growing up he had a poem on a sheet of paper. I've always loved it.

He always wanted to explain things
But no-one cared
So he drew
Sometimes he would draw and it wasn't anything
He wanted to carve it in stone
Or write it in the sky
He would lie out on the grass
And look up at the sky
And it would be only the sky and him that needed saying
And it was after that
He drew the picture
It was a beautiful picture
He kept it under his pillow
And would let no one see it
And he would look at it every night
And think about it
And when it was dark
And his eyes were closed
He could still see it
And it was all of him
And he loved it
When he started school he brought it with him
Not to show anyone but just to have it with him
Like a friend
It was funny about school
He sat in a square brown desk
Like all the other square brown desks
And he thought it should be red
And his room was a square brown room
Like all the other rooms
And it was tight and close
And stiff
He hated to hold the pencil and chalk
With his arms stiff and his feet flat on the floor
With the teacher watching
And watching
The teacher came and smiled at him
She told him to wear a tie
Like all the other boys
He said he didn't like them
And she said it didn't matter
After that they drew
And he drew all yellow
And it was the way he felt about morning
And it was beautiful
The teacher came and smiled at him
"What's this?" she said
"Why don't you draw something like Ken's drawing?"
"Isn't that beautiful?"
After that his mother bought him a tie
And he always drew airplanes and rocket ships
Like everyone else
And he threw the old picture away
And when he lay out alone and looked out at the sky
It was big and blue and all of everything
But he wasn't anymore
He was square inside and brown
And his hands were stiff
And he was like everyone else
And the things inside him that needed saying
Didn't need it anymore
It had stopped pushing
It was crushed
Like everything else.

It's beautiful, tragic and moving, but what really takes your breath away is the coda: the author, a high school senior, committed suicide two weeks after submitting it as an English assignment.

Sadly, the actual story behind the work is unclear and likely lost to history. Still, it gives you pause.

Wednesday, December 07, 2011

The Littlest Twitter Term Counter

A basic need of many folks I work with is to count hashtags or terms on Twitter in real time.

The below isn't a complete solution by any means but it does the job in the simplest sunny-day case. If you, or your technical team, are looking for the most basic starting point then this is it:

curl -d 'track=TERM' -uUSERNAME:PASSWORD -s -o - \ \
| perl -e '$|++; while (<>) {m/\{/ and $i++ and print qq{\r$i}}'

Fill in the values in CAPS (use your Twitter credentials) and you're all set.

Tuesday, November 29, 2011

Google Multiple Accounts

Right on the heels of the sublime, here's the ridiculous: Google's support for multiple accounts.

First of all, though, don't get me wrong; I worked at Google for years. I left voluntarily, and still consider myself extremely fortunate to have worked there. I'm very lucky to count many current Googlers amongst my friends, and by no means am I irrationally antagonistic towards Google.

This multiple login madness has driven me over the edge, though. It's been on Quora; on Twitter, on Buzz (I lost the link, but the thread included Jeff Huber) and on Google+. Amongst people with whom I work it's inciting fury and despair in equal measure.

Here's a simple reproducible example: let's start off with a fresh new Chrome Incognito Window:

2011 11 29 07 49 06 pm
and let's sign in to my work account. Here's my work email.
2011 11 29 07 52 12 pm

I'm going to use this "multiple accounts" feature to sign in with my personal account too. First

2011 11 29 07 52 58 pm
and then
2011 11 29 07 53 16 pm
and, username and password later, boom:
2011 11 29 07 54 31 pm

You see those two tabs there? Two perfectly coexisting Gmail tabs. One for work Gmail; one for personal Gmail; they both work. This, ladies and gents, is Google Multiple Accounts.

Thing is, though, I'm about to click a YouTube link in my work Gmail:

2011 11 29 07 55 58 pm
which works well enough and opens the YouTube "watch page". Great video! I want to share it…
2011 11 29 07 57 13 pm
…but I have to log in…
2011 11 29 07 57 35 pm
and BOOM the trapdoor opens and you're in login hell.

First you get this:

2011 11 29 07 57 53 pm
and of course you click the only link which is going to take you forward. Or so you think. In fact that link takes you to this:
2011 11 29 07 58 35 pm
and I mean DIRECTLY, no login form in sight.

Again, meaning to make forward progress you click "Sign in":

2011 11 29 07 58 54 pm
and you're greeted by a welcoming login form:
2011 11 29 07 59 19 pm
at which you log in, only to be returned directly to this:
2011 11 29 07 59 46 pm
But wait, look above. See the "hepwori" at the top right? I did actually sign in successfully, but got this error page nonetheless.

Already I see the Google guys rolling their eyes and sighing about the YouTube guys. But wait, there's more.

First of all, falling through this trapdoor has caused not just one, but both of my email tabs to log themselves out. Here's the work tab:

2011 11 29 08 00 58 pm
And personal:
2011 11 29 08 00 40 pm

So OK, perhaps this is still YouTube's fault? Let me ask you this. At this point let's say I open a new tab with ⌘T. In the Chrome address bar I type "foo" and do a Google search. Which account, if any, has this search added to its Web History?

Of course you've no idea. Nobody does. It's all random. Sign into two Google accounts at the same time and you're opening yourself up to the undefined, and good luck with that. Steer clear in particular of Blogger, YouTube, AdWords and Analytics… but basically your best bet is not expecting any of this stuff to work at all.

Google Apps customers are afflicted daily, and it's maddening. Google's silence on it is a particular shame, and—with due respect to many of my Googler friends with this suggestion—multiple browser instances or profiles aren't the answer. Either support multi-account sign-in in a single browser, or don't.

Update: a commenteron Google+ writes:

Sometimes [tabs with multiple logins] all work (ie co-exist) and then, for no reason i can fathom, i get logged out and i spend two days signing in to various accounts to try get back to where i was (ie, multiple accounts happily co-existing)
and yes, this is another part of the problem: the sheer fickleness of the system. On my home laptop and work laptop I have the same set of tabs open and yet the behavior is different when logging into additional services.

I've said it before and I'll say it again: grr!

Saturday, November 26, 2011


I took some photos of the West-Coast Hepworths today.







And Wendy took a photo of me:


This one, though, is from the hospital this last week. We're going to meet the subject in June 2012 and until then it's called Ce Ce. We're excited.

Ce Ce

Monday, November 21, 2011

Trauma equals traffic

I've been running this blog for over a decade now. In that time the second most trafficked post is Broken Wrist, Day 15. Still today, "broken wrist" features in two of the top ten keyword sets which bring traffic to To say I'm surprised would be an understatement.

But give the people what they want, right? So here's another story of personal medical trauma.

For four recent months I had no strength in my left shoulder and couldn't raise that elbow above shoulder height. It doesn't sound particularly awful but it interfered with even simple things like trying to put on a t-shirt, or carrying @thebeean in my left arm, or getting her in her car seat (inherently a left-handed activity). My shoulder would fall out of joint even with the weight of cradling Lux's head in my arm at bedtime.

It started with a fall down the stairs in early August: in socks, rushing down our polished wooden staircase one evening to comfort a crying @thebeean, I slipped and landed on my backside but also on my left hand. In medical circles, I later learned, they call this a "foosh": Fall Onto Out-Stretched Hand. My arm was straight and locked, and all the force went directly to my shoulder.

It was only a couple of weeks until I realized that the injury wasn't minor, and only a couple of weeks after that that I gave up waiting for it to get better on its own. I went to see my GP.

I was referred to an orthopedic surgeon. At first he thought I had a sprained rotator cuff. He sent me for physiotherapy. I went to a bunch of appointments, did a slew of exercises at home, but there was really no improvement: after six weeks I still had severely restricted upward motion in my left arm. My physiotherapist sent me back to the orthopedic surgeon with a suspected torn labrum.

The surgeon took another look and decided it wasn't a labral tear and was indeed a sprained and inflamed rotator cuff. He explained my options: I could go for an MRI, or I could get a steroid shot into the joint. He also explained that the most likely outcome of the MRI would be that steroid shot. So hey, I opted for the shot, skipping the MRI.

It was unpleasant, no doubt about it. The doctor froze the flesh on the outside first—I couldn't feel the needle, it's true—but the cortisone burned and swelled and burned deep inside my shoulder joint. I certainly can't recommend it for the experience alone. He left the room and told me to gently move my arm around to spread the chemicals internally. He'd be back in five minutes.

You know what? During those five minutes I experienced what felt like nothing short of a healing miracle. By the time the doctor came back I'd got 170° of motion back into my left arm, up from a constant 110° over the last sixteen weeks. Right now, just over two weeks on, I'm using my arm fully again—including to carry Lux around. I can ride my bike again without discomfort; it's fantastic.

Here's the super gal I can now carry around properly:

Day 476
Overall I'd recommend the shot.

Sunday, November 13, 2011

Day 470

Today marks 470 days since @thebeean was born.

Day 469

I'm still trying to take at least one photo each week.


Saturday, November 05, 2011

More Thoughts on Platforms

One of my very favorite books is The Systems Bible. It's a fantastic blend of parody, wicked humor, sharp insight, and thought-provoking truisms about systems. The particular genius is that the "laws of systemantics" around which the book revolves apply so well across so many diverse types of systems: political systems, social systems, management systems, business systems, transport systems, people systems, security systems, taxation systems, legal systems, computer systems and so on.

I can't recommend it highly enough.

Talking of systems, one of the very best systems practitioners I know left a comment on my blog post about platforms the other day:

When you say "You need to design this stuff in from the very beginning" ... I disagree. I'm not sure you meant it because later on you say "design ... with a platform mindset" which is (to my mind) quite different.

I don't know of many platforms which were created, from day one, as platforms. Platforms tend to emerge from successful products. Any of the things I've ever done that approached platform-ness started out as specific solutions to specific problems which after various rounds of a-ha! moments eventually evolved into platform-like things which supported the original solutions but also an entire class of solutions to similar problems.

If you start by trying to create a platform, you will likely fail. On the other hand, if in the course of building your product and iterating it into what customers actually want, thus making it wildly successful - if in the course of doing that you are "designing with a platform mindset", then you have a fighting chance of making the next evolutionary leap - extracting and then abstracting the platform from the product.

He's right, of course. And what else would you expect of one of the smartest systems guys around? My rhetoric went too far, and in fact his counter-argument is a simple analogue of laws number 15 and 16. So there's that.

I think the broader point remains, though: the best external platforms are built on top of solid internal platforms—and you'll struggle to deliver the former without first developing the latter.

Thursday, October 27, 2011

Perspectives on parenting

One of my most admired tech bloggers Jeff Atwood wrote a great post this week "On Parenthood". It's moving and astute and insightful, and includes this:

My feelings on this matter are complex. I made a graph. You know, for the children.
That one percent makes all the difference
I also liked
I compare the process [of becoming a parent] to becoming a vampire, your old self dies in a sad and painful way, but then you come out the other side with immortality, super strength and a taste for human blood
The whole thing is worth a read.

Then a few days later I came across "Laws Concerning Food and Drink; Household Principles; Lamentations of the Father" [PDF]. This passage tickled me in particular

When you chew your food, keep your mouth closed until you have swallowed, and do not open it to show your brother or your sister what is within; I say to you, do not so, even if your brother or your sister has done the same to you. Eat your food only; do not eat that which is not food; neither seize the table between your jaws, nor use the raiment of the table to wipe your lips. I say again to you, do not touch it, but leave it as it is. And though your stick of carrot does indeed resemble a marker, draw not with it upon the table, even in pretend, for we do not do that, that is why. And though the pieces of broccoli are very like small trees, do not stand them upright to make a forest, because we do not do that, that is why. Sit just as I have told you, and do not lean to one side or the other, nor slide down until you are nearly slid away. Heed me; for if you sit like that, your hair will go into the syrup. And now behold, even as I have said, it has come to pass.
and also
On Screaming
Do not scream; for it is as if you scream all the time. If you are given a plate on which two foods you do not wish to touch each other are touching each other, your voice rises up even to the ceiling, while you point to the offense with the finger of your right hand; but I say to you, scream not, only remonstrate gently with the server, that the server may correct the fault. Likewise if you receive a portion of fish from which every piece of herbal seasoning has not been scraped off, and the herbal seasoning is loathsome to you, and steeped in vileness, again I say, refrain from screaming. Though the vileness overwhelm you, and cause you a faint unto death, make not that sound from within your throat, neither cover your face, nor press your fingers to your nose. For even now I have made the fish as it should be; behold, I eat of it myself, yet do not die.

Again, read the whole thing. It's worth it.

As if I know something (@thebeean is but 15 months old) I've compared the arrival of kids to taking a sudden 90° turn in life. Nothing slows down but everything's different: the terrain, the climate, and the look of the horizon. It's a whole thing.

day 449

Wednesday, October 26, 2011

Long and short clicks, and short patience

I worked at Google 2006–2010 and I remember it as being unique in many ways. While I was there, though, one particular thing which distinguished the core product—Google Search—from other similarly popular web properties was that it had a stated aim of reducing "time on site per visit".

For Google a top-level measure of its success as a search engine is how quickly it can provide you with great results for your query and send you on your merry way. A measure of how well Google's doing its job is how quickly it can get you the result you're looking for. The better it does, the happier you are as a user, the more likely you are to use the product, and the more opportunity to show you relevant ads.

In order to make sure that you're not just being sent away for the sake of it, though, Google decided to measure both "short clicks" and "long clicks". Let's say I click on a Google Search result, find it useless, and immediately click the Back button. Google measures that as a "short click", ie. I wasn't away for very long. The implicit signal is that the search result didn't answer my query well. Google takes note of this.

In contrast, a "long click" is one where I navigate to a search result and don't come back straightaway. The implication is that I found what I was looking for.

Knowing this context it was interesting for me to see for the first time this feature on Search for [hunter walk twitter] to find Hunter's Twitter account. Click on the #1 result, then click your back button. You'll be offered this screen:

The feature's at least six months old but is the first time that I've seen such a direct manifestation of short click feedback in the user interface.

I'm impressed at the option to customize my search results this way. Never forget, in fact, just how darned impressive Google Search is. It's surprising, though, that this is such a coarse setting. Offering to remove all results for all queries henceforth seems a little severe as a proposed reaction to a single short click for a single query.

Sunday, October 23, 2011

What I Have Learned: Attorney-Client Privileged and Confidential

Disclaimer: I am not a lawyer.

If you work in the technology sector in the States then at some point in your career you've probably received an email that started with the words in the title of this post.

Sometimes for effect the sender of the email may have bolded them:

Attorney-Client Privileged and Confidential

Others are known to go with a bold & italic strategy:

Attorney-Client Privileged and Confidential

And of course there is the ever-popular ALL-CAPS approach:


So what does this phrase mean? Is it really necessary? From what types of situation is it meant to protect you and your company? Here's what I've found out.

The last question is the easiest to answer. The law of civil procedure includes provisions for the process of "discovery" whereby each party involved in a civil action can request documents and other evidence, or can compel the production of evidence by using subpoenas or other means. Much of what goes into a company's day-to-day business is "subject to discovery", ie. may be required to be turned over to the legal counterparty (ie. the “other side”) in the suit. That includes product source code, documentation, reports, letters, faxes, files, and phone records.

Most crucially, though, for tech companies: all emails are by default subject to discovery.

This is where our famous phrase comes in: excluded from the discovery process are materials which fall under "attorney-client privilege". Attorney-client privilege is a legal structure which protects from discovery all communications where a client seeks or obtains legal advice from an attorney. Such "privileged" conversations are excluded from the discovery process and may be withheld from the opposing party.

Legally, one needs to consider a number of factors when looking at whether a particular email communication falls under attorney-client privilege; EITHER

  • the email is from a lawyer representing the company, and dispenses legal advice or opinion. Such communications are almost always privileged

or ALL OF THE FOLLOWING must apply:

  • the communication has to be with a lawyer who represents the company, since attorney-client privilege, as the name may suggest, only covers communication between a client (you) and an attorney (an edge case might be an email to someone with the explicit request asking for a lawyer to be added on the thread);
  • the communication must be clearly and directly asking for legal advice or guidance; and
  • the communication must not be broadly distributed (eg. if you email a smaller mailing list it is fine, but if you choose to cc a large group then the communication is likely no longer covered)

Note that "the communication includes a banner identifying the email as attorney-client privileged" is not one of the criteria! In fact the text "Attorney-client privileged" is itself not significant in any way, ie. the communication would be privileged—or not—with or without the banner. The only reasons to keep the text in place are

  • to make explicit, to teams performing legal discovery, that in the opinion of the sender this message is protected under attorney-client privilege; and
  • to remind recipients of the email that the material is sensitive and subject to privilege. Forwarding such an email usually removes it from the umbrella of privilege.

The final piece of legal advice is never to take legal advice from someone who's not a lawyer. The above is a guideline only; when in doubt, always consult an actual attorney.


UPDATE 12 July 2013 Interesting additional notes from an actual corporate attorney:

  1. Attorney-client privilege exists under both state and federal law in the United States. There are a few state-specific differences that can affect when/whether the privilege attaches to a communication.
  2. Attorney-client privilege works differently, if at all, outside the United States. For example: at a global company with employees in, say, the UK—when a UK-based employee seeks legal advice from US-based attorneys, attorney-client privilege does not exist for that communication in the UK.

Thursday, October 20, 2011

Thoughts on Platforms

I'm a big fan of Steve Yegge. I think that his recent platforms rant is, once you look past the haha-Google-doesn't-get-it and haha-Google-employee-bashes-Google+-on-Google+, a contemporary classic essay on platforms and technology strategy. Articulate, entertaining, irreverent and really insightful—like Steve himself, I'm told.

A friend of mine sent round an email recently asking for thoughts on it. What was Yegge trying to say? Here's what I think: he was trying to say a number of things.

  1. A key business/product truism: "a platform-less product will always be replaced by an equivalent platformized product". If your product isn't a platform then prepare to be disrupted by someone who builds a platform and a competing product on top. Platforms enable unimagined/uncontemplated user needs to be met. Ultimately, platforms win.
  2. You can't just build a product, slap an API on top and call it done. That's not a platform and it's probably not even a good API. You need to design this stuff in from the very beginning.
  3. Once you've decided that you're going to build platforms (rather than adding an API as an afterthought), the way to get good at exposing platforms externally is to live them internally.
  4. To live platforms internally they need to pervade your organization, all your thinking and your internal architecture. It can't be just "hey we have an RPC interface". How is that service registered and discovered programmatically? How are quota and rate limits managed? Pluggable authentication? Front-to-back end-to-end tracing? How is it regression-tested as it evolves? Can anyone in the company just pick up the docs for the service and start calling it? Is there an SLA? How is capacity planned and provisioned? Talking of docs, do you have reference manuals and tutorials, code samples, lists of error messages and so on? Who gets paged if it goes down as a result of a supporting service going down? You don't have a platform unless you have a consistent, uniform approach to all these horizontals supporting a vertical API.
  5. If you design your internal systems with a platform mindset then every day you're dogfooding your own platform efforts. It'll be easier to expose a rich and thorough platform to your customers and users, and the quality of the platform you expose will be higher.

The part of Yegge's story about Bezos reminded me of my own similar experience. In 1997 I started work at an investment bank in London. The CIO there had just decided that from now on every app developed internally should be delivered in a browser. If you don't build web-enabled apps then you're out. He had mugs and t-shirts made with slogans such as "web or dead" and "wired or fired". He gave them out to employees. I think I still have my web or dead one.

But this seemed like insanity at the time. Internet Explorer 3 was the default browser. The investment bank I'd just come from didn't even have internet access, or browsers, on their desktops. There were many people at the bank who thought that this web app idea was the most stupid idea ever, particularly those who only knew COBOL. Or had spent years building thick clients in Visual C++ or VBA.

I was impressionable, though, kept my head down and became good at building web apps.

And! Once the bank had developed web-based trading apps and web-based finance apps and web-based legal apps and web-based book-building apps for internal use... waddyaknow we could make these same apps available directly to our customers in their browsers. This was utterly revolutionary—and gave us an enormous market advantage, literally years ahead of our competitors.

It couldn't have happened without everyone internally living, breathing, and being hired and fired around, this internal mandate for browser-based apps. The skills we built and lessons we learned while solving for our internal apps were all directly applicable to the turning inside-out of these same apps.

I think that's similar in spirit to Yegge's point. You'll get good at externalizing platforms by internalizing them.

Sunday, September 11, 2011

DIY @mention constellations, Part V

Last time we got as far as sourcing bulk data from the Twitter Streaming API and producing (over the course of several hours compute-time) a beautiful set of constellations like this:

Full Mention Graph
upon which we wondered:
  • how can we make the rendering faster?
  • how can we get rid of the fairly dull stuff around the edge?
  • how can we get a more detailed view of the center?

It turns out that there's one part of Graphviz which addresses all of these questions: ccomps.

Graphs like this are made up of a finite number of connected components. You can think of these as the distinct separable mention islands which make up the diagram. What if there were a way to plot only the n largest of the connected components? What if there were a way to plot the single largest connected component? Turns out that there is.

Graphviz, as well as having tools like dot and sfdp and neato for plotting graphs, also includes a tool called ccomps which can separate out the connected components of a graph.

Let's start by picking out the blob at the center of our big graph. Zooming in, it looks like this:

Ccomp0 small
so let's take our file mention-graph.gv and pass it through ccomps before rendering it:
ccomps -zX#0 mention-graph.gv | sfdp -Gbgcolor=black -Ncolor=white -Ecolor=white -Nwidth=0.02 -Nheight=0.02 -Nfixedsize=true -Nlabel='' -Earrowsize=0.4 -Gsize=1.5 -Gratio=fill -Tpng > ccomp0.large.png
which (in mere seconds!) gives us
Ccomp0 large

Here we used ccomps -zX#0 to pick out the zeroth connected component of the graph defined in mention-graph.gv, ie. the largest one.

That was easy. The other two birds, making the rendering of the big graph faster and removing the dull stuff around the edge, we kill with one stone. We use ccomps to pick out the largest 1,001 connected components of the graph and plot only those:

ccomps -zX#0-1000 mention-graph.gv | \
grep "-" | cat <(echo "digraph mentions {") - <(echo "}") | \
sfdp -Gbgcolor=black -Ncolor=white -Ecolor=white \
  -Nwidth=0.02 -Nheight=0.02 -Nfixedsize=true \
  -Nlabel='' -Earrowsize=0.4 -Gsize=75 -Gratio=fill \
  -Tpng > ccomp0-1000.large.png
which gives us this graph, comparable to the original but with a render time in seconds instead of hours:
Ccomp0 1000 large

You'll notice that we snuck a little trick in there, which was flattening the output of ccomps using grep|cat<(echo). That little one-liner takes a single graph composed of many wholly connected subgraphs and flattens it to a single graph of many connected components. There's no structural change to the graph but a flat graph renders more quickly.

There are a couple other tricks you'll learn to use too:

  • separate layout (sfdp) from rendering (neato -s -n2)
  • use tee to save the output from stages in a pipeline
and you'll notice that does these things as well as the flattening trick (if you're interested you can take my sheel script apart to see how it works in detail). Separating layout from rendering is a powerful technique if you want to first lay a graph out and then colorize it or label it before rasterizing.

Something you'll want to play with as well (particularly for graphs larger than a few million edges) is removing certain nodes, particularly those with either a high degree (eg. remove accounts which are mentioned a lot) or a low degree (eg. remove accounts that are mentioned only once or twice). In the beginning I used SQL to do this min/max pruning but eventually got bored of waiting for SQL and wrote some Python instead.

This is pretty much the end of the @mention constellations series. I've had enormous fun generating these graphics, developing these techniques, learning and teaching along the way. It gives me huge satisfaction to see that @ialexs has picked up on this work and is taking it to the next level with such beautiful creations as this.

Saturday, August 27, 2011

DIY @mention constellations, Part IV

So far there's been

and now we're going to take a look at applying Graphviz to large Twitter mention graphs.

Getting the data is easy enough: see the embedded 39-line Python script in the sample code. So let's say you run that Python code to give you 50,000 random mentions from the Twitter firehose. Let's say that having done that, having sort'd and uniq'd, and added a one-line header and one-line footer you have, like I do, a 47,375-line file which begins with

digraph mentions {
"0001am_" -> "kaaly_"
"000eca000" -> "000eca000"
"000eca000" -> "kira_moka"
"000parra" -> "amolosflips_"
"00_dag" -> "nishinoakihiro"
"00alliesmeaton" -> "caitlanpratt"
"00kuro" -> "sena1029"
"00nelht" -> "gvwriters"
"00rico00" -> "tsubo0307"
and ends with
"zwackleby" -> "gauravh1"
"zxicee" -> "parnnnparnnn"
"zyhafiyah" -> "amyshaheera"
"zyhnlyh" -> "elsaaps"
"zymecca" -> "wowkonyol"
"zz0_ee" -> "becky_aisha"
"zzangfia" -> "somin_somu"
"zzz_ho" -> "dewwanna"
"zzzoob" -> "ko5712"
that is, you have a directed graph in DOT format representing unique mentions amongst a random sample of 50,000 from Twitter.

Having seen how simple Graphviz is, you'll probably render a graph directly from this file, with a command like

sfdp -Gbgcolor=black -Ncolor=white -Ecolor=white -Nwidth=0.02 \
    -Nheight=0.02 -Nfixedsize=true -Nlabel='' -Earrowsize=0.4 \
    -Gsize=75 -Gratio=fill -Tpng mentions.gv > mentions.png
from which you'd get this image, after waiting a long long time (on my machine about four hours):
Full Mention Graph

Most likely you'd be agog, like I was when I ran this process for the first time. And then, like I did, you'd wonder how to make it faster and how to get rid of the fairly dull stuff round the edge. You might also wonder, like I did, what's inside that blob in the center.

That's for next time.

Next: Part V

Sunday, August 21, 2011

DIY @mention constellations, Part III

So you checked out Part I and Part II, no doubt. You probably installed Graphviz, maybe you tried out the shell script I posted on pastebin, and perhaps you have your own one of these now:

Basic Mention Graph

If so, nice going. This post is going to dial it wayyy back and start at the beginning working with some basic Graphviz functionality.

Graphviz is a suite of software tools for working with graph data, where you can think of a graph as a set of nodes, some of which are connected by edges. At its most basic a graph is a bag of dots with a bunch of lines connecting some dots to other dots. As you'll see, there's more than one way of representing this visually

But talking of dots, Graphviz works with data assembled into a format known itself as DOT. Here's an example of a graph defined in the DOT language:

digraph basic {
    x -> y
    y -> x
    y -> z
    z -> a
    z -> a
    a -> x
It defines a graph where the nodes are identified by the letters x, y, z and a.

Save the above graph definition as a text file called basic.gv (Graphviz documents have extension .gv by convention); we're going to use the basic Graphviz commands to visualize this graph in different ways.

At the command line, run this:

dot basic.gv -Tpng > basic-dot.png
and here's what you'll get as output:
Basic dot
Simple! This represents our graph perfectly. Try this:
neato basic.gv -Tpng > basic-neato.png
to get
Basic neato
And then there's twopi and circo:
twopi basic.gv -Tpng > basic-twopi.png
Basic twopi
circo basic.gv -Tpng > basic-circo.png
results in
Basic circo

Finally, we're going to add some styling options to the graph. Run this command:

circo basic.gv -Gbgcolor=black -Ecolor=yellow -Earrowsize=0.3 -Epenwidth=0.4 -Nlabel='' -Nwidth=0 -Nheight=0 -Nfixedsize=true -Gsize=4 -Gratio=fill -Tpng > basic-circo-options.png
to get the output
Basic circo options
You can see how this works. The -G options apply to the whole Graph. The -N options to the Nodes and the -E options to the Edges. There's online documentation covering all the various options.

Now imagine that rather than basic.gv

digraph basic {
    x -> y
    y -> x
    y -> z
    z -> a
    z -> a
    a -> x
we have instead a graph representing Twitter users @mentioning each other. In the next post we'll look at how we could apply Graphviz to that.

In the meantime, here's where you can ultimately take this. A 369 megapixel graph of the largest connected component amongst 30m mentions on Twitter, with the most active 80k users removed. I'm working on identifying the clumps; I suspect that they're geographic or language regions.

Next: Part IV

Thursday, August 18, 2011

DIY @mention constellations, Part II

At work I wrote a document entitled "Pig for Dilettantes and Cargo-Culters". If you're the kind of person who's at least once used on the Mac, but know little about distributed computing or Twitter's big data schemata, then following the steps in that document is probably the fastest way to get to the point of being able to extract meaningful data out of the Twitter Hadoop cluster. From there you can explore, tweak the scripts, and eventually you'll be able to get the data that you're actually interested in.

In a similar vein I present this post. If you've never opened on the Mac then this probably isn't for you. If you know basically what's going on at the command line, and you're a hardy explorer/experimenter, then read on.

First of all, I presume you've read Part I and have installed Graphviz. Both are required, I'm afraid. Not strictly required is a Mac, but if you're running something other than OS X then you're likely going to need to make some small adaptations for your platform.

So, with Graphviz installed, check out the mention-graph shell script I put on pastebin. Copy it, save it to your Mac as mention-graph, chmod +x it, and you're set.

Using this script, I just ran a constellation of 50,000 live mentions from the Twitter Streaming API, 75 inches square (72 dpi), by running "./mention-graph -n 50000 -u isaach -o -v -s 75":

Screen shot 2011 08 18 at 11 26 16 PM
and here's the output it dropped as mention-graph.png:
Basic Mention Graph

You need to supply your Twitter credentials (the above command, which you should edit to use your own username, will ask for your password) and note that this script sends them in the clear to Twitter. If this worries you then feel free to either edit the script to meet your security standards, or create a Twitter account dedicated to this kind of use, separate from your primary account.

Next time: what this all means and how to take it further. In the meantime, let me know on Twitter how you get on.

Next: Part III

DIY @mention constellations, Part I

People really enjoyed the mention constellation thing. I'm chuffed that on Twitter I've received Tweets about it from five continents, and the thing's been written about by friends and strangers alike. People at work liked it too, which means a lot to me.

A couple of questions I got stood out: (a) can I get the data?; and (b) can I get the code?

The answer to both is yes!

I'm going to do two things. First of all I'm going to post a complete, free, end-to-end solution for generating something like this:

Basic Mention Graph
Secondly, I'm going to explain how the code works.

First of all, though, you need to install Graphviz. It's straightforward, especially on a Mac. Go!

Next: Part II

Sunday, August 07, 2011

About the @mention constellations

Update: find out how to make one of these.

So, about this @mention constellations stuff.

The FAQ:

  • What exactly am I looking at? The main visualization is a map of Twitter mentions on June 21st. Each dot is a Twitter account. Each arrow dot-to-dot illustrates one account mentioning another. Despite the scale of the diagram the underlying dataset is relatively tiny: less than 10 minutes of conversation.
  • Why do some accounts seem to mention themselves? Occasionally accounts do actually mention themselves.
  • Can I get the data? Behind this visualization of June 21st in particular? No. In order to make the same thing from another day? Sure, you can get more than enough data to produce these things for free from the Twitter Streaming API.
  • What's the blob in the middle? Technically speaking it's the largest connected component of the mention graph. I just uploaded a detailed look inside it.
  • What software did you use to make this? Mainly Graphviz.
  • Sure, but how exactly did you make it? I took a sample of Tweets from Twitter's internal Hadoop cluster. I used a tiny Python script to extract the mentions. I loaded the data into a local MySQL instance. I queried MySQL for a sample of the mentions. I formatted the sample into dot using Perl and I laid out and rendered a PNG using Graphviz.
  • Is this your job at Twitter? No, this is a hobby project.
  • What's it like to work at Twitter? Very cool indeed. If you're interested I wrote some stuff about my transition from Google to Twitter at

Tuesday, August 02, 2011

More @mention constellations

The previous post showed a glimpse of a work in progress. Today marks the first formal checkpoint of my hobby project and I'm proud to present the first full iteration at

What you're looking at is a visualization of a sample of Twitter @mentions on one day in late June. Each vertex is a Twitter account. Each directed edge is a mention of one Twitter account by another. You can see some accounts which get mentioned a lot (lots of inbound arrows to a central point) and accounts which do a lot of mentioning (lots of outbound arrows from a central point; these are mainly automata).

I find it absolutely captivating to explore. It's like a safari of conversational molecules.

Coming soon: more details about how I generated this visualization, and how you can produce your own.

Wednesday, July 27, 2011

@mention constellations

I've been lucky enough working at Twitter to entertain a hobby project or two. My current obsession is "@mention constellations". Let me show you them:

@mention constellations

What you're looking at is a small section of a larger graph showing Twitter users mentioning other Twitter users. Each vertex is a Twitter account. Each directed edge is a mention of one account by another. In this image you can see some accounts which get mentioned a lot (lots of inbound arrows to a central point) and accounts which do a lot of mentioning (lots of outbound arrows from a central point). The latter are mainly automata.

To me, in this presentation, the many distinct configurations look like galaxies. Or perhaps viruses. Can you recognize the basic phyla in this ecosystem? Some commonality, a lot of diversity; it's a menagerie of conversational molecules akin to the patterns one finds in Conway's game of life.

I'm working with GraphViz to produce these images, and I have hopes for Gephi although it's not there yet.

Thursday, July 21, 2011

July 20th

shrieky baby is shrieky
getting my lion on
this one for @dickc RT @Kurt_Vonnegut: Peculiar travel suggestions are dancing lessons from God.
whoah. @wendyverse and I just witnessed @thebeean's first steps. we're gobsmacked.
wow wow wow RT @astro_aggie: This is best picture I've ever taken. Atlantis and aurora last week. Stunning! #FromSpace
by popular request, video of some of @thebeean's first steps this morning.…. um, pretend the underpants aren't there.
thanks @AlFranken for this most delightful takedown of a Focus on the Family witness at today's #DOMA hearing:…
i for one would like to learn more about "Windows Party Mode"
lolwut? private static Nothing nothing = new Nothing();
wow, google shuttering google labs. surprising.…
I liked a @YouTube video Rebecca Black - My Moment (NO AUTO-TUNE Version!)
long day is long. headed home.
not the most attractive window title for the default finder window in #lion:…
how long until the first pc oem reverses scroll direction on a windows laptop?
tonight @wendyverse will be modeling the twitter @townhall feetie from @OriginalFeetie:
wow, @salgar just sent me a message on Google Buzz. that dude is kicking it old-skool. respect!
the fact that apple's iPad business is bigger than dell's pc business is just mind-blowing.

Tuesday, July 19, 2011

July 19th

haha! "Murdochs Vow to Launch Full Investigation to Find Out Who is Running Company They Are in Charge Of"…
priceless! RT @caroldecker: God will you all knob off! I do not look like RB!!!!! You mother fuckers!!!
wow just wow. "Apple vs. NASDAQ, Microsoft, and Google. From the day of Google’s IPO through today"… /cc @kevinthau
prototyping a new blog format. next up: inline media.…

Monday, July 18, 2011

July 18th

this web page is very pleased with itself for intercepting right-clicks:…
>boggle<. NotW phone hacking whistleblower found dead and "the death is currently being treated as unexplained"…
what a great idea! "Netflix for Baby Clothes" /cc @hmnoise @wendyverse
weirder and weirder. police examine bag with computer, papers and phone found in trashcan near Rebekah Brooks's home:…
wow. "i understand if u smoke weed, but no touching the children. ALL RACES WELCOME BESIDE ORIENTAL!"… (via @grauface)
shit's got serious. i'm about to install Eclipse, lord help me.
already frustrated beyond belief with Eclipse. that didn't take long.
this evening's view from the front deck @ Casa Nueva Crapworth

Saturday, June 18, 2011

Catching Up

It's been so long since I last posted I figured a catch-up was in order. I think I have the Instagrams to tell the story since last time.

So obviously the bean's doing great:

We bought a new house:
empty house is empty
and filled it with boxes:
My commute goes over Bernal Hill now:
bernal commute #1
which looks lovely in the evening too:
On June 1st we launched the new Twitter Search:
twitter search+photos launch room
and celebrated with drinks that evening
cheers to the launch of the new twitter #search
and dinner the next week

After the launch I needed some time off, so took a week's staycation beginning with a walk over Bernal (Friday)

evening in bernal
and then a trip to Ikea (Saturday)
Jag, Robot
a walk in the Mission (Sunday)
a trip downtown (Monday)
a picnic in the park (Tuesday)
#staycation picnic with @thebeean
a trip to the wine country (Thursday)
yay!#staycation wine tasting
and then, on Friday 17th June, my 38th birthday, a birthday card made of elephant poo:
.@wendyverse sure picked a special birthday card
dinner at a favorite restaurant
a trip to the animal shelter
feral cat entrance
and adoption of Chaucer the cat:
Chaucer the cat

Today was @thebeean's 322nd day:

day 322
and she met Chaucer:

And now you're caught up.