Dataclysm by Christian Rudder

I'm currently writing a novel set in the near future. It imagines the consequences of those endless breadcrumb trails we're currently leaving behind us as we navigate our data-driven world. We know we're under constant surveillance - especially since the Snowden revelations. 'Big data' is just the latest way for corporations and governments to learn everything about us. We sometimes forget what a recent phenomenon all this is.

The truly massive data sets held by the Googles, the Facebooks and the Apples - and by the GCHQs and the NSAs - are just a few years old. So although they give their owners intensely detailed snapshots of our current behaviour and preferences, the arrow of time runs only a few years back inside them. As Christian Rudder points out in this peppy, accessible book, it's hard to predict how much richer these portraits will become when every living being starts leaving a data trail the moment they're born, and only stops self-recording when they're dead. But even though this new field is only taking its first tentative steps, it's already telling us things that were impossible to know a few years back, in the data dark ages.

Rudder is better positioned than most to understand this. As the top data wonk at dating site OK Cupid, he has unfettered access to one of the world's largest databases of human emotions and desires. He uses this book – which is a refined and beautifully-designed expansion on a popular blog - to share a few dozen striking insights that, between them, enrich our understanding of how we relate to one another; how we hold ourselves apart; and who we are in our own right. His insight are about people as an aggregate mass, not as individuals. This book contains none of those homey anecdotes every other popular science book uses to bring its subject to life. It's uncompromisingly about the data; but the stories it tells are all the more powerful for illustrating things that are common to us all - or at least to very large cohorts of us - rather than what sets each of us apart. 

To show how this works, here's one of Rudder's findings about our preferences for romantic partners. It's a great example of a thing we might believe intuitively, but would never have been able to prove until hundreds of thousands of people all choose to log their most intimate feelings on a website called OK Cupid. On the site, as in so many aspects of life, people give each other a star rating for appeal, out of 5. As you'd expect, some people get consistently high ratings, some consistency mediocre, and some just bad. Such is life. But in his second chapter, Rudder concentrates instead on those people who divide opinion - those whose star ratings show the highest variance. In other words, those who people disagree about most. 

He compares the most consistently highly-scoring straight women* against those women with the most diverse scores - the highest variance - and looks at how many messages are straight men send to these women. This metric - messages sent - is a more reliable measure of romantic interest than stated preference. It leads to dates and to, you know, the other thing. What's fascinating is that it's the uncontroversially attractive women are not the ones who get the most action. The opinion-dividers reliably outperform them every time - by a factor of 10%.

I think there's something profound going on here. To be, as the French say, jolie laide - beautifully ugly - seems to inspire the greatest passion in others. These men who are voting with their feet - or with some other part of their anatomy - for women with the highest variance of attractiveness must on some level be feeling: She is the one for me. Specifically for me, and for nobody else. After all, who wants what everyone else wants? That's just so generic.

I suspect this applies to other passions, too. If I look at the Amazon rating for the last book I reviewed here, the extraordinary Dept. of Speculation by Jenny Offill, I see it has a rating of only 3.4 out of 5. Which is just bonkers, in my humble opinion. But look how these ratings are distributed:

That is a pretty good example of variance. Some people evidently hated it. And yet this is the book I've had more people passionately recommend to me than any other in at least a year.  

Could it be that with fiction, as with sex, we save our greatest passions for the most obscure objects of our desire?

* If the bare reporting of the male gaze makes you uneasy, that's kind of the point. The book, like data analytics, takes no position, it simply reports; but of course its impossible to separate this kind of analysis from the gender political baggage that comes when a man rates a woman for attractiveness. Rest assured, though: although this example is about straight males rating straight females, the book also contains a wealth of equally unstinting data provided by people with a raft of other sexual preferences.