Data Science of the Facebook World

The ever insightful Stephen Wolfram has another graph-heavy post, this time compiling data on Facebook analytics:

More than a million people have now used our Wolfram|Alpha Personal Analytics for Facebook. And as part of our latest update, in addition to collecting some anonymized statistics, we launched a Data Donor program that allows people to contribute detailed data to us for research purposes.

A few weeks ago we decided to start analyzing all this data. And I have to say that if nothing else it’s been a terrific example of the power of Mathematica and the Wolfram Language for doing data science. (It’ll also be good fodder for the Data Science course I’m starting to create.)

We’d always planned to use the data we collect to enhance our Personal Analyticssystem. But I couldn’t resist also trying to do some basic science with it.

I’ve always been interested in people and the trajectories of their lives. But I’ve never been able to combine that with my interest in science. Until now. And it’s been quite a thrill over the past few weeks to see the results we’ve been able to get. Sometimes confirming impressions I’ve had; sometimes showing things I never would have guessed. And all along reminding me of phenomena I’ve studied scientifically in A New Kind of Science.

So what does the data look like? Here are the social networks of a few Data Donors—with clusters of friends given different colors. (Anyone can find their own network usingWolfram|Alpha—or the SocialMediaData function in Mathematica.)

It’s a pretty fascinating read.

My favorite graph was this one of the distribution of  your Facebook friends’ age versus your age:

The age of your Facebook friends versus your age.

The age of your Facebook friends versus your age.

It’s also quite interesting how the marriage statistics from Facebook line up with the official Census data:

Facebook marriage age vs. Census data.

Facebook marriage age vs. Census data.

For a lot more analysis, read Stephen Wolfram’s entire post.

Your Typing Style as a Password

Paul-Jean Letourneau, Lead Developer for Wolfram Alpha, recently read a New York Times article which detailed how in the future we may be able to bypass the password simply by typing a user name (or some string). The takeaway is that the way you type the characters will be a unique identifier for you and only you.

Using Mathematica, Letourneau then decided to analyze his own typing signature by seeing the difference in keystrokes as he typed “wolfram.”  He details everything in this blog post.

Using this fun little typing interface, I feel like I actually learned something about the way my colleagues and I type. The time to type two letters with the same finger on the same hand takes twice as long as with different fingers. The faster you type, the more your typing speed will fluctuate. The more your typing speed fluctuates, the harder it will be to distinguish you from another person based on your typing style. Of course we’ve really just scratched the surface of what’s possible and what would actually be necessary in order to build a keystroke-based authentication system. But we’ve uncovered some trends in typing behavior that would help in building such a system.

Quite fascinating to put the research and practical together. You can even test your own typing profile by installing a CDF (computable data format) in your browser. Very cool!

Finding Waldo with Mathematica

For those of you who are fans of Finding Waldo and have a bit of a nerdy side to you, you’ll appreciate that someone figured out how to find Waldo using Mathematica:

Finding Waldo with the help of Mathematica.

The author describes his technique and provides the relevant code:

First, I’m filtering out all colours that aren’t red

waldo = Import["http://www.findwaldo.com/fankit/graphics/IntlManOfLiterature/Scenes/DepartmentStore.jpg"]; 
red = Fold[ImageSubtract, #[[1]], Rest[#]] &@ColorSeparate[waldo]; 

Next, I’m calculating the correlation of this image with a simple black and white pattern to find the red and white transitions in the shirt.

corr = ImageCorrelate[red,    Image@Join[ConstantArray[1, {2, 4}], 
ConstantArray[0, {2, 4}]],    NormalizedSquaredEuclideanDistance]; 

I use Binarize to pick out the pixels in the image with a sufficiently high correlation

and draw white circle around them to emphasize them using Dilation

pos = Dilation[ColorNegate[Binarize[corr, .12]], DiskMatrix[30]]; 

I had to play around a little with the level. If the level is too high, too many false positives are picked out.

Finally I’m combining this result with the original image to get the result above

found = ImageMultiply[waldo, ImageAdd[ColorConvert[pos, "GrayLevel"], .5]]

Amazing.

###

(via Kottke)

The Intersection of Math and Pasta

The New York Times has a short piece on Sander Huisman, a graduate student in physics at the University of Twente in the Netherlands, who decided to plot pasta shapes on his favorite software, Mathematica (I prefer Matlab myself, though I’ve used Mathematica in college and grad school).

Mr. Huisman figured out the five lines or so of Mathematica computer code that would generate the shape of the pasta he had been eating — gemelli, a helixlike twist — and a dozen others. “Most shapes are very easy to create indeed,” he said.

Here is a rendering of one of the pasta shapes he posted to his blog:

Pasta Rendering in Mathematica

You can see the other Mathematica renderings in Sander’s blog post. Fun and tasty!

###

(via Gourmet Pigs)