December 24, 2012

Data Prophet : Nate Silver


I'm summarizing an article Data Prophet in Wired October 2012. It's an interview with Nate Silver, a statistician, a data scientist.

He pointed out a number of problems with our typical predictive models.

Out of Sample Problems


Here we get no data for the phenomenon because we did not collect such data in the first place. e.g. If your model excludes real estate as an industry, whatever rise, fall, crisis happens in real estate, you will have no data for them. And your economic prediction will also miss problmes and effects arising from real estate

Over Fitting Problem


Here, you're mistaking mere coincidences or co-occurrences for patterns. e.g. You observed that whenever ice cream sales go up, murder cases go up too. Then, you assumed that there was a pattern.
 If you curb ice cream sales, will you prevent some murder cases then?

Nate advises that instead of looking for ideas, we should accept and see what is really there in the data.

Problem of Over-reacting to New Data


Maybe your model is too flexible or sensitive to new data or new types of data.

Here, I can give a personal experience. Recently I read in Newsweek about life-after-death experience of a neurosurgeon. He said that he got to a strange place and met strange beings, that there our senses were fused such that he could see a sound and hear a color at the same time etc. I believed every word of his. That is, a person like him who has lived like he has lived his life, can get to a place like that.

Nat Silver wisely warns that more is worse if we don't have a good framework, that we'd better stick to basic models if we are prone to over-reaction to new data. Wise poker players, stock investors, soldiers all now this.

I asked myself what if I got there myself. I would hear-see a sound-smell-sight. Then, I would see-hear it dissolve and another sense arise .... dissolve and arise .... dissolve and arise .... dissolve and arise ....I hope my basic, simplistic model will work there too.


The Need to Test in the Right Environment


An example of the problem situation may go like this: You created your model with data from New Mexico. You then tested the model in New Mexico again.

According  to Nate Silver, the right environment is" an environment where you don’t know what’s going to happen."

No comments:

Post a Comment

A Tip for Job Search: Gold Rush Skills

  If you need to make some money very quickly, what would you do? Your answer points to the kind of problems you can solve. They give you so...