Tuesday, October 25, 2016

Zerohedge doesn't know what oversampling is

Zerohedge - ermahgerd, Podesta recommends oversampling to rig polls!!!1!

Sorry, Russian disinfo stooges and the mouth-breathing idiocracy that loves them, but that's not what "oversampling" is.

When your dataset has heterogeneous subcategories (such as when you're polling and you know that certain racial groups vote differently than others), the heterogeneity will wreck your confidence level if you sample too few of a subcategory.

I.e., if your sample of 100 includes only 5 black women, the result of that subsample can quite easily turn out non-representative: it's too possible to get 4 out of 5 responding Republican when you know the subpopulation should on average respond around 1 out of 5.

In the aggregate, that will wreck your survey's confidence level.

So you oversample small heterogeneous minority groups, to get a better confidence level. Your survey will poll a higher proportion of black males and black females than their proportion of the population, as well as a higher proportion of Latinos, 19-25 age group, sub-$20,000 earnings group, and so on. With larger samples of the subgroups, you will have better representativity in your results.

Then you downweight the result from that oversample so that it fits back into the aggregate.

This is what is done in market research and anyone who's taken 2nd-year stats knows this. FFS anyone who's worked at a market research office phoning people knows what a fucking oversample is.

But Zerohedge and the rest of the Russian-funded Fear Uncertainty & Doubt campaign know you haven't taken 2nd-year stats, so they run bullshit stories and skew them to make you think there's some grand fucking conspiracy.

This, by the way, is why I'm deleting Zerohedge-style comments on this blog without even really reading them. You get to have your comments posted again once you demonstrate you've educated yourself.

No comments:

Post a Comment