Dec 13

Statistics and Polling

This is a bit of a change of pace, but I got some inquiries about this and thought I would offer my own two cents on something that often confuses people. My qualifications for this are two-fold:

  1. In my past life I was a professor who taught classes in Statistics;
  2. I have worked for a political consulting company that among other things performed polling for clients.

So you can use this in deciding if you want to pay any attention to what I have to say on the subject. :)

To get started, the basic question of epistemology: How do we know what we say we know. In the case of statistics, the basic mathematics began to be developed as a way of analyzing gambling. When you play poker, and a hand with three of a kind beats a hand with two pair, that is because two pair (shows up 4.75% of the time) is more likely than three of a kind (shows up 2.11% of the time). But after its start in gambling, statistics took a big step during the Napoleonic wars, when for the first time large armies met and the casualties mounted up. Some doctors realized that gathering evidence about wounds and their treatment would lead them to select the best treatments. But they key factor is that this is all based on probability. And the best way to think about probability is to think about what would happen if you did the same thing and over and over. You might well get a range of outcomes, but some outcomes would show up more often. And this is the first thing that throws a lot of people, because they often have this sense that if something is unlikely, it won’t happen at all. And that is simply untrue. Unlikely things will happen, just not as often. As a joke has it, if you are one in a million, there are 1,500 people in China exactly like you. But the heritage of gambling persists in the technique called Monte Carlo simulations, which run an experiment many, many times, often via a computer algorithm, to generate random data to test theories. John von Neumann understood the significance of this approach, and programmed one the first computers, ENIAC, to carry out Monte Carlo simulations

The next key concept is called the Law of Large Numbers, which in layman’s terms says that if you repeat the experiment many times, the average result should be equal to the expected result. Now this is the average we are talking about here. Any particular experiment could give weird results that are nothing like the expected result, and that is to be expected in a distribution of results. But when you average out between each experiment, the occasional high ones are offset by the occasional low ones, and the average result is pretty good. But to get this you need to do it many, many times. The more times you repeat the experiment the closer your results should be.

Our third key concept is Random Sampling. This says that every member of a population has an equal chance of being selected for a sample. And the population is whatever group you want to make a claim about. If you want to make a claim about left-handed Mormons, your sample should exclude anyone right-handed people or any Lutherans, but it should afford an equal chance of selection for all left-handed Mormons. This is where a lot of problems can arise. For instance, many medical studies in the 20th century included all or mostly men, but the results were applied to all adults. This is now recognized as a big problem in medicine. When this happens we call the problem Sampling bias.

So, with these basic concepts (and see, I did not use any math yet!) we can start to look at polling, and just how good it is or isn’t as the case may be. And it is often very good, but history does show some big blunders along the way.

The first thing to get out of the way is that sampling, done properly, works. This is a mathematical fact and has been proven many times over. You may have trouble believing that 1000 people are an accurate measure of what a million people, or even 100 million people will do, but in fact it does work. When there are problems it is usually because someone made a mistake, such as drawing a sample that is not truly an unbiased sample from the population in question. This does happen and you need to be careful about this in examining polling results. In the earlier part of the twentieth century there were some polls done via telephone surveys, but because telephones were not universally available at that time these polls overstated the views of more affluent people who were more likely to have phones. By the latter part of the century, however, telephone surveys were perfectly valid because almost everyone had a phone (and the few who didn’t were not likely to be voters anyway). But now we have a different problem, in that many people (myself included) have gone to using mobile phones exclusively, and the sampling methods in many cases relied solely on landline telephones. Polling outfits are beginning to adjust for this, so it should not be a problem. But you need to watch out for ways pollsters will limit the sample. A big issue is whether you should include all registered voters (in the U.S., you need to be registered before you can vote. I am not familiar with how other countries handle this.), or if you want to limit it to “likely voters”. Deciding who is a “likely voter” is place where some serious bias can creep in, since it is purely a judgement call by the pollster.

So how do we know that samples work? We have two strong pieces of evidence. First, we know from Monte Carlo simulations how well samples compare to the underlying populations in controlled experiments. You create a population with known parameters, pull a bunch of samples, and see how well they match up to the known population. Second, we have the results of many surveys which we can compare to what actually happens when an election (for instance) is held. Both of these give us confidence that we understand the fundamental mathematics involved.

The next concept to understand is Confidence Interval. This comes from the fact that even an unbiased sample will not match the population exactly. To see what I mean, consider what happens if you toss a fair (unbiased) coin. If it is a truly fair coin, you should get heads 50% of the time, on average, and tails 50% of the time. But the key here is “on average”. If you tossed this coin 100 times, would you always get exactly 50 heads and 50 tails? Of course not. You might get 48 heads and 52 tails the first time, 53 heads and 47 tails the second time, etc. If you did this a whole bunch of times and averaged your results, you would get ever closer to that 50/50 split, but probably not hit it exactly. And what this means is that your results will be close to what is in the population most of the time, but terms like “close” and “most of the time” are very imprecise. How close, and how often really should be specified more precisely. And we can do that with the Confidence Interval. This starts with the “how often” question, and the standard usually used is 95% of the time. This is called a 95% confidence interval, but sometimes the complement is used and it gets referred to as “accurate to the .05 level. These are essentially the same thing for our purposes. And if you are a real statistician, please remember that this podcast is not intended to be a graduate-level statistics course, but rather a guide for the intelligent lay person who wants to understand the subject.  The 95% level of confidence is kind of arbitrary, and in some scientific applications this can be raised or lowered, but in polling you can think of this as the “best practice” industry standard.

The other part, the “how close” question, is not at all arbitrary. It is called formally the Margin of Error, and once you have chosen the level of confidence, it is a pretty straightforward function of the sample size  In other words, if you toss a coin ten times, getting six heads and four tails is very likely. But if you toss it 100 times, getting 60 heads and 40 tails is much less likely. So the bigger the sample size, the closer it should match the population. You might think that pollsters would therefore use very large sample sizes to get better accuracy, but you run into a problem. Sampling has a linear cost. If you double the sample size, you double the cost of the survey. If that resulted in double the accuracy it might be worth it, but in fact for reasonable sample sizes it won’t. Doubling the sample size might get you 10% more accuracy in your results, and is that worth spending twice the money? Not really. So you are looking for a sweet spot where the cost of the survey is not too much, but the accuracy is acceptable.

Any reputable poll should make available some basic information about the survey. The facts that should be reported include:

  • When the poll was taken. Timing can mean a lot. If one candidate was caught having sex with a live man or a dead woman, as the joke has it, it matters a lot whether the poll was taken before or after that fact came out in the news.
  • How big a sample was it?
  • What kinds of people were sampled? Was there an attempt to limit it to likely voters?
  • What is the margin of error?
  • What is the confidence interval?

Now a reputable pollster will make these available, but that does not mean they will be reported in a newspaper or television story about the poll. Or they may be buried in a footnote. But these factors all affect how you should interpret the poll.

Example: http://www.politico.com/story/2013/12/polls-obamacare-100967.html

In this brief news report we don’t get everythng, but we got a lot of it. This story is about two polls just done (as I write this) on people’s opinions regarding “Obamacare”.

The Pew survey of 2,001 adults was conducted Dec. 3 to Dec. 8 and has a margin of error of plus-or-minus 2.6 percentage points.

The Quinnipiac survey of 2,692 voters was conducted from Dec. 3 to Dec. 9 and has a margin of error of plus-or-minus 1.9 percentage points.

What I would note is the the first poll says it was a poll of “adults”, while the second poll was one of “voters”. That makes me wonder about any differences in the results (and the polls did indeed have different results). They were sampling different populations, so the results are not comparable. If the purpose of the survey is to look at how people in general feel, a survey of adults would probably make sense. If the purpose was to forecast how this will affect candidates in the 2014 elections, the second poll may be more relevant.

Second, note that the survey with the larger sample size had a slightly smaller margin of error. That is what we should expect to see.

Third, note that the second poll was “in the field” as we say for one more day than the first poll. Does that matter? It might if some very significant news event happened on the 9th of December that might affect the results.

What I don’t see in this report is any explanation of how the people were contacted, but if I went to their web site, here is what I found on the Quinnipiac site:

From December 3 – 9, Quinnipiac University surveyed 2,692 registered voters nationwide with a margin of error of +/- 1.9 percentage points. Live interviewers call land lines and cell phones.

So if you dig you can get all of this. And note that they specifically mentioned calling cellphones as part of their sample.

One final thing to point out is that if you accept a 95% confidence level, that means that by definition approximately one out of every 20 polls will be, the use the technical term, “Batcrap crazy”. That is why you should never assign too much significance to any one poll, particularly if it gives you results different from all other polls. You are probably looking at that one out of twenty polls that should be ignored. There is a human tendency to seize on it if it tells you what you want to hear, but that is usually a mistake. It is when a number of pollsters do a number of polls and get roughly the same result that you should start to believe it. That does not mean they will agree exactly, there is still the usual margin of error. That is why a poll that show one candidate getting 51% of the vote and her opponent getting 49% will be described as a “dead heat”. With the margin of error, the candidate could be getting anywhere between 53% and 49% assuming the poll is accurate and unbiased.

Nov 04

Review of Beyond Fear: Thinking Sensibly About Security In An Uncertain World

Beyond Fear: Thinking Sensibly about Security in an Uncertain WorldBeyond Fear: Thinking Sensibly about Security in an Uncertain World by Bruce Schneier
My rating: 5 of 5 stars

Bruce wrote this book in 2003 as a response to 9/11 and how it lead to changes in security practices in the U.S. He criticizes many of the security measures taken as “security theater” that makes it look like something is being done without actually accomplishing anything useful. His criticisms probably are nothing terribly new to people 2013 when many people have come to similar conclusions, but what I think is more important in this book is that he attempts to lay out a way of thinking about security that is rational. Security can never be 100% in a world of human beings, and security always entails trade-offs that make it a cost-benefit decision. As an example, you would never hire an armed guard to protect your empty bottles for getting the 10 cent deposit back. That just doesn’t make sense. Bruce lays out a 5 point analysis you can do with any security plan that asks questions about what you are trying to protect, what are the costs of the protection, will the proposed solution actually work, etc. It is a good analysis and worth a read if you want to learn how to think intelligently about security.

View all my reviews

Nov 04

Review of The Code Book

The Code Book: The Science of Secrecy from Ancient Egypt to Quantum CryptographyThe Code Book: The Science of Secrecy from Ancient Egypt to Quantum Cryptography by Simon Singh
My rating: 5 of 5 stars

This book is a very good review of the history on encryption and explains the basic principles involved. It is a lot like David Kahn’s The Code Breakers, but is available for a good deal less. Beginning with Herodotus and some secrecy measures from The Persian Wars, it then moves forward with Arab scholars, medieval developments, and right up to asymmetric public key encryption used today. Highly recommended for anyone who wants to get an overview of what the issues are, but is not looking to dive into the mathematics.

View all my reviews

Aug 21

Review of The Dream Machine: J.C.R. Licklider and the Revolution That Made Computing Personal

The Dream Machine: J.C.R. Licklider and the Revolution That Made Computing PersonalThe Dream Machine: J.C.R. Licklider and the Revolution That Made Computing Personal by M. Mitchell Waldrop
My rating: 5 of 5 stars

Having just read Katie Hafner’s Where Wizard’s Stay Up Late I was ready to tackle this book, which is both deeper and more ambitious. Where Hafner’s book was purely about the origin of the Internet, Waldrop is taking on the whole idea of personal computing. Licklider thus provides the focus for this book, for while he played a crucial role in promoting networking, his true aim was always what he termed a symbiotic partnership between humans and computers, and for him networking was just a necessary step to getting there. That is one of the reasons Licklider provided crucial support to Doug Engelbart, for instance. And even when Licklider was out of the picture (during the heyday of Xerox PARC, for instance) Waldrop keeps his focus on the development of the personal computer. If you like this kind of history and want to know just who did what in those early days, this book is indispensable.

View all my reviews

Aug 21

Review of Where Wizards Stay Up Late

Where Wizards Stay Up Late: The Origins Of The InternetWhere Wizards Stay Up Late: The Origins Of The Internet by Katie Hafner
My rating: 5 of 5 stars

This is a classic book that sat on my shelf for a while and I just decided to pick it up and read. It was very rewarding. It tells the story of how the Internet came to be, and opens with one of the pioneers explaining that he wants to kill the myth that the Internet was designed to withstand a nuclear war. It wasn’t, and most of the people involved never thought about it (though Paul Baran did, apparently). But the way it happened is fascinating, and people who pulled this off were some of the best and brightest of technology. I recommend it highly.

View all my reviews

Mar 24

What’s Wrong With Free, Anyway?

So I am at my LUG meeting the other night listening to a spirited discussion, which is pretty normal for us. We have a lot of very opinionated people there, and there is never a lack of discussion. The trick is getting a word in edgewise, and normally three people are all talking at once trying to grab the floor. In this case, it got to piracy, the music industry, bit torrent, etc. One person tried to make the argument that bit torrent promotes piracy and is harming the industry, and seemed genuinely surprised that no one in the room agreed with him. But we all agreed that the music business had changed irrevocably, and that there would never again be a group as big as The Beatles. But why is that? I tend to think a necessary precondition for anyone getting that big is that they would first have to be that good, and in my own curmudgeonly way I don’t think any of the current acts are that good. Now, if you like to discuss the current music scene and the music business, I always recommend you read The Lefsetz Letter, by Bob Lefsetz. He is constantly explaining that the music world is different now, that you can’t just go into the studio, cut an album, and let the riches roll in.

I think the new music business is about the relationship the artist has with the fans. And it does not rely on mass media in any way. One of the things the Internet has done is kill broadcasting, and bring us instead narrowcasting. By this I mean that instead of attracting a mass audience, you go after a niche audience that wants what you offer. And to get that audience you need to work on your relationships. A very eloquent explanation is given by Amanda Palmer in her TED talk. She frames the question beautifully by saying that the industry is focused on how to make people pay for music, while she focuses on how to let people pay for music. Notice how the language changes when you do this, and what it implies. When you talk about making people pay you are using the language of force, the language you use with enemies, the language on conflict and confrontation. Is it any wonder the industry is imploding? Any business that treats its customers like the enemy does not have a long future in front of it. But if you follow Amanda Palmer and talk about letting people help you, this is the language of trust, of mutual respect.

This has implications beyond the obvious one of treating your customers better. Amanda Palmer recorded an album on a traditional music label, sold 25,000 copies, and was considered a failure. Then she left the label, started a kickstarter campaign to fund her next recording project, and raised $1.2 million. From whom? About 25,000 fans. In other words, she has a hard-core audience of about 25,000 who love what she does and will support it. For record labels, that is not enough. And for certain rock stars with a sense of entitlement that is not enough since they want mansions and expensive sports cars. But it seems to be enough for someone who just wants to make an honest living. This is the niche audience you get in an environment of narrowcasting, not the mass audience we used to get from broadcasting.

I see this in my own music tastes. There are a half-down artists from whom I will buy any product they put out, and I bet you haven’t heard of them. They are not mass artists. One of them, Jonatha Brooke, just did a campaign on PledgeMusic to raise the money for her next album, and I was happy to make my own pledge on return for a CD when it is done and updates and photos while it is being done. And you can be sure I will buy a ticket to her show any time she is in town. That is not to say I don’t enjoy music from some of the “big” acts. About 7 years ago I bought tickets for The Who. What I got was 2 tickets the cost over $100 each, and was so far from the stage that I had trouble even seeing the JumboTron. When Jonatha comes to town, she will play a local club that seats about 400 max, the tickets will cost about $25, and I will be maybe 20′ away from her. And she will stay after the show to sell and sign CDs and talk to her fans. It is artists like this that I support with my money, because I feel some relationship with them. But by the same token, if they didn’t make enough money to keep going, these artists with stop doing what they do. So my feeling is that I support you, and you give me something I want. Amanda Palmer puts her music out on the Internet without DRM, But she asks people to pay her for it, and they do.

I think this is something we can learn from in the Free Software community. If you focus on getting something for nothing, that is not sustainable as a model. Not only do developers have to eat, I think they need to know that people value their work and are willing to support it. And I think that can happen with small-scale applications, and in the age of narrowcasting that is viable, but only if the support is there. All too many people are looking for something free of charge, and get outraged when they can’t get it. This showed up recently when Google decided to end the Google Reader. This free-of-charge application was cancelled because the market was not large enough to make it viable. And that explanation does make sense. Google is one of the world’s largest corporations, and they operate at a very large scale. They simply cannot afford to put resources into small projects. I have heard that the usage for Reader was in the neighborhood of 10-20 million. A petition to keep it gathered 150,000 signatures. And while those may sound like large numbers, for Google they are tiny. They need 100 million to make it worth their while.

But, for a smaller developer, a market of  1-2 million might be plenty. Imagine this developer could provide a “cloud” service, similar to what Google offered, that would cost $2 per month. That would be $24 per year, and form 1 million customers it be $24 million. That is quite enough to run a good RSS Reader service, and it is completely sustainable. The service would have sufficient predictable income to maintain and develop the product. And they could develop a community of users who are passionate about the product. And the same reasoning would apply to downloadable software, even “free software”, if you use that term like I do, to denote software that gives you the Four Freedoms the Free Software Foundation has published. But the key is to understand that you need to support software that you rely on. If you only want “free-of-charge” software, you will probably pay for it with your your personal information or by watching ads. And you will be at the mercy of companies that will drop the product any time it suits them. I think you will find that this rarely happens in the free software community as long as a project has a passionate community that supports it, the way Amanda Palmer’s fans support her.

So what software are you passionate about? And how do you support it?

Listen to the audio version of this post on Hacker Public Radio< !

Jan 31

Tablet share numbers prove I was right

In November 2011 I made the claim here, and on my blog, that by the end of 2012 Apple and Android wold have essentially equal market shares. And the 4th Q 2012 numbers show that I was correct. I doubt there are any wonderful prizes for this, but there it is.

My prediction was based on one simple observation: At the time I made this prediction, the relative market shares of iOS and Android in the tablet market had tracked, with the right lag, the market shares in the smartphone market. So I looked ta how long it took for Android to achieve parity in the smartphone market and predicted it would do similarly in the tablet market. And there is no reason to think this won’t continue. The point of rough parity in the smartphone market came in November 2010, and since then Android’s share has only grown, to the point that the world-wide share of Android is now around 4x that of iOS. So I expect a firm lead to develop by the end of 2013, and by the end of 2014 total dominance for Android.

Jan 23

Review of The Daemon, the Gnu, and the Penguin

<a href=”http://www.goodreads.com/book/show/5273962-the-daemon-the-gnu-and-the-penguin” style=”float: left; padding-right: 20px”><img alt=”The Daemon, the Gnu, and the Penguin” border=”0″ src=”http://d.gr-assets.com/books/1348132467m/5273962.jpg” /></a><a href=”http://www.goodreads.com/book/show/5273962-the-daemon-the-gnu-and-the-penguin”>The Daemon, the Gnu, and the Penguin</a> by <a href=”http://www.goodreads.com/author/show/328650.Peter_H_Salus”>Peter H. Salus</a><br/>
My rating: <a href=”http://www.goodreads.com/review/show/514861960″>4 of 5 stars</a><br /><br />
I give this a high rating because it does what it sets out to do very well.Peter Salus was involved in the history of Unix and Linux, which makes him a good guide to that history. He presents it in a straightforward and spare style, so don’t expect a gripping page turner. But if you want to have good accurate data on who did what and when, this book will deliver. Also, it is a relatively quick read because of his spare style.
<br/><br/>
<a href=”http://www.goodreads.com/review/list/7609801-kevin-o-brien”>View all my reviews</a>

Oct 12

Data-Driven Objectivity

I recently had an exchange online with someone I tend to like, and it was about self-driving cars. My correspondent said that he would never, under any circumstances, get into a self-driven car. In fact, he seemed to think that self-driven cars would lead to carnage on the roads. My own opinion is that human driven cars have already led to a very demonstrable carnage, and that in all likelihood computers would do a better job. As you might imagine, this impressed my correspondent not the least. When  I observed that his opbjections were irrational, he said I shouild choose my words more carefully, but that he would overlook the insult this time.

Possibly that is a bad way to phrase my objection, but it is also, in the strict sense of the term, the precisely proper word to use. What I was saying is that his view had no basis in data or facts, and was purely an emotional response. We all have those, and I’m not claiming any superiority on that ground. But when the Enlightenment philosophers talked of reason it was in contrast to religion and superstition, and really did mean thinking in terms of data, facts, and logical thinking. It is my own view that this type of thinking has the major reponsibility for the progress the human race has made in science and technology over the last few centuries. And it is also my view that this type of thinking is being attacked severely in these days.

The hallmark of rational thinking is that it starts from a basis in observed facts, but always keeps a willingness to revise the conclusion if new facts come to light. If that seems reasonable to you, good. Now think of how the worst insult you can pin on a politician is flip-flopping. The great 20th century economist John Maynard Keynes was accused of this and responded “When my information changes, I alter my conclusions. What do you do, sir?” That is how a rational person thinks. There are people who attack science for being of no use because occasionally scientists change thier minds about what is going on. But that is an uninformed (to be most charitable about it) view. Science is a process of deriving the best possible explanations for the data we have, but always ready to discard them in favor of other explanations when new data comes in. That may bother people who insist on iron-clad certainty in everything, but in fact it does work. If it didn’t work you wouldn’t be reading this. (Did you ever notice the irony of television commentators attacking scientists? You might think the plans for television were found in the Bible/Koran/etc.)

One of the biggest obstacles to clear, rational thinking is what is termed confirmation bias. This is the tendency of people to see the evidence that supports their view, while simultaneously ignoring any evidence that does not support their view. This why the only studies that are given credibility are what we call “double-blind” studies. An example is a drug trial. We know there is a tednency for people to get better because they believe they are being given a new drug. In addition, we know that just being shown attention helps. So we take great care (in a good study) to divide the sample into two groups, with one group getting the great new drug, and the other group getting something that looks exactly like it, but has no active ingredient. It may be a sugar pill, or a solution of saline liquid being injected, just so long as the patient cannot tell which group they are in. But the bias can also be on the experimenter side. If a team of doctors has devoted years to developing a new drug, they will naturally have some investment in wanting it to succeed. And that can lead to seeing results that are not there, or even in “suggesting” in sub-conscious ways to the patient that they are getting the drug or not. So none of those doctors can be a part of it either. Clinicians are recruited who only know that they have two groups, A & B, and have no idea which is which. This is the classic double-blind study: neither the patient nor the experimenter has any idea who is getting the drug and who isn’t.

The reason we need to be this careful is that people are, by and large, irrational. People will be afraid of flying in an airplane but think nothing of getting into a car and driving, even though every bit of data says that driving is far more dangerous. People are far more afraid of sharks than they are of the food they eat, though more people die every year from food poisoning than are ever killed by sharks. And we all suffer from a massive case of the Lake Wobegone effect, in that we all tend to think we are above average, even though by definition roughly half of us are below average on any given characteristic. We just are not good judges of our own capabilities in most cases.

But the worst case is the person who is absolutely certain, no matter what he is certain of. Certainty is great enemy of rationality. Years ago, Jacob Bronowski filmed a series called The Ascent Of Man. In one scene, he stood in a puddle outside at Auschwitz and talked about people who had certainty, and said “I beg of you to consider the possiblity that you may be wrong.” This is the hallmark of a rational person, this is the standard by which every scientist is judged. If you know anyone who can say “This is what I think, but I might be wrong,” you will have found the rarest kind of person, and you should cultivate their aquaintance. This type of wisdom is all too rare. And if you ever find a politician who says that, please vote for them, no matter what their party affiliation. They are worth infinitely more than a hundred of the kind that never have changed their minds about anything.

Oct 09

KDE Manifesto Released

The KDE project has released its Manifesto. Since this is my desktop of choice, I thought I should mention it. It is very good:

The KDE Manifesto
We are a community of technologists, designers, writers and advocates who work to ensure freedom for all people through our software.
Because of this work we have come to value:
Open Governance to ensure engagement in our leadership and decision processes;
Free Software to ensure the result of our work is available to all people;
Inclusivity to ensure that people of all origins are welcome to join us and participate;
Innovation to ensure that new ideas constantly emerge to better serve people;
Common Ownership to ensure that we stay united;
End-User Focus to ensure our work is useful to all people.
That is, in pursuit of our goal, we have found these items essential to define and stay true to ourselves.

The reason things like this matter is that free software is about a lot more than just selling a bunch of software and having an IPO to get rich. It is about our values and empowering people to use software to make their lives better.

Older posts «