2016年10月10日星期一

Identifying bias in samples and surveys

Terms to know
-Bias wording--Can cause people to like or dislike certain responses over others. e.g. A high school wanted to know what percent of its students smoke cigarettes. Counselors selected a random sample of students to take a survey on drug use. One of the questions reads, "If you are under the age of 18, do you illegally smoke cigarettes?" Suggesting that smoking is illegal might make it less likely for students who smoke to admit they do.
-Response bias--The tendency of a person to answer questions untruthfully or misleadingly. e.g. A high school wanted to know what percent of its students smoke cigarettes. During the week when students visited the counselors to schedule classes, they asked every student in person if they smoked cigarettes or not. High school students who smoke aren't likely to admit it to their counselor. At the same time, it's doubtful that students would lie in the other direction—students who don't smoke probably wouldn't say that they do.
-Undercoverage--When researchers exclude members of the population from being in the sample. e.g. A senator wanted to know about how people in her state felt about internet privacy issues. She conducted a poll by calling 100 people whose names were randomly sampled from the phone book (note that mobile phones and unlisted numbers aren't in phone books). The senator's office called those numbers until they got a response from all 100 people chosen. Since the senator used the phone book, people who only use mobile phones, have unlisted numbers, or don't have a phone at all can't possibly be in the sample.
-Convenience sampling--Choose a sample that available without using any randomization. e.g. David hosts a podcast and he is curious how much his listeners like his show. He decides to poll the 100 listeners who send him fan emails. Polling the 100 listeners who send him fan emails means that David simply chose a sample that was available to him without using any randomization. This is a convenience sample, which almost always produces biased results.
-Voluntary responses sampling--Let members of the population choose whether or not they would be in the sample. e.g. David hosts a podcast and he is curious how much his listeners like his show. He decides to start on online poll, and he asks his listeners to visit his website and participate in the poll. Asking all of his listeners to respond to an online poll means that David let members of the population choose whether or not they would be in the sample. This is a voluntary response sample, which almost always produces biased results.
FAQ 1.What is the most concerning source of bias in this scenario?
This kind of questions are simply asking you to identify which situation/term mentioned above belongs to that scenario. 2.Which direction of bias is more likely in this scenario? (Underestimate/ overestimate/ unbiased estimate) **Need to consider the emotional coloring or situations of the people being asked questions
e.g.1 David hosts a podcast and he is curious how much his listeners like his show. He decides to poll the 100 listeners who send him fan emails.(Overestimate) The results are probably an overestimate of the percentage of all listeners that love the show, because listeners who send fan email to David's show are probably more likely to love his show compared to a typical listener.
e.g.2 A senator wanted to know about how people in her state felt about internet privacy issues. She conducted a poll by calling people using random digit dialing, where computers randomly generate phone numbers so unlisted and mobile numbers can still be reached. They called over 1000 random phone numbers—most people didn't answer—until they had reached 1000 respondents.(Underestimate) People who didn't answer their phone probably feel stronger about privacy issues than the typical person. Having them in the sample probably would have changed the results to show more people concerned about privacy.

没有评论:

发表评论