White Paper:

Things to Think About When Doing a Survey

Part 3: Sampling

 

 

 

 

 

Sampling – the most important and least understood part of survey research

 

Question:  What is the most important musical event in the last 10 years?

 

This question was in a recent internet poll.  Most people will be surprised to learn that most popular answer was the musical episode of Buffy the Vampire Slayer.  Elton John reworking of “Candle in Wind” at Princess Diana’s funeral was second. (Source: tvguide.com).   Do you believe that result?  It’s nonsense.  But why is it nonsense?

 

The question was put on the website.  Anyone who came to the website could answer the question.  But there was no control on who answered the question or even who came or didn’t come to the site.  The people who answered the question were representative of nobody.  The advent of the web has made getting this type of useless information easier. But how do you get good information?

 

The important part is getting a statistical sample of the population.  A statistical sample is one in which every member of the entire target population has, excuse the geek-speak, a “known probability of selection.”  That means that we know what the odds are of each member will be included in the sample, which doesn’t necessarily mean that each member has the same probability of being selected.

 

For example, an association has 40,000 members and wants to do a survey of members.  It selects 1,000 members for the survey.  In this case, each member has a 1 in 40 chance of being in the survey sample.  Now, let’s say that it will be an e-mail survey and the association only has e-mail addresses for 30,000 of its members, so they select 1,000 e-mail addresses.  Now, members with a valid e-mail address have a 1 in 30 chance of being in the sample and the members without valid e-mail addresses have a 0 chance. 

 

This is still a statistical sample, but it has a built-in bias.  The association needs to be sure that there is not a difference between those with valid e-mails and those without (this could be done with a quick phone survey of a few questions to a small number of those without a valid e-mail address).

 

Another example of unequal probabilities, I did a survey of long-haul trucking companies for the Federal Highway Administration.  About 80 percent of trucking companies are owner/operators, that is, the company had one truck that the owner drove.  A random sample of trucking companies could have a sample of nearly all owner/operators and would not include the larger companies which have thousands of trucks. 

 

To ensure that we included the largest companies in our sample, the list of trucking companies was sorted into three groupings: large companies, mid-size companies, and small companies including owner-operators.  We then took a sample of each group.  There are two methods we could choose to pull these samples.  We could sample so that each member of each group had the same probability of selection – or we could choose an unequal probability.  For this example, we could choose the same number of sample cases from each group.  Notice the difference in these methods in the table below:

 

 

Method 1:  Equal Probability of Selection

 

 

Total Number

of Firms

Number in Sample

Probability of Selection

Large companies

2,000

40

1 in 50

Mid-size companies

10,000

200

1 in 50

Small companies

48,000

960

1 in 50

Total

60,000

1,200

1 in 50

 

Method 2:  Equal Number in Sample Groups

 

 

Total Number

of firms

Number in Sample

Probability of Selection

Large companies

2,000

400

1 in 5

Mid-size companies

10,000

400

1 in 25

Small companies

48,000

400

1 in 120

Total

60,000

1,200

1 in 50

 

What are the advantages of the two methods?  With the first method, we can easily make generalizations of the total population (and with reduced variance).  But if we wanted to make comparisons between the small and the large companies, we may not have the power since there are only 40 large companies in our sample.

 

With the second method, we would have the power to make group comparisons.  However, in order to make generalizations of the total population, the results of the each group would need to weighted to account for the unequal probability of selection.

 

Sampling from a list can be straight-forward but there are common mistakes. One of the most common is not getting a random sample.  I know of an organization that pulled a sample from its database.  The computer database administrator was not given guidance, so he pulled the first 1,000 cases from the membership list.  Unfortunately, the list was sorted by zip code, so the organization got a nice census of the members in the New England and New Jersey.

 

Sampling when there is not a list is even more complex.  One way to get around this, is to sample a “parent” organization and create a list from those organizations to get a sample.  For example, to interview hospital nurses, you may select a small number of hospitals. You ask the hospital to supply a list of nurses and you choose a certain number from each hospital.  There are statistical costs for this type of sample.

 

Surveying the general public has its own issues.  First, there is no way at the present time to do a general public survey using e-mail addresses.  There are companies that are trying to make it work—some very good companies, but the methodology is not yet there.  And if you try it, you are setting yourself up for some crazy results such as the musical Buffy the Vampire Slayer result.

 

<<back                                                                                                                        next>>

White paper home

 

 

 

Back to Top


Philosophy    Communications and Marketing    Research   

 

Association Management    Client List   Senior Staff    Contact    Home

 

© Copyright 2003, Adirondack Communications Inc. All rights reserved.