Statistics

Home Feedback Contents Search

Requires a Java Enabled Browser.

Home
Got Radiation?
Funnies
Terrorism
Ice Arenas
PowerPoint "isms"
ShotGun Rules
Networks
Loan Calculator
Statistics
History of the Donut
Mulch???
Stupid Renfaire Junk






logobvnbottom.gif (2837 bytes)

 

Sample Size Calculator

 

Determine Sample Size

Confidence Level: 99%95% 75%50%45%
40%35%30%25%20%15%
Confidence Interval:
Population:       (Can be left blank)
Collected Data Accuracy 100%90% 80%70%60%
50%40%30%
   
Sample size required:
Survey Accuracy(%):

 

Sample Size Stuff

The confidence interval is the plus-or-minus band around the reported result.  For example, if your confidence interval was 2, and the survey results were 50% in your favor, then the results will fall between 48-52%.

The confidence level tells you how sure the results actually are.   For any scientific study, one where the results really matter, most researchers use the 95% confidence level, or at least start at this level and then try to rise higher.

When used together, you can make statements such as, "I am 95% sure that 48-52% of the population feels this way.  Unfortunately, this also means that the following statement and survey/poll results could also be completely valid.  "I am 25% confident that 75-85% of the population feels
this way."    This, while valid, is somewhat worthless due to the extremely low confidence level.  For market research, this might be useful when targeting an audience for a product as that percentage of the population could turn into sales.   For more scientific processes, this just isn't accurate enough.

Standard Deviation:
In summary, this is the average distance away from the entire group's average that each data point resides.  If we were to say, 68% of all of my data resides within 2 standard deviations, then you have a nice bell shaped curve.  (2 standard deviations in either direction).

If we were to say that 68% of the data resides within 1 standard deviation, then clearly the data is closer to the average, and the bell shaped curve is "skinnier."   So, as your confidence level goes up, you must go out more standard deviations.  (Note, your actual standard deviation value goes down as more data falls right around the average.)   These numbers are used often in statistics and have been simplified into a Zval, or the number of standard deviations you must include to get to a specific confidence level.   Or, how far out you can go with your data and still get agreement.  This table is in any statistics textbook, and the calculator above uses them directly. (Standard two-tail...  "chop off both ends of the bell curve since the standard deviation extends in both the plus and minus directions")

In summary, the above calculator will help determine how many samples you need to evaluate in order to achieve your desired confidence level.  Obviously, if you take a very small number of samples, then your confidence level will also be small.  How accurate your results are depends on how important the data is to you and you constituents.

 


Factors that Affect Confidence Intervals

There are three factors that determine the size of the confidence interval for a given confidence level. These are: sample size, percentage and population size.


Sample Size

The larger your sample, the more sure you can be that their answers truly reflect the population. This indicates that for a given confidence level, the larger your sample size, the smaller your confidence interval. However, the relationship is not linear (i.e., doubling the sample size does not halve the confidence interval).


Population Size

How many people are there in the group your sample represents? This may be the number of people in a city you are studying, the number of people who buy new cars, etc. Often you may not know the exact population size. This is not a problem. The mathematics of probability proves the size of the population is irrelevant, unless the size of the sample exceeds a few percent of the total population you are examining. This means that a sample of 500 people is equally useful in examining the opinions of a state of 15,000,000 as it would a city of 100,000. For this reason, The Survey System ignores the population size when it is "large" or unknown. Population size is only likely to be a factor when you work with a relatively small and known group of people (e.g., the members of an association).

The confidence interval calculations assume you have a genuine random sample of  the relevant population.  If your sample is not truly random, you cannot rely on the intervals. Non-random samples usually result from some flaw in the sampling procedure. An example of such a flaw is to only call people during the day, and miss almost everyone who works. For most purposes, the non-working population cannot be assumed to accurately represent the entire (working and non-working) population.  There are many items that impact the collected data.  You can, and probably should spend considerable time studying the data collection process to reduce or eliminate them.  A few terms that can cause flaws in your data collection are:
Response Bias:   This happens when forms or surveys are mailed, or people are directed to a website (for example) to fill out a survey.   Those people who feel the strongest about a topic may complete the survey, while those "normals" just skip it.    This sways the results.
Only calling people:   This discriminates against those people without phones, or people without immediate access to a phone such as an office or factory with shared extensions.
Only a web based:   This discriminates against people without easy internet access.
Non-random due to regional issues:   Various regions of the country, city, or county where data is taken from could impact the results.
Failure to recognize where stratification is required:   For example, suppose the "pollee" has some motivation to come out strongly against you.    "Pollees" with this same motivation need to be stratified out, and their results need to be modified before those results can be applied as a sample.   Another common example is an arena where support service is being surveyed.   If the support was good, but the answer was unpleasant, the results will be biased and not truly reflect the quality of the support service.

 

 

Find Confidence Interval

Confidence Level: 95% 99%
Sample Size:
Population:
Percentage:
   
   
Confidence Interval:

 


 

 

 

Back Home Next

Company Name
Company Tagline
Author Info Copyright 2005
Last Modified : 09/05/05 07:05 PM