A study of an association between which ear is used for cell phone calls and whether the subject is left-handed or right-handed began with a survey e-mailed to 5000 people belonging to an otology online group, and 717 surveys were returned. (Otology relates to the ear and hearing.) What percentage of the 5000 surveys were returned? Does that response rate appear to be low? In general, what is a problem with a very low response rate?
Constructing a Confidence Interval for the Difference between Two Population Proportions
In order to determine if a new instructional technology improves students' scores, a professor wants to know if a larger percentage of students using the instructional technology passed the class than the percentage of students who did not use the new technology. Records show that 45 out of 50 randomly selected students who were in classes that used the instructional technology passed the class and 38 out of 51 randomly selected students who were in classes that did not use the instructional technology passed the class. Construct a 95%
confidence interval for the true difference between the proportion of students using the technology who passed and the proportion of students not using the technology who passed.
We are going to show how to construct the confidence interval first without a TI-83/84 Plus calculator and then with one.
Step 1: Find the point estimate.
First, we'll let Population 1 be those students who used the new technology and Population 2 be those students who did not. Next, we need to calculate the sample proportions. The sample proportion for Sample 1 (using instructional technology) is calculated as follows.
The sample proportion for Sample 2 (without the instructional technology) is found as follows.
Now that we have the sample proportions, we can calculate the point estimate.
Step 2: Find the margin of error.
Notice that the samples are indeed independent of one another. Because they are two separate groups of students, they are not connected in any way. We can assume that the other necessary conditions are met to allow us to use the standard normal distribution to calculate the margin of error. The level of confidence is c=0.95
, so the critical value is zα2/=z0.052/=z0.025=1.96
. Substituting the values into the formula gives us the following.
Step 3: Subtract the margin of error from and add the margin of error to the point estimate.
Subtracting the margin of error from the point estimate and then adding the margin of error to the point estimate gives us the following endpoints of the confidence interval.
Lower endpoint: (pˆ1−pˆ2)−E=0.154902−0.145675≈0.009Upper endpoint: (pˆ1−pˆ2)+E=0.154902+0.145675≈0.301
Thus, the 95%
confidence interval for the difference between the two population proportions ranges from 0.009 to 0.301
. The confidence interval can be written mathematically using either inequality symbols or interval notation, as shown below.
Therefore, we are 95%
confident that the percentage of students who passed the class is between 0.9% and 30.1% higher for the population of students who used the new instructional technology (Population 1) than for the population of students who did not use the technology (Population 2). Thus, with 95%
confidence, the professor can conclude that the new instructional technology improves students' scores.
To calculate the confidence interval for the difference between two proportions on the calculator, we don't need to find the individual sample proportions; we just need to enter the number of successes and the sample size for each sample, as well as the level of confidence. Press STAT , scroll to TESTS, and then choose option B:2-PropZInt. x1 is the number of successes from the first sample and n1 is the first sample's size. Similarly, x2 is the number of successes from the second sample and n2 is the second sample's size. As usual, C-Level is the confidence level, which must be entered as a decimal. The data should be entered as shown in the first screenshot below. After you select Calculate and press ENTER , the results will be displayed on the screen as shown in the second screenshot below.
2-PropZInt data entry screen with x_1 equal to 45, n_1 equal to 50, x_2 equal to 38, n_2 equal to 51, and C-Level equal to .95. 2-PropZInt results screen shows ( .00923 , .30057 ), p hat_1 equal to .9 , p hat_2 equal to .7450980392, n_1 equal to 50, and n_2 equal to 51.
Notice that the calculator gives the same interval but with more decimal places. The interpretation of the confidence interval is still the same. The proportion of students passing the class was higher for the population of students who used the new instructional technology than for the population of students who did not use the technology.
In this section we will turn our attention to comparing two population proportions. Once again, there are times when we aren't necessarily focused on the exact proportion, but rather how proportions from two populations compare, that is, if they are equal, or if one is larger than the other.
When we were comparing population means, we constructed a confidence interval for the difference between the two population means. Similarly, when comparing two population proportions, we use a confidence interval for the difference between the population proportions. The best point estimate for the difference is pˆ1−pˆ2
. In this section we will restrict our discussion to comparing two population proportions when the following conditions are met. Notice that the conditions are similar to those discussed for estimating a single population proportion.
All possible samples of a given size have an equal probability of being chosen; that is, simple random samples are used. The samples are independent. The conditions for a binomial distribution are met for both samples. The sample sizes are large enough to ensure that n1pˆ1≥5
, n1(1−pˆ1)≥5, n2pˆ2≥5, and n2(1−pˆ2)≥5
When these conditions are met, we can apply the Central Limit Theorem to the sampling distribution of the differences between the sample proportions for two independent samples. This means that we will use the standard normal distribution to calculate the margin of error of a confidence interval for the difference between two population proportions. You can assume that the necessary criteria are met for all examples and exercises in this lesson.
p=xN=# of successespopulation size
pˆ=xn=# of successessample size
Properties of a Binomial Distribution
The experiment consists of a fixed number, n, of identical trials. Each trial is independent of the others. For each trial, there are only two possible outcomes. For counting purposes, one outcome is labeled a success, and the other a failure. For every trial, the probability of getting a success is called p. The probability of getting a failure is then 1−p
The binomial random variable, X, counts the number of successes in n trials.
If there are n pairs of data values and the population distribution of the paired differences is approximately normal, then the sampling distribution for the sample statistic d⎯⎯ follows a t-distribution with n, n−1 degrees of freedom. Hence, the formula for the margin of error is as follows. This is the same formula that is used when estimating a single population mean when σ is unknown. This is because we use the paired differences as a single set of sample data rather than using the data from the two samples separately when working with paired data.
Margin of Error of a Confidence Interval for the Mean of the Paired Differences for Two Populations ( σ Unknown, Dependent Samples)
When both population standard deviations are unknown, the samples taken are dependent, simple random samples of paired data, and either the number of pairs of data values in the sample data is greater than or equal to 30
or the population distribution of the paired differences is approximately normal, the margin of error of a confidence interval for the mean of the paired differences for two populations is given by
is the critical value for the level of confidence, c=1−α such that the area under the t-distribution with n−1 degrees of freedom to the right of tα2/ is equal to α2
is the sample standard deviation of the paired differences for the sample data, an
n is the number of paired differences in the sample data.
To use paired data to construct a confidence interval, the following conditions must be met.
All possible samples of a given size have an equal probability of being chosen; that is, simple random samples are used. The samples are dependent. Both population standard deviations, σ1
Either the number of pairs of data values in the sample data is greater than or equal to 30
or the population distribution of the paired differences is approximately normal.
In this lesson, you may assume that these conditions are met for all examples and exercises involving paired data.
The value that we want to estimate is the mean of the paired differences for the two populations of dependent data, μd
. Recall that the first step in constructing a confidence interval is to find the point estimate, and the best point estimate for a population mean is a sample mean. Therefore, the mean of the paired differences for the sample data, d⎯⎯
is the point estimate used here.
Formula: Mean of Paired Differences
When two dependent samples consist of paired data, the mean of the paired differences for the sample data is given by d⎯⎯=∑din
is the paired difference for the ith pair of data values and
n is the number of paired differences in the sample data.