Reading Assignment #15 - Due 4/2/12

Here's your next reading assignment. Read Sections 5.1-5.2 in your textbook and answer the following questions by 8 a.m., Monday, April 2nd. Be sure to login (using the link near the bottom of the sidebar) to the blog before leaving your answers in the comment section below.

  1. Suppose you've developed a vehicle back-up camera and you suspect that it will help drivers parallel park more quickly. You have twelve volunteers parallel park an SUV without the camera, and then you have a different twelve volunteers parallel park the same SUV while using the camera. You compare the time it takes each group of volunteers to park the SUV. Are these samples paired or unpaired? Explain your answer.
  2. Suppose that the drinking water in two cities, Phoenix and Las Vegas, is sampled and tested for arsenic. The average arsenic level in the 50 Phoenix samples was 12.5 parts per billion (ppb) with a standard deviation of 7.63. The average arsenic level in the 50 Las Vegas samples was 15.4 ppb with a standard deviation of 15.3. Conduct a hypothesis test using these data to determine if the arsenic levels in these two cities are different.
  3. Answer the question posed in Exercise 5.14 on page 201.
  4. What question do you have about the reading?

41 thoughts on “Reading Assignment #15 - Due 4/2/12

  1. 1. It is paired, since the two samples have a "special" connection.
    2. They are the same. (P value around 8.188)
    3. Since SE squared is Variance, that equation functions similarly.
    4. No questions.

  2. 1. These samples are unpaired. An observation in one sample is not related to any one specific observation in the other. These samples would be paired if, for example, the same volunteers parked the SUV with and without the cameras, then you could compare each person's performance.

    2. Ho: μP - μLA = 0. In other words, there is no difference between the arsenic levels in the drinking water in Phoenix and LA. Ha: μP - μLA Not= 0, or there is a difference in the arsenic levels. A z-score of -1.1994 leads us to a p-value of 0.230373. Since this value is not less than .05, we do not reject the null.

    3. The formula for the variance of a linear combination of random variables is as follows: Var(aX + bY) = a^2 Var(x) + b^2 Var(Y). Since the Var(X-Y) would be the Var(1X + (-1)Y) = Var(x) + Var(Y), which is where this rewrite of equation 5.13 is derived from.

  3. 1. They are paired because they share the same correspondence in the time it takes to park the car.

    2.Ho: There is no difference in the arsenic levels
    Ha: There is a difference in arsenic levels

    SE: 1.084701802
    Z Score: 2.673545848
    P = 2*(1-0.99621)
    =0.00758

    Since it is less than .05, the null hypothesis is rejected and there is a difference

    3. It comes from accounting for the difference in variance.

    4. None

  4. 1. The data collected is unpaired - Changing to a new set of 12 drivers means that the comparison between Parking Sample Without Camera 1 and Parking Sample With Camera 1 is not a direct observation on the camera's effectiveness, but also on differences between two completely different drivers.

    2. Using the method described in Section 5.2.3,
    Ho = There is no difference in means between Phoenix and Las Vegas.
    Ha = There is a difference (Mu(diff) =/= 0) between means in Phoenix and Las Vegas.

    SE = 2.418

    We are 95% confident that the aresnic level in Phoenix is higher than that in Las Vegas by .48ppb to 5.32ppb.

    3.(SE)^2 represents the variance of a sample, so the variance of X-Y is equal to var(X) - var(Y), or (SE)x^2 - (SE)y^2.

    4. What is the functionality of determining data about the difference of two means rather than simply comparing data from the two different sample sets?

  5. 1) The samples are paired. The time it takes for each volunteer to park an SUV with the camera has a connection with exactly one of the times it takes for each volunteer to park an SUV without the camera. The difference in outcomes of each pair of observation could be easily taken.
    2)
    Ho: m1 - m2 = 0
    Ha: m1 - m2 != 0
    SE = 2.4
    Z = 1.21
    p-value = 2 * .1131 = .2262
    The p-value is not very small, so we fail to reject Ho and we conclude that the arsenic levels in two cities are not different.
    3) var(x+y) = a^2 * var(x) + b^2 * var(y)
    4) I don't think the book clearly states how to find p-value for the difference of two means. We calculate z-score; then look at the probability; then do 1-that probability. And finally p-value = 2 * our result. Is it right?

  6. 1.They are paired because there is a correspondence between the data.

    2. Ho: Average Arsenic levels are the same (diff = 0)
    Ha: Average Arsenic levels are different (diff != 0)

    Z = ((15.4-12.5) - 0) / ((15..3-7.63)/sqrt(50)) = 2.67 -> 0.9962
    1- 0.9962 = .0038

    p-value = 2 ∗ (one tail area) ≈ 2 ∗ .0038 = 0.0076
    p < .05 so reject Ho

    3. SE =

  7. 1) Unpaired. The observations are independent because different drivers are used for each condition (camera and no camera).

    2) Null Hypothesis : There is no difference. Alternative hypothesis : There is difference. Ndiff = 50. Meandiff =2.9 SDdiff = 7.67
    Standard Error = 1.08
    Z-score = 2.69
    P-score = 0.007
    Conclusion : No difference.

    3) Var(aX+bY) =( a^2 Var X ) + (b^2 Var Y)
    a = 1, b= -1 but since it is squared, the negative sign disappear.

    4) No question.

  8. 1. The samples are unpaired because they came from 2 totally different groups of subjects and the order is irrelevant. If the same group of subjects parked the two cars, the test would be paired because each subject could be matched to their own time in the other trial.

    2. Ho: mup-mulv=0
    Ha: mup-mulv not= 0

    SE=sqrt(sp^2/np+slv^2/nlv)=2.418
    z=(mulv-mup)/SE=3.21
    p=P(z>=(x-0)/SE)=P(z>=3.213)=1-.9993=.0007

    3. if X and Y are independent random variables with standard deviations sx and sy, then the linear combination z=aX+bY has a variance sz^2=a^2sx^2+b^2sy^2. since for z=X-Y, a=1 and b=-1, the variance of z is sx^2+sy^2.

  9. 1) No, the trials involve different people with two distinct capabilities and are therefore independent.
    2) H0 : The difference between the two cities' arsenic levels are the same
    HA: The difference between the two cities' arsenic levels are different.
    3) By the rules of probability theory, we know that the variance of two combined variables is equal to the sum of their individual variances. Since the square root of the variance is the standard deviation and the standard error is the standard deviation of a sample mean, we can find the standard error of two combined variables by taking the square root of the sum of their standard errors.
    4) Is it acceptable to have the alternate hypothesis be a one-sided test? All the examples in the book were two-sided.

  10. 1. These are unpaired. Since it is a different 12 people doing each trial, there is no correspondence between two specific results.
    2. z=1.19. If alpha =.05, there is no significant difference in the water.
    3. It is like one of our problem set theories. If we square the st. deviation, we get the variance. so square both deviations, and you get the variances. Add them (since they are squared, you don't subtract them anymore), and then take the square root of the numbers.
    4. THe reading was straight forward.

  11. 1) Since the samples are independent (each observation in the first set has no correspondence with any observation in the second set), the samples are unpaired.
    2)
    H0: There is no difference between the average arsenic level between the two cities.
    HA: There is a difference between the average arsenic level between the two cities.
    p-value = 2*P(Z>2.9-0/(2.418)) = 2*P(Z>1.20) = 2*0.115 = 0.23
    Since p-value is greater than 5% significance level, we fail to reject the null.
    3) Since the new random variable can be modeled as Z = Y - X, the standard error of Z will be sqrt(a^2*SE(Y)^2 + b^2*SE(X)^2) = sqrt(SE(Y)^2 - SE(X)^2).
    4) No question.

  12. 1. These samples are not paired because the second sample has a different group of people than the first. In order for data to be paired, each observation in the first set must correspond to exactly one observation in the second set of data. With 2 different sets of drivers, the data cannot be considered paired because each driver in the first set does not correspond to a driver in the second set in a meaningful way.

    2. H0 difference = 0
    Ha difference =/= 0
    SE= [(7.63^2)/(50)+(15.3^2)/(50)]^(1/2)=2.42
    z=(15.4-12.5)/(2.42)=1.20
    p=2(1-0.8849)=0.2302
    Because of this high p value, we fail to reject the null hypothesis.

    3. Squaring the standard error of the difference between the two means is the same as adding the square of the standard error of the two populations because the standard error represents the variance of the data. Thus the variance between two sets of data is the same as the sum of their individual variances.

    4. Is it possible to set up a hypothesis test between two data points like in question 2 in the conventional way? (i.e H0=12.5 and Ha=/= 12.5)?

  13. 1. Unpaired, in order for the samples to be paired, you would need to have only one group of twelve volunteers. Each of these volunteers would have to park twice, once using the camera and once without using the camera. Since there are two groups of volunteers in the actual experiment, there is no special correspondence between the two sets of observations and thus the data is unpaired.

    2.
    H0: μ (difference) = 0
    H1: μ =/= 0

    Point estimate: 15.4 ppb - 12.5 ppb = 2.9 ppb
    Standard error: sqrt( (7.63^2)/(50) + (15.3^2)/(50) ) = 2.418 ppb
    Z Score: 1.20
    p-value: 0.1151 * 2 (two-tailed test) = 0.2302

    23.02% chance that the difference "2.9 ppb" was produced through random variation when the actual difference in population means was "0 ppb." There is not strong enough evidence to reject the null.

    3. The standard error squared is simply the variance of the estimate. If we have two samples, "x" and "y", we may compute the variance of their difference "x-y" by considering it a linear combination:
    VAR(aX + bY) = a^2 * VAR(X) + b^2 * VAR(Y)
    a = 1, b = -1
    VAR (x-y) = VAR(x) + VAR(y)
    SE^2 (x-y) = SE^2(x) + SE^2(y)
    SE (x-y) = sqrt( SE^2(x) + SE^2(y) )

    4. Who will win the 2012 NCAA Men's Basketball Championship?

  14. 1. The samples are paired because each time the volunteers park the same SUV. Therefor each parking time without a camera, has a natural correspondence to the parking times with a camera on the same SUV. When two sets of observations have this correspondence they are said to be paired.

    2. The two samples are independent of each other, so not paired. So a point estimate of the mu s can be found. x(bar, PH)- x(bar, LV)= 15.4 - 12.5= 2.9. Because n>50 and the data doesn't seem too skewed, we can assume the data is nearly normal. And because they aren't paired, we can assume that the difference in sample means can be using a normal distribution.

    SEx(bar,lv)-x(bar,ph)= Radical (sample Std(lv)^2/50 + sample Std(ph)^2/50)= 2.42

    Us P values for this hypothesis test with a sig level of alpha=.05. Find the upper limit using p values.
    Z=2.9 - 0/2.42 = 1.21 -> upper tail-> 1- .8869= .1171.

    Because the P-value is greater than .05, we accept Ha and reject Ho.

    3. Just the same as the Var(x-y)= var(x)+var(y), SE^2(x1-x2)= SE^2(x1)+SE^2(x2).

    4. Could we have done #2 using a confidence interval instead?

  15. 1. These samples are unpaired because each observation cannot be paired with exactly one observation in another set. All twelve samples in each set could be matched up with any sample in the other set.
    2. The p-value is found to be 0.2302, which is larger than the significance value, 0.05 for a 95% confidence interval. Therefore, we fail to reject the null hypothesis, that there is no difference in the arsenic levels. There is insufficient evidence to show that the difference in the arsenic levels is significant.
    3. I cannot process this right now.
    4. I would like exercise 5.14 explained because it makes me feel like I don't fully understand variance.

  16. 1. These samples are unpaired because each consists of a different set of twelve drivers. It would be paired if the same 12 drivers were asked to parallel park with the camera and then again without the camera.

    2. H0: The aresenic content is the same for the two cities μ1-μ2 = 0
    HA: μ1-μ2 ≠ 0
    x1-x2 = 2.9
    SEx1-x2 = 1.803
    z = (2.9-0)/(1.803) = 1.61
    p = 2*0.0537 = 0.1074
    This result is not significant at the α=0.05 level so we fail to reject the null hypothesis and show that there is any difference between the arsenic levels in the two water supplies

    3. When you want to find the variance of a linear combination of variables X and Y of the form aX + bY, the general formula is (a^2)Var(X) + (b^2)Var(Y). In this case we want the variance of of X-Y so Var(X-Y) = Var(X) +Var(Y). The minus sign disappears because the coefficient -1 is squared. Variance is just the standard error squared which can be subbed in giving the equation in problem 5.14.

    4. Isn't a hypothesis test with null hypothesis H0: μ1-μ2 = 0 the same as one with H0: μ1=μ2? which we've been doing up until now.

  17. 1) Unpaired. You cannot specify which data point in the first set correlates with which data point in the second set, since the drivers are different.
    2) Ho: (P Phoenix Arsenic levels = LV Arsenic levels)
    Ha: (P Phoenix Arsenic levels =/ LV Arsenic levels)
    3) The variance of two random variables X and Y is SE^2 of X + SE^2 of Y
    4) Still not sure on 5.14

  18. 1. They are unpaired. Because the drivers are different in two group. If there are only 12 drivers altogether and calculate there parking time when using the camera or not, then they are paired.
    2. H0: There is no difference between arsenic levels in these two cities.
    Ha: There is a difference
    The difference of mean is 2.9
    The SE is 2.42
    So Z = 2.9/2.42 is 1.2. According to the table, upper tail is 0.1539
    Because it's two side, we double the result and get p value 0.31. It is larger than the significance value 0.05 so we fail to reject H0.
    3. Calculate the square root of both side, the RHS is the form in formula 5.13
    4. Whether we always use the two side in hypothesis testing?

  19. 1. I would say that the samples are paired because you are testing the parallel park speed of the cars and there is data for with and without a backup camera.

    2. H0 : there is no difference in the means
    HA : There is a difference between the two cities means

    3. Because the two means are completely independent, the difference of their sample means can be determined by calculating each Standard error separately and then adding them together.

    4. Why are the standard errors added together and not subtracted when you are looking for the difference?

  20. 1) No because there is no direct connection from each of the first 12 to exactly one of the other 12.
    2) H0: they are the same, HA: they are different. Construct a 95% confidence interval with the difference of the means and the SEx1-x2
    3) SE^2 is variance and the variance of a*var(x)+b*var(y) = a^2*var(x)+b^2*var(y) so then that becomes SEx1-y1=(var(x)+var(y))^(1/2)
    4) none

  21. 1) Unpaired because they two tests are independent from each other.
    2) The null hypothesis is that the arsenic levels are the same and the alternate hypothesis is that they are different. Z = 2.42, p = .0078 so the null hypothesis is rejected.
    3) The variance of (X - Y)^2 = s.d.(x)^2 + s.d.(y)^2. So since standard error is based off of variance it will have the same algebraic relation as variance which is why S.E.(x-y)^2 = S.E(x)^2 + S.E.(y)^2

  22. 1. These samples are unpaired because two different sets of volunteers were used; one to park the SUV without the camera and the other set to park the same SUV with the camera. Here it is important to see if it helps the same driver to park with the camera, and the current data does not have the one-to-one correspondence between the two sets.
    2. diff = mean arsenic level in Las Vegas - mean arsenic level in Phoenix
    Null hypothesis: mean_diff is zero
    Alternate hypothesis: mean_diff is not equal to zero
    In this case, the point estimate is 15.4-12.5 = 2.9
    The standard error is computed to be 2.418
    The p-value is computed to be .2302 (this is a two sided test)
    Using the significance value of .05, the p-value is found to be bigger than the significance level, hence we fail to reject the null hypothesis.
    3. If Z = X-Y, then var(Z) = (1)^2*var(X) + (-1)^2*var(Y) = var(X)+var(Y)
    Since the standard error squared, SE^2 represents the variance of the estimate, it follows that SE^2_(x1-x2) = SE^2_x1 + SE^2_x2
    4. What is the difference when we try to find the difference in population means of paired and unpaired data?

  23. 1. The samples are unpaired. In order for the data sets to be paired, both sets (with the camera and without it) would have to consist of the same drivers parallel parking.
    2. The null hypothesis is that the difference in the means of the arsenic levels in the two cities is zero, while the alternative hypothesis is that the difference in the means is not zero. The SE is sqrt((7.63^2/50)+(15.3^2/50)) = 2.42. The Z-score is then ((15.4-12.5)-0)/2.42 = 1.20. The p-value is 2*(1-.8849)=..2302 which is well above the normal acceptable significance level of 0.05. Therefore, we cannot reject the null hypothesis, and we cannot say that the arsenic levels are different in the two cities.
    3. The standard error squared is the variance of the estimate. So, plugging in the differences of the data points into the general variance formula would yield this equation.
    4. Does the standard error formula for a difference of sample means have any relation to the pythagorean theorem, or is it simply coincidence that the two formulas look the same?

  24. 1) Unpaired - in order to be paired you would need to use the same volunteers both times. Each observation in one set does not have a special correspondence or connection with exactly one observation in the other data set.
    2) H0 = water samples in the two cities are equal in average arsenic content
    HA = water samples in the two cities are not equal in average arsenic content
    3) SE[x1-x2] = sqrt((SE[x1])^2 +(SE[x2])^2)
    = sqrt(s1^2/n1 + s2^2/n2)
    SE[x1-x2] ^2 = SE[x1] ^2+ SE[x2] ^2
    Because you add variance when taking the difference in mean - it should grow with a broader sample
    4) My question is actually concerning my former answer - did I get 3 right? I had no idea how to explain it...

  25. 1. The two samples are unpaired as each observation in the first sample to does have a direct connection to a specific observation in the second sample.

    2. H0: µdiff = 0
    HA: µdiff does not equal zero

    µdiff = 15.4-12.5 = 2.9
    SEdiff = ((7.63^2 + 15.3^2)/50)^1/2 = 2.42
    Z = 2.9-0/2.42 = 1.20
    Z(1.2) = .1151
    a = 2*.1151 = .23
    Since a > .05, we cannot reject the null hypothesis.

    3. For a linear combination of two random variables X and Y the variance is given by a^2*Var(X)+b^2*Var(Y) where a and b are fixed numbers. For the difference of two means (X1- X2), a and b are 1 and -1. The net result is that Var(X1-X2) = 1^2*Var(X1)+(-1)^2*Var(X2) = Var(X1)+Var(X2)

  26. Suppose you've developed a vehicle back-up camera and you suspect that it will help drivers parallel park more quickly. You have twelve volunteers parallel park an SUV without the camera, and then you have a different twelve volunteers parallel park the same SUV while using the camera. You compare the time it takes each group of volunteers to park the SUV. Are these samples paired or unpaired? Explain your answer.
    These samples are unpaired because different volunteers were used in the two experiments for each car. Thus, there are two differences in each trial, the person driving and camera or not camera. Thus you cannot pair together these trials. Another way to look at it is simply that since there are different drivers, drivers are independent in what we would call pairs, thus the data is unpaired since dependency must exist in order to identify pairs.

    Suppose that the drinking water in two cities, Phoenix and Las Vegas, is sampled and tested for arsenic. The average arsenic level in the 50 Phoenix samples was 12.5 parts per billion (ppb) with a standard deviation of 7.63. The average arsenic level in the 50 Las Vegas samples was 15.4 ppb with a standard deviation of 15.3. Conduct a hypothesis test using these data to determine if the arsenic levels in these two cities are different.
    H0: muP - muLV = 0
    HA: muP - muLV != 0
    SE(muP-muLV) = 2.4179
    xP - xLV = -2.8

    p(z 0.1230 * 2 = .246

    We fail to reject H0 for HA and maintain that there is no difference in arsenic levels in LV and Phoenix.

    Answer the question posed in Exercise 5.14 on page 201.

    We discussed this when we discussed linearity of random variables. Std and variance is not linear but adhears to the equation shown. Thus taking the sqrt of both sides and substituting our SE formula, we get the SE formula for difference of means.

    What question do you have about the reading?

    What is z* (in words, z star)?

  27. 1) Suppose you've developed a vehicle back-up camera and you suspect that it will help drivers parallel park more quickly. You have twelve volunteers parallel park an SUV without the camera, and then you have a different twelve volunteers parallel park the same SUV while using the camera. You compare the time it takes each group of volunteers to park the SUV. Are these samples paired or unpaired? Explain your answer.
    > Yes, the link between the two data sets is the fact that the same person was tested in each

    2) Suppose that the drinking water in two cities, Phoenix and Las Vegas, is sampled and tested for arsenic. The average arsenic level in the 50 Phoenix samples was 12.5 parts per billion (ppb) with a standard deviation of 7.63. The average arsenic level in the 50 Las Vegas samples was 15.4 ppb with a standard deviation of 15.3. Conduct a hypothesis test using these data to determine if the arsenic levels in these two cities are different.
    > H0: μdiff = 0. There is no difference.
    HA: μdiff ≠ 0. There is a difference.
    I do not know how to continue from here

    3) Answer the question posed in Exercise 5.14 on page 201.
    > This equation comes from the fact that the standard deviation of the difference between 2 variables is the square root of the sum of the squares. You get this equations when you square both sides

    4) What question do you have about the reading?
    > Problem 2 above.

  28. 1. Unpaired. The data would be paired if the original 12 participants parked without and then with the backup camera. But because they were different individuals, the data is unpaired.

    2.
    H0: mu-diff = 0
    HA: mu-diff != 0
    95% CI: (-7.61, 1.81)
    Fail to reject null hypothesis

    3. Its like when we had the formula combining sample data. the equation for variance was: (S = sigma)
    Sz^2 = (a^2)*(Sx^2) + (b^2)*(Sy^2)

    4. April fools? This was posted really late...

  29. 1.) The tests are not paired because although the same SUV is used in a driving test with the camera and without, there are different drivers, thus the variables have changed and we cannot compare the difference accurately.

    2) The null hypothesis is that the arsenic levels in the water of Phoenix and Las Vegas are the same. The alternative hypothesis is that they are different. A hypothesis test returns a low p-value, indicating that we should reject the null hypothesis. The water levels in the two cities are most likely different.

    3.) The formula can be explained by considering the two test variables as random variables. We are after the random variable that is the difference between the two, X - Y. The variance of such a combination is Var(X - Y) = Var(X) + Var(Y) and since the variance is the square of the standard deviation (standard error) then we have

    (SE_X-Y)^2 = (SE_X)^2 + (SE_Y)^2

    4.) Why exactly is it so that Var(X - Y) = Var(X) + Var(Y)? I don't think I ever learned why that was, just that it was.

  30. 1. unpaired, because different volunteers are used in the two tests. Thus, the two results are independent of each other.

    2.H0: AAL (average arsenic level) of Las Vegas - AAl of Phoenix = 0
    Ha: AAL of Las Vegas - AAL of Phoenix does not equal 0
    SE = 2.42
    Z = (15.4 - 12.5) / 2.42 = 1.2
    p-value = 0.115
    double of the p-value is greater than the significance value, 0.05. Thus, we fail to reject the null hypothesis.

    3. the variance of the difference in mean values of the two variables is equal to the sum of the variances of the two variables. As the difference in mean values of the two variables is predicted, the variance of their difference has become larger because it's harder to "predict" the result of two variables.

  31. 1) They are not paired, as there is no connection between individuals in the two test groups.

    2) Ho: mean_diff = 0 (There is no difference between the average arsenic levels in Phoenix and Las Vegas.)
    Ha: mean_diff != 0 (There is a difference one way or the other between arsenic levels in the two cities.)

    3) From probability theory: var(ax+by) = a^2*var(x) + b^2*var(y). We are simply applying this to a difference between two samples so var(x-y) = var(x) + var(y).

    4) Is it ever relevant to try a one sided test on differential data? Say, if we want to see only if one is greater than another.

  32. 1. No they aren’t because there is no exact relationship between data points in the tests. Maybe if the SUV drivers were the same.

    2. H0: There is no difference (as this is always the H0, according to the book), or the meanDifference = 0
    HA: There is a difference, or meanDifference != 0
    All the data seem to match the qualifications of a normal distribution.

    3. This is simply the mean variance of the data. As we know, the SE^2 will equal the variance for a variable in X. So all we are doing is just taking the variance of X minus the variance of Y, then (logically) squaring it to get the variance for a paired data point.
    4. None.

  33. 1) unpaired. The drivers are different, so there is no pair of w/ camera and w/o camera data for any one driver.

    2) H_0: levels are the same (difference equals 0)
    H_a: levels are different (difference does not equal 0)
    Conduct test using (Phoenix minus Vegas)
    mean: -2.9 ppb
    SE: 2.418 ppb
    Z = (-2.9 - 0)/2.418 = -1.2
    Yields a p-value of .2302, providing insufficient evidence in support of H_a

    3) Let X and Y be the square of the SE of two unpaired datasets (their variance). Because the variance is always positive (it is a squared value), the SE squared associated with the linear combination x-y, where x and y are the means of the datasets in question is equal to X+Y.

    4) none

  34. 1. It seems that these samples are unpaired. The definition of paired samples states that, "Two sets of observations are paired if each observation in one set has a special correspondence or connection with exactly one observation in the other data set." The individual park times of each of the twelve cars are not related in a this "special correspondence" as is required to be a paired sample.
    2. Suppose that the drinking water in two cities, Phoenix and Las Vegas, is sampled and tested for arsenic. The average arsenic level in the 50 Phoenix samples was 12.5 parts per billion (ppb) with a standard deviation of 7.63. The average arsenic level in the 50 Las Vegas samples was 15.4 ppb with a standard deviation of 15.3. Conduct a hypothesis test using these data to determine if the arsenic levels in these two cities are different.
    Ho: Phoenix arsenic =/= Las Vegas arsenic
    Ha: Phoenix arsenic == Las Vegas arsenic
    3. It came from the formula for Standard Deviation of y-x which is: Stddev(y-x)= sqrt(sy^2-sx^2).
    4. Everything seems straight forward, I think. Might think of something later, though.

  35. 1. The samples are not paired because it is not the same twelve people that do the parallel parking without the camera, that do it with it.

    2. H0: The difference in Arsenic level is 0
    Ha: The difference is not 0.
    SE = 2.42; Z =((15.4-12.5)-0)/2.42 = 1.19
    p-value = 2* 0.117 = 0.234
    With this p-value we cannot reject H0, and cannot say conclusively if the Arsenic levels are different.

    3. SE^2 of difference in mean is the variance of the difference in the two observed means. This is equal to the addition of the variance of the two means because variance has no negative quantity and therefore the variance of each mean is added.

  36. The data is paired since we can match a point from each set on the person who parked the SUV.

    H0: mu_Phoenix - mu_LasVegas = 0
    HA: same as above, but not equal

    The sample error is a type of standard deviation, which increases in the same fashion as in 5.14.

    How can we determine when a pair has no bearing
    and what's the threshold for the correspondence between the variables?

    PS -- would be helpful to post readings further in advance

  37. 1. They are unpaired. Although they follow the first condition for being paired data, that each set must have the same number of data points, they break the second because there isn't a 1-1 relationship between the relative data in each set. The 3rd driver to try parallel parking without the camera has no special relationship to the 3rd driver in the camera test above its relationships with the other 11 drivers

    2. H0: mu1 = mu2

    t = (~XT - ~XC)/sqrt(varT/nT + varC/nC)
    = (12.5-15.4)/sqrt(7.63/50 + 15.3/50)
    = -2.5/04586
    = 5.4513737462

    Since 5.45 is greater than the cooresponding t-test value for 95%, we can see that the results are significant. Because they are significant, we know that the means of the arsenic level for the two cities are in all probability different.

    3. We know that for two random variables X and Y, sigma^2_(X-Y) = sigma^2_(X) + sigma^2_(Y). The standard error is basically just a constant times the variance, so the equation is essentially the same except with standard error instead of variance.

    4. Whats another example of paired data?

  38. 1. They are unpaired. Save for the fact that both groups drove the same SUV, there is no direct relationship between individual members of each group.

    2. Ho: There is no difference between the arsenic levels (uP - uL = 0)
    Ha: There is a difference between the arsenic levels (uP - uL =/ 0)
    z = 0.234. We fail to reject the null hypothesis.

    3. It is the difference of two squares, minus their sums.

    4. No questions.

  39. 1. These samples are unpaired because the samples consists of separate individuals. If it was done with and without the camera using the same people the sample would be paired.
    2. Null: mu between arsenic level in Phonex and Las Vegas= 0.
    Alternate: mu between arsenic level in Phonex and Las Vegas does not equal 0.
    3. The variance of x and y added together is equal to the variance between x plus the variance between y.
    4. Why does it matter to know whether samples are paired or unpaired?

  40. 1) There is no relation between the individuals in the tests so they aren't paired.
    2) Ho: mean difference = 0
    Ha: mean difference not = 0
    3) We use the following variance equation var(x-y) = var(x) + var(y).
    4) When is it appropriate to use the two sided test?

  41. 1)
    The samples would be unpaired. Since a particular data point in sample A may have more than one correspondence in sample B, the samples would not be paired. Also, both samples would be independent of one another because different people were used in both experiments.

    2)
    x_P = 12.5, n_P = 50, s_P = 7.63; x_LV = 15.4, n_LV = 50, s_LV = 15.3
    H0 = u_P - u_LV = 0
    HA = u_P - u_LV != 0

    x_d = x_P-x_LV = -2.9
    u_d = u_P-u_LV

    p = 2*P(x_d < -2.9 | u_d = 0)
    SE_d = sqrt(7.63^2/50 + 15.3^2/50) = sqrt(1.164+4.682) = 5.846
    Z = (-2.9-0)/5.846 = -0.496
    p = 2*P(Z < -1.61) = 2 * 0.3085 = 0.617
    Hi p value, so we can reject the HA, so the arsenic levels are the same in both cities.

    3)
    SE = sqrt(std^2/n). The SE of a difference of means is just a vector that describes the connection (or disconnect) between two different data points. You can use the Pythagorean theorem to describe the components of the SE of a difference of means. So, (SE_x-y) ^2 = (SE_x)^2 + (SE_y)^2.

Leave a Reply

Your email address will not be published. Required fields are marked *