Pstat 120B
Final Study Guide
Here is a sampling of questions to help you study for the final exam. I will draw some questions with slight changes from the problems listed here. These are suggested areas of focus for your studying for the exam. I do not guarantee that these are the only topics on the exam but do recommend studying at least all these topics to help prepare.
You should also review integration techniques like substitution and integration by parts and definitions of mean and variance, mgfs and other pre-requisite material from Calcululs and 120A as needed.
Lecture and homework material is other material that you should review for comprehensive preparation for the exam.
In Week 10, TAs will go over some of these problems and you can also bring additional questions that you would like your TA to review.
1. Here is a partial list of items covered in the course. Also, refer to summaries in each slide deck and the first question in each homework.
(a) Partial list of topics:
• Distributions from the table booklet and common ones like Gaussian(normal), bi-nomial, Bernoulli, Geometric, Poisson, uniform, exponential, gamma, chi-square, t, F
• Definition of mean and variance of random variables and mean, variance linear func-tions of random variables
• Various methods for finding the distribution of functions of random variables (i.e., statistics)
• Mean, Variance, and approximate sampling distribution of the sample mean, sample variance, Central Limit Theorem and applications
• Point estimation and Properties of point estimators: bias, MSE, consistency, relative efficiency, sufficiency
• Interval estimation , confidence intervals
• Method of moments, Maximum likelihood method for estimation
• Rao-Blackwell theorem and MVUE
• 5 steps in a hypothesis test for a population parameter and logic of hypothesis tests.
• Large sample hypothesis tests for mean μ and assumptions
• small sample hypothesis tests for mean μ, difference in mean μ1 - μ2, variance σ2 and assumptions
(b) Terminology/defintions:
• random sample
• statistic
• sampling distribution
• sample mean
• population mean
• sample variance
• population variance
• convergence in distribution
• estimator
• estimate
• estimand
• point estimator
• bias
• mean square error (MSE)
• error of estimation
• unbiased estimator
• convergence in probability
• interval estimator
• confidence interval
• confidence coefficient
• consistent estimator
• relative efficiency
• sufficient statistic
• likelihood function
• Hypothesis tests: null, alternate, test statistic and it’s distribution, Rejection Region and p-value
• Type 1 error, Type 2 error and Power of a test.
(c) Partial list of results
• mean and variance of the sample mean
• bias and MSE of the sample mean as an estimator of the population mean
• central limit theorem
• Markov’s inequality
• Chebyshev’s inequality
• weak law of large numbers
• consistency of the sample variance
• variance vanishing and unbiasedness imply consistency
• factorization theorem and sufficiency
• assumptions, test statistics and sampling distributions for different types of hypothesis tests
2. Let Y1, Y2,..., Yn be independent Bernoulli random variables with probability mass function,
(a) Calculate the moment-generating function for Y1 .
(b) Find the moment-generating function for
(c) Give the name of the distribution of W.
3. Suppose that two electronic components in the guidance system for a missile operate indepen- dently and that each has a length of life governed by the exponential distribution with mean 1 (with measurements in hundreds of hours). Find the
(a) probability density function for the average length of life of the two components
(b) mean and variance of this average, using the answer in part (a). Check your answer by computing the mean and variance, using Theorem 5.12.
4. Let X be a random variable with density given by
fX (x) = 2(1 − x), 0 ≤ x ≤ 1
Use the method of distribution functions (i. e. CDF method) to find the densities in parts (a–b) .
(a) Find the probability density function of U1 = 2X − 1.
(b) Find the probability density function of U2 = X2 .
Use the method of inverse transformations to find the densities in parts (c–d) and compare your answers to parts (a–b) .
(c) Find the probability density function of U1 = 2X − 1. Compare to part (a).
(d) Find the probability density function of U2 = X2 . Compare to part (b).
(e) Compute E(U1), E(U2) using the derived densities fU1 (u) and fU2 (u).
(f) Find E(U1), E(U2) by first finding the moments of X and then using the properties of expectation to find E(U1), E(U2) . Does your answer match to the answer of the previous part?
5. Suppose the morning prices (in dollars per gallon) at two neighboring gas stations Y1 and Y2 are independent random variables, each with a density function given by
On a given morning, a customer is going to buy gas from whichever station is less expensive. Find the
(a) probability density function for the price per gallon the customer will pay.
(b) expected cost per gallon that the customer will pay.
6. Let Y1, Y2, . . . , Y5 be a random sample of size 5 from a normal population with mean 0 and variance 1 and let Let Y6 be another independent observation from the same population. Additionally let W ∼ χ
2
5
and U ∼ χ
2
4 be two more random variables independent from all other random variables. Find the distributions of the following random variables and explain your reasoning
7. A car manufacturer numbers their tanks uniformly over the interval (1,θ). Suppose X1, X2,..., Xn denotes a random sample of tanks.
(a) Compute the bias and show that X(¯) is a biased estimator of θ .
(b) Find a function of X(¯) that is an unbiased estimator of θ .
(c) Find the MSE when X(¯) is used as an estimator of θ .
8. Assume that Y1, Y2,..., Yn is a sample of size n from an exponential distribution with mean θ .
(a) Using the method of moment-generating functions show that 2 Yi/θ has a χ2 distri- bution with 2n degrees of freedom.
(b) Use the distribution of 2 Yi/θ above to derive a 95% confidence interval for θ .
(c) If a sample of size n = 7 yields ¯(y) = 4.77, use the result from part (b) to give a 95% confidence interval for θ .
9. Suppose X1, X2,..., Xn is iid from N(250, 2.52 ).
(a) What is 90% confidence interval for µ when n = 36? Your answer will be in terms of X(¯) .
(b) What is the margin of error of the confidence interval you computed above?
(c) Determine the sample size we would need so that the length of the 95% confidence interval for µ is 1.25.
10. Suppose that you want to estimate the mean pH of rainfalls in an area that suffers from heavy pollution due to the discharge of smoke from a power plant. Assume that σ is in the neighborhood of 0.5 pH and that you want your estimate to lie within 0.1 of µ with probability near 0.95. Approximately how many rainfalls must be included in your sample (one pH reading per rainfall)? Would it be valid to select all of your water specimens from a single rainfall? Explain.
11. A small amount of the trace element selenium, from 50 to 200 micrograms (µg) per day, is considered essential to good health. Suppose that independent random samples of n1 = n2 = 30 adults were selected from two regions of the United States, and a day’s intake of selenium, from both liquids and solids, was recorded for each person. The mean and standard deviation of the selenium daily intakes for the 30 adults from region 1 were ¯(y)1 = 167.1 µg and s1 = 24.3 µg, respectively. The corresponding statistics for the 30 adults from region 2 were ¯(y)2 = 140.9 µg and s2 = 17.6 µg. Find a 95% confidence interval for the difference in the mean selenium intake for the two regions. Interpret the interval.
Set this problem up as a hypothesis test and perform a hypothesis test to test if there is a difference in mean selenium intake for the two regions by using Rejection Region and p-values as well.
12. Do SAT scores for high school students differ depending on the students O(˜) intended field of
study? Fifteen students who intended to major in engineering were compared with 15 students who intended to major in language and literature. Given in the accompanying table are the means and standard deviations of the scores on the verbal and mathematics portion of the SAT for the two groups of students:
Note: You may assume SAT scores are normally distributed and that the two samples are drawn from populations with equal variance.
(a) Construct a 95% confidence interval for the difference in average verbal scores of students majoring in engineering and of those majoring in language/literature.
(b) Construct a 95% confidence interval for the difference in average math scores of students majoring in engineering and of those majoring in language/literature.
(c) Interpret the results obtained in parts (a) and (b).
(d) What assumptions are necessary for the methods used previously to be valid?
(e) Try setting this problem up as a hypothesis test and perform an appropriate hypothesis test using Rejection Region and p-values as well.
13. Sheer strength measurements derived from unconfined compression tests for two types of soils
gave the results shown in the following table (measurements in tons per square foot).
Soil Type I Soil Type II
n1 = 30 n2 = 35
y-1 = 1.65 y-2 = 1.43
s1 = 0.26 s2 = 0.22
(a) Do the soils appear to differ with respect to average shear strength, at the 1% significance level?
(b) Construct a 99% confidence interval for the difference in mean shear strengths for two soil types. Interpret this interval. Based on this interval, should the null hypothesis be rejected?
(c) Try setting this problem up as a hypothesis test and perform an appropriate hypothesis test using Rejection Region and p-values as well.
14. Operators of gasoline-fueled vehicles complain about the price of gasoline in gas stations. Ac- cording to the American Petroleum Institute, the federal gas tax per gallon is constant (18.4¢ as of January 13, 2005), but state and local taxes vary from 7.5¢ to 32.10¢ for n = 18 key metropolitan areas around the country. The total tax per gallon for gasoline at each of these
18 locations is given next. Suppose that these measurements constitute a random sample of size 18:
42.89 53.91 48.55 47.90 47.73 46.61
40.45 39.65 38.65 37.95 36.80 35.95
35.09 35.04 34.95 33.45 28.99 27.45
Is there sufficient evidence to claim that the average per gallon gas tax is less than 45¢?
(a) Use the t table in the appendix to bound the p-value associated with the test.
(b) Construct a 95% confidence interval for the average per gallon gas tax in the United States.
(c) Is there sufficient evidence to claim that the average per gallon gas tax is less than 45¢? Is the conclusion of the hypothesis test the same based on the calculations in previous two parts. Explain.
15. The EPA has set a maximum noise level for heavy trucks at 83 decibels (dB). The manner in which this limit is applied will greatly affect the trucking industry and the public. One way to apply the limit is to require all trucks to conform to the noise limit. A second but less satisfactory method is to require the truck fleet’s mean noise level to be less than the limit. If the latter rule is adopted, variation in the noise level from truck to truck becomes important because a large value of σ 2 would imply that many trucks exceed the limit, even if the mean fleet level were 83 dB. A random sample of six heavy trucks produced the following noise levels (in decibels):
85.4 86.8 86.1 85.3 84.8 86.0.
If noise levels are normally distributed, use the above data to construct a 90% confidence interval for σ 2 , the variance of the truck noise-emission readings. Interpret your results.
Try setting this problem up as a hypothesis test and perform an appropriate hypothesis test using Rejection Region. Can you use p-values here as well?
16. For X1, X2,..., Xn is iid from Uniform(0,θ), let Yn = max{X1, X2,..., Xn}. Show that Yn is consistent for estimating θ .
17. Let Y1, Y2,..., Yn denote a random sample from the uniform distribution over the interval (0,θ).
(a) Show that Y(n) = max{Y1, Y2,..., Yn} is sufficient for θ .
(b) Use Y(n) to find an MVUE of θ .
18. Suppose that Y1, Y2,..., Yn constitute a random sample from a uniform distribution with prob- ability density function
(a) Obtain the MLE of θ .
(b) Obtain the MLE for the variance of the underlying distribution.
19. Suppose that Y1,..., Yn denote a random sample from the Poisson distribution with mean λ .
(a) Find the method-of-moments estimate, λ(ˆ)MOM for λ .
(b) Find the maximum liklihood estimate λ(ˆ)MLE for λ .
(c) Find the expected value and variance of λ(ˆ)MLE .
(d) Show that λ(ˆ)MLE is consistent for λ .
(e) What is the MLE for P(Y = 0) = e-λ ?
20. Suppose that X1,..., Xn is a random sample from an exponential distribution with mean β .
(a) Show that X is a sufficient statistic.
(b) Construct an unbiased estimator, β(ˆ) for β based on the sufficient statistic X- .
(c) Find the variance of this estimator.
(d) Is the estimate from (b) consistent?
(e) Is the estimate the ‘MVUE’?
(f) Find the MLE of the population variance θ2 .
21. Let Y1, Y2,..., Yn denote a random sample from the probability density function
(a) Find the MLE for θ . Hint: the support involves θ
(b) Is the MLE you found a sufficient statistic?
22. Let Y1, Y2,..., Yn denote a random sample from the probability density function
(a) Find an estimator for θ by using method of moments.
(b) Is this estimator a sufficient statistic for θ?