辅导MATH 304 5 Assignment、讲解R function、辅导讲解R程序、Matlab语言程序讲解留学生
2018-09-24
Lab Assignment #1 - EstimationDue: Monday, 9/24 at the beginning of class.Disclaimer: Labs typically take more time to complete than a regular homework assignment.In this lab you will:• Explore bias & variance;• Numerically calculate MLE’s.Format:The lab has two sections which will walk you through each topic and example code (there is a .R filecontaining all of code provided in this lab). At the end of each section, you will find a box with a labquestion. This is to be completed for your homework due Monday 9/24. You must include all of your code(with questions properly labeled and appropriate comments), your answers to any questions (as comments)and relevant graphs.Remember: to get the help file for any R function, just type ?functionname in the console.Section #1: Bias & VariancePart INow that we are all experts at simulating data and writing for loops, we will apply this knowledge to somedart throwing.Recall from lecture:•ˆθ is a random variable.It is a function of X1, . . . , Xn and will therefore produce a different estimate from each different sample.• As a RV, ˆθ has a sampling distribution.This is the distribution of values of ˆθ produced from all possible samples of the same size from thepopulation.• We would like ˆθ to be unbiased.The sampling distribution of ˆθ is centered at θ.• We would like ˆθ to have small variance.The sampling distribution of ˆθ is concentrated as much as possible around θ.We are going to simulate data to provide estimates of θ that have varying properties (low bias/low variance,high bias/high variance, low bias/high variance, high bias/low variance).To begin, we must first create the target. Run the following code:t = seq(0, 2*pi, length = 1000)# Circle centered at (0,0) with radius 2coords = t(rbind(sin(t)*2, cos(t)*2))plot(coords, type = ’l’, xlab = "", ylab = "", xaxt = ’n’, yaxt = ’n’,A. Flynt, Department of Mathematics, Bucknell University, MATH 304 1xlim = c(-3, 3), ylim = c(-3, 3))# Circle centered at (0,0) with radius 1.75coords.out = t(rbind(sin(t)*1.75, cos(t)*1.75))lines(coords.out, type = ’l’)# Circle centered at (0,0) with radius 1c.1 = t(rbind(sin(t)*1, cos(t)*1))lines(c.1, type = ’l’)# Circle centered at (0,0) with radius .75c.2 = t(rbind(sin(t)*.75, cos(t)*.75))lines(c.2, type = ’l’)# Circle centered at (0,0) with radius .25coords.in = t(rbind(sin(t)*.25, cos(t)*.25))lines(coords.in, type = ’l’)# Circle centered at (0,0) with radius .1coords.in.in = t(rbind(sin(t)*.1, cos(t)*.1))# Color in center circle greenpolygon(coords.in.in, col = ’green’)Notice that the outer circle of the target has a radius of 2. When simulating data, we are going to makedraws from two separate Uniform distributions. One that represents our x-coordinate and one thatrepresents our y-coordinate. If we want to produce estimates centered on the bullseye (low bias) that couldoccur at the far edges of the target (high variance), then we will simulate data from two Uniformdistributions with parameters -2 and 2 as follows:# Initiate vectors for (x, y) coordinatesx.ran = y.ran = NULLfor(i in 1:10){# Randomly draw a dart from a Uniform(-2, 2).# Here, left of the bullseye is negative and right of the bullseye is positive.x.ran[i] = runif(1, -2, 2)# Here, above the bullseye is positive and below the bullseye is negative.y.ran[i] = runif(1, -2, 2)# Draw the dart at the (x,y) coordinates from the previous drawpoints(x.ran, y.ran, pch="X", cex=2, col = ’black’)# Wait 2 seconds between each throwSys.sleep(2)}A. Flynt, Department of Mathematics, Bucknell University, MATH 304 2You now have 10 black X’s on your target that represent observations with low bias and high variance.We can calculate the bias and variance as follows:# Bias of throws. Just the sum of the average in each (x,y) directionBias.black = mean(x.ran) + mean(y.ran)# Variance of throws. Just the sum of the variance of each (x,y) directionVariance.black = var(x.ran) + var(y.ran)Lab Question #1You will now generate darts for the scenarios of low variance/low bias (color red), low variance/high bias(color blue), high variance/high bias (color orange). For each set of darts, calculate bias and varianceusing similar names. Then add the following legends to your target:# Legend for bias of each scenariolegend("topleft", title = "Bias", legend=round(c(Bias.black, Bias.red,Bias.blue, Bias.orange), 2), col = c("black", "red", "blue", "orange"), pch="X")# Legend for variance of each scenariolegend("topright", title = "Variance", legend = round(c(Variance.black,Variance.red, Variance.blue, Variance.orange), 2), col = c("black", "red","blue", "orange"), pch="X")Part IIWe are now going to run some simulations based on the example that we used in the “Properties of PointEstimators” packet from lecture.Let X1, X2, . . . , Xniid∼ Uniform(0, θ). Recall, that ˆθMOM = 2X¯ and ˆθMLE∗ =n+1n X(n) are two unbiasedestimators for θ.We are going to run a simulation to see how these two estimators perform.The idea is to draw random samples of size 25 from a Uniform(0, 12). For each sample, compute 2¯x and26/25 × max{x1, x2, . . . , xn} and record those values. Repeat this 1000 times.Lab Question #2(a) Write and run the code for the simulation described above.(b) Create two histograms of the 1000 replications, one for each estimate.Make sure that your histograms have the same x and y axes (this can be done with the argumentsxlim = c(a, b) and ylim = c(d, e), where you have to decide on a, b, d, e).(c) Which estimator exhibits the smallest amount of variability? Is that what you expected? Why orwhy not?A. Flynt, Department of Mathematics, Bucknell University, MATH 304 3Section #2: Calculating MLE’sIn this final section, we will calculate maximum likelihood estimators for particular distributions using R.As always, the first thing to do is write down the log-likelihood function. We have not spent time on creatingfunctions yet, so I will just show you the syntax for how this works:# You will first give your function a name.# Then specify the necessary parameters (pars) and data (object) for your function.# You will open the function ({)# Then you will specify the form of your log-likelihood (logl)# You need to tell the function what it should return when you use it# Note: the function that we will use to optimize our likelihood only minimizes# functions, therefore we will return the negative log-likelihood# Finally, close your function (})name = function(pars, object){logl = loglikelihood functionreturn(-logl)}ExampleLet X1, X2, . . . , Xniid∼ Poisson(λ). Then, the log-likelihood function is given by:`(λ) = Xni=1xiln(λ) − nλ −Xni=1ln(xi!).Since the last term does not include the parameter λ, it can be safely ignored for optimization. Thus, thekernel of the log-likelihood function is:`(λ) ∝Xni=1xiln(λ) − nλ.We can program this function using the following syntax:poisson.lik = function(lambda, x){logl = sum(x)*log(lambda) - length(x)*lambdareturn(-logl)}Once the log-likelihood function has been declared, then the nlm command can be invoked. The minimumspecification of this command is nlm(log-likelihood, starting values, data). Here, starting valuesis your (vector of) starting value(s) for the optimization of your parameter.We will now simulate a data set of 1000 observations from a Poisson(λ = 3.25) and numerically find themaximum likelihood estimate.# Set the seed so that we all generate the same data set.set.seed(288)poiss.data = rpois(1000, 3.25)A. Flynt, Department of Mathematics, Bucknell University, MATH 304 4# Note that we are using 1 as the starting value.out = nlm(poisson.lik, 1, x = poiss.data)pois.mle = out$estimatepois.mle# [1] 3.242How did we do? Not too bad considering the true value of λ was 3.25. Is this the value that you wereexpecting? Why or why not?Lab Question #3Recall that on HW #9, I asked you to write down (but not solve) the equations that you would need tomaximize to find the MLE’s for (α, β) from a Gamma distribution. These partial derivatives are notsomething that we would want to do by hand (taking the derivative of the gamma function is not toofun), so instead, we will find the MLE’s numerically.(a) Write the code for gamma.lik, the log-likelihood function of a Gamma(α, β). Note, that thisfunction will now have a vector for the pars argument.(b) Read in “gamma data.txt” from Moodle and optimize this data with starting values of (1, 1) for(α, β).Note: if you use read.table(), R will read in the data as a data frame and your likelihoodfunction will not be able to handle an object of that class. To avoid this issue, you will have to pulloff the x column from the data and pass that into your likelihood function.(c) What are your MLE’s for (α, β)?A. Flynt, Department of Mathematics, Bucknell University, MATH 304 5