代写STAT3600 Statistical Analysis Assignment 2调试R语言程序

2025-05-20 代写STAT3600 Statistical Analysis Assignment 2调试R语言程序

STAT3600

Statistical Analysis

Assignment 2 (submit Q4, Q5)

Deadline: 14 Mar, 2024

Note: (1) Numeric values should be presented in 4 decimal places. (2) Do not use computer and show the intermediate steps for Q1 to Q4.

1.   A psychiatrist wants to know whether the level of pathology (Y) in psychotic patients 6 months after treatment can be predicted with reasonable accuracy from knowledge of pretreatment symptom ratings of thinking disturbance (X1) and hostile suspiciousness (X2). The data collected on 15 patients are stored in ‘pathology.dat’. Consider a multiple linear regression model with Y as the dependent variable and X1  and X2  as the independent variables.

a.   Write down the regression model. State clearly the assumptions.

b.   Find the least squares estimates of the regression coefficients. Interpret the results.

c.   Construct the ANOVA table and hence test whether there is a regression of Y on X1  and X2 at the 5% level of significance.

d.   Estimate the covariance matrix of the estimates.

e.   Find a 95% confidence interval for each partial regression coefficient.

f.   Test whether β1 = 32 or not at the 5% level of significance.

g.   Test whether β2  = -10 or not at the 5% level of significance.

h.   Calculate the R2.

i.    Considering a case with x1  = 3 and x2  = 6, find the predicted level of pathology and the

confidence interval for the mean response and the prediction interval with 95% confidence level.

2.   Consider a general linear hypotheses

C is of dimensions r × p with rank r and d is of dimensions r × 1. Prove that

a.   under the reduced model, the least squares estimator is

b.   the difference SSEr SSEf can be expressed as

3.   Consider a multiple linear regression model

yi = β0 +β1xi1 + … +βipxip +εi

where xij are constant, βj are parameters, εi are iid N(0, ) and i = 1, … , n. A weighted least square estimator for βj is obtained by minimizing

where wi are some predefined known constant values and Prove that the estimators of the regression coefficients are given as

and the variance-covariance matrix of the estimator is

W is a diagonal matrix of w1, … , wn .

4.    Do not use computer. You are given the following matrices computed for a regression analysis Y = β0 +β1X1 +β2X2 +ε .

The matrices are properly ordered according to the regression function given above.

a.   Calculate the LSE of the regression coefficients. Describe the effects of the regressors on the response variable quantitatively.

b.   Calculate SSE and MSE.

c.   Calculate the standard error of the estimates.

d.   Calculate a 90% confidence interval for β1 and β2 , respectively.

e.   Construct an ANOVA table. Test whether there is a regression of Y on X1  and X2  at the 5% level of significance.

f.   Calculate R2. Comment on the fitness of the model.

g.   Test at the 5% level of significance whether each of X1  and X2  is effective, respectively.

h.   Test at the 5% level of significance whether β1 +2β2  = 0.

i.    Test at the 5% level of significance whether β1 +2β2  = 0 and β1 +β2  = 1, simultaneously.

j.    Estimate the mean of Y when (X1, X2) = (1, -1). Construct a 95% confidence interval for the estimate.

k.   Estimate the means of Y for two cases where = (1, -1) and =

(-1,0.5). Construct a 90% simultaneous interval based on (i) Bonferroni’s method and (ii) Scheffe’s method.

5.    This study aimed to explore the relationship between aggravated insomnia and COVID-19- induced psychological impact on the public, lifestyle. changes, and anxiety about the future. The data are stored in ‘insomnia.csv’ and the variables are given as follows.

spiegel

Spiegel Sleep Questionnaire:

0 – 42

higher the scores, the more severe the insomnia.

fcv

FCV-19S:

level of fear ofCOVID-19

0 – 35

Higher the scores, the greater fear ofCOVID-19

sas

Severity of an individual’s anxiety status: 0 – 80

higher the scores, the more severe anxiety

sds

Severity of an individual’s depression status: 0 – 80

higher the scores, the more severe depression

age

18 or above

a.   Formulate a multiple linear regression model for the dataset, using spiegel as the response and the remaining variables as regressors.

b.   Calculate LSE’s for the regression coefficients and their respective standard errors.

c.   Test the significance of each regression coefficient at the 5% level of significance.

d.   Construct an ANOVA table and test whether there is a regression of spiegel on the regressors at the 5% level of significance.

e.   Calculate the R2  statistic for the model. Do you think the model is adequate to explain the variation of severity of insomnia among the subjects under study?

f.   Construct 90% confidence intervals for the regression coefficients.

g.   Describe how the significant regressors affect the severity of insomnia.

h.   Test at the 5% level of significance whether both the coefficients of sds and age are zero.

i.    Test at the 5% level of significance whether the coefficients of sas and sds are the same.

j.   Predict the value of Spiegel for an individual with the following values. Construct a 95% prediction interval.

fcv

sas

sds

age

12

20

15

45