Linear Models Computational Statistics & Probability

2023-11-22 Linear Models Computational Statistics & Probability

Computational Statistics & Probability

Problem Set 2 - Linear Models

Due: 23:59:59 22.nov.2023

Fall 2023

Instructions

Assignments must be submitted through Canvas. See the course Canvas page for policies covering collaboration,

acceptable file formats (.Rmd & .pdf), and late submissions. Completed assignments must include executable

code (.Rmd) and a corresponding knitted markdown file (.pdf). An R Markdown cheat sheet is available.

1. Multiple Regression & Causal Models

Return to the Howell1 dataset and consider the causal relationship between age and weight in children.

Let’s define a child as anyone younger than 13 and assume that age influences weight directly and through

age-related physical changes that occur during development. Each child’s height will serve as a measured

proxy for these unmeasured physical attributes. We summarize this causal background knowledge by the

following DAG:

A

H

W

where A, H and W represent random variables age, height, and weight, respectively.

a) What is the total causal effect of year-by-year growth of !Kung children on their weight? To answer this

question, construct a linear regression (m1a) to estimate the total causal effect of each year of growth on a

!Kung child’s weight. Assume that average birth weight is 3.5kg. Use prior predictive simulation to assess

the implications of your priors.

b) What is the total causal effect of height on weight? Construct a linear regression (m1b) to estimate the

total causal effect height on a !Kung child’s weight. Use prior predictive simulation to assess the implication

of your priors.

c) After knowing the age of a !Kung child, what additional value is there in also knowing the child’s height?

Conversely, after knowing the height of a !Kung child, what additional value is there in also knowing the

child’s age?

2. Causal Influence with Categorical Variables

The causal relationship between age and weight might be different for girls and boys.

a) To investigate whether this is so, construct a single linear regression (m2) with a categorical variable for

sex to estimate the total causal effect of age on weight separately for !Kung boys and girls. Plot your data

and overlay the two regression lines, one for girls and one for boys.

1

Hint: You should onstruct an index variable S for sex and change the coding for d.male from {0,1} to {1,

2}. This can be done succinctly with the line S=d$male+1.

b) Does the causal relationship between age and weight differ for !Kung girls and boys? Provide one or more

posterior contrast plots as a summary and explain your answer.

# HINT: the following code can be used to make a posterior contrast plot

# contrast at each age, vector

seq <- 0:12

mu1 <- sim(m2,data=list(A=seq,S=rep(1,13))) # girls

mu2 <- sim(m2,data=list(A=seq,S=rep(2,13))) # boys

mu_contrast <- mu1

for ( i in 1:13 ) mu_contrast[,i] <- mu2[,i] - mu1[,i]

plot( NULL , xlim=c(0,13) , ylim=c(-15,15) , xlab="age" ,

main = "posterior contrast plot",

ylab="weight difference (boys-girls)" )

for ( p in c(0.5,0.67,0.89,0.99) ) # credibility intervals

shade( apply(mu_contrast,2,PI,prob=p) , seq )

abline(h=0,lty=2,lwd=2)

2