Computational Statistics & Probability
Problem Set 2 - Linear Models
Due: 23:59:59 22.nov.2023
Fall 2023
Instructions
Assignments must be submitted through Canvas. See the course Canvas page for policies covering collaboration,
acceptable file formats (.Rmd & .pdf), and late submissions. Completed assignments must include executable
code (.Rmd) and a corresponding knitted markdown file (.pdf). An R Markdown cheat sheet is available.
1. Multiple Regression & Causal Models
Return to the Howell1 dataset and consider the causal relationship between age and weight in children.
Let’s define a child as anyone younger than 13 and assume that age influences weight directly and through
age-related physical changes that occur during development. Each child’s height will serve as a measured
proxy for these unmeasured physical attributes. We summarize this causal background knowledge by the
following DAG:
A
H
W
where A, H and W represent random variables age, height, and weight, respectively.
a) What is the total causal effect of year-by-year growth of !Kung children on their weight? To answer this
question, construct a linear regression (m1a) to estimate the total causal effect of each year of growth on a
!Kung child’s weight. Assume that average birth weight is 3.5kg. Use prior predictive simulation to assess
the implications of your priors.
b) What is the total causal effect of height on weight? Construct a linear regression (m1b) to estimate the
total causal effect height on a !Kung child’s weight. Use prior predictive simulation to assess the implication
of your priors.
c) After knowing the age of a !Kung child, what additional value is there in also knowing the child’s height?
Conversely, after knowing the height of a !Kung child, what additional value is there in also knowing the
child’s age?
2. Causal Influence with Categorical Variables
The causal relationship between age and weight might be different for girls and boys.
a) To investigate whether this is so, construct a single linear regression (m2) with a categorical variable for
sex to estimate the total causal effect of age on weight separately for !Kung boys and girls. Plot your data
and overlay the two regression lines, one for girls and one for boys.
1
Hint: You should onstruct an index variable S for sex and change the coding for d.male from {0,1} to {1,
2}. This can be done succinctly with the line S=d$male+1.
b) Does the causal relationship between age and weight differ for !Kung girls and boys? Provide one or more
posterior contrast plots as a summary and explain your answer.
# HINT: the following code can be used to make a posterior contrast plot
# contrast at each age, vector
seq <- 0:12
mu1 <- sim(m2,data=list(A=seq,S=rep(1,13))) # girls
mu2 <- sim(m2,data=list(A=seq,S=rep(2,13))) # boys
mu_contrast <- mu1
for ( i in 1:13 ) mu_contrast[,i] <- mu2[,i] - mu1[,i]
plot( NULL , xlim=c(0,13) , ylim=c(-15,15) , xlab="age" ,
main = "posterior contrast plot",
ylab="weight difference (boys-girls)" )
for ( p in c(0.5,0.67,0.89,0.99) ) # credibility intervals
shade( apply(mu_contrast,2,PI,prob=p) , seq )
abline(h=0,lty=2,lwd=2)
2