Problem Set 1
1. A study aims to estimate the causal effect of sanitary drinking water on infant mortality. Researchers randomly sample 200 villages from the population of all villages. 70 of these villages have unsanitary drinking water whereas 130 of them have sanitary drinking water. In the 70 villages with unsanitary drinking water, infant mortality rates are, on average, 40 per 1,000 live births. In the 130 villages with sanitary drinking water, infant mortality rates are, on average, 5 per 1,000 live births.
a. Is this study an experiment or not? If yes, which type of an experiment? If not, why not?
b. If any, list your concerns about interpreting the difference in infant mortality
rates between the two types of villages as the causal effect of sanitary drinking water on infant mortality.
2. Researchers randomly assign over-weight participants to a treatment group or a control group. Participants in the treatment group receive incentives to walk 10,000 steps a day. Participants in the control group do not receive incentives. They compare the average weight loss in the treatment group and the control group.
a. Is this study an experiment or not? If yes, which type of an experiment? If not, why not?
b. Which of the following is a concern for interpreting the differences between the treatment and control groups as the causal impact of the intervention? Evaluate each item and explain your reasoning.
i. The intervention started in the spring and ended in the summer. People try to lose more weight in the summer.
ii. Before the study started, about half the participants were already walking 10,000 steps.
iii. Before the study started, about three-quarters of the participants in the treatment group were already walking 10,000 steps, and about one-quarter of the participants in the control group were already walking 10,000 steps.
iv. Participants are overweight, are not randomly drawn from the population.
3. A charity runs an experiment to understand which messages increase charitable donations. They call 1,000 of their donors and randomize them to a message that either focuses on the success that the charity has achieved or focuses on how much more needs to be done. The charity asks the donors each of the questions below. Categorize each question as behavioral or hypothetical and briefly explain why.
a. Will you stay on the call an extra ten minutes to receive additional information?
b. On a 1-7 scale, how likely are you to donate again in the future?
c. Will you click on the link we sent to you to learn more about opportunities to donate?
d. Would you like to make a donation today?
4. For each statement below, explain whether the observed correlation is likely to imply
causation. Discuss, based on the lecture, what other reason(s) might beat play. Provide examples or counterarguments that illustrate the importance of considering alternative explanations.
a. Statement A: "A study finds a strong positive correlation between icecream sales and the number of drowning incidents at a beach. Therefore, consuming more icecream causes an increase in drowning incidents."
b. Statement B: “Research shows students in charter schools have higher test scores in standardized tests than students in public schools. Hence, if must be the case that going to charter school increases students’ test scores.”
c. Statement C: “ Research shows a negative correlation between the number of hours spent studying and the frequency of party attendance among college students. Hence, studying less causes students to attend more parties."
d. Statement D: "In a survey, a positive correlation is found between the frequency of exercise and reported happiness levels. Therefore, engaging in more physical activity causes people to be happier."
5. In class, we discussed 3 core assumptions for unbiasedness of difference-in-means estimator. Identify which core assumption we discussed in class is violated in the scenarios below:
a. You run an experiment among college students to test the effect of incentives to walk (treatment) vs. no incentives to walk (control) on average steps per day.
Students who receive incentives to walk (D=1) encourage their friends who don't receive incentives (D=0) to walk with them.
b. A researcher wants to test how winning large sums of money in a national lottery affects people’s views about estate tax. The researcher interviews a random sample of individuals and compare the answers of who won large sums to the answers of those who won nothing or small sums.
c. A researcher wants to measure whether sales team wearing professional attire
affects the sales in astore. To that end, they provide suits to their employees in a set of randomly selected stores.
6. Suppose that an experiment was performed on the villages in Table 2.1 (the one we discussed in class, reproduced below), such that two villages are allocated to the treatment group and the other 5 villages are allocated to control group.
a. Suppose that the research randomly selects Villages 3 and 7 from the set of
seven villages to be placed in the treatment group. Calculate the difference-in- means estimate. Is this estimate biased or unbiased? Why? Note: You might need to make calculations to prove your point.
b. Suppose that instead of the random assignment, the research selected Villages 3 and 7 because treatment can be administered easier in those villages. Will this procedure result in a biased or unbiased estimate?