L1022: Coursework Project 100% for 2024-25
Maximum project length: 3000 words (no minimum requirement)
A Cross-Country Analysis of Adolescent Fertility
Adolescent fertility poses health risks for both mother and baby. Having children very early in life can also reduce a woman’s opportunities and her social and economic well-being.
The aim of this project is to look at the rate of adolescent fertility (women who give birth aged 15 to 19) across countries and to explore potential determinants, as well as outcomes. What can lead to higher or lower teenage fertility? And what are the risks of having a high fertility rate among very young women?
For this project, you are given country-level data from the year 2018 for a random sample of countries. Your dataset includes the following variables:
· “Adolescent fertility rate (births per 1,000 women ages 15-19)” (AFR for short): This measures how many girls/women give birth in their teenage years (specifically aged 15 to 19).
· “Health expenditure per capita (current US$)” (HE_pc): Current expenditure on healthcare goods and services per year per person.
· “GDP per capita (current US$)” (GDP_pc): GDP per capita is gross domestic product divided by midyear population, measured in 2018 US dollars.
· “Compulsory education in years” (CEdu): Number of years that children are legally obliged to attend school.
· “Low-birthweight babies (% of births)” (LBW): is the percentage of babies born weighing less than 2.5kg.
· “Maternal mortality (per 100,000 live births)” (MMR): The maternal mortality rate is the estimated number of women who die from pregnancy-related causes while pregnant or within 42 days of pregnancy termination per 100,000 live births.
· “A woman can get a job in the same way as a man (1=yes; 0=no)” (WJob): measures whether there are legal restrictions on a woman’s ability to work.
For your data analysis, perform. the following steps, and write up the results. (Use the write-up instructions document to help you with the write-up.)
1. Describe the data, using summary statistics and graphs, as appropriate. For all variables, show basic descriptive statistics and discuss them. Use at least four graphs to show important or interesting features of the data.
2. Calculate the Pearson correlation coefficients between AFR and LBW, as well as AFR and MMR. Test the statistical significance for each of the coefficients and comment on your results. Note that LBW has missing values (meaning that for some countries we do not know the share of babies born with low birth weight). You will need to exclude the countries with missing values when calculating the correlation coefficient between AFR and LBW. However, you should include these countries in your other analysis (in all of the following steps), so DO NOT remove them from your dataset.
3. Calculate the Pearson correlation coefficients between AFR, HE_pc, GDP_pc, and Cedu. Do all pairwise correlations – this should give you 6 correlation coefficients. Comment on the results. (There is no need to test statistical significance here.)
4. Test whether the adolescent fertility rate (AFR) is lower (on average) in countries where women can get a job in the same way a man can (i.e. where WJob=1).
5. Estimate the following regression model:
where the i subscript. corresponds to country i. Interpret the a and b coefficients that you obtain, and comment on their economic significance. Formally test the statistical significance of the slope ( b ) coefficient. (Write out the test procedure fully.)
6. Estimate the following regression model:
where the i subscript. corresponds to country i. Interpret the slope coefficients that you obtain, and comment on their economic and statistical significance. (You do not need to write out the tests for statistical significance here, just state the results.)
7. Interpret the R-squared from this regression (from part 6) and test its statistical significance.
8. Generate a new variable that captures the natural logarithm of the health expenditure per capita: Ln_HE_pc = ln(HE_pc)
9. Re-estimate the regression equation from part 6 with the new variable instead of simple health expenditure:
10. Which of the three regression models fits the data best? Why?
11. Using the last regression (from part 9) predict the adolescent fertility rate in a country with 8 years of compulsory education, $200 health expenditure per capita, and where women cannot get a job in the same way a man can (i.e. there are legal barriers to women’s ability to work).