Econ 275 (Baseler): Problem Set 4
** Due Thursday Apr 17 at the beginning of class**
** Please use a pen and not a pencil if you hand-write.**
1 Credit Market Model
Consider the following model of the credit market: A borrower needs to invest W + L = I in a high-yield technology, where W denotes her initial wealth and L her requested loan. The source of capital market imperfection is ex post moral hazard. Namely, once the return σI is realized, the borrower can either repay immediately, or she can stall (voluntarily default). Stalling revenues away from the lender has a cost to the borrower (who has to keep ahead of the lender), and let this cost be a fixed proportion τ of total revenues σI. Finally, if the borrower defaults on her repayment obligation, the lender can invest effort into debt collection. Specifically, assume that a lender who incurs a non-monetary effort cost L · C(p) has probability p of collecting his due repayment rL.
1. What is the borrower’s expected payoff if she defaults and anticipates that the lender will exert monitoring effort level p?
2. Write down the condition such that the borrow will decide not to (strategically) default.
3. Assume that a larger loan increases the value of defaulting to the borrower, and that the lender does not want the borrower to try to default. Show that this implies a limit on how much the lender will be willing to lend at a fixed p (Hint: use your answer to part 2 to find the loan size limit L
∗
such that the borrower will default if L > L* and will not default if L ≤ L*). What happens to this limit when productivity of capital σ goes up and when the interest rate r goes down? Give concise intuitions for your results.
4. Turning to the choice of the optimal monitoring policy p, write down the maximization problem for a risk-neutral lender. Hint: choose p to solve the profit function of the lender when the borrower is trying to default.
5. In the special case where C(p) = −c · ln(1 − p), what p will the lender choose in terms of r? (Hint: you need to differentiate the profit function with respect to p and set this equal to 0. Because the profit function is concave, the highest profit will be achieved where = 0.)
6. Show that with p optimally chosen by the above formula (i.e. assuming C(p) = −c·ln(1−p)), the credit limit does not depend on the interest rate. Give some intuition for your result.
2 Information frictions in the migration decision (part 2)
This problem is a continuation of Problem 1 from PS2. In that problem set you measured the impact of an information experiment that informed rural Kenyan households about average urban earnings on migration over the next year. On Blackboard, under the PS4 folder, download two datasets called migration_baseline.dta and migration_endline.dta.
Remember the Stata tips from previous problem sets - they apply similarly here. You can find new Stata tips at the end of this problem set.
Note that these datasets are for class use only. If you want to use these data for any purpose other than the class, you need to discuss with me first.
Questions
1. The baseline dataset includes information collected at baseline, before households were as-signed to treatment or control. The endline dataset includes information collected about 1 year after baseline on the same set of households. It includes information about how many people migrated from the household, where they went, how much money they earned and in what type of job, and other aspects of their lives in the city. To evaluate the impact of the experiment, you will need to merge the baseline and endline data. You should merge using the hh_id variable, which uniquely identifies households in each dataset (see PS3 for merging tips if you’ve forgotten how to merge). How many households are in the baseline dataset? Out of those households, how many are you able to match to endline data?
2. The households which could not be matched to endline data are called “attriters” - households that could not be found for an endline survey. This can happen if the entire household relocates, or refuses to participate in the survey. One potential concern is attrition bias: that the type of households that attrit varies across treatment status. Tell a plausible story that could generate differential attrition in this context.
3. Let’s test for differential attrition formally. Generate a new variable called “attrited” which equals 1 if the household is not in the endline survey, and 0 if it is in the endline survey (Hint: when a household was not administered an endline survey, all their endline data will be missing). Test whether the attrition rate significantly different between treatment and control. Report and interpret your result.
4. Are the characteristics of the non-attriters comparable across treatment and control? In other words, do we have “balance” if we compare the baseline characteristics (each baseline variable listed in the table above) of those individuals observed at endline?
5. An important debate in economics is whether the allocation of workers across space is efficient. Ignoring for a moment differences in amenities, the implication of an efficient equilibrium al-location is that workers who move across space will not earn more. In other words, the marginal return to migration is 0. Recall from PS2 that this experiment increased the mi-gration rate in the treatment group - that is, it created exogenous variation in the number of migrants. In this question you are going to use that exogenous variation to evaluate the efficiency hypothesis.
(a) The test you want to run is asinh(Yi) = α + βMi + ϵi where Yi
is Nairobi income of family i, Mi
is the number of migrants who traveled to the capital city of Nairobi from household i, and ϵi
is an error term. asinh is called the inverse hyperbolic sine function: it has approximately the same interpretation as log, but unlike log it is defined for Yi = 0 (Hint: you will need to generate a new variable that takes the asinh transform. of Yi). If Mi were randomly assigned, then β would measure the causal effect on income of sending 1 additional migrant to Nairobi. Estimate this exact specification using endline data. What coefficient do you find for β? Explain that coefficient in words. Is this equal to the causal returns to migration, or not? Why/why not?
(b) The experiment offers a suitable instrument for Mi
. What variable is it? Test the inclusion restriction with a regression and report your result. Justify the exclusion restriction in words.
(c) Estimate the instrumental variable specification of the equation in 5a. See Stata tips below for how to do this. What do you find for β now? Is it larger or smaller than the estimate from 5a? What does this tell you about the nature of selection into migration? Interpret the coefficient on β in terms of the efficiency hypothesis discussed above.
(d) The β you estimated in 5c has another name: the treatment on the treated. That is, it’s the return to migrating among households who migrated because of the information treatment. Do you expect this number to be smaller, larger, or the same as the intent-to-treat (ITT)? Why? (You do not need to estimate the ITT - just explain in words.)
(e) There is one concern we need to rule out: what if migration to Nairobi decreases income earned outside of Nairobi? Then it’s possible that the experiment could be income-decreasing even if it increases income earned in Nairobi. Test this concern by regressing total household income (taking the asinh transform) at endline on treatment. Do you find evidence for or against this concern? Interpret your finding.
Stata Tip: Merging Datasets
Sometimes, different variables for the same units of observations (for example, households) belong to different datasets. You might need to merge these datasets to perform. your statistical analysis. To do it in Stata, you just need to know the following:
• The merging variable (id): the variable that uniquely identifies each observation and does not change across datasets.
• Why does Stata need it? Imagine that households are not sorted in the same way across the datasets that you need to merge. The id variable is going to inform. Stata that – say – the first household in dataset 1 is the 45th in dataset 2. Then, Stata uses this information to match the values of each variable to the right household.
• What do you need to do to merge in Stata?
Imagine that you need to merge dataset 1 and dataset2. The merging variable is id. They are both in your working directory (you don’t need to write the full path to the files in Stata).
Then you type:
*Open dataset1
use dataset1
*Merge in the second dataset
merge 1:1 id using dataset2
rename _merge merge1
Adapt this script. to answer question 1. Look at the Stata help of the command merge – help merge – to find out what the variable _merge is. This variable is created after each merge and it provides fundamental information about the merging process.
Stata Tip: IV Regression
Instrumental variable (IV) regression proceeds in two stages:
• First stage: Xi = γ + δZi + νi
• Second stage: Yi = α + βXˆ
i + ϵi
Here Z is called the instrument and X is the endogenous variable. Xˆ
i are the fitted values from the first stage (that is: Xˆ
i = γ + δZi). You don’t need to worry about this, as Stata lets you estimate the second stage with a single command. That command is:
ivreg Y (X = Z), r
This tells Stata to estimate the effect of X on Y, instrumenting X with Z. The ,r at the end tells it use robust standard errors (don’t worry about what this means).
Stata tip: Asinh transform
The inverse hyperbolic sine (asinh) is a function with a derivative very similar to that of the logarithm function. But, conveniently, it does not drop observations of “0”. Taking a log transform. drops these observations because log(0) does not exist. In this problem set you will analyze the effect of migration on income after taking an asinh transform. To generate a new variable called asinh_income from a variable called income, type
gen asinh_income = asinh(income)