Demographic characteristics

The problem set again uses the Stata dataset WAGE2.dtA. The dataset
contains the information on monthly earnings, employment history,
education, demographic characteristics, and two test scores for 935 men in
year 1980:
wage monthly earnings (in 1976 USD)
hours average weekly hours of work
IQ IQ (intelligence quotient) score
educ years of education
age age in years
married =1 if the person is married; 0 otherwise
black =1 if the person is black; 0 otherwise
Problem 1: R2 and hypothesis testing in simple regression (20 points
total)

  1. (5 points) Estimate the following linear regression model and paste your
    results below:
    lwage= ?0 + ?1 educ + u
    Hint: You did this for PS2.
    What is the R2 of the regression and how do you interpret it? What does it tell
    you about the extent to which wages are causally affected by education?
  2. (5 points) Using the information on SST, SSE, and SSR in the Stata output,
    show how the R2 for this regression was calculated.
    Hint: R2 = SSE/SST, so plug the figures for SSE and SST into this formula and
    solve.
  3. (10 points) Use the output from your regression to test the hypothesis that
    education is unrelated to log-wage. Use the steps shown in the cookbook that we reviewed in class. (You can find this in the
    “Cookbooks & crib sheets” folder on D2L.) Start by stating the null and the
    alternative hypotheses in terms of the notation used in class. Use a
    significance level of 1%. What do you conclude and why? Be sure to explain
    your reasoning.
    Problem 2: Ability bias in the estimated return to education (30
    points total)
    Suppose that the population model for log-earnings (lwage) is given by:
    lwage = ?0 + ?1 educ + ?2 ability + v
    A person’s ability is not observed, so you estimate the simple linear
    regression model:
    lwage = ?0 + ?1 educ + u
    where ability is included in the error term u.
  4. (5 points) What is the OLS estimate of the slope coefficient on education
    from the simple linear regression and how do you interpret it?
    Hint: You did this in Problem Set 2, and you estimated this regression above
    in Problem 1. The dependent variable is in log terms, so interpret the slope
    coefficient as a percentage change.
  5. (5 points) What is the key condition for the OLS estimator of ?1 you
    obtained in question 1 to be an unbiased estimator of the effect of education
    on the wage? Does this condition hold? Why or why not?
    Hint: Think of ability as an omitted variable in the lwage model. (Be sure to
    read the section on “Omitted Variable Bias: The Simple Case” in section 3.3
    of Wooldridge.)
  6. (5 points) What is the direction of the bias of the OLS estimator of ?1 in the
    simple linear regression model? Explain your answer using the 2x2 OVB
    matrix in Wooldridge and the slides.
    Hint: Think about two things. First, how are lwage and and ability related in
    the population wage model — that is, what is the likely sign of the estimated
    slope coefficient on ability)? Second, how are the variables ability and educ
    related — that is, do workers with more ability tend to have more or less
    education than workers with less ability, on average?
  7. (5 points) Estimate a model of lwage on educ and IQ. (In Stata, type: reg
    lwage educ IQ.) What is the OLS estimate of ?1 from this regression and how
    do you interpret it? Why is it smaller than the estimate you obtained in
    question 1? Explain.
  8. (5 points) Now estimate the auxiliary regression of IQ on educ. Interpret
    the coefficient on educ in this regression. Is the relationship you estimate
    between IQ and educ consistent with your answer to question 3 above?
  9. (5 points) Show that the bias you discussed in question 4 can be written in
    terms of the OVB formula.
    Hint: First, write down the OVB formula:
    An easy way to write this without all the Greek letters and superscripts is:
    beta1-tilde = beta1-hat + (beta2-hat)(delta1-tilde).Now plug in the numbers
    for each term and show that the OVB formula is correct. Remember that
    beta1-tilde comes from the regression of lwage on just educ; that beta1-hat
    and beta2-hat come from the regression of lwage on both educ and IQ; and
    that delta1-tilde comes from the regression of IQ on educ.

Sample Solution