Use the uploaded Excel file and any statistical software R Studio, SPSS (except MS Excel) to answer the following questions. While uploading your submission, including the code, your output pasted in the word document with your answers. Also, upload an output or code file. 


These data are for full-time workers, defined as workers employed more than 35 hours per week for at least 48 weeks in the previous year. 

Variables and their codes:

FEMALE: 1 if female; 0 if male

YEAR: Year

AHE: Average Hourly Earnings (Dependent variable, Y)

BACHELOR: 1 if worker has a bachelor’s degree; 0 if worker has a high school degree

AGE: Age (Independent variable, X)


1. Draw a scatter plot between AHE (Y-axis) and AGE (X-axis).

2. Repeat #1 but create separate scatterplots with the BACHELOR variable. [You should have two scatterplots here!]

3. Run descriptive statistics for the dataset. 

4. Repeat #3 but separately for males and females.

5. A researcher believes that AHE for female workers is the same irrespective of their educational status. Run a hypothesis test to verify their claim. Use a 5% level of significance (i.e. α = 0.05). What is the null and alternate hypothesis of this test? Interpret the result. 

6. Calculate the correlation coefficient between AHE and AGE.

7. a. Run a regression for males and females separately.

b. Interpret the coefficients. 

c. For males, what is AHE when AGE is 33? What about for a 33-year-old female?

