您當前位置：首頁 >> Python編程Python編程

日期：2019-06-04 11:36

Statistics 108

Homework Assignment 4

Correction to problem 5 chapter 6

Note, all problems listed here are to be written up and handed in on the due date

provided. Homework is posted concurrently with the class material. Sections may

be assigned at different times but have the same due date as previously assigned

problems. For the problems in chapter 7 you should use the package LEAPS.

The data set student survey was given to a sample of students, to investigate variables

that might be related to GPA. We are interested in fitting a model with GPA as

the outcome and the other variables as potential predictors (exercise, TV, siblings,

verbalSAT, mathSAT, SAT, piercings). The questions below for chapter 6 and chapter

7 problems all refer to this data set.

Chapter 6 problems below have a due date of June 5

(P1) For each of the potential predictors obtain a scatterplot of GPA plotted against the

predictor. Do any of the scatterplots suggest a linear relationship with GPA?

(P2) Obtain the Anova table for the regression of GPA on the list of predictors above. Omit

mathSAT and verbalSAT. State the null hypothesis being tested by the F-statistics

in the context of this problem. Do not use generic terms such as H0 : β1 = 0. Use

terminology such as H0 : βSAT = 0. Be very clear what the hypothesis tests. State

your conclusion for the model.

(P3) Obtain the t-tests for each of the variables in the model. Explain, in your own words

and in context, what hypothesis is being tested. State your conclusions in context

and identify variables that are significant and those that are not on the basis of the

p-value.

(P4) For the variables SAT, TV obtain the t-test from the simple linear regression. Compare

the test value, standard error and error degrees of freedom to the t-tests obtained

in problem 3.

(P5) Calculate the average GPA for students who watch 5 hours of TV, have 2 siblings,

exercise 10 hours, have 3 piercings and have an SAT score of 1200. Find the standard

error and calculate a 95% confidence interval.

(P6) Obtain the coefficient of determination for the model with SAT and for the model

where SAT is dropped and the two variable mathSAT and verbalSAT are included.

Which model has a higher coefficient of determination.

(P7) Obtain the residual plot of the residuals plotted against the fitted values. Do you see

any patterns that might indicate a violation of the regression assumptions? If yes,

which ones?

(P8) Obtain the normal probability plot of the residuals.

Chapter 7 problems have a due date of June 5

(P1) Obtain the partial sum of squares and partial F-test for the hypothesis H0 : βsibling =

βT V = 0. Provide the test-statistic, numerator and denominator degrees of freedom

and state your conclusions.

(P2) In the model building process we can choose various techniques for identifying which

of the potential predictors are useful and which ones are not. Use forward selection

and backward elimination to build a model for GPA. Do you arrive at the same

model? Show your steps and identify the order in which variables are entered for

forward selection and for backward elimination.

(P3) Obtain R2

adj for each of the models fit in the stepwise procedure. If you choose this

criterion for model fitting, do you obtain the highest R2

adj for the model selected

through backward elimination?

(P4) On the basis of Mallows Cp, which is the best model with 2, 3 or 4 predictors.

(P5) Try fitting a model with only mathSAT, verbalSAT and SAT. What happens to the

parameter estimates βmathSAT , βverbalSAT , βSAT ? Do you have an explanation?