# Lab 5

Name: ______________________________________________________

You are working on an alternative energy source and biomass is a key component. You want to predict above-ground biomass for this region, and you believe that biomass is related to substrate (subsoil) variables of salinity, water acidity, potassium, sodium, and zinc. Your crew collects information on biomass and these five variables for 45 plots.

1) Before you create this regression model, you must examine the relationships between each of the five predictor variables and biomass (the response variable). Create five scatterplots using biomass as the response variable (y) and each of the predictor variables (x). Compute the linear correlation coefficient for each pair. Describe the relationships.

GRAPH>Scatterplot>Simple>OK. The response variable (y-variable) is Bio and the five predictor variables are the x-variables. Look at the scatterplots and describe each relationship below. Next compute the correlation coefficient for each pair and write the r-value below. STAT>Basic Statistics>Correlation. You can easily do all correlations at once by creating a correlation matrix. Put all predictor variables in the Variables box together.

Correlation (r)                           Description

Bio v. sal ______________________________________________________

Bio v.pH ______________________________________________________

Bio v. K _______________________________________________________

Bio v. Na ______________________________________________________

Bio v. Zn ______________________________________________________

Circle the above pair that has the strongest linear relationship.

2) You are now going to create four regression models using the predictor variables. You will compare the adjusted R2, regression standard error, p-values for each coefficient, and the residuals for each model. Using this information, you will select the best model and state your reasons for this choice.

Begin with the full model using all five predictor variables. STAT>Regression>General Regression. Put Bio in the Response box and all five predictor variables in the Model box (see image). Click Results and make sure that the Regression equation, coefficient table, Display confidence intervals, Summary of Model, and Analysis of Variance Table are checked (see image). Click OK. Click Graphs and make sure that under Residual Plots that Individual plots and Residual versus Fits are selected (see image). Click OK.

#### MODEL 1

Write the regression model _______________________________________________

Write the adj. R2 ________________________________________________________

Write the regression standard error _________________________________________

Examine the residual plot. Are there any problems? ____________________________

Write the variables which are NOT significant ________________________________

#### MODEL 2

Now remove the LEAST significant variable (highest p-value) and repeat the steps using only the remaining variables.

Write the regression model _______________________________________________

Write the adj. R2 ________________________________________________________

Write the regression standard error _________________________________________

Examine the residual plot. Are there any problems? ____________________________

Write the variables which are NOT significant ________________________________

MODEL 3

Now remove the LEAST significant variable (highest p-value) and repeat the steps using only the remaining variables.

Write the regression model _______________________________________________

Write the adj. R2 ________________________________________________________

Write the regression standard error _________________________________________

Examine the residual plot. Are there any problems? ____________________________

Write the variables which are NOT significant ________________________________

#### MODEL 4

Now remove the LEAST significant variable (highest p-value) and repeat the steps using only the remaining variables.

Write the regression model _______________________________________________

Write the adj. R2 ________________________________________________________

Write the regression standard error _________________________________________

Examine the residual plot. Are there any problems? ____________________________

Write the variables which are NOT significant ________________________________

3) Select the best model and state your reasons for selecting this model.