homework/
folderfull_name_hw##.R
justin_pomeranz_hw14.R
Source with echo
Here we go for our final linear-model example. It is unique in that it combines a categorical explanatory variable with a continuous explanatory variable. What are we up to? We are combining regression and one-way ANOVA! Yes we are.
# load libraries
library(dplyr)
library(ggplot2)
Download the limpet.csv
file from D2L and put it in your
“data” folder. Read it in to R and convert it to a tibble
with the following command.
# read in data for problem 1
limp <- read.csv("data/limpet.csv") %>%
as_tibble()
The dataset we will use is limpet.csv
and is originally
from Quinn and Keough’s 2002 book Experimental Design and Data Analysis
for Biologists. The data relate egg production by limpets to four
density conditions in two seasons. The response variable (y) is egg
production (EGGS
) and the independent variables (x’s) are
DENSITY
(continuous) and SEASON
(categorical).
Because we are examining egg production along a continuous density
gradient, this is essentially a study of density-dependent reproduction.
The experimental manipulation of density was implemented in spring and
in summer. Thus, a motivation for collecting these data could be ‘does
the density dependence of egg production differ between spring and
summer’?
Density dependence is the idea that as the number of individuals sharing a food resource increases (e.g. their density goes up), they get a smaller and smaller portion of that food, leading to reduced reproduction success.
limpet
1.1 Print the names()
and head()
of your
data object here.
1.2 Add a comment to your script with each variable and its class.
Recall that SEASON
and DENSITY
are our
predictor variables.
1.3 Produce a scatter plot of the data with EGGS
as the
response variable and DENSITY
as the predictor
variable.
1.4 Does there appear to be a linear relationship between the number of eggs and increasing population density? Why or why not?
1.5 Add a line of best fit to your scatter plot with
geom_smooth(method = "lm")
1.6 Now, make another scatter plot (without the line) which includes
SEASON
mapped to the color
aesthetic. R
1.7 Does it appear that there is a different relationship between spring and summer? In other words, do the colors seem to “group” on one side/part of the graph, or are the colors randomly or evenly distributed across the graph?
1.8 Repeat the plot in 1.6, but now add a line of best fit. Below the plot, add a comment as to whether or not your answer in 1.7 is supported or if you’ve changed your mind.
1.9 Let’s check the normality of residuals with a qqplot. Make the plot, and add a comment below discussing if we meet or fail or asusmption.
1.10 Fit a model using lm()
for the limpet data, and
save it as a new data object. Make sure to include main effects of both
explanatory AND the interaction between them.
1.11 Before we look at the model output, let’s make a fitted vs. residual plot to check our assumption of Equal variance of residuals.
1.12 Add a comment to Interpret the plot made above. Does this model appear to have equal variance in the residuals? Why or why not?
1.13 Now, use the anova()
function on your model fit
object above to print out an ANOVA table for your linear regression.
1.14 Add a comment to your script interpreting the ANOVA table for our ANCOVA analysis. Discuss what the predictor variable means, and how that relates to a linear model. Recall which variables were continuous or categorical.
HINT: Continuous predictors indicate if there is a relationship (Linear Regression).
Categorical predictors indicate if there is a difference in means between groups (different intercepts)
1.15 Print out the summary()
of the linear model object
you created.
1.16 Add a comment to your script describing what each coefficient means. Be sure to specify what group (if necessary) the coefficient refers to.
1.17 Add a comment which has the adjusted R-squared value from our model as well as an interpretation for what it means.
1.18 Add a comment to your script writing out the full equations for our linear model.
\[EGGS_{spring} = \beta_0 + \beta_1 * DENSITY \]
\[EGGS_{summer} = (\beta_0+\beta_{0 \Delta}) + (\beta_1 +\beta_{1 \Delta}) * DENSITY \]