homework/
folderfull_name_hw##.R
justin_pomeranz_hw08.R
Source with echo
# load libraries
library(dplyr)
library(ggplot2)
# read in data
bt_samp <- read_csv("data/body_temp_hr.csv")
We will be formally testing whether the true population mean of human body temperatures is eaqual to 98.6 degrees Fahrenheit.
1.1 Add a comment to your homework script stating the null and
alternative hypotheses. Make sure to put each hypothesis on its own
line. Make sure that your alternative hypothesis is 2-tailed.
1.2 Extract the body_temp
column/variable from
bt_samp
and store it as a vector called b
.
1.3 Create new variables for the mean, SD and SEM of the vector
b
. Be sure that the values print out when you source your
script so that I can check your work.
1.4 Calculate a t-statistic and print it out to your console.
1.5 calculate the critical value of t for this analysis and store it as a variable.
HINT use the qt()
function, and
make sure you calculate the correct degrees of freedom for this data
set.
Give it a different name than what was used in the lab.
1.6 Add a comment interpreting how the t-stat compares to our t-critical value. Do we reject or fail to reject our null hypothesis?
1.7 Calculate the p-value for your t-statistic in this analysis.
1.8 Analyze your b
vector using the lm()
function. Make sure to correctly account for your null value of 98.6
degrees.
summary()
function to print out the results1.9 Add a comment to your script interpreting the results of your analysis. Be sure to refer back to the null and alternative hypotheses (reject or fail to reject?) include the t-statistic, the degrees of freedom, and the p-value in your answer.
scandens
We want to know if the average beak length (blength
) has
changed in the population of Geospiza scandens living on Daphne
Major in the Galapagos islands between 1975 and 2012.
We will once again be using the Geospiza scandens data from
the darwin-finches.csv
.
# read in full data set
finch <- read_csv("data/galapago-finches.csv")
# filter out scandens data and make a tibble
scandens <- finch %>%
filter(species == "scandens")
# change the year column to a character data type
# this is necessary to run a 2-sample t-test
# If we leave year as an integer, the lm() function
# will treat it as continuous and will calculate statistics for a linear regression.
scandens$year <- as.character(scandens$year)
All of the following questions refer to the scandens
object.
2.1 Add a comment to your homework script stating the null and alternative hypotheses. Make sure to put each hypothesis on its own line. Make sure that your alternative hypothesis is 2-tailed.
2.2 Use dplyr
, group_by()
,
summarize()
and mutate()
to calculate the
following statistics for the blength
variable in each
year
:
1.3 Use the lm()
function to perform a 2-sample t-test
on whether or not the average blength
(in mm) is equal
between the two years of data.
+ Note that you will be using the scandens
data for this
step, not the summarized data you calculated in 1.2 above.
+ Make sure to save your model-fit as an object, and then use the
summary()
function to print out the results.
1.4 Add a comment to your script interpreting the results of your analysis.
2.3 Use the lm()
function to perform a 2-sample t-test
on whether or not the average blength
(in mm) is equal
between the two years of data.
summary()
function to print out the results.2.4 What is the estimate and standard error for blength
in the “1975” sample? What is the estimate and standard error for the
difference in blength
between “1975” and the
“2012” samples?
2.5 Add a comment to your script interpreting the results of your analysis. Be sure to refer back to the null and alternative hypotheses (reject or fail to reject?) include the t-statistic, the degrees of freedom, and the p-value in your answer.
darwin_maize
For this problem, we will be returning to the
darwin_maize
data. Recall that the original experimental
design included growing plants from different fertilization categories
(self vs. cross). The individual plants were taken from the same parent,
so are “paired” in this sense. Here, we will be testing the hypothesis
that there is a difference in height
between the Self and
Cross type
.
darwin_maize
For this problem, we will be using the darwin_maize.csv
available on D2L. Download this csv file into your data/
folder and read it into R.
darwin_maize <- read_csv("data/darwin_maize.csv")
3.1 Add a comment to your homework script stating the null and alternative hypotheses. Make sure to put each hypothesis on its own line. Make sure that your alternative hypothesis is 2-tailed.
3.2. Make a new data object called darwin_diff
which
includes the difference
in heights
for each
pair
in the data set. + As in the lab, the way the data is
structured does not make this task simple, but we can use
group_by()
function before using the -diff()
function inside of mutate()
to accomplish this.
+ See the caterpillar
example from the lab.
+ Hint your new object should contain the following
columns: pair
and difference
.
+ Make sure to print out your new darwin_diff
object at the
end of this problem.
3.3 Use lm()
function to formally test your
hypothesis.
+ Store the output of lm()
in a new data object, and then
use the summary()
function to print out the results.
3.4 Add a comment to your script interpreting the results of your analysis.
# read in the data
darwin_maize <- read_csv(here::here("data/darwin_maize.csv"))
3.1 Add a comment to your script describing the null and alternative hypotheses. Put each hypothesis on its own line.
3.2 You will need to modify the structure of the
darwin_maize
object in order to formally test these
hypotheses. Follow the example in lab, and save your modified data as a
new object.
pair
and you will
want to calculate the difference in the height
variable.3.3 Use the lm()
function to calculate a formal test on
the modified data object.
Hint: If you modified the data correctly in 3.2, you should be
setting up an “intercept-only” model inside of
lm()
.
Be sure to store your model fit in a new object and run the
summary()
function on it.
3.4 Add a comment to your script describing the estimated average and standard error for the difference between the two groups.
3.5 Add a comment to your script interpreting the results of your analysis. Be sure to refer back to the null and alternative hypotheses (reject or fail to reject?) include the t-statistic, the degrees of freedom, and the p-value in your answer.