homework/
folderfull_name_hw##.R
justin_pomeranz_hw05.R
Source with echo
dplyr
and ggplot2
packages are downloaded on the machine you’re using
Packages
tab in
the Files, Plots, Packages… panel and scrolling downinstall.packages("dplyr")
dplyr
package:# load libraries
library(dplyr)
library(ggplot2)
For this homework assignment, we will be working with real data from “Darwin’s Finches” in the Galapagos Islands. Famously, the variation in beak sizes of these birds is one of the observations which is often credited to Darwin’s formulation of his Theory of Natural Selection (although in reality he just brought back finch specimens, and museum archivists were the ones to realize the diversity of species on the islands).
Figure By John Gould (14.Sep.1804 - 3.Feb.1881) - From “Voyage of
the Beagle”
Public Domain, available http://darwin-online.org.uk/converted/published/1845_Beagle_F14/1845_Beagle_F14_fig07.jpg
The beaks of the Galapagos finches was the primary detail of interest. For birds, beaks are the primary tool at their disposal, and the type (size, shape, etc.) of beak you have determines what type of food resources that are available to you, and the types of “jobs” you can do.
Regardless of Darwin’s personal affinity (or lack thereof) for these birds, they have nonetheless played an outsized role in the history of biology. In more recent times, Drs. Peter and Rosemary Grant have spent decades on the islands following the birds through time and documenting changes in Finch morphology in response to drought and other environmental conditions. Their work has been well documented, including in the popular science book Beak of the finch by Jonathan Weiner, and much of this work is summarized in an educational documentary on HHMI.
In this homework assignment, we will be working with a subset of the Grants’ beak measurement data for two of the species, Geospiza fortis, and G. scandens, collected on Daphne Major Island in 1975 and 2012. The full data set is available at Data Dryad
Be sure to download the data galapago-finches.csv
and
from D2L and place them both in your data/
folder in your R
project.
Copy the following code and put it at the top of your homework script:
finch <- read.csv("data/galapago-finches.csv")
Get to know your data by using the names()
,
dim()
, head()
, and str()
functions.
which species are present? To determine this, use the following
code: unique(finch$species)
Split the data based on species, and make two new data frames,
one called fortis
and one called
scandens
.
filter()
function from dplyr.==
in the filter functionfortis <- finch %>% filter(species == "fortis")
fortis1975
fortis2012
scandens1975
scandens2012
dim()
function. Do these dimensions make sense? You may
want to rerun the dim()
from problem 1.1 above.fortis1975
ggplot()
to make separate histograms for both the
blength
and the bdepth
variables.aes(x = blength)
+
at the end of each function in
order to add another layer.
+
at the last line.geom_histogram(bins = 15)
labs(title = "...")
function to add a title
with the species name and year.theme_bw()
if you want to.#
to make it a comment in your R scipt.blength
for the
fortis1975
object.2*
and your 99% will have a
3*
in the calculation.#
in your
R script.fortis2012
Calculate the mean, SD and SEM for beak length for the fortis species from 2012.
Calculate a 95% and 99% CI for beak length for the fortis species from 2012.
Compare and interpret the 95% CIs for the fortis species
from 1975 and 2012.
scandens
beak depthCalculate the mean, SD, SEM, and 95% confidence interval for the
blength
variable for the scandens species in both
1975 and 2012.
Interpret the 95% confidence intervals for the scandens
species in the two years.
In the above example we split the original data object into four
based on combinations of finch species and survey year. However, we
could have performed all of these calculations on the original data set
using dplyr
. Using dplyr
functions and the
original, complete, finch
data object, calculate the mean,
sd, sem, and 95% CIs for beak length for each combination of year, and
species.
Hints
* Recall the group_by()
function.
* summarize()
collapses a large data set down to one row
per group
* the mutate()
function can be used to make new column(s)
based on calculations using values already in columns.