- We’ve learned about two general ways to store data, vectors and data frames
- Vectors store a single set of values with the same type
-
Data frames store multiple sets of values, one in each column, that can have different types
- These two ways of storing data are related to one another
- A data frame is a bunch of equal length vectors that are grouped together
- So, we can extract vectors from data frames and we can also make data frames from vectors
Extracting vectors from data frames
- There are several ways to extract a vector from a data frame
- Let’s look at these using the Portal data
- We’ll start by loading the
surveystable into R
surveys <- read.csv("surveys.csv")
- One common approach to extracting a column into a vector is to use the
[] - Remember that
[]also mean “give me a piece of something” - Let’s get the
species_idcolumn "species_id"has to be in quotes because we we aren’t usingdplyr
surveys["species_id"]
- This actually returns a one column data frame, not a vector
- To extract a single column as a vector we use two sets of
[] - Think of the second set of
[]as getting the single vector from inside the one column data frame
surveys[["species_id"]]
- We can also do this using
$ - The
$in R is short hand for[[]]in cases where the piece we want to get has a name - So, we start with the object we want a part of, our
surveysdata frame - Then the
$with no spaces around it - and then the name of the
species_idcolumn (without quotes, just to be confusing)
surveys$species_id
Combining vectors to make a data frame
- We can also combine vectors to make a data frame
- We can make a data frame using the
data.framefunction - It takes one argument for each column in the data frame
- The argument includes the name of the column we want in the data frame,
=, and the name of the vector whose values we want in that column - Just like
mutateandsummarize - So we give it the arguments
sites, anddensity
density_data <- data.frame(sites = c("a", "a", "b", "c"), density = c(2.8, 3.2, 1.5, 3.8))
-
If we look in the
Global Environmentwe can see that there is a new data frame calleddensity_dataand it has our two vectors as columns -
We could also make this directory using the vectors that are already stored in variables
sites <- c("a", "a", "b", "c")
density <- c(2.8, 3.2, 1.5, 3.8)
density_data <- data.frame(sites = sites, density = density)
- We can also add columns to the data from that only include a single value without first creating a vector
- We do this by providing a name for the new column, an equals sign, and the value that we want to occur in every row
- For example, if all of this data was collected in the same year and we wanted to add that year as a column in our data frame we could do it like this
density_data_year <- data.frame(year = 2000, sites = sites, density = density)
year =sets the name of the column in the data frame- And 2000 is that value that will occur on every row of that column
- If we run this and look at the
density_data_yeardata frame we’ll see that it includes the year column with2000in every row
Summary
- So, that’s the basic idea behind how vectors and data frames are related and how to convert between them.
- A data frame is a set of equal length vectors
- We can extract a column of a data frame into a vector using either
$or two sets of[] - We can combine vectors into data frames using the
data.framefunction, which takes a series of arguments, one vector for each column we want to create in the data frame.