Learning Objectives
Following this assignment students should be able to:
- connect to a remote database and execute simple queries
- integrate database and R workflow
- export output data from R to database
- tidy data table with redundant fields or overfilled cells
Reading
-
Topics
- Cleaning messy data using tidyr
-
Readings (optional)
Lecture Notes
Exercises
Tree Biomass (100 pts)
Estimating the total amount of biomass (the total mass of all individuals) in forests is important for understanding the global carbon budget and how the earth will respond to increases in carbon dioxide emissions. We can estimate the mass of a tree based on its diameter.
There are lots of equations for estimating the mass of a tree from its diameter, but one good option is the equation:
Mass = 0.124 * Diameter2.53
where
Mass
is measured in kg of dry above-ground biomass andDiameter
is in cm DBH (Brown 1997).We’re going to estimate the total tree biomass for trees in a 96 hectare area of the Western Ghats in India. The data needs to be tidied before all of the tree stems can be used for analysis. f If the
Macroplot_data_Rev.txt
is not already in your working directory download a copy.- Use
pivot_longer()
to create a longer data frame with one row for each measured stem. Use dplyr’sfilter
function to remove all of the girths that are zero. Store this longer data frame in a variable and also display it. - Write a function that takes a vector of tree diameters as an argument and
returns a vector of tree masses using the equation above. Test it usingmass_from_diameter(22)
. - Stems are measured in girth (i.e., circumference) rather than diameter.
Write a function that takes a vector of circumferences as an argument
and returns a vector of diameters (
diameter = circumference / pi
). Test it usingdiameter_from_circumference(26)
. - Use the two functions you’ve written to and dplyr to add a
mass
column to your longer data frame. Store this data in a variable and display it. - Estimate the total biomass by summing the mass of all of the stems in dataset.
separate()
theSpCode
column intoGenusCode
andSpEpCode
columns and then usegroup_by
andsummarize
to the total biomass for each uniqueGenusCode
.- Use ggplot to make a histogram of the
diameter
values. Make the x label"Diameter [cm]
and the y label"Number of Stems"
- Use