Installing packages for this course

While base R has a great deal of essential functionality, most of the power of R comes from the rapidly growing list of user-created and contributed ‘packages’. A package is simply a bundle of functions and tools, sometimes also including example datasets, basic documentation, and even tutorial ‘vignettes’. You can see all the official R packages by going here: https://cran.r-project.org/web/packages/.

The most common way to install package in R is with the install.packages() command. For instance to install the package ggplot2 you do this:

install.packages("ggplot2")

Remember that you only need to install a package once (although you may have to update packages occasionally – see the green Update button in the Packages tab in R Studio). When you want to actually use a package (for example ggplot2) you call it like this:

library(ggplot2)

If your call to library() is working, nothing visible happens. However if you see errors, they might be because your package is out of date (and thus needs to be updated/reinstalled), or because some important dependencies are missing. Dependencies are other packages on which this package depends. Typically these are installed by default, but sometimes something is missing. If so, simply install the missing package and then try calling library(ggplot2) again.

Notice that for the function install.packages('yourPackage') you must use quotes around the package name. In contrast for the function library(yourPackage) you do not use quotes.

As you submit each installation request, note the output in your R console. If you get a warning that says installation was not possible because you are missing a package ‘yourPackage’, that suggests you are missing a dependency (e.g. something the main package needs to work correctly). Try installing the package mentioned in the error. If you have trouble, reach out to the TA’s!

Installing packages used for general data science

For the rest of this page, copy and paste the provided code in order to install packages necessary for this course. Notice if you hover to the right of a code-chunk in the html version of the eBook, you will see a copy icon for quick copying and pasting.

Although you are copying and pasting the code, take a moment to look at the output. Did you get any error messages that a package did not install? Sometimes you see a warning message announcing something, but perhaps the package installed ok. If there is an error message, the package probably did not install.

To see if a package was installed, try loading it by typing library(yourPackage). If nothing happens (no errors) then all is good!

These packages will support some of our general work in R:

  • rmarkdown allows the creation of mixed output documents that combine code, documentation and results in a single, readable format.
  • The packages tinytex and knitr are necessary for creating the R documents including PDF output that will be required for submitting assignments.
  • We will use many data manipulation tools from the tidyverse. You can learn more about the tidyverse here: https://tidyverse.tidyverse.org/, and you can see applications of tidyverse packages in the R for Epidemiologists Handbook. The tidyverse is actually a collection of data science tools including the visualization/plotting package ggplot2 and the data manipulation package dplyr. For that reason, when you install.packages('tidyverse') below, you are actually installing multiple packages!
  • The packages here and pacman are utilities to help simplify file pathnames and package loading behavior.
install.packages('tidyverse')   
install.packages(c('pacman', 'here'))
install.packages(c('tinytex', 'rmarkdown', 'knitr')) 
tinytex::install_tinytex()  
# this function installs the tinytex LaTex on your
#  computer which is necessary for rendering (creating) PDF's 

Installing packages use for geographic data

There are many ways to get the data we want for spatial epidemiology into R. Because we often (but don’t always) use census geographies as aggregating units, and census populations as denominators, the following packages will be useful. They are designed to quickly extract both geographic boundary files (e.g. ‘shapefiles’) as well as attribute data from the US Census website via an API.

install.packages(c('tidycensus','tigris')) 

help('census_api_key','tidycensus')

NOTE: To be able to interact with the Census bureau API through R, you will need a personalized “API key”. When you enter the second line of code above (e.g. the help() function), you will see information on how to:

  1. Request a key from the Census bureau
  2. Enter your cutoms key into your machine so that it is available when needed.

We will not need the Census API key for a couple of weeks, but it is good to start now and ask for help if you have trouble!

Installing packages used for spatial data manipulation & visualization

This section installs a set of tools specific to our goals of importing, exporting, manipulating, visualizing, and analyzing spatial data.

  • The first line of packages have functions for defining, importing, exporting, and manipulating spatial data.
  • The second line has some tools we will use for visualizing spatial data (e.g. making maps!).
install.packages(c('sp', 'sf', 'raster', 'RColorBrewer', 'OpenStreetMap'))  
install.packages(c('tmap', 'tmaptools')) 

BEWARE

There are many large shifts currently underway in R architecture for spatial analysis. As we will learn, for years we have been shifting away from an older data class defined in the package sp to a newer one called sf.

In addition to that shift, at the end of 2023 several packages that helped link older R functions to standard GIS libraries outside of R are being retired. These include maptools, rgeos and rgdal. Newer packages do not rely on them, but some older packages have not been updated. Note that sp, rgeos, rgdal, and maptools are all included in the list above. Each year we uncover new opportunities and new bugs related to unanticipated package dependencies caused by some packages aging out. We’ll see what bumps we run into this year!

Installing packages used for spatial analysis

Finally these are packages specifically for spatial analysis tasks we will carry out.

install.packages(c('spdep', 'CARBayes', 'sparr', 'spatialreg',  'DCluster', 'SpatialEpi', 'smerc'))
install.packages(c('GWmodel', 'spgwr') )