Installing R/RStudio
R is an open source statistical and graphical language. R is free to use and download. It is a powerful programming language that can be used for statistics, machine learning, data visualization, report generation, and interfaces seamlessly with many other programming languages (HTML, Python, JavaScript, LaTeX, etc.). The #rstats community maintains current packages and creates new packages which extend the capabilities of the R programming language. With a focus on reproducible and repeatable research, the R programming language is a great compliment to the material you will learn in PL361.
To download R, you will go to CRAN (the Comprehensive R Archive Network). CRAN maintains various servers around the world and is used to distribute R and packages for R.
R runs from a terminal window. However, to take full advantage of the power of R, it is best to interact through an Integrated Development Environment (IDE). RStudio is a popular IDE that allows a user to interact with code and output within one interface. We will rely on RStudio during this course, RStudio desktop can be downloaded from the RStudio website.
I will provide a brief overview of R and RStudio during our first few statistics lessons. If you run into any issues downloading these two programs (R and RStudio) or if you want to become more familiar with how R is used, take a look at the R For Data Science book by Hadley Wickham and Garrett Grolemund.
We will extensively use the {tidyverse}
package to visualize and manipulate data. You can learn more about the underlying philosophy of the tidyverse on the associated website.
To install (install.packages()
) and load (library()
) the {tidyverse}
package, use the following functions:
install.packages("tidyverse") # This installs the package from CRAN
library(tidyverse) # This loads the package into the current environment for use.
Finally, for these courses, we will use quite a few toy datasets. These datasets will be consolidated in the {pl361}
and {pl462}
R packages hosted on github. To install the {pl361}
package, run the following code in the RStudio console
::install_github("A-Farina/pl361") devtools
To install the {pl462}
package, run the following code in the RStudio console
::install_github("A-Farina/pl462") devtools
Once installed, all of the datasets used during the courses can be accessed through the {pl361}
or {pl462}
package using a simple data()
function call. For example, if you wanted to load the helicopter dataset which compares the number of target acquisitions in simulations between pilots in an AH-64 Apache and pilots in an AH-1 Viper aircraft, you would start your script with the following code:
data(helicopter, package = "pl361")
Once the dataset is loaded, we can explore these data with a call to the glimpse()
function from the {dplyr}
package.
::glimpse(helicopter) dplyr
## Rows: 20
## Columns: 2
## $ apache <dbl> 35, 39, 46, 38, 41, 31, 32, 32, 36, 39, 31, 47, 44, 34, 31, 40,…
## $ viper <dbl> 36, 29, 36, 30, 28, 30, 38, 32, 34, 34, 33, 32, 31, 34, 28, 31,…
We can also do a quick visualization to better understand these data.
%>%
helicopter pivot_longer(cols = everything()) %>%
ggplot(aes(x = name, y = value)) +
geom_boxplot() +
geom_point(
size = 1.5,
alpha = .3,
position = position_jitter(
seed = 1, width = .1)) +
labs(title = "Target Acquisition According to Airframe",
x = "Airfame",
y = "Number of Targets Aquired")