Welcome to R

Started in the 1990s, R is a computer programming language with statistics in mind. R is great for statistical analysis and visualizations. Recently, R Markdown and R Shiny have been developed. We will have a brief intro now!

The Basics

We are (hopefully) working in RStudio, which is an editor with useful features and is especially great for beginners. It looks like this!

First download this zipped folder, unzip it, and open up RStudio.

There are 4 windows in RStudio. We will mostly be focusing on the two left ones. The top left window is our editor, which is where we can store our code. The bottom left is the console, where R results come from. We can type in the console directly. Try typing the following commands in the console, pressing enter after each line.

print("Hello World")
2 + 2
1:10
plot(1:10, 2 * 1:10)

This is it! This is R. Everything we do is based on short simple commands.

R Markdown and the Editor

First we need to change our working directory. Go to Session -> Set Working Directory -> Choose Directory. Find and select the folder called wt that you just downloaded and unzipped. Select File -> Open File and select cog.Rmd. Then press Knit HTML, which will cause this document to pop up. Wild.

This document is made with R Markdown. R Markdown allows for the incorporation of code and the use of results straight into the final write-up!

In between the accents below, type print("Hello World"). Then Knit the document. What happens? Then repeat with plot(rnorm(10), rnorm(10)). Do you get the same plot as your neighbor?

# Type code below here
plot(rnorm(10), rnorm(10))

This is just the tip of the iceberg with R. I’ve included some resources for R at the bottom. Some of these resources may be too esoteric at the moment, but don’t fret! Right now, just do the things that make sense and are fun to do!

Note: Sometimes, the hardest part about R is downloading it. Unfortunately, it’s not easy! Please do not hestitate to contact me at sgallagh@stat.cmu.edu for help in downloading R and RStudio.

A day in the life of a statistician

Background

Statisticians do a lot of coding (all day erry day) but usually with a set purpose of answering questions such as:

  1. When do people buy the most ice cream?

  2. How much dark matter is there in the universe?

  3. How bad is the flu going to be this year in Pittsburgh?

The statistician has to develop a plan which usually consists of:

  1. Work with scientists to collect data

  2. “Clean” the data

  3. Manipulate data using tools such as R and Shiny

  4. Analyze the results

  5. Present results in a clear and interesting manner

Sadly, we usually don’t get to be involved in the data collection as much as we would like. You filled out a survey so we will be looking at that. I also sent the same survey to some of my peers. We will try to answer some questions about the data. I already went ahead and did most of steps 2-3 so we can now work on steps 4-5. To do this, we will be using a Shiny App.

Shiny and You

Shiny is an extension of R that allows us to make an interactive website with minimal knowledge of actually knowing how websites work.

I first need you to run the following command in the console:

Note You must have your working directory set to the wt folder!!

source("prelim.R")

This should install all the packages we need.

Now select File -> Open File and select ui.R. It should now appear as a tab in the top left window of RStudio. This is the user interface of the Shiny App, meaning the graphical appearance and things we interact with. Another important file is server.R which does all the computations. Click on the tab ui.R and press Run App.

A working Shiny App should pop up that looks like the below! Alternatively, you find the app at this link.

Try playing around with it and then using the tabs to answer the following questions. You can insert your answers right in this document! Knit this file at the end to make your final report!

Height and Shoe Size

  1. Select the “Millennials” data set. Describe in 1-2 sentences the trend of height vs shoe size.

ANSWER:

  1. Do there seem to be any points that are “out of place?” In statistics, we call these “outliers.”

ANSWER:

  1. Click on “Gender” under “Split on Group.” Which group is taller?

ANSWER:

  1. Click on “Smoother” for the Trend Line. Move the slider. What happens to the trend line?

ANSWER:

  1. Repeat with the “Coding for Girls” data set.

ANSWER:

  1. Do you think height and shoe size are related? Why or why not?

ANSWER:

Age, TV, and Color

  1. What is the most popular TV show for Millennials? For Coding for Girls?

ANSWER:

Word Cloud

The word cloud displays the most common responses about people’s activities or descriptions. What are the most common activities for Millennials? For Coding for Girls?

ANSWER:

General

  1. What kind of statistical questions would you like to answer?

ANSWER:

Bonus Shiny

Go to File -> Open File and select control.R.

Shiny Apps are usually easy to break, but in the control.R file I’ve isolated a few parameters you can change. Try changing the text within the quotes! Try changing the colors for the first graph. Here is a list of R color names. You can also put a different picture in. Put a picutre in the www subfolder of wt and change bailee.jpg to the picture name.

References

History of R