---
title: "Jan 20"
output: html_notebook
---
Degree of reading power (DRP) for 3rd graders data:
```{r}
drp<-read.csv("DRP.csv")
```
Display the top of the data:
```{r}
head(drp,10)
```
These are test scores. The "Treatment" group received some special training, while the "Control" group did not.
Let's create vectors containing the control and treatment data:
```{r}
control<-drp$Score[drp$Group=="Control"]
treatment<-drp$Score[drp$Group=="Treatment"]
```
Let's find the range of these two groups:
```{r}
range(treatment)
```
```{r}
range(control)
```
The "range()" function returns a vector with the max and min of the argument. We can find the range as defined in class like this:
```{r}
range_treatment<-max(treatment)-min(treatment)
range_treatment
```
```{r}
range_control<-max(control)-min(control)
range_control
```
So we see there is quite a difference in the spread of the two data sets as measured by the range. Let's look at the data visually:
```{r}
stem(treatment, scale=2)
```
```{r}
stem(control, scale=2)
```
Let's compute the interquartile ranges:
```{r}
IQR(treatment)
```
```{r}
IQR(control)
```
Exercise: write a function that takes in a vector and returns a vector containing the outliers, and returns NULL if there are none.
Let's create side-by-side boxplots of our two data sets:
```{r}
type<-c("Treatment", "Control")
boxplot(treatment,control, names=type, horizontal = FALSE, main="Degree of Reading Power Test, Third Grade", ylab="Score")
```