---
title: "Jan 20"
output: html_notebook
---


Degree of reading  power (DRP) for 3rd graders data: 

```{r}
drp<-read.csv("DRP.csv")
```

Display the top of the data: 

```{r}
head(drp,10)
```

These are test scores. The "Treatment" group received some special training, while the "Control" group did not. 

Let's create vectors containing the control and treatment data:

```{r}
control<-drp$Score[drp$Group=="Control"]
treatment<-drp$Score[drp$Group=="Treatment"]
```


Let's find the range of these two groups: 

```{r}
range(treatment)
```

```{r}
range(control)
```

 
The "range()" function returns a vector with the max and min of the argument. We can find the range as defined in class like this: 

```{r}
range_treatment<-max(treatment)-min(treatment)
range_treatment
```

```{r}
range_control<-max(control)-min(control)
range_control
```


So we see there is quite a difference in the spread of the two data sets as measured by the range. Let's look at the data visually: 

```{r}
stem(treatment, scale=2)
```


```{r}
stem(control, scale=2)
```

Let's compute the interquartile ranges: 

```{r}
IQR(treatment)
```

```{r}
IQR(control)
```

Exercise: write a function that takes in a vector and returns a vector containing the outliers, and returns NULL if there are none. 

Let's create side-by-side boxplots of our two data sets: 

```{r}
type<-c("Treatment", "Control")
boxplot(treatment,control, names=type, horizontal = FALSE, main="Degree of Reading Power Test, Third Grade", ylab="Score")
```