5. Two sample t-tests in R with bill length data-set

Lindsey Gray
Aug 25, 2018
2 min read

Two sample t-test (also called Student’s two-sample t-test)

You use a t-test when you compare the values of a continuous variable between two groups (or "levels") of categorical variable. For example, you can use this test to answer the question, “do female and male five day old chicks (categorical/factor variable) differ significantly in their bill length (continuous/scale/vector variable)?”.

You have already imported your kiwi chick bill size dataframe into R and it is ready for analysis. You called it “data1”. You also checked and made sure the “stats” packaged was installed and loaded – you are ready! To conduct the t-test you enter the following command exactly:

ttest1<-t.test(bill~sex, data=data1)

You have asked R to conduct a t-test on your data frame "data1". You have asked R to see whether “bill” significantly differs ("~") due to “sex”. You typed “bill” and “sex” in exactly as they appeared in your text file (R is case sensitive). The little “~” basically means “is explained by”, or “is statistically associated with”. So you are asking R to see whether bill is explained by sex. The “data=data1” part is you telling R which loaded dataframe contains the two variables “bill” and “sex”. The object you have created is “ttest1”. This object “ttest1” is the t-test! (You could have called it anything though).

To see the results of the t-test, you now simply type the name of the t-test object into the console, “ttest1”. You will get the following output:

>ttest1

Welch Two Sample t-test

data: bill by sex

t = 2.9375, df = 11.384, p-value = 0.01307

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

3.891452 26.775215

sample estimates:

mean in group female mean in group male

39.50000 24.16667

If you wanted to write up the results of this t-test in a paper/report you would write the following: “female and male five day old chicks significantly differed in their bill length (t = 2.94, df = 11.85, P <0.05)”. You could make a figure reporting the means and confidence intervals or standard deviation/error of the mean as well, or you could just report these values in the text. (More on standard deviation and error soon.)

#AudienceEngagement #Blog

EASY-R

Step-by-step instructions on how to use the free statistics program R for absolute beginners, by Biologists Lindsey Gray and Brittany Mitchell.

Download R: https://cran.r-project.org

1. Quick statistics re-hash before going into R-specific stuff

2. R specific words and definitions you need to learn

3. Downloading R and R packages

5. Two sample t-tests in R with bill length data-set

Comments