5. Two sample t-tests in R with bill length data-set
Two sample t-test (also called Student’s two-sample t-test)
You use a t-test when you compare the values of a continuous variable between two groups (or "levels") of categorical variable. For example, you can use this test to answer the question, “do female and male five day old chicks (categorical/factor variable) differ significantly in their bill length (continuous/scale/vector variable)?”.
You have already imported your kiwi chick bill size dataframe into R and it is ready for analysis. You called it “data1”. You also checked and made sure the “stats” packaged was installed and loaded – you are ready! To conduct the t-test you enter the following command exactly:
ttest1<-t.test(bill~sex, data=data1)
You have asked R to conduct a t-test on your data frame "data1". You have asked R to see whether “bill” significantly differs ("~") due to “sex”. You typed “bill” and “sex” in exactly as they appeared in your text file (R is case sensitive). The little “~” basically means “is explained by”, or “is statistically associated with”. So you are asking R to see whether bill is explained by sex. The “data=data1” part is you telling R which loaded dataframe contains the two variables “bill” and “sex”. The object you have created is “ttest1”. This object “ttest1” is the t-test! (You could have called it anything though).
To see the results of the t-test, you now simply type the name of the t-test object into the console, “ttest1”. You will get the following output:
>ttest1
Welch Two Sample t-test
data: bill by sex
t = 2.9375, df = 11.384, p-value = 0.01307
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
3.891452 26.775215
sample estimates:
mean in group female mean in group male
39.50000 24.16667
If you wanted to write up the results of this t-test in a paper/report you would write the following: “female and male five day old chicks significantly differed in their bill length (t = 2.94, df = 11.85, P <0.05)”. You could make a figure reporting the means and confidence intervals or standard deviation/error of the mean as well, or you could just report these values in the text. (More on standard deviation and error soon.)