top of page

8. Regression in R using the growth dataset


Eastern Banjo Frog

Regression

Regressionis the analysis you use when your independent or predictor variable is continuous/scale (“vector”) and your dependent or response variable is also continuous scale (“vector”).

Regression is the test you use when you have a solid a prior notion that the value of the independent variable is driving the value of the dependent variable. If you are just testing to see whether there is a relationship “in general” between two continuous variables, and you have no notion of the directionality of this relationship, then you are/should be doing a correlation analysis.

You may be interested in whether there is a predictive relationship between chick growth rate and hatch weight. At this stage it might help to visualise the figure/graph of these data. If you think/are predicting that growth rate is driven by/significantly influenced by hatch weight, then you would put hatch weight on the x axis (independent variable) and growth rate (dependent variable) on the y axis. I have made a dataset called “regression.txt” (made from the “wtvsgrowth” Excel file) this has two columns with the headings “weight” and “growth”. To run the test you need to get the dataset into R, we will call the dataframe “grow”:

grow<-read.table(file.choose(), header=TRUE)

If you type “grow” straight into the console, your dataframe will show with the two columns and their headings and all the data points.

To run the test, use the “lm” function following command (I have called our regression object “reg”):

reg<-lm(grow$growth~grow$weight)

This command is asking the “lm” function to test whether there is a statistical relationship between growth and weight with growth bring “predicted” by (“~”) weight.

To get the results of this test you use the “summary” function which gives the following:

summary(reg)

Call:

lm(formula = grow$growth ~ grow$weight)

Residuals:

Min 1Q Median 3Q Max

-18.773 -15.582 -1.552 12.001 30.236

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 50.55414 31.66839 1.596 0.125

grow$weight -0.03940 0.09621 -0.410 0.686

Residual standard error: 17.4 on 22 degrees of freedom

Multiple R-squared: 0.007566, Adjusted R-squared: -0.03754

F-statistic: 0.1677 on 1 and 22 DF, p-value: 0.6861

To report these results in a paper or report you would write something like, “linear regression analysis found hatch weight did not significantly predict growth rate in chicks (F(1,22)= 0.17, P = 0.69, R2= 0.008)”.

The R-square value will be 1 if there is a “certain” predictive relationship between the two variables. With real-world data you are not likely to get an R-square of 1, you might get 0.90 or higher though. A high R-squared value like this (obviously along with P < 0.05) indicates a strong relationship between the variables.


bottom of page