# R - Analysis of Covariance

**Analysis of Covariance**also called as

**ANCOVA**.

## Example

Consider the R built in data set mtcars. In it we observer that the field "am" represents the type of transmission (auto or manual). It is a categorical variable with values 0 and 1. The miles per gallon value(mpg) of a car can also depend on it besides the value of horse power("hp").We study the effect of the value of "am" on the regression between "mpg" and "hp". It is done by using the

**aov()**function followed by the

**anova()**function to compare the multiple regressions.

## Input Data

Create a data frame containing the fields "mpg", "hp" and "am" from the data set mtcars. Here we take "mpg" as the response variable, "hp" as the predictor variable and "am" as the categorical variable.input <- mtcars[,c("am","mpg","hp")] print(head(input))When we execute the above code, it produces the following result −

am mpg hp Mazda RX4 1 21.0 110 Mazda RX4 Wag 1 21.0 110 Datsun 710 1 22.8 93 Hornet 4 Drive 0 21.4 110 Hornet Sportabout 0 18.7 175 Valiant 0 18.1 105

## ANCOVA Analysis

We create a regression model taking "hp" as the predictor variable and "mpg" as the response variable taking into account the interaction between "am" and "hp".### Model with interaction between categorical variable and predictor variable

# Get the dataset. input <- mtcars # Create the regression model. result <- aov(mpg~hp*am,data = input) print(summary(result))When we execute the above code, it produces the following result −

Df Sum Sq Mean Sq F value Pr(>F) hp 1 678.4 678.4 77.391 1.50e-09 *** am 1 202.2 202.2 23.072 4.75e-05 *** hp:am 1 0.0 0.0 0.001 0.981 Residuals 28 245.4 8.8 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1This result shows that both horse power and transmission type has significant effect on miles per gallon as the p value in both cases is less than 0.05. But the interaction between these two variables is not significant as the p-value is more than 0.05.

### Model without interaction between categorical variable and predictor variable

# Get the dataset. input <- mtcars # Create the regression model. result <- aov(mpg~hp+am,data = input) print(summary(result))When we execute the above code, it produces the following result −

Df Sum Sq Mean Sq F value Pr(>F) hp 1 678.4 678.4 80.15 7.63e-10 *** am 1 202.2 202.2 23.89 3.46e-05 *** Residuals 29 245.4 8.5 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1This result shows that both horse power and transmission type has significant effect on miles per gallon as the p value in both cases is less than 0.05.

## Comparing Two Models

Now we can compare the two models to conclude if the interaction of the variables is truly in-significant. For this we use the**anova()**function.

# Get the dataset. input <- mtcars # Create the regression models. result1 <- aov(mpg~hp*am,data = input) result2 <- aov(mpg~hp+am,data = input) # Compare the two models. print(anova(result1,result2))When we execute the above code, it produces the following result −

Model 1: mpg ~ hp * am Model 2: mpg ~ hp + am Res.Df RSS Df Sum of Sq F Pr(>F) 1 28 245.43 2 29 245.44 -1 -0.0052515 6e-04 0.9806As the p-value is greater than 0.05 we conclude that the interaction between horse power and transmission type is not significant. So the mileage per gallon will depend in a similar manner on the horse power of the car in both auto and manual transmission mode.

*Table of contents:*1. R - Overview

2. R - Environment Setup

3. R - Basic Syntax

4. R - Data Types

5. R - Variables

6. R - Operators

7. R - Decision Making

8. R - Loops

9. R - Functions

10. R - Strings

11. R - Vectors

12. R - Matrices

13. R - Arrays

14. R - Factors

15. R - Data Frames

16. R - Packages

17. R - Data Reshaping

18. R - CSV Files

19. R - Excel Files

20. R - Binary Files

21. R - XML Files

22. R - JSON Files

23. R - Web Data

24. R - Database

25. R - Pie Charts

26. R - Bar Charts

27. R - Boxplots

28. R - Histograms

29. R - Line Graphs

30. R - Scatterplots

31. R - Mean, Median and Mode

32. R - Linear Regression

33. R - Multiple Regression

34. R - Logistic Regression

35. R - Normal Distribution

36. R - Binomial Distribution

37. R - Poisson Regression

38. R - Analysis of Covariance

39. R - Time Series Analysis

40. R - Nonlinear Least Square

41. R - Decision Tree

42. R - Random Forest

43. R - Survival Analysis

44. R - Chi Square Tests

## No comments:

## Post a Comment