This section is devoted to all things bar graphing. This is mostly aimed at graphing basic ANOVA results, displaying errorbars, statistical significance through a Tukey’s HSD and the difference between “stacked” and “dodged” plots.
For this section, we will be using the weeds dataset where we performed a two-factor ANOVA For a quick reminder: weeds.aov2 <- aov(flowers ~ species * soil, data = weeds) anova(weeds.aov2) ## Analysis of Variance Table ## ## Response: flowers ## Df Sum Sq Mean Sq F value Pr(>F) ## species 2 2368.6 1184.31 9.1016 0.0005203 *** ## soil 1 238.5 238.52 1.8331 0.1830080 ## species:soil 2 155.0 77.52 0.
Error bars are a simply addition to your graph, utilising their own geometric command geom_errorbar(). To add the error bars, we use the following command ggplot(weeds.summarise, aes(x=species, y=mean)) + geom_bar(stat="identity")+ geom_errorbar(aes(ymin = mean-se, ymax = mean+se)) This is suprisingly simple. All we do is specify the aesthetic (aes) where we compute our minimum and maximum y values for our bars as our mean column +/- our standard error column. We can further customise our errorbars through the use of a few arguments.
When presenting our results to an audience (paper or presentation) it is important to communicate our results clearly in a manner that is understandable to a wider audience. Tha main way to do so with an Analysis of Variance, is using a post-hoc test like a Tukeys Honest Significant Difference (Tukeys HSD). This will analyse the differences between the levels within a factor to distinguish which levels are significantly different from one another.
In the last examples, we plotted a single column graph with errorbars and significant notation. To plot multiple columns, for example a soil by species interaction, is quite simple. Firstly, we will run our summarise command, adding the soil column into our group_by() command to generate the means and standard error for the soil, species combinations. weeds.summarise2 <- weeds %>% group_by(species, soil) %>% summarise(mean = mean(flowers), se=sd(flowers)/sqrt(n())) We plot multiple columns by specifying one column in our x axis, and filling/colouring by another.
Thats the general process for setting up a column graph for ANOVA data. It can take some time, but we get alot of freedom in how we present this. Let’s spruce up our graph to a finalised form, before we save it to an image file. weeds.bar <- ggplot(weeds.summarise, aes(x=species, y=mean, fill=species))+ geom_bar(stat="identity", show.legend=F, colour="black")+ labs(x="Weed Species", y= expression(Flowers~(m^3)))+ theme(panel.background = element_blank(), panel.grid = element_blank(), axis.line = element_line(colour = "black", size=1), axis.