Overlapping Error Bars Means
Contents |
in a publication or presentation, you may be tempted to draw conclusions about the statistical significance of differences between group means by looking at whether the error bars overlap. Let's look at two contrasting examples. What can you conclude when standard error bars do not overlap? When standard how to interpret error bars error (SE) bars do not overlap, you cannot be sure that the difference between two
Large Error Bars
means is statistically significant. Even though the error bars do not overlap in experiment 1, the difference is not statistically significant (P=0.09 by unpaired sem error bars t test). This is also true when you compare proportions with a chi-square test. What can you conclude when standard error bars do overlap? No surprises here. When SE bars overlap, (as in experiment 2) you can be sure the what are error bars in excel difference between the two means is not statistically significant (P>0.05). What if you are comparing more than two groups? Post tests following one-way ANOVA account for multiple comparisons, so they yield higher P values than t tests comparing just two groups. So the same rules apply. If two SE error bars overlap, you can be sure that a post test comparing those two groups will find no statistical significance. However if two SE error bars do not overlap, you can't
What Do Small Error Bars Mean
tell whether a post test will, or will not, find a statistically significant difference. What if the error bars do not represent the SEM? Error bars that represent the 95% confidence interval (CI) of a mean are wider than SE error bars -- about twice as wide with large sample sizes and even wider with small sample sizes. If 95% CI error bars do not overlap, you can be sure the difference is statistically significant (P < 0.05). However, the converse is not true--you may or may not have statistical significance when the 95% confidence intervals overlap. Some graphs and tables show the mean with the standard deviation (SD) rather than the SEM. The SD quantifies variability, but does not account for sample size. To assess statistical significance, you must take into account sample size as well as variability. Therefore, observing whether SD error bars overlap or not tells you nothing about whether the difference is, or is not, statistically significant. What if the groups were matched and analyzed with a paired t test? All the comments above assume you are performing an unpaired t test. When you analyze matched data with a paired t test, it doesn't matter how much scatter each group has -- what matters is the consistency of the changes or differences. Whether or not the error bars for each group overlap tells you nothing about theP valueof a paired t test.
MenuMenu Home Current issue Comment Research Archive Archive by issue Archive by category Specials, focuses & supplements error bars standard deviation or standard error Authors & referees Guide to authors For referees Submit manuscript Reporting how to calculate error bars checklist About the journal About Nature Methods About the editors Press releases Contact the journal Subscribe
How To Draw Error Bars
For advertisers For librarians Methagora blog Home archive issue This Month full text Nature Methods | This Month Print Share/bookmark Cite U Like Facebook Twitter Delicious Digg https://egret.psychol.cam.ac.uk/statistics/local_copies_of_sources_Cardinal_and_Aitken_ANOVA/errorbars.htm Google+ LinkedIn Reddit StumbleUpon Previous article Nature Methods | This Month The Author File: Jeff Dangl Next article Nature Methods | Correspondence ExpressionBlast: mining large, unstructured expression databases Points of Significance: Error bars Martin Krzywinski1, Naomi Altman2, Affiliations Journal name: Nature Methods Volume: 10, Pages: 921–922 Year published: (2013) DOI: doi:10.1038/nmeth.2659 Published online 27 http://www.nature.com/nmeth/journal/v10/n10/full/nmeth.2659.html September 2013 Article tools PDF PDF Download as PDF (269 KB) View interactive PDF in ReadCube Citation Reprints Rights & permissions Article metrics The meaning of error bars is often misinterpreted, as is the statistical significance of their overlap. Subject terms: Publishing• Research data• Statistical methods At a glance Figures View all figures Figure 1: Error bar width and interpretation of spacing depends on the error bar type. (a,b) Example graphs are based on sample means of 0 and 1 (n = 10). (a) When bars are scaled to the same size and abut, P values span a wide range. When s.e.m. bars touch, P is large (P = 0.17). (b) Bar size and relative position vary greatly at the conventional P value significance cutoff of 0.05, at which bars may overlap or have a gap. Full size image View in article Figure 2: The size and position of confidence intervals depend on the sample. On average, CI% of interva
Tour Start here for a quick overview of the site Help Center Detailed answers to any questions you might have Meta Discuss the workings and policies of this site About Us Learn more about Stack Overflow the company Business Learn more about http://stats.stackexchange.com/questions/114701/standard-error-bars-overlap-but-significance-estimated-marginal-means-versus-o hiring developers or posting ads with us Cross Validated Questions Tags Users Badges Unanswered Ask Question http://mathbench.umd.edu/modules/prob-stat_bargraph/page08.htm _ Cross Validated is a question and answer site for people interested in statistics, machine learning, data analysis, data mining, and data visualization. Join them; it only takes a minute: Sign up Here's how it works: Anybody can ask a question Anybody can answer The best answers are voted up and rise to the top Standard error bars overlap but significance - estimated marginal error bars means versus observed means up vote 1 down vote favorite I'm running a Mixed effects model ANOVA with two fixed factors (condition, repetition) and one random factor (subject). Subsequently, a Tukey multiple comparisons test is performed. Now I'd like to plot the means and standard errors (SEMs) of the single conditions in a single error bar plot, and report the p values between the conditions. The problem: while in the Tukey test, I got significant differences and non-overlapping SEMs error bars mean between certain means, for my plotted real/observed data the SEM bars overlap. This is now counterintuitive, since commonly you would assume that in the case of overlapping, the means are not significantly different. My question is: is the difference between estimated marginal means and observed means due to having a random factor in my model, or what is the reason for the discrepancy? how would you report the data? Would you still plot observed data with the p values and state that the p values are derived from the estimated model? Or would you plot estimated means and standard errors? Thank you! EDIT: I'm adding the multiple comparisons result for a sample case as well as the observed means and standard error plot in case this helps. anova mean standard-error post-hoc share|improve this question edited Sep 8 '14 at 19:13 asked Sep 8 '14 at 13:38 user54643 64 add a comment| 1 Answer 1 active oldest votes up vote 2 down vote Statistical significance is not transitive. If you want to say how much error there is in estimating the means, show error bars around the means. If you want to compare the means, show results of multiple comparisons. Don't mix those two ideas together. It is quite possible - especially in mixed models - that means can have similar standard errors, but comparisons among the means have radically different standard errors. share|improve this answer answered Sep
and found 6: Error bars 7: Practice with error bars 8: And another way: the standard error 9: The same graph both ways 10: Review map| <| >| home Another way to add info: the standard error Graphs using standard deviation (SD) tell you what a big population of fish would look like -- whether their sizes would be all uniform, or somewhat raggedy, or totally raggedy. Sometimes, though, you don't really care what a population looks like, you just want to know, did a treatment (like Fish2Whale instead of other competing brands) make a difference on average? In that case you measure a bunch of fish because you're trying to get a really good estimate of the average effect, despite whatever raggediness might be present in the populations. Let's say your company decides to go all out to prove that Fish2Whale really is better than the competition. They convert a supply closet into an acquarium, hatch 400 fish, and tell you to do a HUGE experiment. The whole idea of the HUGE experiment is to get a really accurate measurement of the effect of Fish2Whale, despite the natural differences such as temperature, light, initial size of fish, solar flares, and ESP phenomena. The return on their investment? Really small error bars. But how do you get small error bars? Just using 400 fish WON'T give you a smaller SD. A huge population will be just as "ragged" as a small population. Instead, you need to use a quantity called the "standard error", or SE, which is the same as the standard deviation DIVIDED BY the square root of the sample size. Since you fed 100 fish with Fish2Whale, you get to divide the standard deviation of each result by 10 (i.e., the square root of 100). Likewise with each of the other 3 brands. So your reward for all that work is that your error bars are much smaller: Why should you care about small error bars? Well, as a rule of thumb, if the SE error bars for the 2 treatments do not overlap, then you have shown that the treatment made a difference. (This is not a statistical test, but simply a way to visualize what your results mean. Many statistical tests are actually based on the exact amount of overlap of the SE bars, but they can get quite technical. For now, we'll just assume that no overlap = a true difference between the treatments.) So, in order to show that Fish2Whale really i