Overlapping Error Bars Significance
Contents |
in a publication or presentation, you may be tempted to draw conclusions about the statistical significance of differences between group means by looking at whether the error bars overlap. Let's look at how to interpret error bars two contrasting examples. What can you conclude when standard error bars do not
Large Error Bars
overlap? When standard error (SE) bars do not overlap, you cannot be sure that the difference between two means is
Sem Error Bars
statistically significant. Even though the error bars do not overlap in experiment 1, the difference is not statistically significant (P=0.09 by unpaired t test). This is also true when you compare proportions
What Are Error Bars In Excel
with a chi-square test. What can you conclude when standard error bars do overlap? No surprises here. When SE bars overlap, (as in experiment 2) you can be sure the difference between the two means is not statistically significant (P>0.05). What if you are comparing more than two groups? Post tests following one-way ANOVA account for multiple comparisons, so they yield higher P values what do small error bars mean than t tests comparing just two groups. So the same rules apply. If two SE error bars overlap, you can be sure that a post test comparing those two groups will find no statistical significance. However if two SE error bars do not overlap, you can't tell whether a post test will, or will not, find a statistically significant difference. What if the error bars do not represent the SEM? Error bars that represent the 95% confidence interval (CI) of a mean are wider than SE error bars -- about twice as wide with large sample sizes and even wider with small sample sizes. If 95% CI error bars do not overlap, you can be sure the difference is statistically significant (P < 0.05). However, the converse is not true--you may or may not have statistical significance when the 95% confidence intervals overlap. Some graphs and tables show the mean with the standard deviation (SD) rather than the SEM. The SD quantifies variability, but does not account for sample size. To assess statistical significance, you must take into account sample size as well as variability. Therefore, observing whether SD error bars over
in a publication or presentation, you may be tempted to draw conclusions about the statistical significance of differences between group means by looking at whether the error bars overlap. Let's look at two contrasting examples. What can you conclude when error bars standard deviation or standard error standard error bars do not overlap? When standard error (SE) bars do not overlap, you how to calculate error bars cannot be sure that the difference between two means is statistically significant. Even though the error bars do not overlap in experiment how to draw error bars 1, the difference is not statistically significant (P=0.09 by unpaired t test). This is also true when you compare proportions with a chi-square test. What can you conclude when standard error bars do overlap? No surprises https://egret.psychol.cam.ac.uk/statistics/local_copies_of_sources_Cardinal_and_Aitken_ANOVA/errorbars.htm here. When SE bars overlap, (as in experiment 2) you can be sure the difference between the two means is not statistically significant (P>0.05). What if you are comparing more than two groups? Post tests following one-way ANOVA account for multiple comparisons, so they yield higher P values than t tests comparing just two groups. So the same rules apply. If two SE error bars overlap, you can be sure that a https://egret.psychol.cam.ac.uk/statistics/local_copies_of_sources_Cardinal_and_Aitken_ANOVA/errorbars.htm post test comparing those two groups will find no statistical significance. However if two SE error bars do not overlap, you can't tell whether a post test will, or will not, find a statistically significant difference. What if the error bars do not represent the SEM? Error bars that represent the 95% confidence interval (CI) of a mean are wider than SE error bars -- about twice as wide with large sample sizes and even wider with small sample sizes. If 95% CI error bars do not overlap, you can be sure the difference is statistically significant (P < 0.05). However, the converse is not true--you may or may not have statistical significance when the 95% confidence intervals overlap. Some graphs and tables show the mean with the standard deviation (SD) rather than the SEM. The SD quantifies variability, but does not account for sample size. To assess statistical significance, you must take into account sample size as well as variability. Therefore, observing whether SD error bars overlap or not tells you nothing about whether the difference is, or is not, statistically significant. What if the groups were matched and analyzed with a paired t test? All the comments above assume you are performing an unpaired t test. When you analyze matched data with a pa
CatservEvolutionBlogGreg Laden's BlogLife LinesPage 3.14PharyngulaRespectful InsolenceSignificant Figures by Peter GleickStarts With A BangStoatThe Pump HandleThe http://scienceblogs.com/cognitivedaily/2008/07/31/most-researchers-dont-understa-1/ Weizmann WaveUncertain PrinciplesUSA Science and Engineering Festival: The BlogWorld's Fair2010 World Science Festival BlogA Blog Around The ClockAdventures in Ethics and http://rpsychologist.com/how-to-tell-when-error-bars-correspond-to-a-significant-p-value ScienceA Good PoopAll of My Faults Are Stress RelatedAngry ToxicologistApplied StatisticsArt of Science LearningA Vote For ScienceBasic Concepts in SciencebioephemeraBlogging error bars the OriginBrookhaven Bits & BytesBuilt on FactsChaotic UtopiaChristina's LIS RantClass MCognitive DailyCommon KnowledgeCulture DishDean's CornerDeep Sea NewsDeveloping IntelligenceDispatches from the Creation WarsDot PhysicsDr. Joan Bushwell's Chimpanzee RefugeEffect MeasureEruptionsevolgenEvolution for EveryoneEvolving ThoughtsFraming ScienceGalactic InteractionsGene ExpressionGenetic FutureGood Math, Bad MathGreen GabbroGuilty PlanetIntegrity overlapping error bars of ScienceIntel ISEFLaelapsLife at the SETI InstituteLive from ESOF 2014Living the Scientific Life (Scientist, Interrupted)Mike the Mad BiologistMixing MemoryMolecule of the DayMyrmecosNeuron CultureNeuronticNeurophilosophyNeurotopiaNot Exactly Rocket ScienceObesity PanaceaObservations of a NerdOf Two MindsOmni BrainOn Becoming a Domestic and Laboratory GoddessOscillatorPhoto SynthesisPure PedantryRetrospectacle: A Neuroscience BlogRevolutionary Minds Think TankScience + SocietyScience After SunclipseScience is CultureScienceOnline 2010: The BlogSciencePunkScience To LifeSciencewomenSeed/MoMA SalonSee Jane ComputeShifting BaselinesSignoutSpeakeasy ScienceSpeaking Science 2.0Stranger FruitSuperbugTerra SigillataTetrapod ZoologyThe Blogger SAT ChallengeThe Book of TrogoolThe Cheerful OncologistThe Corpus CallosumThe Examining Room of Dr. CharlesThe Frontal CortexThe IntersectionThe Island of DoubtThe LoomThe Primate DiariesThe Quantum PontiffThe Questionable AuthorityThe Rightful Place ProjectThe ScienceBlogs Book ClubThe Scientific ActivistThe Scientific IndianThe Thoughtful AnimalThe Voltage GateThoughts from KansasThus Spake ZuskaTomorrow's TableTranscription and TranslationUniverseWalt at RandomWe BeastiesWh
statistics Share on: Introduction Belia, Fidler, Williams, and Cumming (2005) found that researchers in psychology, behavior neuroscience and medicine are really bad at interpreting when error bars signify that two means are significantly different (p = 0.05). What they did was to email a bunch of researchers and invite them to take a web-based test, and they got 473 usable responses. The test consisted of an interactive plot with error bars for two independent groups, the participants were asked to move the error bars to a position they believed would represent a significant t-test at p=0.05. They did this for error bars based on the 95 % CI and the group’s standard errors. The participants did on average set the 95 % CI too far apart with their mean placement corresponding to a p value of .009. They did the opposite with the SE error bars, which they put too close together yielding placements corresponding to p = 0.109. And if you’re wondering they found no difference between the three disciplines. Plots I wanted to pull my weight, and I have therefore created some various plots in R that show error bars that are significant at various p-values. Figure 1. Error bars corresponding to a significant difference at p = .05 (equal group sizes and equal variances) Figure 2. Error bars corresponding to a significant difference at p = .01 (equal group sizes and equal variances) Figure 3. Error bars corresponding to a significant difference at p = .001 (equal group sizes and equal variances) Based on the first plot we see that an overlap of about one third of the 95 % CIs corresponds to p = 0.05. For the SE error bars we see that they are about 1 SE apart when p = 0.05. R Code Here's the complete R code used to produce these plots 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47library(ggplot2) library(ggplot2) library(plyr) m2 <- 100 # initital group size, should be the same as m1 p <- 1 # starting p-value m1 <- 100 # mean group 1 sd1 <- 10 # sd group 1 sd2 <- 10 # sd group 2 n <- 20 # n per group s <- sqrt(0.5 * (sd1^2 + sd2^2)) # pooled sd while(p>0.05) { # loop til p = 0.05 t <- (min(c(m1,m2)) - max(c(m1,m2))) / (s * sqrt(2/n)) # t statistics df <- (n*2)-2 # degress of freedom p <-pt(t, df)*2 # p value m2 <- m2 - (m2/10000) # adjust mean for group 2 } get_CI <- function(x, sd, CI) { # calculate error bars se <- sd/sqrt(n) # standard error lwr <- c(x - qt((1 + CI)/2, n - 1) * se, x - se) # 95 % CI and SE lower limit upr <- c(x + qt((1 + CI)/2, n - 1) * se, x + se) # 95 % CI and