The true mean reaction time for all women is unknowable, but when we speak of a 95 percent confidence interval around our mean for the 50 women we happened to test, we are saying that if we were to repeat this measurement many times, 95 percent of the time the confidence interval would include the true mean.

Scientific papers in the experimental sciences are expected to include error bars on all graphs, though the practice differs somewhat between sciences, and each journal will have its own house style.

What if the error bars do not represent the SEM? Standard Errors But perhaps the study participants were simply confusing the concept of confidence interval with standard error. Your graph should now look like this: The error bars shown in the line graph above represent a description of how confident you are that the mean represents the true impact

Though no one of these measurements are likely to be more precise than any other, this group of values, it is hoped, will cluster about the true value you are trying to measure. They give a general idea of how precise a measurement is, or conversely, how far from the reported value the true (error free) value might be. The (frequentistic) interpretation is that the given proportion of such intervals will include the "true" parameter value (for instance the mean).

Notice the range of energy values recorded at each of the temperatures. A huge population will be just as "ragged" as a small population. We might measure reaction times of 50 women in order to make generalizations about reaction times of all the women in the world.

What can I do? The SD, in contrast, has a different meaning.

Just 35 percent were even in the ballpark -- within 25 percent of the correct gap between the means.

In contrast, since SE is associated with your sample size (n) (as SD=SE/sqrt(n)), a greater sample will reduce the SE of the estimate. Graphing the mean with an SEM error bars is a commonly used method to show how well you know the mean. The only advantage of SEM error bars are that they are shorter, which can help avoid overlap. Almost always, I'm not looking for that precise answer: I just want to know very roughly whether two classes are distinguishable. I suppose the question is about which "meaning" should be presented.

The CI is absolutly preferrable to the SE, but, however, both have the same basic meaing: the SE is just a 63%-CI.

The 95% confidence interval in experiment B includes zero, so the P value must be greater than 0.05, and you can conclude that the difference is not statistically significant. The SEM bars often do tell you when it's not significant. In this case, 5 measurements were made (N = 5) so the standard deviation is divided by the square root of 5.

Is there a better way that we could give our uncertainty in group means, without assuming that things are normally distributed? When I add my std deviation bars in it selects all the men and all the women bars.

Just use the SE instead of SD and you're good. With fewer than 100 or so values, create a scatter plot that shows every value. These ranges in values represent the uncertainty in our measurement.

The (frequentistic) interpretation is that the given proportion of such intervals will include the "true" parameter value (for instance the mean). However, we are much less confident that there is a significant difference between 20 and 0 degrees or between 20 and 100 degrees.

So how many of the researchers Belia's team studied came up with the correct answer?

