New rules for the presentation of statistics in the Nature journals are described in the June Editorial of Nature Cell Biology (11, 667; 2009). From the Editorial:
Thanks to advanced imaging technologies and better integration with molecular and systems approaches, cell biology is undergoing something of a renaissance as a quantitative science. Robust conclusions from quantitative data require a measure of their variability. Cell biology experiments are often intricate and measure complex processes. Consequently the number of independent repeats of a measurement can be limited for practical reasons, yet the variability of the measurements can be rather high. Cell biologists have developed good intuition to guide their analysis of such constrained datasets. Biological complexity and the reliance on intuition can cause culture shock to physical scientists crossing over into cell biology (a kind of extension of the celebrated ‘two cultures’ concept of C. P. Snow).
With the arrival of quantitative information and ‘-omic’ datasets, statistical analysis becomes a necessity to complement instinct. The problem is that statistical tools are built on basic assumptions such as the independence of replicate measurements and the normality of data distribution. Usually, sizeable datasets are prerequisite for statistical analysis. Alas, these can be as hard come by as a biostatistician (n is typically well below 5). The result is that all too often statistics (frequently undefined ‘error bars’) are applied to data where they are simply not warranted.
There are no easy solutions to rectify the prevalence of poor statistics in cell biology studies. However, an obvious recommendation is to consult a statistician when planning quantitative experiments. Consider whether n represents independent experiments (you may actually be publishing a measure of the quality of your pipette!) and whether it is large enough for the test applied. Avoid showing statistics when they are not justified; instead, show ‘typical’ data or, better still, all the measurements. Importantly, displaying unwarranted statistics attributes a misleading level of significance to the data. Always describe and justify any statistical analysis applied. We have updated our guidelines to reflect these recommendations. One key rule: if the number of independent repeats is less than the fingers of one hand, show the actual measurements rather than error bars. If you wish to present error bars, include the actual measurements alongside them.
Finally, please remember that you are interrogating a complex system — be careful not to discard ‘outlier’ data points on a whim, as they may well be as relevant as clustered measurements. One is naturally inclined to ignore data that does not match the hypothesis tested, but biology is rarely as black and white as we would like. Do not make ‘hypothesis driven’ research become ‘hypothesis forced’!