I’m going to avoid diving into statistical analysis although I will write a blog post about that shortly. I also cover it briefly in my book.
Whenever you take a measurement there is always a margin of error. There are two types of error: systematic errors, which are part of the measurement method, and random errors, which as the name suggests come from things that vary randomly outside of the measurement system.
Systematic errors are things like out of calibration equipment. They reduce accuracy but tend to be repeatable. If you are using the same measurement system to compare two set ups, for example, the results for each might not be accurate but in the absence of random errors the differential, if any, will be valid.
Random errors are things like environmental changes or human errors, your ability to reproduce an exact position on the bike for example. Random errors affect the precision (repeatability) of your measurements causing your results to be scattered around a mean value.
When aero testing the biggest component of drag (your body) is also the biggest potential cause of random errors.
Before you start testing alternatives to see which is better, what you need to do is establish the repeatability (the precision) of your measurement system.
You can do this by doing a series of test runs, at least seven, with identical equipment and with what we are assuming is an identical riding position. What you are evaluating is your ability to repeat your position for each test run, having dismounted and remounted between each run.
You will get a set of results. The difference between the highest and the lowest result is the range or spread. The average value of the results is the mean value. The magnitude of the range is an indication of the degree of precision of your measurement system including the influence of any random errors from your body position and clothing. You might get a range of 0.008 and a mean for the CdA of 0.205. You could say that your CdA is 0.205 plus or minus 0.004.
Let’s suppose that you now want to test two helmets. You do alternate A B test runs and seven runs of each, so fourteen runs in total. You now have a range and mean value for each helmet. If the ranges overlap significantly you can’t say with any degree of confidence that there is a difference between the two helmets even if their mean values do show a difference.
In statistical terms the real value for each helmet is likely to be somewhere within the range. Ninety five percent likely if using AI to do the statistical heavy lifting for you.
If you have one result that is 0.205 plus or minus 0.004, and one result that is 0.212 plus or minus 0.004 you don’t have a statistically significant difference between the two CdA numbers.
If you have one result that is 0.205 plus or minus 0.002, and one result that is 0.212 plus or minus 0.002 you do have a statistically significant difference between the two CdA numbers.
If you have one result that is 0.205 plus or minus 0.003, and one result that is 0.212 plus or minus 0.003 you again have a statistically significant difference between the two CdA numbers, albeit a very small difference.









