How Long Should You Run a Split Test For?
In SEO and internet marketing, split testing is one of the most powerful techniques available to marketers. A split test essentially means taking two slightly different versions of a webpage or website, and then seeing which one performs best. The reason this is so effective, is that it means you can make sure that a strategy or a change will be beneficial before you employ it across your entire site.
That means you can make smarter, data-driven decisions, and each one will improve your ranking, conversions and profits.
The problem is that many people don’t fully understand how a split test works, and they cut the experiment short prematurely. This ultimately results in skewed findings which can mislead you on the best way to act!
So how long should you run your split test, and why?
Why Length Matters for Split Tests
The ideal answer to the question of how long you should run a split test for is: indefinitely. The longer you can run the test, the more data you can collect, and the more accurate it will be.
The longer a split test runs for, the more data you can collect, and therefore the more likely it is to be accurate.
The reason for this has to do with what are known as confounding variables. A confounding variable is anything that you don’t control for, and this in turn can hurt your outcome.
Let’s say for example, that you have two versions of a website. On version A, you use one font, and on version B, you use another font. Your hope is to see if one of the fonts improves the amount of time people stay on your website, which should in turn also improve the SEO.
Half of the visitors are sent to version A, and half are sent to version B. You run the test for 48 hours.
But by sheer coincidence, a large proportion of the people who go to version A of the website happen to be from the UK. And let’s say for argument’s sake, that people from the UK prefer a particular font.
You have no way of testing for this, and therefore the results you get aren’t accurate. You assume that the new font is better, and you put it on every page of your website – thereby hurting your ranking significantly.
Had you run the test for five weeks though, then the likelihood of this being a sheer coincidence will shrink remarkably. Throwing a heads ten times in a row is unlikely, but throwing it 100 times in a row is so unlikely as to be safe to discount.
This is what you are aiming for with your split tests. You can never be 100% sure, but the longer the test runs, the more sure you can be.
The good news is that there are ways to test for significance, using – for example – a chi-squared test. This way, you get a number (called P), that will tell you precisely the likelihood of your study being a fluke. As long as it’s lower than 0.05, then you are good to act on the information.