This is a pretty hot topic in CRO at the moment, and it ties in with this existential crisis that is happening in the industry from the value and impact of CRO being limited by focusing too much on the direct, short-term ROI that it can provide. Some people are tackling this by pushing for calling it something else in order to create a wedge between the hacky practitioners who sell CRO programs by promising huge increases in conversion rate and revenue, and those who take a more responsible approach and put more focus on the value of building a program and shifting the company culture towards data-driven experimentation rather than top-down, opinion-based changes.

And the thing is, they're both right, in a way. CRO absolutely can deliver some impressive boosts to revenue in a fairly short timeframe. But if this is all you're interested in and you treat CRO as just another channel, you are completely missing the bring.

But whichever camp you're in, and no matter how responsible you are with setting the right expectations, at some point somebody important is going to want to know how their investment is contributing to the bottom line. If you're an agency or freelancer this would be your client, and if you're an in-house CRO this would be whoever is ultimately responsible for the budget that pays for what you're doing. The good news is that since CRO happens on your (or your client's) website where you (hopefully) have a solid tracking and reporting setup, it's one of the easiest activities to measure. All of the pitfalls of attribution tracking don't really apply to CRO since the same cohort of traffic is split randomly and you use statistical methods to provide confidence that any effect you see is because of the change you are testing.

When it comes to the metrics and methods you you can use to measure the impact of your optimization activities, there are 3 main layers involved:

  1. Test-level metrics used to measure the performance of each test
  2. Business impact metrics to measure how all of the tests are affecting the bottom line
  3. Project- or program-level metrics to measure the non-financial aspects of your CRO program

Test-level metrics

 

This is what people are most familiar with. CRO is about optimizing the conversion rate after all, right? Well, sorta, but that's really not the full story. Some people take a pretty narrow definition of what a "conversion" is and say it applies only to purchases (for an ecom store) or leads, but I like to open this up a bit to more of a dictionary definition and say that a conversion is any time you convert something from one state to another. So it could be converting store visitors into customers. But it could also mean converting someone who views a page to someone who takes an action on a page or progresses to the next funnel step (often called a micro-conversion). Or converting a one-time buyer into a repeat customer. You get the idea. But even with this more generous definition, conversions are just one way to measure the performance of an experimental treatment.

In the context of an e-commerce store, of course final conversions and the micro-conversions that lead to them are important, but what if your test results in more people purchasing, but they are all buying cheaper products? How would you decide whether it's beneficial to implement that change? This is where a metric like RPU (revenue per user) comes into play. It's basically a combination of conversion rate and AOV (average order value), and it gives you an idea of how well you are monetizing every visitor who comes to your store.

Now I should caution you that while basing tests on RPU is a great idea in theory, in practice it's much more difficult to use in statistical significance calculations. This is because it's a "non-binomial metric", which means that it has a range of values, whereas something like conversion rate is based on a simple yes or no question. Did the user make a purchase, or not? So what this means is that you'll need to prepare your data before even starting the test, and you'll have to use a special stats calculator where you upload the raw test data, rather that just the traffic and conversion numbers like you do with a regular AB test calculator. I won't get into the weeds too much on this topic but if you're interested then feel free to reach out.

All of the metrics I've mentioned so far are related to what happens while the user is on the website, but what about what happens after that? This is where downstream metrics come in, and they are a way to measure the quality, rather than the quantity, of conversions. For leads this would usually be the lead-to-opportunity rate, or the lead-to-close rate. There's not much value in pushing a whole bunch more leads to your sales team that have a really low chance of converting into sales. Not only will the sales team be unhappy, but following up on an influx of low-quality leads will waste a lot of time and money. Same idea for ecom, you could run a test that pushes more people to make a purchase, but then they disappear and never buy from you again. So it's always a good idea to at least keep an eye on how your tests are affecting these downstream metrics. It's a bit more challenging to measure since your analytics setup likely won't extend very far beyond the website, but it is doable. It usually involves pushing experiment IDs and variant details to your CRM so that you can run some analysis later.

Business Impact Metrics

Usually when people are talking about measuring the impact of a CRO program, they are talking about one thing: the increase in revenue created by implementing winning tests. This is typically calculated by taking the lift of the test, say 10%, and applying that to the total number of users that will be exposed to the change. So if you ran a test on the homepage, and you currently have $10k worth of revenue flowing through the homepage per week, that test would theoretically bring you an extra $1000 of revenue every week, assuming the AOV is unchanged. If you crunch the same sort of numbers for every winning test (because obviously you don't implement the losing tests), that gives you an idea of the extra revenue those tests are generating.

But, as I'll explain, this is an overly simplistic view. There are 3 main reasons for this:

  1. You don't actually know what the true effect will be - if during the experiment your variant performs 10% better than control, and you run the stats and it has 95% significance, you might be tempted to think that you can expect a 10% increase in performance. But it doesn't necessarily work that way. The 10% lift is what helps get you over the significance threshold, but there's a confidence interval with a lower and upper limit of what the true effect is likely to be. If you have to estimate it, then the effect you see in the test is probably the best bet, but it's not guaranteed.
  2. The effect will likely fade over time - there are a lot of reasons for this, like tactic fatigue for example, but the reality is that a web store is not a "set it and forget it" type of thing. You need to keep innovating and keeping things fresh even to maintain the same baseline. If you stop improving, performance will slowly degrade. For this reason most CRO ROI calculations have some sort of ageing built in where the test is assumed to increase revenue by a certain % for the first few months, and then the effect is reduced by 10-20% per month until it goes to zero. There's no way to estimate this fading effect accurately, so some sort of model and assumptions need to be used.
  3. It doesn't take into account the losses you prevent by not implementing losing tests - sure, coming up with an awesome idea that boosts your conversions is the fun part of CRO, and that's what all the case studies are about. But there's also the other side to it, where if you weren't testing your website changes, a lot of the stuff you launch would actually hurt your website performance. So when you run a test that loses big time, you've actually done something really valuable because now you know not to implement that change. If you're just communicating the positive effect on revenue from winning tests, then you are missing the risk management part of the equation, so it's always a good idea to illustrate the revenue loss that the CRO program has prevented.

With all this uncertainty, should you even bother reporting on the ROI at all? In most cases you probably won't have a choice, but as long as you are clear with the stakeholders and the ROI is just an estimate based on a model with a lot of assumptions, and you educate them about the value the program is bringing over and above the boost in revenue, then there's no need to shy away from this. And usually, even if you are really conservative with your assumptions, the ROI calculation is going to show some pretty impressive numbers.

Ok, so what about the impact on the site's overall conversion rate? This is another angle for measuring the impact of your experiments, and the good thing about it is that it's not a forecast or estimate, it's something you can actually just measure directly. But, some of the external factors mentioned above come into play here as well. You quite simply can't assume that your conversion rate would stay the same if you weren't running any AB tests, so again make sure to set expectations accordingly, and don't be too disappointed if your CR isn't going up over time as much as it should based on your test results.

I've seen people discuss holding a certain percentage of website traffic to an original control state, where no test treatments are implemented at all. Theoretically this would give you a way to see the actual effect of your optimization efforts (a sort of AB test of the CRO process itself), but I'm skeptical about how this would work in practice. For one, there would be a cost associated with it, since you would be foregoing the revenue increase for that holdout group. And then there are issues with cookie deletion / expiration and people returning on different devices, so you could never be sure that any particular person was consistently seeing that holdout control experience without being exposed to the "real" version of the site.

Program-Level Metrics

 

This is a bit outside the scope of this article, but I wanted to at least mention the metrics you can use to measure the testing program itself. Typically this would be things like test velocity (number of tests run within a certain time period), win rate (percentage of tests that are "winners"), and average uplift. Which ones to use depends a lot on how your CRO team is organized and what the program goals are at any particular time. I'd also argue that win rate and average uplift are not as important as they may seem, since they place the focus too much on winning tests, and as I've discussed this is just a small part of the value of a CRO program.

A better metric, especially when the AB testing program is just getting started, would be the percentage of website changes that are tested before being published. Making the cultural shift from a release cycle where changes are just rolled out when they are ready, to launching every change as an experiment, can be a huge challenge, and it takes a lot of time, effort and education. You'll run into a lot of friction at first even if people mostly understand that it's a positive shift. So measuring and reporting on the progress of this shift can be a really valuable way to provide some visibility into how things are going.

Conclusion

 

As you can see, measuring CRO seems pretty simple on the surface (it's just about the conversion rate, right?), but there's a lot of complexity under the hood, and everything is dependent on your specific business context and optimization strategy. It would take many articles to do this topic justice, but hopefully this quick rundown has shed some light on some of the different approaches you can take when it comes to measuring not just experiments, but the experimentation program as a whole. Happy optimizing!