Rocking vs Galloping
Rocking vs Galloping
Facebook is known for many things, one of which are the posters plastered on its walls. "Move Fast and Break Things" may capture the headlines at F8 but our favorite is the following:

How does this relate to measurement?
Many teams set up a process of testing that creates a lot of motion but not much progress. A fantastic example of this is what Mobile Dev Memo recently argued about A/B testing creative on Facebook - it just shouldn't be done.
"Split Testing is probably the most expensive way of vetting creatives...Split Testing violates the dynamics of Facebook’s ad serving algorithm; it’s a concept that isn’t appropriate for the platform and does little more than satiate a desire to put a checkmark next to a “Tested?” line item on a process checklist."
Theory vs Practice
"In theory, there is no difference between theory and practice. But in practice, there is." - Benjamin Brewster
It is always important that a statistical test models reality (eg, what you're actually doing on a day-to-day basis). The reason A/B testing creative on Facebook "violates the dynamics of Facebook’s ad serving algorithm" is because the test does not model the realities of the Facebook algorithm.
"Facebook initially distributes traffic to the ads equally but recognizes that the audiences for each are different and thus defines new, creative-specific audiences in real time to optimize performance. A more efficient way of testing ad creatives on Facebook than Split Testing is to simply put the creatives in an ad set [and] let Facebook manage traffic distribution to them. And if audiences need to be segmented or geo-fenced, then that can be accomplished with different ad sets — there’s almost no reason to Split Test creatives, since the audience used in a Split Test won’t be the one exposed to any given creative at scale."
Let's Go Galloping
How many agencies have you heard discuss their creative split testing framework? How many meetings have you sat in which showcased their latest round of creative split testing? Our guess is at least several as this type of work is consistent in advertising. Teams believe that all motion is progress while in reality it can be wasted effort and wasted client money.
Why put resources against setting up and analyzing an A/B test that doesn’t model real world conditions? A/B testing is an important statistical tool; however, it needs to be leveraged in situations in which testing conditions model real-life conditions. Ads in a newspaper fit this condition. Ads in a news feed do not.
When you can match test conditions to real-life, measurement agendas begin to trot. And when the C-suite can all explain what occurred in the study, that is when it gallops. (More on how measurement is actually a persuasion problem wrapped up as an analysis problem at a later date.)
If you feel your team or agency learning agenda is creating a lot of motion but not much progress, chances are you're right. Don't fret! The best time to make a change was last year, the second best time is today.
And for the sake of all those involved (including your checkbook), stop using the split testing tool to vet creatives on Facebook!
--
After running 257 lift studies across 30+ different clients in our past, we thought it would be helpful to write a series on common mistakes and how to avoid them. This write up is part 5 in the series we call Adventures in Significance
This is a dreaded subject by many but is core to the direct response advertising world. We thought we could do our part with a round-up of our favorite opinions on the subject and how this applies to your business.
--
Thanks to Helena Lopes for sharing their work on Unsplash.