A/B testing, is it worth the effort?

Industry: Banking

This year we've done a few A/B tests of just subject lines for B2B e-mail campaigns. The "standard" way we split the campaign audience was by halving the dataset equally. The first half of the e-mail addresses receive Subject Line A, the latter half receive Subject Line B. This got me to thinking that maybe customers will read just about anything we send them, and if our content is unclear they'll hit reply and/or call up their salesperson. And, that the population subset where campaign readership was higher might also reflect how clean the dataset was.

The customer population used for all three broadcasts is the same dataset.

E-mail Focus: Product

Test A: NINA/No Doc - $750K to 80% LTV
Test B: NINA/No Doc - Non-Traditional Solutions

E-mail Focus: Product

Test A: New 1-Month LIBOR for the Building Season
Test B: New Construction Solutions

E-mail Focus: Technology Enhancement

Test A: Private Label Marketing A Click Away
Test B: Let Us Market for You

The test results are as follows:

Month, A/B, qty e-mails sent, 1-wk opens, 1-wk open rate, 3-wk opens, 3-wk open rate

March A, 21019, 9809, 47%, 10450, 50%
March B, 21022, 9286, 44%, 9840, 47%

May A, 22024, 7766, 35%, 8152, 37%
May B, 22028, 9563, 43%, 10153, 46%

July A, 22600, 8698, 38%, 9580, 42%
July B, 22604, 9468, 42%, 10330, 46%

You might infer that a 3% difference in open rate is not statistically relevant. Even in market research where our audience size was much smaller, at least 8-10% is noteworthy, but not 3%. Not for this population size. Maybe if we did consumer marketing and broadcasted to millions of addresses per month would this be more relevant, but I digress. July is about the same with 4% not being that significant either.

So what happened with the May campaign? Perhaps it isn't data related at all. Spring to Fall is when a lot of home construction and remodeling goes on. A product that has good rates for just its first month isn't as appealing as say, a more general umbrella of solutions offered to a customer.

One thought in the aftermath of the May campaign was that dataset B responded more favorably to campaign broadcasts. Sure, I might be inclined to believe that. However, for the July A/B campaign, the second half of the population was sent the A campaign, and the first half were sent the B campaign. As the results show, the audience didn't care one way or the other.

Analysis: Inconclusive. The only good way to measure e-mail campaign success is to tie the data to quantitative metrics, like how much loan volume resulted from a broadcasted campaign, how much of a media mix was used to promoted a particular product, how many calls were generated by this campaign. ROI isn't necessarily a good metric anymore especially when your company broadcasts in-house. It costs us almost nothing to broacast, except the time spent by the business units putting the campaign content together, myself formatting the content into html/aol campaign templates and pulling data for broadcasting.

So, is it worth the effort? No. But that isn't going to keep us from testing this process with future campaigns.
Related Posts Plugin for WordPress, Blogger...