The wrong word can cost you
Is the latest budget plan under 'attack' or under 'siege'? Did Serena Williams 'beat' or 'vanquish' her opponent in the third round?
It may seem like the kind of pedantic semantic discussion only a copy editor could care about, but in the split-second decision to click an article, it turns out readers care, too. One word in a headline can mean the difference between a story succeeding, or sinking.
There is no exact formula to a winning headline. Situations and tastes change. Audiences can be fickle, liking bombastic, overwrought copy, then switching to understated, objective headlines, seemingly at random.
When dealing with high volumes of readers, publishers maximize their chances at success by A / B testing headlines to see what resonates with readers. They try a headline, change it, or run two versions of the headline with a tool that will split their audience between A and B headlines.
Developed by a Stripe employee called Tom, there even exists a tool that scrapes the New York Times to see all headline revisions and how they affect the popularity of the story.
Testing trumps instinct Even a publication with the history and experience of the New York Times struggles to get it right the first time. They're also open about their A / B testing process.
Writing in their own article on the practice, the NYT explained:
'The Times also makes a practice of running A / B tests on the digital headlines that appear on its homepage: Half of the readers will see one headline, and the other half will see an alternative headline, for about half an hour. '
Taken from the scraping API, around 29% of articles use A / B testing on their headlines, the maximum number of headline changes on a single article observed was eight. Although many changes are minor things, like capitalizations or an added comma, the practice is designed to find the most effective combination of words to get more people to click the story.
And it works. Articles that underwent A / B testing on the headline are 80% more likely to feature on their “most popular” list. The beauty of digital publishing means it's possible to set a marker then improve, rather than aiming for the bullseye on every headline.
This story is a great example of how removing a two-word phrase entirely changed the profile and popularity of an article. There is a microscopic proportion of airtime given to the headline that includes 'Jumping Jehoshaphat' at the start, showing it must have had a major impact on click-throughs.
It may seem like small, inconsequential changes that serve to satisfy a particularly picky editor, but A/B testing headlines can help direct major streams of traffic.
Emotionally-charged or to the point? Online content is associated with click-bait. The more shocking, the more emotionally charged, the better. It would make logical sense to use A / B testing to drive up the emotional impact of headlines. But, how the New York Times actually uses A / B testing reveals a more nuanced picture.
Emotionally-charged or to the point? Online content is associated with click-bait. The more shocking, the more emotionally charged, the better. It would make logical sense to use A/B testing to drive up the emotional impact of headlines. But, how the New York Times actually uses A/B testing reveals a more nuanced picture. Let’s look at an example from a recent story on Trump’s planned leadership of the G.O.P:
By far the most popular article contains phrases 'hit list' and 'warning shot'. Powerful, violent imagery. Compared with 'claim leadership' this paints a far more exciting story and it manages to capture more readers' attention.
But, it also completely cuts the main point that the other headlines lead with, the leadership of the G.O.P This is a clear example of how the core message of the story has been dimmed down to make way for a more emotional punch.
Stories that have the possibility to be seen as 'dry' or dense can reach a wider audience by finding the most appealing way in for their audience. Yet this technique can't be applied to every story.
In the analysis of the Grammys reports, the most popular headlines are the simplest ones. The leads free from story and imagery.
We can see how, as time progressed, the NYT withdrew color from both Grammy headlines. The editor pulled out the imagery of 'pandemic' and 'protests' in favor of simple, clear headlines that proved more popular.
For readers searching for quick, objective facts, such as who won the Grammys, less is more. Readers don’t need emotionally-charged editorializing at this point. They want clean information.
A dangerous practice?
If the New York Times sees such a drastic improvement in popularity when they refine their headlines, it would make sense to A/B test every headline, sub-title, and blockquote, right?
It’s clear from how the New York Times uses A/B testing that it’s possible for publications to overuse the practice. When a publication prints a word, in ink or online, they have to stand behind it. Having lines that seem to shift and adapt throughout the day is not ideal for an unimpeachable source of truth.
It’s worth remembering that 62% of the NYT’s revenue comes from subscriptions, only 27% comes from advertising that relies on high volumes of readers. Adapting everything for what responds to trends in what readers find compelling distorts the editorial mission that has kept them in print for over a hundred years. Being overly populist erodes trust and confidence in the brand. It then harms the conversion rate for more long-term products like paid subscriptions.
The scraping of the New York Times’s headline data shows that A/B testing can be a very effective tool for making content more appealing and directing readers to certain stories, in the short-term.
The fact that they limit its use shows that it can have negative long-term effects and brands should maintain their own editorial mission, rather than chase popularity at all costs.