25 of the New York Times' top 100 news stories in 2019 were incorrectly blocked as brand unsafe

CHEQ acquires Ensighten

Learn More

The failures of blunt brand safety come as Coronavirus creates more brand safety headaches in 2020

The 100 most-read online news stories of 2019 in The New York Times provide a snapshot of life at the turn of the decade. These news stories generated huge traffic to the Times, burnishing the paper of record’s reputation for quality and attracting new readers, and cementing the reputations of the news sites’ top journalists.

However, in a bizarre twist, the vast majority of these online stories generated no programmatic advertising dollars. This is because of the crude business of keyword blacklists, created in the name of “brand safety.” These blacklists are words deemed too dangerous for advertisers to appear beside. If an online news story contains one of these words, through the deployment of ad verification, advertisers steer clear. One recent study by the IAB found that keyword blacklists, running into 3000 words are used by 94% of marketers. CHEQ and the University of Baltimore found that the unnecessary blocking of safe premium news inventory is costing online publishers $3.2bn a year in lost revenue.

More often than not, blacklists by their own blunt logic block entirely safe content. This overblocking is hurting publishers’ revenues and killing brands reach. CHEQ found that 25 of 100 of The New York Times’ most-read online news stories of 2019 were incorrectly flagged by blunt ad verification keywords. Looking at the problem through a case study of the New York Times shows the bizarre brand safety blocking that occurs in the wider online news ecosystem.

The stories correctly and unnecessarily blocked

Based on industry standard blacklists, “Trump” was the number one keyword denying advertising dollars for the New York Times’ top stories. This wiped out advertising on 11 stories. Sebastian Tomich, global head of advertising at The New York Times told the Digiday podcast: “advertisers never come in and say, ‘I want to be next to Trump.'” Even accounting for genuinely brand unsafe stories among the top 100 (including Trump, the Epstein suicide, and multiple tragic shootings) a further 25 entirely safe stories among this premium assortment were denied advertising.

Blocked: moving home, Netflix and Game of Thrones

The 25 top stories demonetized included keywords blocking the word “death” in a story about moving home (“After death and divorce, moving is supposed to be the most stressful thing you can go through,”). No less than four entertainment stories featuring in their vaunted top 100, were blocked. This includes these blacklist offenders: “The 50 Best Movies on Netflix Right now” (blocked for mention of “sexuality”: 1967’s Bonnie and Clyde “mixes sexuality, danger, restlessness and ennui”); “10 Recent Netflix Originals worth your time” (blocked for “sex”); a review of Game of Thrones, season 8 episode 3, by Jeremy Egner (blocked for “death”) and a review of The Joker (blocked for “violence”).

Also denied dollars: Black holes, mobile phones, LGBTQ and history of slavery

The iconic first image of a black hole, analyzed by cosmic affairs correspondent, Dennis Overbye — was blocked for “violent” (“the unleashing a violent jet of energy some 5,000 light-years into space”). “Slavery” was flagged as brand unsafe in two stories — one referencing the Duchess of Sussex, Meghan Markle as the “descendant of plantation slaves  — while a travel piece described Williamsburg as the arrival point for the “first African slaves to North America”. This latter article was part of a landmark series by The New York Times (The 1619 Project, an ongoing project developed by The New York Times Magazine in 2019 with the goal of re-examining the legacy of slavery in the United States was entirely blocked from receiving ad dollars when blacklists alone are used). In another top 100 story, the keyword, “alcohol” demonetized a great read about Kevin Roose’s detoxing from his mobile phone: “Unlike alcohol or opioids, phones aren’t an addictive substance so much as a species-level environmental shock”). In a separate category of wrong, the keyword, “HIV”, (present on every blacklist we have seen), was blocked twice among the top 100. One of the stories mentioned a potential new treatment for “HIV”, while in a profile piece, Jonathan Van Ness star of Queer Eye opened up on his experiences with living with the condition. Brands were forced to stay away. CHEQ has found in a separate report that 73% of LGBTQ content is flagged by first generation brand safety as unsafe for containing words such as “lesbian”, “same sex” and “sexual.”

This, of course, is not unique to The New York Times. The blunt brand safety crapshoot affects every top news site. However, that it can even affect The New York Times (described by its own media columnist, Ben Smith, as “a digital behemoth crowding out the competition”) shows no online news site is immune from this brand safety failure. The New York Times like other news sites relies on such advertising: recently disclosing that digital advertising revenues ($103 million), surpassing print ($88 million), as it achieved $709 million in overall digital revenues.

Coronavirus: Using AI over keywords

Of course, Coronavirus is proving a new brand safety headache in 2020 with “coronavirus” competing with “Trump” as the most used keyword demonetizing thousands of stories on The New York Times and every other online news site. Mike Zaneis, co-founder of the Brand Safety Institute told (the New York Times): “Now, the scale creates a huge challenge: There might be 10,000 stories a day about coronavirus.”

In contrast, CHEQ does not use keyword blacklists, instead ML experts have trained the CHEQ AI on millions of sources to understand in context what a news story is, and is not, about. In terms of coronavirus, the CHEQ AI has been trained to understand mention of the disease contextually, that is not over or under inclusive (as is the case with blunt keywords). It has been an interesting challenge: Engineers have trained the AI to define the disease as a unique category, but also trained the AI to understand sub-terms about the disease in context (“deaths” “vaccines”, “advice”). So, for instance some clients are telling us they do not want to appear against stories about death tolls from the virus, but are happy to appear next to (more neutral stories) for instance advice, public safety announcements, or precautions about the virus. Unlike keyword blacklists, our AI will not block mention of any story about Corona beer, or computer “viruses” or “viral” TikTok posts (unless a brand actually wants this).

Hopefully, by the time The New York Times publishes its top stories for 2020 and the coronavirus is neutralized, the industry will have arrived at a better brand safety solution. However, the case of The New York Times shows that the discredited use of blacklists is proving to be a very unscientific means of brand safety containment.