# Blacklist and whitelist

The content that appears on the article pages of a Tenant is extracted with Boilerpipe from the source code of the original website.

To deliver a clean and fluid UX in mobile, there are extraneous elements from the Desktop version that are filtered by default.

The whitelist and blacklist are what direct Boilerpipe to omit or add the specific elements that are needed.

Both the whitelist and blacklist are defined in a tenant's definition.json inside configuration.

"whitelist": "elements_to_include",
"blacklist": "elements_to_remove"

The elements can be HTML classes or ids listed without any class or id mark (. or #) and separated by a comma without space.

HTML tags can also be added in capital letters (e.g. ASIDE), and it also accepts attributes defined like this: [attribute=anyData]

Adding == to the beginning of a class will either blacklist or whitelist the elements that only have that specific item. Elements that have the same class besides others will not be affected.

For example, if we had these two HTML elements:

<div class=“a”>A</div>
<div class=“a b c”>A B C</div>

The following line would only remove the first element because the second one has more classes other than “a”:

"blacklist":"==a"

Blacklist has priority over whitelist. Therefore, if the same selector is both blacklisted and whitelisted, it won’t show. To reverse this behavior in very specific cases where this may be needed, the greedyWhitelist flag can be added.

# Whitelist

whitelist identifies elements or characteristics for Boilerpipe to extract and use on a Tenant.

Examples might include author name, tags, publishing date, etc.

"whitelist": "[href=/author/],slideshow-subtitle,article-full-picture,cover-img,[itemprop=datePublished],img-credits"

# Blacklist

blacklist avoids the extraction of elements.

Defined items might include a publisher's sharing bar, which would be excluded because Marfeel provides a native auto-hiding social sharing bar by default or an image that does not have to be in mobile.

"blacklist": "desktop-footer,==off-phones,article-bottom-blocks-comment-layout-cell-sidebar,OUTBRAIN,term-name-descriptionH3"

WARNING

This applies to Boilerpipe, affecting only article pages. To exclude a section or a group of articles, use blacklistedUrlPatterns.

TIP

There are other extraction flags that can be added in definition.json to cover the needs of every Tenant.