# Write an OnExtraction Middleware

This article explains the steps to implement a Middleware with the OnExtraction hook, using the Poll Daddy Widget Provider as an example.

Poll Daddy, now Crowd Signal (opens new window) is a popular platform for creating Polls and Quizzes that you can embed into your website.

When a website has articles that contain polls, like this one, we need to integrate them into the Marfeel version.

Example of Poll Daddy

# Poll Daddy Widget Provider

At Marfeel, there is already a Widget Provider for Poll Daddy (opens new window) available. If we take a look at its schema (opens new window), we'll see that it has two properties, pollId which is required, and mainColor.

While mainColor is a static value and can be configured through widgets.json, pollId needs to be dynamic as it will change depending on the page it's placed on. Implementing a Middleware will allow us to dynamically retrieve the pollId value.

# Find the pollId

The first step is to find where the pollId is in the tenant's page. Whereas this may differ from case to case, in this example you can find it within the URL of the src attribute of a <script> tag:

<script
  type="text/javascript"
  charset="utf-8"
  src="https://secure.polldaddy.com/p/10582522.js"
  >
</script>

Now you know where to find the pollId value, we'll create a Middleware that extracts it.

# Pick a Middleware type

First, we'll need to decide which type of Middleware is required: OnExtraction or OnBrowser.

Since we need to retrieve data from the HTML of the page, OnExtraction is the right type.

# Implementation guide

The easiest way to work with Middleware is by using a test-driven approach.

# Create the test

Start by downloading a copy of the page's HTML which we can use as an input for our test.

TIP

Use the npm run create:fixtures command to automatically generate the fixtures:

npm run create:fixtures https://example.com/article-with-cute-kittens/

This command will download the HTML of the desired page into the fixtures folder, within the src folder of the tenant.

You'll also need to install the Middleware test package:

npm i --save-dev @marfeel/middlewares-test-utils

As any test-driven development, start by creating the test file in src/middlewares/widgets/poll-daddy/ and name it on-extraction.test.ts.

TIP

If we used the OnBrowser Hook, the filename would be named on-browser.test.ts.

In it, load the fixture we just extracted into a document that our Middleware can work on.

The loadFixture method does this for you. Import it from the middlewares-test-utils package and call it passing the filename of your fixture file as a parameter:

import { loadFixture } from '@marfeel/middlewares-test-utils';

const document = await loadFixture('how-to-improve-ad-viewability.html');

The loadFixture will load your fixture into a document.

Next, you need to set up the Middleware execution. The runMiddleware method allows you to run a Middleware in a given document.

runMiddleware requires two arguments:

  • Document: The document the Middleware will be executed on, in this case, the output of the loadFixture method.
  • onExtractionMiddleware: The Middleware to execute.

Middleware

At this point you haven't created the Middleware yet, create an empty file next to on-extraction.test.ts and name it on-extraction.ts. Then, import it in your test file.

import { onExtraction } from './on-extraction';

const result = await runMiddleware(document, onExtraction);

To finish the test, configure the describe block (opens new window) validating the result of the middleware execution is correct by comparing it to the expected result.

The whole test will look something like this:

import { loadFixture, runMiddleware } from '@marfeel/middlewares-test-utils';

import { onExtraction } from './on-extraction';

describe('Poll Daddy', () => {
  describe('OnExtraction', () => {
    test('returns the pollId', async() => {
      const document = await loadFixture('how-to-improve-ad-viewability.html');

      const result = await runMiddleware(document, onExtraction);

      expect(result).toEqual({
        pollId: '10582522'
      });
    });
  });
});

If you try to run the test now ( using npm test ) it will fail, because we haven't implemented the Middleware hook yet! Let's do that next.

# Set up OnExtraction Middleware

Access the on-extraction.ts file you created earlier and add the following skeleton:

import { onExtractionFunction } from '@marfeel/middlewares-types';
import { PollDaddyProps } from '@marfeel/widget-providers-poll-daddy';

export const onExtraction: onExtractionFunction<PollDaddyProps> = async ({ document }): Promise<PollDaddyProps | undefined> => {
  return {
    pollId: ''
  }
};

The onExtractionFunction import enables the OnExtraction type, which we will use for the onExtraction function expected return.

The PollDaddyProps import is required to connect the Middleware to your Widget Provider.

TIP

For these imports to work you need to install them as dependencies:

  • Poll-daddy Widget Provider:
npm i @marfeel/widget-providers-poll-daddy
  • Middleware Types package, as a development dependency.
npm i --save-dev @marfeel/middlewares-types

At this point, you can run the test. It will still fail though, as the Middleware is returning an empty value.

# Implement OnExtraction Middleware

Now you need to configure the OnExtraction Middleware to retrieve the target data.

First, you need to find the correct script tag. As the Document object of the page is an argument of the Middleware, you can use it to query its elements and filter out the one containing the ID.

const url = [...document.querySelectorAll('script')]
  .map(element => element.getAttribute('src'))
  .filter(url => url)
  .find(url =>    url.toLowerCase().includes('https://secure.polldaddy.com/p/'));

This example iterates over all the scripts of the document looking for one that contains polldaddy in the src attribute.

So now you have the https://secure.polldaddy.com/p/<pollID>.js URL located you have to extract the id from it. You can achieve this using string manipulation functions.

const filename = url.split('/').pop();
const pollId = filename.replace('.js', '');

return {
  pollId
};

Regex

You could use a Regex to parse the string but it should be avoided when possible as Regex has higher complexity and performance cost compared to string manipulation methods.

All pieces are in place, your Middleware should look like this:

import { onExtractionFunction } from '@marfeel/middlewares-types';
import { PollDaddyProps } from '@marfeel/widget-providers-poll-daddy';

export const onExtraction: onExtractionFunction<PollDaddyProps> = async ({ document }): Promise<PollDaddyProps | undefined> => {
  const url = Array.from(document.querySelectorAll('script'))
    .map(element => element.getAttribute('src'))
    .filter(url => url)
    .find(url => url.toLowerCase().includes('https://secure.polldaddy.com/p/'));

  if (!url) {
    return;
  }

  const filename = url.split('/').pop();
  const pollId = filename.replace('.js', '');

  return {
    pollId
  };
};

Now you can run npm test and the test will pass.

We should also test that our new Middleware works correctly in production.

To do so, compile the code using npm run build(don't forget the react option if needed!) and use the Middleware command to test it in the production microservice.

npm run middleware poll-daddy https://example.com/article-containing-poll-daddy/

This gives you the following output which means the Middleware works correctly when extracting the data directly from the website.

{
  "result": {
    "pollId": "10582522"
  },
  "error": null
}

# Connect the pieces

Now that you have a working Middleware, let's hook it all up.

  1. Activate the Middleware: Add the invokeMiddleware flag with the value true to features.json.
{
  "features": {
    "invokeMiddleware": true
  }
}

  1. Configure the Widget Provider: Add the poll-daddy provider to widgets.json.
{
   "widgets": [
    {
      "type": "provider",
      "id": "poll-daddy",
      "name": "poll-daddy",
      "selector": ".mrf-polldaddy",
      "middleware": "poll-daddy",
      "parameters": {
        "pollId": "",
        "mainColor": "#F56565"
      }
    }
  ]
}

TIP

If there's no widgets.json file in your tenant, create one in the root folder of the tenant.

The value of the middleware key is the name of the folder the Middleware is in.

Example of Poll Daddy in Marfeel

Done! You have your Middleware working and fully integrated with the poll-daddy Widget Provider, it's time to send the PR.

Once the code is merged to production, you'll be able to visit the Marfeel version of the site and see the Provider in action.

TIP

As a safety check, it's recommended to test your middleware in several articles. The tenant might use different configurations and your Middleware should cover them all.