# Provider Extractor

In addition to their implementation, all providers must also have a corresponding extractor class which is invoked by Marfeel Alfred in order to detect this provider on a tenant page.

Marfeel Alfred facilitates scaffolding at tenant creation time, by looking for any provider supported by Marfeel.

# How it works

Marfeel Alfred runs Puppeteer (opens new window) in the tenant's desktop version (either in responsive or desktop version) and executes all extractors' implementations in order obtain the information needed to create the different json resources that we want (e.g: inventory.json, analytics.json, comments.json, etc.)

Usually, an Analytics or Adserver provider triggers network requests with all its configuration. This means that we could use the method onRequest in the extractor to get all the information for that particular provider. A second argument is passed and it references the current Page (opens new window) This argument can be used to retrieve any information from the DOM following Puppeteer's (opens new window) API.

# Implementation

The Extractor is a TypeScript class with a specific method to detect the provider.

There are two main strategies of detection:

  • onRequest: the detection occurs during the HTTP requests in the page. It matches the extractor specific regular expression when an HTTP request is thrown during the tenant's page render. In this case, the extractor class must contain the onRequest method.
  • onLoad: the detection occurs after the page has been fully loaded. It parses the DOM and the extractor must contain the onLoad method.

Both methods, onRequest and onLoad return a Promise which resolves with an array of provider configurations.

When to use Page argument

The page argument is used when we need to extract information from the page itself. We could use pupeeteer eval methods to get any information needed.

e.g:

import { Page } from 'puppeteer';

export default class MyProviderExtractor {
    // This method will be invoked on every HTTP request done by page
    static onRequest(url: string, page: Page): Promise<ProviderConfigs[]> {
    return this.matches(url) ?
        Promise.resolve([this.extract(url)]) :
        Promise.resolve([]);// Otherwise provider is not detected by this HTTP request
    }
    // if one of requests matches specific provider pattern
    private static matches(url: string): boolean {
        return SOME_URL_PATTERN.test(url);
    }
    // extracts and return provider data
    private static extract(url: string): ProviderConfigs {
        // some logic
    }
}

Please, check existing providers for more examples on how to implement an extractor and contact Berg (opens new window) (analytics) or Alot (opens new window) (ad servers) in case you have any questions.

# Testing

All extractors must pass a certain coverage percentage of unit tests, to guarantee a seamless integration in the Alfred project. The unit test are done with Jest instead of Jasmine and any dependency is mocked. In order to run the test use:

npm run test

e.g:

import Extractor from './index';

describe('Extractor', () => {
    test('extracts CONFIIG object', async() => {
        const analytics = await Extractor.onRequest('SOME_URL_HERE');

        expect(analytics.length).toEqual(1);
        expect(analytics[0].vars.SOME_CONFIG).toEqual('SOME_CONFIG');
    });
});

# Deployment

Once the provider extractor class is implemented it must be compiled and deployed to Amazon S3. This is where the compiled code will be fetched by Alfred from.

Use the provider-cli to deploy the compiled extractor:

provider-cli deploy -k ${AWS key} -s ${AWS secret} -f dist/extractor.js

In order to use provider-cli command, a dev dependency should be added:

"devDependencies": {
    ...
    "@marfeel/provider-cli": "..latest provider-cli version..",
    ...
}

TIP

This step should be invisible as all the providers should be doing this during the shuttling phase.

# Migration

During the migration of an extractor we need to check whether is a Typescript project or not. Since extractors are in TS if the project is not in TS we need to add a tsconfig.json inside extractor directory.

# Adservers

For Adservers the tsconfig.json would be:

{
    "extends": "../node_modules/@marfeel/adserver-providers-cli/config/extractor.tsconfig.json"
}

Also we need to add in the package.json inside the eslintConfig:

"overrides": [
	{
		"files": [
			"**/*.ts"
		],
		"parserOptions": {
			"project": "./extractor/tsconfig.json"
		},
		"extends": [
			"@marfeel/eslint-config-js",
			"@marfeel/eslint-config-jest",
			"@marfeel/eslint-config-ts"
		]
	}
]

# More Information

For more examples of Extractors you can check: