Name		Name	Last commit message	Last commit date
parent directory ..
groups		groups
utils		utils
README.md		README.md
index.js		index.js
package.json		package.json

README.md

Scraper

The system that will scrap data for the website.

Add a new scraper

Create a scraper

Create a new file in ./groups/<name>.js

export const name = '<name of group>';
export const url = '<page that list brands>';
export const infoUrl = '<wikipedia page>';

export const scrapDetails = async (get$, getPage) => {
    const details = {
        name,
        slug: slugify(name),
        url,
        infoUrl,
        description,
        picture,
    };
    return details;
};

export const scrapBrands = async (get$, getPage) => {
    const brands = new Map();
    return brands;
};

Scrap details

Usually, we scrap details from the group's wikipedia page.

You have access to a default one getDetailsScraper, it will scrap the name, description and logo of a group, given its url.

You can replace the scrapDetails function of your group with:

import { getDetailsScraper } from '../utils/index.js';

export const scrapDetails = getDetailsScraper(url, infoUrl);

Scrap the brands

In your scrapBrands script you can choose to use either Cheerio or Puppeteer by using respectively get$ and getPage:

export const scrapBrands = async (get$, getPage) => {
    const $ = await get$(url);
    const page = await getPage(url);
};

Then you're free to use whatever lib you need. Take example of what's been already done in ./packages/scraper/groups/*

Run the command

yarn scrap <name>

And it will add the new group and its brands to the shared data in ./packages/website/public/data.json

Usage

yarn start <group>

⚠️ New data will delete the previous data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scraper

scraper

groups

groups

utils

utils

README.md

README.md

index.js

index.js

package.json

package.json

README.md

Scraper

Add a new scraper

Usage

Files

scraper

Directory actions

More options

Directory actions

More options

Latest commit

History

scraper

Folders and files

parent directory

Scraper

Add a new scraper

Usage