Skip to content

elacuesta/scrapy-playwright-cloud-example

Repository files navigation

scrapy-playwright sample project for Scrapy Cloud

Running scrapy-playwright on Zyte Scrapy Cloud.

Dockerfile

A custom Docker image is provided: Dockerfile. To keep the resulting image small, only the chromium browser is installed by default.

Settings

_browsers = {
    "chromium": "/ms-playwright/chromium/chrome-linux/chrome",
    # "firefox": "/ms-playwright/firefox/firefox/firefox",
    # "webkit": "/ms-playwright/webkit/pw_run.sh",
}
PLAYWRIGHT_BROWSER_TYPE = "chromium"
PLAYWRIGHT_LAUNCH_OPTIONS = {
    "executable_path": _browsers[PLAYWRIGHT_BROWSER_TYPE],
    "timeout": 10000,
}
  • PLAYWRIGHT_LAUNCH_OPTIONS: the process within the Docker container will be executed by a user different from the one who built the image, the path to the browser executable needs to be set explicitly.

Build and deploy

  • Make sure you have shub installed
  • Replace the project id (project: <project-id>) in the scrapinghub.yml file with your own project id
  • Run shub image upload
  • Run shub schedule headers

For more information, check out the full documentation on how to build and deploy Docker images to Scrapy Cloud.

About

Trying scrapy-playwright on Scrapy Cloud

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published