Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move most of the Scrapy template's logic to Apify SDK #176

Open
honzajavorek opened this issue Jan 16, 2024 · 3 comments
Open

Move most of the Scrapy template's logic to Apify SDK #176

honzajavorek opened this issue Jan 16, 2024 · 3 comments
Labels
enhancement New feature or request. t-tooling Issues with this label are in the ownership of the tooling team.

Comments

@honzajavorek
Copy link
Contributor

honzajavorek commented Jan 16, 2024

I think some logic from __main__.py could be moved to the SDK. I think new_configure_logging could be a decorator which is just imported from Apify SDK. I think configure_logger and logger names could be imported.

Similarly, main.py contains _get_scrapy_settings, but it could also be just imported, if turned into this:

def apply_apify_settings(settings: Settings, proxy_config: dict | None = None):
    ...
    return settings

Then it wouldn't need to call get_project_settings() and would leave space for custom modifications before or after applying Apify-specific settings.

The main benefit of doing this is that the template contains less boilerplate and it's easier to control and maintain updates. If new Scrapy version is published and the logic of the monkey patching or anything else needs to be changed, I could just upgrade Apify SDK with updates. As of now the only way is to watch updates to the template manually and update my code by copy-pasting.

Scrapy is a library and I can upgrade it carefully or pin it to a certain version, but Apify is a SaaS platform. If something changes in Apify and it won't be compatible with the old template code, it can just break my actors out of nowhere. Only then I will be prompted to go and see if the template looks differently than last time. Not ideal.

Follow up to #132, vaguely related to apify/actor-templates#264

@honzajavorek
Copy link
Contributor Author

I did some changes in my implementation so that it's more tidied up:

Feel free to grab inspiration from what I did, or even chunks of code (MIT licensed, just mention my name). I guess I've solved this for myself now. If there are updates to the Scrapy template, I hope I'll be able to somehow keep up with it and backport changes to my highly customized project.

@vdusek vdusek added t-tooling Issues with this label are in the ownership of the tooling team. enhancement New feature or request. labels Jan 19, 2024
@vdusek
Copy link
Contributor

vdusek commented Jan 19, 2024

Hi Honza, thank you for opening this. Moving as much code as we can from the template to the SDK is definitely a good way to go. Unfortunately, adding new features to our Scrapy-Apify integration is not a priority for this quarter, so I cannot promise I'll have time to take a look at this in the near future.

@vdusek
Copy link
Contributor

vdusek commented Jan 23, 2024

The function apply_apify_settings was moved to SDK in #178. Let's solve the rest of it later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request. t-tooling Issues with this label are in the ownership of the tooling team.
Projects
None yet
Development

No branches or pull requests

2 participants