Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error retrieving data from .com #32

Closed
adnanhassan23 opened this issue Feb 5, 2022 · 3 comments
Closed

Error retrieving data from .com #32

adnanhassan23 opened this issue Feb 5, 2022 · 3 comments

Comments

@adnanhassan23
Copy link

adnanhassan23 commented Feb 5, 2022

Thank you for this great package! when i make more than 5 requests getting this error Error retrieving data from .com. Google is pretty hard on scrapers. After some research, I found that this issue can be fixed with proxies. This Proxy package https://www.npmjs.com/package/simple-proxies can be used with google-index ? This package https://github.com/christophebe/serp also using simple-proxies and https://www.scraperapi.com

@rodion-arr
Copy link
Owner

rodion-arr commented Feb 7, 2022

Hi @adnanhassan23, the package does not do requests from Node.js directly - it uses headless browser. So I assume if you need proxy - it should be configured on network level, not code

@adnanhassan23
Copy link
Author

adnanhassan23 commented Feb 7, 2022

Hi @rodion-arr Thank you for your response. According to this comment puppeteer/puppeteer#1948 (comment) i changed request URL

 const scraperApiUrl = 'http://api.scraperapi.com?api_key=3c0481387e80c18778d20145c095e9&url='
    const googleDomain = 'https://google.com/search?hl=en-US&rls=en&q='
    const requestUrl = `${scraperApiUrl}${googleDomain}site:${site}` 

Is this good practise? Actually i want to use multiple proxies. Can you Please elaborate on configured on network level?

@rodion-arr
Copy link
Owner

rodion-arr commented Feb 8, 2022

@adnanhassan23, it's OK to pass request URL like you've shown.
Regarding "network level" - I meant to setup proxy on server level and left the code part unchanged, e.g. setup http_proxy env variable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants