Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: reddit engine #3444

Open
edisonzf2020 opened this issue May 1, 2024 · 6 comments
Open

Bug: reddit engine #3444

edisonzf2020 opened this issue May 1, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@edisonzf2020
Copy link

Version of SearXNG, commit number if you are using on master branch and stipulate if you forked SearXNG
Repository: https://github.com/searxng/searxng
Branch: master
Version: 2024.4.29+e45a7cc06

How did you install SearXNG?

What happened?

How To Reproduce

Expected behavior

Screenshots & Logs

Additional context

Technical report

@edisonzf2020 edisonzf2020 added the bug Something isn't working label May 1, 2024
@glanham-jr
Copy link
Contributor

Hi @edisonzf2020! Thanks for the bug report. With this, are you able to provide any details on what error you are seeing, or some screenshots?

@andypiper
Copy link

I am seeing this from the reddit engine (and I suspect but cannot confirm that this is what folks have also been raising in this issue and in #3112)

Error: searx.exceptions.SearxEngineAccessDeniedException 
Parameters: ('HTTP error 403',) 
File name: searx/search/processors/online.py:116 
Error Function: _send_http_request 
Code: response = req(params['url'], **request_args)

@JeffAlyanak
Copy link
Contributor

@andypiper Have you tested this using a different IP or outbound proxy to confirm whether this is just your IP being blocked by Reddit?

@JeffAlyanak
Copy link
Contributor

@glanham-jr

Just tested this, and it appears that Reddit returns a 403 for requests missing a desktop browser User-Agent.

I tested by adding Mozilla/5.0 (X11; Linux x86_64; rv:124.0) Gecko/20100101 Firefox/124.0 and changing to another IP and that seems to have worked.

@andypiper Do you have the ability to test from another IP? If so, try adding the following to line 33 of the reddit engine:

    params['headers']['User-Agent'] = "Mozilla/5.0 (X11; Linux x86_64; rv:124.0) Gecko/20100101 Firefox/124.0"

Should look something like this:

def request(query, params):
    query = urlencode({'q': query, 'limit': page_size})
    params['url'] = search_url.format(query=query)
    params['headers']['User-Agent'] = "Mozilla/5.0 (X11; Linux x86_64; rv:124.0) Gecko/20100101 Firefox/124.0"
    return params

@glanham-jr
Copy link
Contributor

I have a branch which you can test the changes. If the changes look good, the PR should be ready to go.

#3556

@glanham-jr
Copy link
Contributor

@JeffAlyanak so based on @return42 message in my PR, there should already be a User-Agent set. Does the default User Agent not work for you? Or specifically, is it possible to provide the headers on the error request? Do we think it's really an IP issue?

Also note that Reddit blocks VPNs to their ability, so if searxng is running in a VPN there may be a chance that could cause issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants