Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WiP: Fix looping Cloudflare challenge, Resolves #1036 #1163

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ilike2burnthing
Copy link
Contributor

Thanks to @juanfrilla for #1036 (comment).

Unfortunately, currently this only works on Windows, and the looping challenges return if using proxies or VPNs.

@garfield69
Copy link
Contributor

garfield69 commented Apr 21, 2024

FWIW
My win10 is on chrome 124, and i don't use VPN or proxy.
I've tested this PR (as a source based python run), and it solves for trupornolabs, riperam, marinetracker, devil-torrents, 52BT, which were indexers that were giving me issues previoulsy.
Also tested against most of the other cloudflare protected indexers that were previously working for me, and they continue to work with this PR.
Some indexers however continue to fail, leporno still returns the invalid cookies error, and ext-torrents which now fails on ext.to but works for the other 2 alternate domains.

But after each solve there remains a chrome subtask that starts to spin up to 15% CPU and I have to manually kill them off.
Should I test using this PR win10 exe?
[edit] Oh wait, there isn't one.

@juanfrilla
Copy link

juanfrilla commented Apr 21, 2024

Another thing that I've noticed is that in the user-agent headless replacement:

                self.execute_cdp_cmd(
                    "Network.setUserAgentOverride",
                    {
                        "userAgent": self.execute_script(
                            "return navigator.userAgent"
                        ).replace("Headless", "")
                    },
                )

I don't know why but If I hardcode the user-agent using the exact that my computer has like this:

user_agent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36"
options.add_argument(f"--user-agent={user_agent}")

it bypasses cloudflare, but if i put this to make it automatically like you have it on line 533 from src/undetected_chromedriver/__init__.py it does not work.

So an alternative could be to setup a driver only to get the user agent:

def get_user_agent(driver):
   return driver.execute_script("return navigator.userAgent;").replace("Headless", "")

And then pass the user-agent to the definitive driver

PD: I only can tell you what I've discovered to see if we can go through the solution cuz I'm having troubles to get the project installed/set up 😅

@m33ts4k0z
Copy link

m33ts4k0z commented Apr 21, 2024

Hello. I just wanted to let you know that this didnt bypass the challenge on arab-torrents.net, showing the same internal server error. I am not using VPN, proxies and I dont have a datacenter IP. Let me know if you need any info to troubleshoot this further.

I didnt actually use this branch, it worked fine after I switched to it. Thanks

@ilike2burnthing
Copy link
Contributor Author

ilike2burnthing commented Apr 21, 2024

@garfield69 yea this seems to be an issue with Chrome v124. You can revert to v123 in the mean time if it's easier - #1161

Alternatively, build your own binaries, which will use Chromium v123:

python src/build_package.py

@ilike2burnthing
Copy link
Contributor Author

@m33ts4k0z were you doing this on Windows?

@m33ts4k0z
Copy link

@m33ts4k0z were you doing this on Windows?

Yes on a Windows 11 VM on Unraid but it did work in the end. I updated my first post here with the cause.

@garfield69
Copy link
Contributor

Alternatively, build your own binaries, which will use Chromium v123:

python src/build_package.py

Oh cool, did not know I could build on windows.
Built successfully and tested. Much better, no left over chrome tasks chewing CPU anymore :-)

@ilike2burnthing
Copy link
Contributor Author

ilike2burnthing commented Apr 22, 2024

@juanfrilla sorry for the delay in replying, been busy and only got to a few quick ones on my phone.

I'll have a look at the UA idea when I next get a chance, thanks.

Assuming you're following the run from source instructions, what issue are you having? https://github.com/FlareSolverr/FlareSolverr#from-source-code

@juanfrilla
Copy link

@ilike2burnthing my main problem is that i cannot install Xvfb on MacOS

@ilike2burnthing
Copy link
Contributor Author

Tried XQuartz?

@juanfrilla
Copy link

juanfrilla commented Apr 22, 2024

Tried XQuartz?

yessir now the project is set up, let's see what I can fix

@21hsmw
Copy link
Contributor

21hsmw commented May 1, 2024

What exactly is left to do on this to get it merge? I tried to guess with the comments here and some different issues but I can't get the current status of this. It seems to be stale for quite some time, so what's needed?

@ilike2burnthing
Copy link
Contributor Author

Unfortunately, currently this only works on Windows, and the looping challenges return if using proxies or VPNs.

I'll have a look at the UA idea when I next get a chance, thanks.

@21hsmw
Copy link
Contributor

21hsmw commented May 1, 2024

Unfortunately, currently this only works on Windows, and the looping challenges return if using proxies or VPNs.

I'll have a look at the UA idea when I next get a chance, thanks.

Well, I made my own implementation of this "new tab" idea and I was able to make it work with every website I could (ext.to, www3.yggtorrent.cool, dodi-repacks.site, hd-torrents.me/login.php, nhentai.net) on my Linux system using a VPN / socks5 proxy and also with my container image on my own remote Linux server, which was blocked by cloudflare too.
Unfortunately I can't test on Windows, so if someone can test that and report back please do.

Public image with my edits: 21hsmw/flaresolverr:fixlooping
Code here: 21hsmw@da6cc9d

@ilike2burnthing
Copy link
Contributor Author

That's working 95% of the time on Windows for me, even with a proxy, but failing 95% of the time on Docker. Usual error:

Error: Error solving the challenge. 'NoneType' object has no attribute 'startswith'

Seems it's related to get_correct_window and trying to get driver.current_url. Adding some extra logging shows that the URL is returned as None. Adding some additional sleeps then shows the correct URL, but I'm still getting challenge loops or crashed.

@21hsmw
Copy link
Contributor

21hsmw commented May 2, 2024

That's working 95% of the time on Windows for me, even with a proxy, but failing 95% of the time on Docker.

When you say it fails on Docker, is it still on Windows or Linux?

I got this error on Linux while doing my implementation, but have not been able to replicate it since. For the looping challenges, it seems to be a timing issue. Playing with the timer values can make it work in some cases, but it's not easy to know what works for everyone since it seems to take network latency into account. For example, if I use a proxy close to my location, it works 100% of the time with the sites I listed earlier, but if I use a proxy very far from me, it works 50% of the time.
Can you try to increase the timers to something like 6, 8 or more?

@ilike2burnthing
Copy link
Contributor Author

Linux.

I'll play around with timings again (I did a bunch yesterday), see if I can get something that works both on my Docker and Windows.

@21hsmw
Copy link
Contributor

21hsmw commented May 2, 2024

Linux.

Strange then. I'm able to solve the challenges of all sites I try on my Debian and Fedora systems with different VPNs/Proxies with and without Docker involved.
Can you share an example of one of your tests with debug enabled?

Here's an example with dodi-repacks.site using the docker image I shared previously:
https://pastebin.com/nBramRXq

@aevrard

This comment was marked as off-topic.

@zenderzender
Copy link

Unfortunately, currently this only works on Windows, and the looping challenges return if using proxies or VPNs.

I'll have a look at the UA idea when I next get a chance, thanks.

Well, I made my own implementation of this "new tab" idea and I was able to make it work with every website I could (ext.to, www3.yggtorrent.cool, dodi-repacks.site, hd-torrents.me/login.php, nhentai.net) on my Linux system using a VPN / socks5 proxy and also with my container image on my own remote Linux server, which was blocked by cloudflare too. Unfortunately I can't test on Windows, so if someone can test that and report back please do.

Public image with my edits: 21hsmw/flaresolverr:fixlooping Code here: 21hsmw@da6cc9d

Thanks for your workaround @21hsmw
Here is a temporary image for anybody needing arm build :)
zender/flaresolverr-fixed:arm

Working with LANG=fr-FR

@aevrard the solution you provide will kill the killswitch if you're using something like gluetun...

@LoicBrison
Copy link

Thanks @21hsmw !
Works for YGG with YGGCookie and YGGtorrent; LANG=en_US

@Vrozaksen
Copy link

Vrozaksen commented May 3, 2024

21hsmw/flaresolverr:fixlooping

Worked for me on whatbox.ca

services:
     flaresolverr:
         image: 21hsmw/flaresolverr:fixlooping
         environment:
           - LOG_LEVEL=${LOG_LEVEL:-info}
           - LOG_HTML=${LOG_HTML:-false}
           - CAPTCHA_SOLVER=${CAPTCHA_SOLVER:-none}
           - TZ=UTC
           - PORT=25000
           - HOST=127.0.0.1
         network_mode: host
         pull_policy: always
         restart: unless-stopped

@juanfrilla
Copy link

juanfrilla commented May 3, 2024

Unfortunately, currently this only works on Windows, and the looping challenges return if using proxies or VPNs.

I'll have a look at the UA idea when I next get a chance, thanks.

Well, I made my own implementation of this "new tab" idea and I was able to make it work with every website I could (ext.to, www3.yggtorrent.cool, dodi-repacks.site, hd-torrents.me/login.php, nhentai.net) on my Linux system using a VPN / socks5 proxy and also with my container image on my own remote Linux server, which was blocked by cloudflare too. Unfortunately I can't test on Windows, so if someone can test that and report back please do.

Public image with my edits: 21hsmw/flaresolverr:fixlooping Code here: 21hsmw@da6cc9d

replacing the image of the dockerfile for this:
python:3.11-slim-bullseye works perfectly locally on MacOS M2 with and without proxies (tested for my website "https://www.icj-cij.org/sites/default/files/case-related/187/187-20231215-ord-01-00-en.pdf", I can get the cf_clearance cookie

I tested as well on a centOS server with the previous image (python:3.11-slim-bookworm) and it doesnt work

@21hsmw
Copy link
Contributor

21hsmw commented May 3, 2024

Unfortunately, currently this only works on Windows, and the looping challenges return if using proxies or VPNs.

I'll have a look at the UA idea when I next get a chance, thanks.

Well, I made my own implementation of this "new tab" idea and I was able to make it work with every website I could (ext.to, www3.yggtorrent.cool, dodi-repacks.site, hd-torrents.me/login.php, nhentai.net) on my Linux system using a VPN / socks5 proxy and also with my container image on my own remote Linux server, which was blocked by cloudflare too. Unfortunately I can't test on Windows, so if someone can test that and report back please do.
Public image with my edits: 21hsmw/flaresolverr:fixlooping Code here: 21hsmw@da6cc9d

replacing the image of the dockerfile for this: python:3.11-slim-bullseye works perfectly locally on MacOS M2 with and without proxies (tested for my website "https://www.icj-cij.org/sites/default/files/case-related/187/187-20231215-ord-01-00-en.pdf", I can get the cf_clearance cookie

I tested as well on a centOS server with the previous image (python:3.11-slim-bookworm) and it doesnt work

I just tested several times with my image using the same link you provided, and I was able to pass the challenge after the tab switch is completed.
I still think it's a timing issue.

Here are my observations testing with your link:

  • My image using a proxy close to my real location:
    • First try: Works
    • Second try: Works
    • Third try: Works
  • My image using a proxy far from my real location:
    • First try: Fail
    • Second try: Fail
    • Third try: Fail

I was skeptical about the results, so I started tweaking the timers.

Changed this part from 2/2 to 4/4:

def switch_to_new_tab(driver: WebDriver, url: str) -> None:
    logging.debug("Opening new tab...")
    driver.execute_script(f"window.open('{url}', 'new tab')")
-   time.sleep(2)
+   time.sleep(4)
    logging.debug("Closing original tab...")
-   time.sleep(2)
+   time.sleep(4)
    driver.close()

Then tried again:

  • New timer image using a proxy close to my real location:
    • First try: Works
    • Second try: Works
    • Third try: Works
  • New timer image using a proxy far from my real location:
    • First try: Works
    • Second try: Works
    • Third try: Works

Oddly enough, when I switch back to 2/2, the far away proxy still works, which means that if it succeeds in a challenge, it's easier to go through afterwards.
So I tried with the same countries but with different ISPs and the results are the same. The far away proxy fails at 2/2, succeeds at 4/4, and continues to work even after switching back to 2/2.

Can you also try changing to 4/4 and see if that changes anything?
If it doesn't, there's something we're missing somewhere.

@juanfrilla
Copy link

juanfrilla commented May 3, 2024

Unfortunately, currently this only works on Windows, and the looping challenges return if using proxies or VPNs.

I'll have a look at the UA idea when I next get a chance, thanks.

Well, I made my own implementation of this "new tab" idea and I was able to make it work with every website I could (ext.to, www3.yggtorrent.cool, dodi-repacks.site, hd-torrents.me/login.php, nhentai.net) on my Linux system using a VPN / socks5 proxy and also with my container image on my own remote Linux server, which was blocked by cloudflare too. Unfortunately I can't test on Windows, so if someone can test that and report back please do.
Public image with my edits: 21hsmw/flaresolverr:fixlooping Code here: 21hsmw@da6cc9d

replacing the image of the dockerfile for this: python:3.11-slim-bullseye works perfectly locally on MacOS M2 with and without proxies (tested for my website "https://www.icj-cij.org/sites/default/files/case-related/187/187-20231215-ord-01-00-en.pdf", I can get the cf_clearance cookie
I tested as well on a centOS server with the previous image (python:3.11-slim-bookworm) and it doesnt work

I just tested several times with my image using the same link you provided, and I was able to pass the challenge after the tab switch is completed. I still think it's a timing issue.

Here are my observations testing with your link:

  • My image using a proxy close to my real location:

    • First try: Works
    • Second try: Works
    • Third try: Works
  • My image using a proxy far from my real location:

    • First try: Fail
    • Second try: Fail
    • Third try: Fail

I was skeptical about the results, so I started tweaking the timers.

Changed this part from 2/2 to 4/4:

def switch_to_new_tab(driver: WebDriver, url: str) -> None:
    logging.debug("Opening new tab...")
    driver.execute_script(f"window.open('{url}', 'new tab')")
-   time.sleep(2)
+   time.sleep(4)
    logging.debug("Closing original tab...")
-   time.sleep(2)
+   time.sleep(4)
    driver.close()

Then tried again:

  • New timer image using a proxy close to my real location:

    • First try: Works
    • Second try: Works
    • Third try: Works
  • New timer image using a proxy far from my real location:

    • First try: Works
    • Second try: Works
    • Third try: Works

Oddly enough, when I switch back to 2/2, the far away proxy still works, which means that if it succeeds in a challenge, it's easier to go through afterwards. So I tried with the same countries but with different ISPs and the results are the same. The far away proxy fails at 2/2, succeeds at 4/4, and continues to work even after switching back to 2/2.

Can you also try changing to 4/4 and see if that changes anything? If it doesn't, there's something we're missing somewhere.

yeeeeah it worked with 4/4 on the server, here is the image:
https://hub.docker.com/repository/docker/juanfrillaaa/flaresolverr/general

@21hsmw
Copy link
Contributor

21hsmw commented May 3, 2024

yeeeeah it worked with 4/4 on the server

Nice.
I can get past most sites with 4/4 now, but I can't get this one to work: hd-torrents.me
Can you try it? I was able to before, but now I can't for some reason.
I will try to check later.

@ilike2burnthing
Copy link
Contributor Author

ilike2burnthing commented May 5, 2024

Using your current image:

2024-05-05 05:19:33 INFO     ReqId 139851825376128 FlareSolverr 3.3.17
2024-05-05 05:19:33 DEBUG    ReqId 139851825376128 Debug log enabled
2024-05-05 05:19:33 INFO     ReqId 139851825376128 Testing web browser installation...
2024-05-05 05:19:33 INFO     ReqId 139851825376128 Platform: Linux-5.13.x-x86_64-with-glibc2.36
2024-05-05 05:19:33 INFO     ReqId 139851825376128 Chrome / Chromium path: /bin/chromium
2024-05-05 05:19:34 INFO     ReqId 139851825376128 Chrome / Chromium major version: 124
2024-05-05 05:19:34 INFO     ReqId 139851825376128 Launching web browser...
2024-05-05 05:19:34 DEBUG    ReqId 139851825376128 Launching web browser...
version_main cannot be converted to an integer
2024-05-05 05:19:35 DEBUG    ReqId 139851825376128 Started executable: `/app/chromedriver` in a child process with pid: 29
2024-05-05 05:19:40 INFO     ReqId 139851825376128 FlareSolverr User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36
2024-05-05 05:19:40 INFO     ReqId 139851825376128 Test successful!
2024-05-05 05:19:40 INFO     ReqId 139851825376128 Serving on http://0.0.0.0:8191
2024-05-05 05:22:09 INFO     ReqId 139851791058624 Incoming request => POST /v1 body: {'maxTimeout': 90000, 'cmd': 'request.get', 'url': 'https://idope.se/browse.html'}
2024-05-05 05:22:09 DEBUG    ReqId 139851791058624 Launching web browser...
version_main cannot be converted to an integer
2024-05-05 05:22:09 DEBUG    ReqId 139851791058624 Started executable: `/app/chromedriver` in a child process with pid: 166
2024-05-05 05:22:13 DEBUG    ReqId 139851791058624 New instance of webdriver has been created to perform the request
2024-05-05 05:22:13 DEBUG    ReqId 139851757487808 Navigating to... https://idope.se/browse.html
2024-05-05 05:22:20 INFO     ReqId 139851757487808 Challenge detected. Title found: Just a moment...
2024-05-05 05:22:20 DEBUG    ReqId 139851757487808 Waiting for title (attempt 1): Just a moment...
2024-05-05 05:22:21 DEBUG    ReqId 139851757487808 Timeout waiting for selector
2024-05-05 05:22:21 DEBUG    ReqId 139851757487808 Try to find the Cloudflare verify checkbox...
2024-05-05 05:22:25 DEBUG    ReqId 139851757487808 Cloudflare verify checkbox not found on the page.
2024-05-05 05:22:25 DEBUG    ReqId 139851757487808 Try to find the Cloudflare 'Verify you are human' button...
2024-05-05 05:22:25 DEBUG    ReqId 139851757487808 The Cloudflare 'Verify you are human' button not found on the page.
2024-05-05 05:22:31 DEBUG    ReqId 139851757487808 Waiting for title (attempt 2): Just a moment...
2024-05-05 05:22:32 DEBUG    ReqId 139851757487808 Timeout waiting for selector
2024-05-05 05:22:32 DEBUG    ReqId 139851757487808 Try to find the Cloudflare verify checkbox...
2024-05-05 05:22:35 DEBUG    ReqId 139851757487808 Cloudflare verify checkbox not found on the page.
2024-05-05 05:22:35 DEBUG    ReqId 139851757487808 Try to find the Cloudflare 'Verify you are human' button...
2024-05-05 05:22:35 DEBUG    ReqId 139851757487808 The Cloudflare 'Verify you are human' button not found on the page.
2024-05-05 05:22:37 DEBUG    ReqId 139851757487808 Waiting for title (attempt 3): Just a moment...
2024-05-05 05:22:38 DEBUG    ReqId 139851757487808 Timeout waiting for selector
2024-05-05 05:22:38 DEBUG    ReqId 139851757487808 Try to find the Cloudflare verify checkbox...
2024-05-05 05:22:39 DEBUG    ReqId 139851757487808 Cloudflare verify checkbox not found on the page.
2024-05-05 05:22:39 DEBUG    ReqId 139851757487808 Try to find the Cloudflare 'Verify you are human' button...
2024-05-05 05:22:40 DEBUG    ReqId 139851757487808 The Cloudflare 'Verify you are human' button not found on the page.
2024-05-05 05:22:42 DEBUG    ReqId 139851757487808 Opening new tab...
2024-05-05 05:22:44 DEBUG    ReqId 139851757487808 Closing original tab...
2024-05-05 05:22:47 DEBUG    ReqId 139851791058624 A used instance of webdriver has been destroyed
2024-05-05 05:22:47 ERROR    ReqId 139851791058624 Error: Error solving the challenge. 'NoneType' object has no attribute 'startswith'
2024-05-05 05:22:47 DEBUG    ReqId 139851791058624 Response => POST /v1 body: {'status': 'error', 'message': "Error: Error solving the challenge. 'NoneType' object has no attribute 'startswith'", 'startTimestamp': 1714886529365, 'endTimestamp': 1714886567775, 'version': '3.3.17'}
2024-05-05 05:22:47 INFO     ReqId 139851791058624 Response in 38.41 s
2024-05-05 05:22:47 INFO     ReqId 139851791058624 172.18.0.4 POST http://flaresolverr:8191/v1 500 Internal Server Error

With a few changes:

2024-05-05 05:29:05 INFO     ReqId 140686649551744 FlareSolverr 3.3.17
2024-05-05 05:29:05 DEBUG    ReqId 140686649551744 Debug log enabled
2024-05-05 05:29:05 INFO     ReqId 140686649551744 Testing web browser installation...
2024-05-05 05:29:05 INFO     ReqId 140686649551744 Platform: Linux-5.13.x-x86_64-with-glibc2.36
2024-05-05 05:29:05 INFO     ReqId 140686649551744 Chrome / Chromium path: /usr/bin/chromium
2024-05-05 05:29:05 INFO     ReqId 140686649551744 Chrome / Chromium major version: 124
2024-05-05 05:29:05 INFO     ReqId 140686649551744 Launching web browser...
2024-05-05 05:29:05 DEBUG    ReqId 140686649551744 Launching web browser...
version_main cannot be converted to an integer
2024-05-05 05:29:06 DEBUG    ReqId 140686649551744 Started executable: `/app/chromedriver` in a child process with pid: 31
2024-05-05 05:29:10 INFO     ReqId 140686649551744 FlareSolverr User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36
2024-05-05 05:29:10 INFO     ReqId 140686649551744 Test successful!
2024-05-05 05:29:11 INFO     ReqId 140686649551744 Serving on http://0.0.0.0:8191
2024-05-05 05:29:20 INFO     ReqId 140686615234240 Incoming request => POST /v1 body: {'maxTimeout': 90000, 'cmd': 'request.get', 'url': 'https://idope.se/browse.html'}
2024-05-05 05:29:20 DEBUG    ReqId 140686615234240 Launching web browser...
version_main cannot be converted to an integer
2024-05-05 05:29:20 DEBUG    ReqId 140686615234240 Started executable: `/app/chromedriver` in a child process with pid: 173
2024-05-05 05:29:23 DEBUG    ReqId 140686615234240 New instance of webdriver has been created to perform the request
2024-05-05 05:29:23 DEBUG    ReqId 140686581663424 Navigating to... https://idope.se/browse.html
2024-05-05 05:29:38 DEBUG    ReqId 140686581663424 Current URL... https://idope.se/browse.html
2024-05-05 05:29:38 INFO     ReqId 140686581663424 Challenge detected. Title found: Just a moment...
2024-05-05 05:29:38 DEBUG    ReqId 140686581663424 Waiting for title (attempt 1): Just a moment...
2024-05-05 05:29:39 DEBUG    ReqId 140686581663424 Timeout waiting for selector
2024-05-05 05:29:39 DEBUG    ReqId 140686581663424 Try to find the Cloudflare verify checkbox...
2024-05-05 05:29:41 DEBUG    ReqId 140686581663424 Cloudflare verify checkbox not found on the page.
2024-05-05 05:29:41 DEBUG    ReqId 140686581663424 Try to find the Cloudflare 'Verify you are human' button...
2024-05-05 05:29:41 DEBUG    ReqId 140686581663424 The Cloudflare 'Verify you are human' button not found on the page.
2024-05-05 05:29:43 DEBUG    ReqId 140686581663424 Waiting for title (attempt 2): Just a moment...
2024-05-05 05:29:44 DEBUG    ReqId 140686581663424 Timeout waiting for selector
2024-05-05 05:29:44 DEBUG    ReqId 140686581663424 Try to find the Cloudflare verify checkbox...
2024-05-05 05:29:45 DEBUG    ReqId 140686581663424 Cloudflare verify checkbox not found on the page.
2024-05-05 05:29:45 DEBUG    ReqId 140686581663424 Try to find the Cloudflare 'Verify you are human' button...
2024-05-05 05:29:45 DEBUG    ReqId 140686581663424 The Cloudflare 'Verify you are human' button not found on the page.
2024-05-05 05:29:47 DEBUG    ReqId 140686581663424 Waiting for title (attempt 3): Just a moment...
2024-05-05 05:29:48 DEBUG    ReqId 140686581663424 Timeout waiting for selector
2024-05-05 05:29:48 DEBUG    ReqId 140686581663424 Try to find the Cloudflare verify checkbox...
2024-05-05 05:29:50 DEBUG    ReqId 140686581663424 Cloudflare verify checkbox not found on the page.
2024-05-05 05:29:50 DEBUG    ReqId 140686581663424 Try to find the Cloudflare 'Verify you are human' button...
2024-05-05 05:29:50 DEBUG    ReqId 140686581663424 The Cloudflare 'Verify you are human' button not found on the page.
2024-05-05 05:29:52 DEBUG    ReqId 140686581663424 Opening new tab...
2024-05-05 05:30:01 DEBUG    ReqId 140686581663424 Closing original tab...
��
05 05:30:05 DEBUG    ReqId 140686581663424 Current URL... devtools://devtools/bundled/devtools_app.html?remoteBase=https://chrome-devtools-frontend.appspot.com/serve_file/@a087f2dd364ddd58b9c016ef1bf563d2bc138711/&can_dock=true&targetType=tab&veLogging=true
2024-05-05 05:30:06 DEBUG    ReqId 140686581663424 Current URL... https://idope.se/browse.html
2024-05-05 05:30:10 DEBUG    ReqId 140686581663424 Waiting for title (attempt 4): Just a moment...
2024-05-05 05:30:12 DEBUG    ReqId 140686581663424 Timeout waiting for selector
2024-05-05 05:30:12 DEBUG    ReqId 140686581663424 Try to find the Cloudflare verify checkbox...
2024-05-05 05:30:12 DEBUG    ReqId 140686581663424 Cloudflare verify checkbox not found on the page.
2024-05-05 05:30:12 DEBUG    ReqId 140686581663424 Try to find the Cloudflare 'Verify you are human' button...
2024-05-05 05:30:12 DEBUG    ReqId 140686581663424 The Cloudflare 'Verify you are human' button not found on the page.
2024-05-05 05:30:14 DEBUG    ReqId 140686581663424 Waiting for title (attempt 5): Just a moment...
2024-05-05 05:30:16 DEBUG    ReqId 140686581663424 Timeout waiting for selector
2024-05-05 05:30:16 DEBUG    ReqId 140686581663424 Try to find the Cloudflare verify checkbox...
2024-05-05 05:30:17 DEBUG    ReqId 140686581663424 Cloudflare verify checkbox not found on the page.
2024-05-05 05:30:17 DEBUG    ReqId 140686581663424 Try to find the Cloudflare 'Verify you are human' button...
2024-05-05 05:30:17 DEBUG    ReqId 140686581663424 The Cloudflare 'Verify you are human' button not found on the page.
2024-05-05 05:30:19 DEBUG    ReqId 140686581663424 Waiting for title (attempt 6): Just a moment...
2024-05-05 05:30:20 DEBUG    ReqId 140686581663424 Timeout waiting for selector
2024-05-05 05:30:20 DEBUG    ReqId 140686581663424 Try to find the Cloudflare verify checkbox...
2024-05-05 05:30:23 DEBUG    ReqId 140686581663424 Cloudflare verify checkbox not found on the page.
2024-05-05 05:30:23 DEBUG    ReqId 140686581663424 Try to find the Cloudflare 'Verify you are human' button...
2024-05-05 05:30:23 DEBUG    ReqId 140686581663424 The Cloudflare 'Verify you are human' button not found on the page.
2024-05-05 05:30:25 DEBUG    ReqId 140686581663424 Waiting for title (attempt 7): Just a moment...
2024-05-05 05:30:26 DEBUG    ReqId 140686581663424 Timeout waiting for selector
2024-05-05 05:30:26 DEBUG    ReqId 140686581663424 Try to find the Cloudflare verify checkbox...
2024-05-05 05:30:26 DEBUG    ReqId 140686581663424 Cloudflare verify checkbox not found on the page.
2024-05-05 05:30:26 DEBUG    ReqId 140686581663424 Try to find the Cloudflare 'Verify you are human' button...
2024-05-05 05:30:26 DEBUG    ReqId 140686581663424 The Cloudflare 'Verify you are human' button not found on the page.
2024-05-05 05:30:28 DEBUG    ReqId 140686581663424 Opening new tab...
2024-05-05 05:31:04 DEBUG    ReqId 140686615234240 A used instance of webdriver has been destroyed
2024-05-05 05:31:04 ERROR    ReqId 140686615234240 Error: Error solving the challenge. Timeout after 90.0 seconds.
2024-05-05 05:31:04 DEBUG    ReqId 140686615234240 Response => POST /v1 body: {'status': 'error', 'message': 'Error: Error solving the challenge. Timeout after 90.0 seconds.', 'startTimestamp': 1714886960843, 'endTimestamp': 1714887064376, 'version': '3.3.17'}
2024-05-05 05:31:04 INFO     ReqId 140686615234240 Response in 103.533 s
2024-05-05 05:31:04 INFO     ReqId 140686615234240 172.18.0.4 POST http://flaresolverr:8191/v1 500 Internal Server Error

While everything works fine on Windows, the best I could get with Docker was an equivalent to the current release, but now with various errors instead of just timing out, e.g.:

2024-05-05 06:43:04 ERROR    ReqId 140686615234240 Error: Error solving the challenge. Message: no such window: target window already closed\nfrom unknown error: web view not found\n  (Session info: chrome=124.0.6367.78)\nStacktrace:\n#0 0x560cca239613 <unknown>\n#1 0x560cc9f14f37 <unknown>\n#2 0x560cc9ef202b <unknown>\n#3 0x560cc9f879e1 <unknown>\n#4 0x560cc9f9a7bc <unknown>\n#5 0x560cc9f7dc83 <unknown>\n#6 0x560cc9f4eb8d <unknown>\n#7 0x560cc9f4f942 <unknown>\n#8 0x560cca207c56 <unknown>\n#9 0x560cca20b01a <unknown>\n#10 0x560cca20aacf <unknown>\n#11 0x560cca20b4a5 <unknown>\n#12 0x560cca1f86df <unknown>\n#13 0x560cca20b840 <unknown>\n#14 0x560cca1e15a6 <unknown>\n#15 0x560cca22a265 <unknown>\n#16 0x560cca22a452 <unknown>\n#17 0x560cca238b3f <unknown>\n#18 0x7fee2da2e134 <unknown>\n
2024-05-05 06:43:04 DEBUG    ReqId 140686615234240 Response => POST /v1 body: {'status': 'error', 'message': 'Error: Error solving the challenge. Message: no such window: target window already closed\\nfrom unknown error: web view not found\\n  (Session info: chrome=124.0.6367.78)\\nStacktrace:\\n#0 0x560cca239613 <unknown>\\n#1 0x560cc9f14f37 <unknown>\\n#2 0x560cc9ef202b <unknown>\\n#3 0x560cc9f879e1 <unknown>\\n#4 0x560cc9f9a7bc <unknown>\\n#5 0x560cc9f7dc83 <unknown>\\n#6 0x560cc9f4eb8d <unknown>\\n#7 0x560cc9f4f942 <unknown>\\n#8 0x560cca207c56 <unknown>\\n#9 0x560cca20b01a <unknown>\\n#10 0x560cca20aacf <unknown>\\n#11 0x560cca20b4a5 <unknown>\\n#12 0x560cca1f86df <unknown>\\n#13 0x560cca20b840 <unknown>\\n#14 0x560cca1e15a6 <unknown>\\n#15 0x560cca22a265 <unknown>\\n#16 0x560cca22a452 <unknown>\\n#17 0x560cca238b3f <unknown>\\n#18 0x7fee2da2e134 <unknown>\\n', 'startTimestamp': 1714891291300, 'endTimestamp': 1714891384926, 'version': '3.3.17'}

As for what changes I got the best results with (there were many other failed variations):

def get_correct_window(driver: WebDriver) -> WebDriver:
    if len(driver.window_handles) > 1:
        for window_handle in driver.window_handles:
            driver.switch_to.window(window_handle)
            time.sleep(1)
            current_url = driver.current_url
            logging.debug(f"Current URL... {current_url}")
            if not current_url.startswith("devtools://devtools"):
                return driver
    return driver

def switch_to_new_tab(driver: WebDriver, url: str) -> None:
    logging.debug("Opening new tab...")
    driver.execute_script(f"window.open('{url}', 'new tab')")
    time.sleep(8)
    logging.debug("Closing original tab...")
    driver.close()

[...]

                if attempt in {4, 8, 12, 16, 20}:
                    switch_to_new_tab(driver, req.url)
                    time.sleep(1)
                    driver = get_correct_window(driver)
                    time.sleep(4)

@toniohc

This comment was marked as duplicate.

@Maiikoo

This comment was marked as duplicate.

@21hsmw
Copy link
Contributor

21hsmw commented May 5, 2024

I spent all afternoon trying to find a way to bypass cloudflare on hd-torrents.me without success while I'm able to pass all others.
I still don't understand why some websites work and others don't. Interestingly, I'm able to pass the challenge without even going through the tab part with your test website... (https://idope.se/browse.html / Success on attempt 2 each time).
The only thing I've found is the fact that manually opening a new empty tab and pasting the website URL makes it work 100% of the time, but not when I do the same with the driver and then paste manually, meaning they know we open the tab by some magic (scripting) and that triggers their system. I have tried to find a way to virtualize the keybindings but without success because of the current project implementation. Unless we find a different method that works the same for everyone, I'm not sure we'll be able to fix this completely.

@ilike2burnthing
Copy link
Contributor Author

The successor to UC is nodriver, but I'm not going to even try to change over to that, I would be immediately out of my depth.

@Hyperz
Copy link

Hyperz commented May 6, 2024

I spent all afternoon trying to find a way to bypass cloudflare on hd-torrents.me without success while I'm able to pass all others. I still don't understand why some websites work and others don't. Interestingly, I'm able to pass the challenge without even going through the tab part with your test website... (https://idope.se/browse.html / Success on attempt 2 each time). The only thing I've found is the fact that manually opening a new empty tab and pasting the website URL makes it work 100% of the time, but not when I do the same with the driver and then paste manually, meaning they know we open the tab by some magic (scripting) and that triggers their system. I have tried to find a way to virtualize the keybindings but without success because of the current project implementation. Unless we find a different method that works the same for everyone, I'm not sure we'll be able to fix this completely.

I've come to the same conclusions after many hours of testing, including generating a browsing history and manipulating web pages of big sites to inject an <a> and spawning tabs by clicking that (or just to get a referrer header in there). Clearly Chromedriver or Chrome itself is doing something to make this detectable which CF is aware of. If I had to put my tinfoil hat on I'd say this kind of makes sense given that Google is basically an ad company and as such sneaking something in there to detect/fingerprint (parts of?) automation, in addition to all the other tracking and data harvesting, would make sense. Regardless though, IMHO Chrome/Chromium should be dropped entirely. When it's not a CF update it's a Google update that breaks things. And unless someone wants to dig through the millions of lines of Chromium source code or spend a completely unreasonable amount of time reverse engineering the CF challenges the problem is not really going to get solved at this point.

All that said, the easiest fix for this issue is to just use Firefox. I've ripped out all the uc driver related stuff, made get_webdriver() return a Firefox instance (using webdriver-manager), replaced the tab spawning workarounds with a simple driver.get() and now it works fine again. Even on a site that didn't work with any of the other tricks when using the uc driver.

@ilike2burnthing
Copy link
Contributor Author

Can you provide a PR?

@Hyperz
Copy link

Hyperz commented May 6, 2024

Unfortunately I can't because because what I run locally is a very old version that I've been patching myself to work around issues over the last year or so, and there is some custom messy stuff in there as well. And I used selenium-wire in case I needed to do further testing. Plus I didn't hook up the proxy config stuff. But switching it to Firefox was like maybe 10 minutes of work since it's just the utils.py file that needs changing (changing get_webdriver() and removing the functions that are Chrome-specific).

My get_webdriver now looks like this, but keep in mind this is using selenium-wire, which again isn't needed (and actually no longer maintained):

from selenium.webdriver import FirefoxService
from webdriver_manager.firefox import GeckoDriverManager
from seleniumwire import webdriver

...

def get_webdriver(proxy: dict = None) -> webdriver.Firefox:
    global PATCHED_DRIVER_PATH
    logging.debug('Launching web browser...')

    options = webdriver.FirefoxOptions()
    # options.set_preference("network.proxy.type", 1) # Direct = 0, Manual = 1, PAC = 2, AUTODETECT = 4, SYSTEM = 5
    # options.set_preference("network.proxy.http", "127.0.0.1")
    # options.set_preference("network.proxy.http_port", 2020)
    # options.set_preference("network.proxy.share_proxy_settings", True)

    if get_config_headless():
        if os.name == 'nt':
            options.add_argument('-headless')
        else:
            start_xvfb_display()

    driver = webdriver.Firefox(
        options=options,
        service=FirefoxService(
            executable_path=GeckoDriverManager().install(),
        ),
        seleniumwire_options={'disable_encoding': True},
    )
    driver.maximize_window()

    return driver

webdriver.Firefox subclasses selenium.webdriver.Firefox and webdriver.FirefoxOptions is just selenium.webdriver.FirefoxOptions, so IIRC you'd use those instead of the ones from selenium-wire.

@21hsmw
Copy link
Contributor

21hsmw commented May 6, 2024

The successor to UC is nodriver

I've been playing with it for an hour and it passes everything. It seems to support most of what we need for flaresolverr, so it might be an option. The only thing I don't know is if it will run correctly inside a container. If it does, I will probably start moving flaresolverr to nodriver, but that will take some time.

I noticed this part in the nodriver readme (code)

utility function to convert a running undetected_chromedriver.Chrome instance to a nodriver.Browser instance and contintue from there

It could be a relatively quick way to support it without too much change until the decision is made whether or not to use it fully.

@Gauvino Gauvino mentioned this pull request May 6, 2024
@0rsa

This comment was marked as off-topic.

@tenettow
Copy link

tenettow commented May 9, 2024

I can access all trackers in Jackett with the latest Flaresolverr, but not with these changes.

@ilike2burnthing
Copy link
Contributor Author

'latest' being v3.3.17?

What OS?

@tenettow
Copy link

tenettow commented May 9, 2024

Flaresolverr v3.3.17, Linux x86_64 running on DigitalOcean. Tested on my macOS and that version also works fine with my normal IP. Any tracker I should try that you have issues with?

@ilike2burnthing
Copy link
Contributor Author

ilike2burnthing commented May 9, 2024

https://github.com/search?q=repo%3AJackett%2FJackett+configuring-flaresolverr&type=code

Some may not be currently using CF, there may be some missed, some only use it for login or keyword searches, but that should give you an idea.

See also those sites mentioned in #1036.

@researchersec
Copy link

juanfrillaaa/flaresolverr:latest solves the challenges greatly on my ubuntu with docker! Thanks @juanfrilla!

@21hsmw
Copy link
Contributor

21hsmw commented May 24, 2024

The only thing I don't know is if it will run correctly inside a container.

Update: Just tested, it works with an xvfb display inside a container. I'll see if I can find some time to implement part of nodriver for flaresolverr, but I'm not sure if I should put both undetected-chromedriver and nodriver in the same files, or if it would be better to create separate files like flaresolverr_service_nd.py to make the code cleaner. Any thoughts? @ilike2burnthing

@ilike2burnthing
Copy link
Contributor Author

Separate sounds good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed needs investigation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet