Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creepjs detects it... #64

Open
searchingforcode opened this issue Aug 29, 2022 · 18 comments
Open

Creepjs detects it... #64

searchingforcode opened this issue Aug 29, 2022 · 18 comments
Assignees
Labels
antibot Yet another antibot service is recognizing the injection t-tooling Issues with this label are in the ownership of the tooling team.

Comments

@searchingforcode
Copy link

Screenshot_40

Hi the creepjs can detect real info using workers. Is there any solution for that?

@searchingforcode searchingforcode added the bug Something isn't working. label Aug 29, 2022
@barjin
Copy link
Collaborator

barjin commented Aug 30, 2022

Thank you for submitting this issue!

Indeed, support for worker injection is in our scope right now. Unfortunately, injecting code in the (service/shared) worker environment is not as simple as doing it in the "plain browser" environment itself. We are closely monitoring the progress on both automation libraries and will add worker support as soon as it is possible.

@B4nan B4nan added antibot Yet another antibot service is recognizing the injection and removed bug Something isn't working. labels Aug 30, 2022
@ja3abuser
Copy link

Thank you for submitting this issue!

Indeed, support for worker injection is in our scope right now. Unfortunately, injecting code in the (service/shared) worker environment is not as simple as doing it in the "plain browser" environment itself. We are closely monitoring the progress on both automation libraries and will add worker support as soon as it is possible.

https://playwright.dev/docs/api/class-worker#worker-evaluate isn't it?

@searchingforcode
Copy link
Author

Thank you for submitting this issue!

Indeed, support for worker injection is in our scope right now. Unfortunately, injecting code in the (service/shared) worker environment is not as simple as doing it in the "plain browser" environment itself. We are closely monitoring the progress on both automation libraries and will add worker support as soon as it is possible.

I think that is why many paid multilogin type service providers use their own modified chromium.

@ja3abuser
Copy link

Thank you for submitting this issue!
Indeed, support for worker injection is in our scope right now. Unfortunately, injecting code in the (service/shared) worker environment is not as simple as doing it in the "plain browser" environment itself. We are closely monitoring the progress on both automation libraries and will add worker support as soon as it is possible.

I think that is why many paid multilogin type service providers use their own modified chromium.

Shitty idea :)

@piercefreeman
Copy link

piercefreeman commented Sep 7, 2022

@barjin @wireguard-dev It seems like the biggest issue here is the lack of init script support in service workers. worker-evaluate isn't a good fit because it can only be called after a worker has been spawned, which is too late to override the object prototypes before the scripts get a chance to execute its fingerprinting logic. At best we end up in a race condition to try and inject custom scripts while the web worker is waiting.

In the past I've solved for this problem by intercepting the egress script requests that fetch the javascript backing the web worker. The browser / page context needs to fetch the script javascript before spawning a server worker, so these should still be routed through browserContext.route(url, handler[, options]). If we fetch the script from the desired remote and prepend the fingerprinting init script before it, we should get the desired object primitives before the service worker has a chance to run.

Since fingerprint-injectoralready renders this init script as a single string payload (specifically here), it seems like all we need here would be:

  • Ensure the object primitives are compatible with service workers. Some APIs like for DOM access or synchronous events are disabled, so we would have to make sure we aren't trying to patch these inappropriately. Otherwise fingerprinting services might be able to detect the presence of these monkeypatched objects and throw an error.
  • Add a handler for page.route within attachFingerprintToPlaywright that filters for request.resourceType() or request.serviceWorker(). I'll have to check which one is present in these service worker loading requests.
  • In this router, continue to conduct the fetch through the browser context.
  • Pretend the init script from the injector to this script and return the script to the browser.

Does this sound like a reasonable route forward?

@barjin
Copy link
Collaborator

barjin commented Sep 7, 2022

Thank you @piercefreeman (and sorry @wireguard-dev for taking this long :) )! Exactly as you say, the injection needs to take place before all other JS execution.

In fact, there is a (somewhat stale) branch here (see the last commit) following the exact approach you described. Feel free to merge it and try it out locally. This does perform "worker injection", but unfortunately doesn't capture ServiceWorker or SharedWorker requests (most likely due to Playwright limitations).

Without stable support from PW/PP, I cannot really give any estimates on this as of now. The partial support (for regular Workers only) still needs some testing (mainly performance-wise), but it could make it to npm somewhat soon :)

@piercefreeman
Copy link

@barjin I wrote a MITM proxying library to work around chromium issues where it won't intercept some traffic. It should support that ServiceWorker/SharedWorker use case over both http and https. If you're interested in giving it a spin:
https://github.com/piercefreeman/grooveproxy

@ja3abuser
Copy link

@barjin I wrote a MITM proxying library to work around chromium issues where it won't intercept some traffic. It should support that ServiceWorker/SharedWorker use case over both http and https. If you're interested in giving it a spin: https://github.com/piercefreeman/grooveproxy

And for services that use a blob: what will you do? These are no longer requests

@piercefreeman
Copy link

@wireguard-dev Can you elaborate a bit on what service blobs you're referring to?

The proxy server should capture all packets that are going out the network, though. Just like providing a third party proxy server in chromium or requests - it'll route all packets through the middle layer. So in theory it should be possible to manipulate them and mock them with fingerprint injection logic. If you have more info on the context I'll be able to be more specific.

@ja3abuser
Copy link

@wireguard-dev Can you elaborate a bit on what service blobs you're referring to?

The proxy server should capture all packets that are going out the network, though. Just like providing a third party proxy server in chromium or requests - it'll route all packets through the middle layer. So in theory it should be possible to manipulate them and mock them with fingerprint injection logic. If you have more info on the context I'll be able to be more specific.

CloudFlare uses blob: with SharedWorker. blob is like hard-encoded in main js file code of sharedworker.
image

@piercefreeman
Copy link

piercefreeman commented Oct 18, 2022

@wireguard-dev This case gets us a bit outside the scope of the original thread, which was doing request interception over the network from shared workers.

For this particular case I think the answer is to pre-inject an override to the SharedWorker constructor, via the parent code that launches the shared worker in the first place. If a bytes payload is provided then it will prepend the custom logic to the base64 encoded payload. Either that or some future chromium or playwright support that allows you to inspect the script of the shared worker before processing begins.

Do you have a link to this cloudflare javascript?

@ja3abuser
Copy link

@wireguard-dev This case gets us a bit outside the scope of the original thread, which was doing request interception over the network from shared workers.

For this particular case I think the answer is to pre-inject an override to the SharedWorker constructor, via the parent code that launches the shared worker in the first place. If a bytes payload is provided then it will prepend the custom logic to the base64 encoded payload. Either that or some future chromium or playwright support that allows you to inspect the script of the shared worker before processing begins.

Do you have a link to this cloudflare javascript?

Just check websites with cloudflare enabled js challenge. and you will see

@ja3abuser
Copy link

@barjin any updates?

@tzbo
Copy link

tzbo commented Oct 28, 2022

Any update?

@barjin
Copy link
Collaborator

barjin commented Oct 29, 2022

Thank you everyone for the great ideas! (and sorry for taking so long, again). Long story short - injecting the browser data into workers using the base64 encoded payloads seems more than doable. I'll do some research and will get back to you all with the results.

As for the 'regular' workers, i.e. the ones initiating network requests - every way of injecting those (I can come up with) results in a performance downgrade. @piercefreeman 's proxy library looks great - make sure to check it out! - but unfortunately, we cannot use it in fingerprint-suite just yet - we have to keep our dependencies slim. Our main downstream library crawlee handles advanced networking on itself - and introducing another proxy manipulating the requests might easily result in some (potentially fatal) Mexican standoff situation :)

@tzbo
Copy link

tzbo commented Nov 9, 2022

Any update?

@meotimdihia
Copy link

meotimdihia commented Apr 21, 2023

Creepjs is rating F- (lowest). Could anyone share how to improve the rating?

  const context = await newInjectedContext(browser, {
    // Constraints for the generated fingerprint (optional)

    // Playwright's newContext() options (optional, random example for illustration)
    newContextOptions: {
      geolocation: {
        latitude: 51.50853,
        longitude: -0.12574
      }
    }
  })

  const page = await context.newPage()
  await page.goto("https://abrahamjuliot.github.io/creepjs/")

@barjin barjin self-assigned this Jul 21, 2023
@barjin barjin added the t-tooling Issues with this label are in the ownership of the tooling team. label Jul 21, 2023
@pedroota
Copy link

Thank you for submitting this issue!

Indeed, support for worker injection is in our scope right now. Unfortunately, injecting code in the (service/shared) worker environment is not as simple as doing it in the "plain browser" environment itself. We are closely monitoring the progress on both automation libraries and will add worker support as soon as it is possible.

I think that is why many paid multilogin type service providers use their own modified chromium.

Hi! Do you have any resources about how to modify chromium? I'm trying to build a software exactly like Multilogin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
antibot Yet another antibot service is recognizing the injection t-tooling Issues with this label are in the ownership of the tooling team.
Projects
None yet
Development

No branches or pull requests

8 participants