Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The time allocated for FrameManager to initialize is not enough (1 second) #2622

Closed
AleksandrsJakovlevsVisma opened this issue May 3, 2024 · 8 comments

Comments

@AleksandrsJakovlevsVisma
Copy link

AleksandrsJakovlevsVisma commented May 3, 2024

Description

We are trying to use Puppeteer-Shart to convert HTML to PDF. Sometimes we could get 200 conversion requests simultaneously. At first, we tried to create a browser and a page for each request but quickly realized that this was a bad idea performance-wise. Our new strategy is to use a pool of pages which means we always use one browser and allow a maximum of 10 pages to be opened.

Still, it does not work. At some point, when a new page is created, we get this exception:

PuppeteerSharp.TargetClosedException: Protocol error (Performance.enable): Session closed. Most likely the Page has been closed.Close reason: Connection failed to process Page.frameStoppedLoading. Timeout of 1000 ms exceeded.

Coupled with stuff like this:

PuppeteerSharp.TargetClosedException: Protocol error(Target.createTarget): Target closed. (Connection failed to process {\""id\"":14,\""result\"":{},\""sessionId\"":\""BED668EF013F206DE76E55157C4B06BC\""}. Object reference not set to an instance of an object.

Reading the code I guess the following is happening:

  1. With such a high number of requests being done at the same time, the server performance is slowed down
  2. At some point, when creating a new page, FrameManager.InitializeAsync is not able to be completed within one second
  3. The connection is then closed with the "Timeout of 1000 ms exceeded" error message
  4. The rest of the requests fail since the browser they were using has a closed connection

Complete minimal example reproducing the issue

This is an approximation of what the pool looks like. Calling GetPageAsync in multiple threads will eventually lead to the above-mentioned exceptions.

using System;
using System.Collections.Concurrent;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using PuppeteerSharp;

namespace Test;

public class ChromiumPagePool() : IDisposable
{
    private readonly SemaphoreSlim _destroyPageSemaphore = new(1, 1);
    private readonly ConcurrentQueue<IPage> _pages = [];
    private IBrowser? _browser;
    private SemaphoreSlim? _currentPageCount;
    private int _destroyPageTasks;
    private bool _disposing;

    public async Task InitAsync()
    {
        _browser = await Puppeteer.LaunchAsync(new LaunchOptions
        {
            ExecutablePath = @"C:\Users\aleksandrs.jakovlevs\Downloads\chrome-win\chrome-win\chrome.exe",
            Browser = SupportedBrowser.Chromium,
            Args =
            [
                "--no-sandbox"
            ]
        });

        for (var i = 0; i < 3; i++)
        {
            var page = await _browser.NewPageAsync();
            _pages.Enqueue(page);
        }

        _currentPageCount = new SemaphoreSlim(7, 10);
    }

    public async Task<IPage?> GetPageAsync(CancellationToken token)
    {
        var res = _pages.TryDequeue(out var page);

        if (res && page != null)
        {
            return page;
        }

        await _currentPageCount.WaitAsync(token);

        page = await _browser.NewPageAsync();

        return page;
    }

    public void QueuePageDestruction(IPage page, CancellationToken token)
    {
        var pageToDestroy = page;
        var cancelToken = token;
        Task.Run(() => DestroyPage(pageToDestroy, cancelToken), token);
    }

    private async Task DestroyPage(IPage page, CancellationToken cancelToken)
    {
        Interlocked.Increment(ref _destroyPageTasks);

        if (_disposing)
        {
            page.Dispose();
            return;
        }

        try
        {
            page.Dispose();

            await _destroyPageSemaphore.WaitAsync(cancelToken);

            try
            {
                if (10 - _currentPageCount.CurrentCount <= 3)
                {
                    page = await _browser.NewPageAsync();
                    _pages.Enqueue(page);
                }
                else
                {
                    _currentPageCount.Release();
                }
            }
            finally
            {
                _destroyPageSemaphore.Release();
            }
        }
        finally
        {
            Interlocked.Decrement(ref _destroyPageTasks);
        }
    }

    public void Dispose()
    {
        _disposing = true;

        while (_destroyPageTasks > 0)
        {
            Thread.Sleep(50);
        }

        foreach (var page in _pages.Where(p => !p.IsClosed))
        {
            page.Dispose();
        }

        if (_browser?.IsConnected == true)
        {
            _browser.Dispose();
        }

        _destroyPageSemaphore.Dispose();
        _currentPageCount?.Dispose();

        GC.SuppressFinalize(this);
    }

    ~ChromiumPagePool()
    {
        Dispose();
    }
}

Expected behavior:

All 200 requests successfully convert HTML to PDF

Actual behavior:

At some point conversion fails

Versions

  • 16.0.0
  • .NET 8.0
@kblok
Copy link
Member

kblok commented May 6, 2024

@AleksandrsJakovlevsVisma Are you willing to test some custom build to confirm the fix?

@AleksandrsJakovlevsVisma
Copy link
Author

@kblok Sorry for the delay, I was on vacation.

Yes, I don't mind trying a custom build

@kblok
Copy link
Member

kblok commented May 13, 2024

@AleksandrsJakovlevsVisma I'm exposing the TaskHelper timeout here. Can you see if increasing the timeout there helps?

That's an static class that will affect the entire library.

@AleksandrsJakovlevsVisma
Copy link
Author

@kblok I did the following:

  1. Downloaded that branch as zip and referenced the csproj in my proj
  2. I also did one additional change seen below
    image
  3. In my code I am setting DefaultTimeout to 30 secs
  4. During testing I am no longer seeing TargetClosedException, but now I am getting 504 gate-way timeouts from the API that is using Puppeteer

@kblok
Copy link
Member

kblok commented May 15, 2024

@kblok I did the following:

  1. I also did one additional change seen below

Good catch

  1. During testing I am no longer seeing TargetClosedException, but now I am getting 504 gate-way timeouts from the API that is using Puppeteer

That would be something on your side right?

@AleksandrsJakovlevsVisma
Copy link
Author

AleksandrsJakovlevsVisma commented May 20, 2024

@kblok

Thank you for helping out with this one.

It is the problem on our side, yes. The PDF conversion on average takes 6 seconds, but the request times out after a minute. 200 requests with a max count of simultaneously opened pages = 10 is not enough to make sure each of 200 finishes in a minute. Do you have any advice on how to speed up a simple HTML to PDF conversion? Perhaps there are some magic LaunchOptions args?

In any case, can we expect the configurable default timeout released in the near future?

@kblok
Copy link
Member

kblok commented May 20, 2024

@kblok

Thank you for helping out with this one.

It is the problem on our side, yes. The PDF conversion on average takes 6 seconds, but the request times out after a minute. 200 requests with a max count of simultaneously opened pages = 10 is not enough to make sure each of 200 finishes in a minute. Do you have any advice on how to speed up a simple HTML to PDF conversion? Perhaps there are some magic LaunchOptions args?

PDF generate is quite resource intensive. If you look at the code, most of the time is on the browser side.

In any case, can we expect the configurable default timeout released in the near future?

I can ship it today :)

@AleksandrsJakovlevsVisma
Copy link
Author

@kblok

I can ship it today :)

Nice, we will look forward to it, whenever you'll have the time :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants