Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition that caused LaunchAsync to never resolve for chrome #2214

Merged
merged 15 commits into from Jun 13, 2023

Conversation

leonardo-fernandes
Copy link
Contributor

@leonardo-fernandes leonardo-fernandes commented May 26, 2023

Fixes #2202

Only process received messages after the discovered targets have been stored.

@leonardo-fernandes leonardo-fernandes changed the title Fix https://github.com/hardkoded/puppeteer-sharp/issues/2202 Fix #2202 May 26, 2023
@leonardo-fernandes leonardo-fernandes changed the title Fix #2202 Fix race condition that caused LaunchAsync to never resolve for chrome May 26, 2023
@leonardo-fernandes
Copy link
Contributor Author

From my tests, FirefoxTargetManager doesn't seem affected by the same race condition.

@kblok
Copy link
Member

kblok commented May 26, 2023

I'm taking a few days off. I'll take a look at this on Monday. Thanks!

@leonardo-fernandes
Copy link
Contributor Author

leonardo-fernandes commented May 27, 2023

I spent some hours trying to understand why some unit tests were failing with my previous commit.

It looks like the await inside of the OnMessageReceived was continuing the flow of execution in ProcessIncomingMessage and eventually releasing the semaphore from the TaskQueue which would continue to process the next messages, even though the previous message was still not processed.

I think this is caused by the OnMessageReceived method returning void instead of Task, so if it encounters an asynchronous call the continuation is queued but not returned to the caller, and the caller cannot await from it. Sorry if I can't explain it any better than this :)

I tried making OnMessageReceived synchronous but hit a deadlock. The OnAttachedToTarget terminates with an await Task.WhenAll and two SendAsync tasks. These tasks will only terminate when the return messages are processed. But the thread executing OnAttachedToTarget is waiting for the SendAsync inside of a semaphore, which prevents the processing of more messages.

In my opinion, the code in OnAttachedToTarget shouldn't await for these, and could simply run the tasks with Task.Run. There's nothing being gained from the await because (due to OnMessageReceived returning void) the caller will resume the control flow. There's a few more logic using await inside OnAttachedToTarget which I wouldn't dare to touch, so this is where I stopped the analysis. If we can make OnAttachedToTarget synchronous, then the incoming messages will be processed in sequence and would greatly reduce the chances of race conditions.

@kblok
Copy link
Member

kblok commented May 29, 2023

I'm looking at your findings in #2202. But honestly. This looks more like a hard patch than a solution. Let's see what we can find :) I think it makes sense.

In my opinion, the code in OnAttachedToTarget shouldn't await for these, and could simply run the tasks with Task.Run. There's nothing being gained from the await because (due to OnMessageReceived returning void) the caller will resume the control flow.

The problem here is that I think it's not the only place where we send a message inside an event.
Event handling is tricky in Puppeteer-Sharp. I wanted to get rid of them and use delegates (with tasks). But having multiple listeners is challenging. That's why we need to find alternatives like the AsyncDictionaryHelper.
It's pretty much like you did here. Find the race and try to solve it.

Thank you for your time!

@leonardo-fernandes
Copy link
Contributor Author

The problem here is that I think it's not the only place where we send a message inside an event. Event handling is tricky in Puppeteer-Sharp. I wanted to get rid of them and use delegates (with tasks). But having multiple listeners is challenging. That's why we need to find alternatives like the AsyncDictionaryHelper. It's pretty much like you did here. Find the race and try to solve it.

I know it's tricky, I tried changing the code to be synchronous and hit a few deadlocks. But today I was able to change OnAttachedToTarget to synchronous. I'll commit it to another branch, and I am keen on your feedback because it made all tests pass reliably (except PublishSingleFileTests, for some reason it has never passed in my machine).

@leonardo-fernandes
Copy link
Contributor Author

Ok, I compared the code with the upstream puppeteer, and I believe the upstream code does not await on the OnAttachedToTarget. See https://github.com/puppeteer/puppeteer/blob/f342c26d953e51bf262123e2e97276eff8ee2841/packages/puppeteer-core/src/common/ChromeTargetManager.ts#L174

The call to #onAttachedToTarget (which is an async function) does not await on it.

I also have a clearer understanding of why this might not be an issue in upstream. There are some difference in semantics between async in js and .net. One key difference is that js is single-threaded, so it's not possible in js to have a thread switch between the execution of _targetsIdsForInit.Remove(target.TargetId); and FinishInitializationIfReady();. But in .NET, it's possible that a second thread adds a new element into _targetsIdsForInit after the last element had been removed, and so the check inside FinishInitializationIfReady would fail.

As I'm feeling more confident, and seeing that it improves the test results, I've included the changes for not awaiting OnAttachedToTarget for your consideration. I've also included a missing return, see https://github.com/puppeteer/puppeteer/blob/f342c26d953e51bf262123e2e97276eff8ee2841/packages/puppeteer-core/src/common/ChromeTargetManager.ts#L335, and without which some tests would fail randomly.

@kblok
Copy link
Member

kblok commented May 30, 2023

I'm not being able to get a full green in AppVeyor :(

@leonardo-fernandes
Copy link
Contributor Author

I know. I'm trying to reproduce the failures in my local machine, but so far haven't been able to.
I'll need more time then to understand what's happening.

@leonardo-fernandes
Copy link
Contributor Author

leonardo-fernandes commented Jun 1, 2023

I've been running test cases over and over to detect race conditions / deadlocks. I'll document what I find here. Note that I've added logging in multiple places, and line numbers might not match exactly what's committed.

Today I found the following exception led to a deadlock when executing PuppeteerSharp.Tests.LauncherTests.PuppeteerConnectTests.ShouldBeAbleToCloseRemoteBrowser:

Value cannot be null.
Parameter name: logger
   at Microsoft.Extensions.Logging.LoggerExtensions.Log(ILogger logger, LogLevel logLevel, EventId eventId, Exception exception, String message, Object[] args)
   at Microsoft.Extensions.Logging.LoggerExtensions.LogTrace(ILogger logger, String message, Object[] args)
   at PuppeteerSharp.Browser.CreateTarget(TargetInfo targetInfo, CDPSession session) in C:\src\puppeteer-sharp\lib\PuppeteerSharp\Browser.cs:line 516
   at PuppeteerSharp.ChromeTargetManager.OnTargetCreated(TargetCreatedResponse e) in C:\src\puppeteer-sharp\lib\PuppeteerSharp\ChromeTargetManager.cs:line 195
   at PuppeteerSharp.ChromeTargetManager.OnMessageReceived(Object sender, MessageEventArgs e) in C:\src\puppeteer-sharp\lib\PuppeteerSharp\ChromeTargetManager.cs:line 156

This is what seems to have happened.

  1. ChromeTargetManager is created in Browser.cs before the Browser._logger field is populated.
  2. The ChromeTargetManager constructor attaches the OnMessageReceived listener (again, this is before Browser._logger is populated).
  3. If a message is received in the connection, there's a chance that it is processed before the ChromeTargetManager constructor finishes executing (and before Browser._logger is populated).
  4. If that happens, accessing Browser._logger will throw an exception. I had instrumented Browser.CreateTarget with logging, which caused an exception.
  5. Exception is caught in OnMessageReceived and it calls _connection.Close(). This eventually calls Connection.Dispose() which calls TaskQueue.Dispose(), which waits for the semaphore.
  6. However, the semaphore is held by the same thread (which was processing a message), which causes the deadlock.

Stack of the deadlocked thread (TaskQueue appearing twice):

 	PuppeteerSharp.dll!PuppeteerSharp.Helpers.TaskQueue.Dispose() Line 21	C#
 	PuppeteerSharp.dll!PuppeteerSharp.Connection.Dispose(bool disposing) Line 240	C#
 	PuppeteerSharp.dll!PuppeteerSharp.Connection.Dispose() Line 102	C#
 	PuppeteerSharp.dll!PuppeteerSharp.Browser.Disconnect() Line 191	C#
 	PuppeteerSharp.dll!PuppeteerSharp.Browser.CloseCoreAsync() Line 457	C#
 	PuppeteerSharp.dll!PuppeteerSharp.Browser.CloseAsync() Line 196	C#
 	PuppeteerSharp.dll!PuppeteerSharp.Browser.Connection_Disconnected(object sender, System.EventArgs e) Line 477	C#
 	PuppeteerSharp.dll!PuppeteerSharp.Connection.Close(string closeReason) Line 200	C#
 	PuppeteerSharp.dll!PuppeteerSharp.ChromeTargetManager.OnMessageReceived(object sender, PuppeteerSharp.MessageEventArgs e) Line 172	C#
 	PuppeteerSharp.dll!PuppeteerSharp.Connection.ProcessIncomingMessage(PuppeteerSharp.Messaging.ConnectionResponse obj) Line 323	C#
 	PuppeteerSharp.dll!PuppeteerSharp.Connection.ProcessMessage(PuppeteerSharp.Transport.MessageReceivedEventArgs e) Line 269	C#
 	PuppeteerSharp.dll!PuppeteerSharp.Connection.Transport_MessageReceived.AnonymousMethod__0() Line 244	C#
 	PuppeteerSharp.dll!PuppeteerSharp.Helpers.TaskQueue.Enqueue(System.Func<System.Threading.Tasks.Task> taskGenerator) Line 55	C#
 	PuppeteerSharp.dll!PuppeteerSharp.Connection.Transport_MessageReceived(object sender, PuppeteerSharp.Transport.MessageReceivedEventArgs e) Line 244	C#
 	PuppeteerSharp.dll!PuppeteerSharp.Transport.WebSocketTransport.GetResponseAsync(System.Threading.CancellationToken cancellationToken) Line 190	C#

My thoughts on this:

  • Logger could be initialized sooner in Browser.cs. This seems like a low-risk change.
  • Sending the message "Target.setDiscoverTargets" before the Browser and ChromeTargetManager objects are fully constructed is risky. Maybe consider moving the send to a separate method, that must be called after the constructor?
  • Any exception caught by OnMessageReceived may lead to this deadlock - EDIT: the exception will lead to a deadlock only if Browser.CloseCoreAsync does not await, otherwise the TaskQueue.Dispose() will execute on a separate thread; this in turn seems dependent on the CurrentState, with some implementations such as InitialState triggering the deadlock. How can we avoid the deadlock in this case? Maybe use a ThreadLocal AsyncLocal to signal that the semaphore is held on the same thread async context?

@leonardo-fernandes
Copy link
Contributor Author

Another race condition.

  1. The code in WaitForTargetAsync and TargetManager_TargetAvailable can run in parallel in two distinct threads.
  2. The thread executing WaitForTargetAsync checks for the target in Targets(). If it's not found, then a targetCompletionSource is created to await for the creation of the target.
  3. The code checks again in Targets() for some reason.
  4. At this point (after the second check against Targets(), there's a chance that a thread switch continues executing TargetManager_TargetAvailable.
  5. The code in TargetManager_TargetAvailable calls the TargetCreated event. However, this call happens before the event handlers in the first thread had been added.
  6. When the thread switches back to WaitForTargetAsync, the TargetCreated event handler is added, but it's too late and it won't be invoked. The targetCompletionSource will never resolve, and the code will terminate with a timeout.

Thoughts:

  • I think we should re-check Targets() after attaching the event handlers. If the TargetCreated event is invoked before the event handlers are attached, we will find them by re-checking Targets(). If the event is invoked after, then the targetCompletionSource will do its job.

@jn-foreflight
Copy link

Logger could be initialized sooner in Browser.cs. This seems like a low-risk change.

The race-condition of attaching event handler(s) before populating _logger seems to be present in other classes as well.
Do you mind if I handle all those cases in a separate PR?

Copy link
Member

@kblok kblok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@kblok
Copy link
Member

kblok commented Jun 1, 2023

Do you mind if I handle all those cases in a separate PR?

Sure!

@jnyrup
Copy link
Contributor

jnyrup commented Jun 1, 2023

As I'm feeling more confident, and seeing that it improves the test results, I've included the changes for not awaiting OnAttachedToTarget for your consideration. I've also included a missing return, see puppeteer/puppeteer@f342c26/packages/puppeteer-core/src/common/ChromeTargetManager.ts#L335, and without which some tests would fail randomly.

Great finding 🚀

Consider extracting this a separate PR.
That'll make it easier to track how AppVeyor behaves.

Only process received messages after the discovered targets have been stored.
…sages unchanged.

This fixes some of the unit tests, which were failing due to changes in the order of execution of initialization messages.
…_logger field is initialized.

* Fixed a deadlock that could happen if the connection is closed on the thread that is processing received messages. TaskQueue could not be disposed on the same thread that held the semaphore.
* Fixed a race condition if targets are created/changed concurrently before the TargetHandler is registered as an event handler.
… thread A could invoke `TaskQueue.Dispose()` and set `_isDisposed = 1`, which would then allow thread B to finish work setting `_held = false` but without releasing the semaphore, and then thread A would attempt `_semaphore.Wait()` entering a deadlock.
…t having discovered all targets.

This was causing unit tests such as PuppeteerConnectTests.ShouldSupportTargetFilter to fail because the test executed faster than the target discovery.
@kblok
Copy link
Member

kblok commented Jun 2, 2023

@jnyrup I'll wait for your approval here. AppVeyor seems to go in the green direction.

Copy link
Contributor

@jnyrup jnyrup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look well-argued to me.
I just have some questions to ensure your fixes don't regress anything else.

lib/PuppeteerSharp/ChromeTargetManager.cs Outdated Show resolved Hide resolved
lib/PuppeteerSharp/Browser.cs Show resolved Hide resolved
public ConcurrentDictionary<string, Target> GetAvailableTargets() => _attachedTargetsByTargetId;

public async Task InitializeAsync()
{
_ = _connection.SendAsync("Target.setDiscoverTargets", new TargetSetDiscoverTargetsRequest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a benefit of moving the call to Target.setDiscoverTargets in InitializeAsync instead of having it in the constructor as it is in puppeteer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't fully understand this, so I'll have to run more tests to get a better grip at it. But the test PuppeteerConnectTests.ShouldSupportTargetFilter was failing randomly. This test connects to a browser that already has 3 tabs open.

What I found was that the messages of the targets were being processed concurrently, even before the event handlers in AttachAsync were added. Again, ultimately this is due to the multi-threaded .NET vs single-threaded JS difference, as in JS the constructor and the event handlers would be added without any concurrent code.

I will get back to you with a more thorough explanation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the addition of _targetDiscoveryCompletionSource to ensure it has completed before anything else tries to consume _targetsIdsForInit which StoreExistingTargetsForInit populates.

Copy link
Contributor

@jnyrup jnyrup Jun 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea, if we move Target.setDiscoverTargets to InitializeAsync anyways, can we use modern async/await syntax and avoid _targetDiscoveryCompletionSource or am I missing some concurrency again?

public async Task InitializeAsync()
{
    var setDiscoverTargetsTask = _connection.SendAsync("Target.setDiscoverTargets", new TargetSetDiscoverTargetsRequest
    {
        Discover = true,
        Filter = new[]
        {
            new TargetSetDiscoverTargetsRequest.DiscoverFilter()
            {
                Type = "tab",
                Exclude = true,
            },
            new TargetSetDiscoverTargetsRequest.DiscoverFilter(),
        },
    });

    await _connection.SendAsync("Target.setAutoAttach", new TargetSetAutoAttachRequest()
    {
        WaitForDebuggerOnStart = true,
        Flatten = true,
        AutoAttach = true,
    }).ConfigureAwait(false);

    try
    {
        await setDiscoverTargetsTask.ConfigureAwait(false);
        StoreExistingTargetsForInit();
    }
    catch (Exception ex)
    {
        _logger.LogError(ex, "Target.setDiscoverTargets failed");
    }

    FinishInitializationIfReady();

    await _initializeCompletionSource.Task.ConfigureAwait(false);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jnyrup I have reverted these changes and couldn't reproduce any failure in 100 executions of PuppeteerConnectTests.ShouldSupportTargetFilter. Let's see how it behaves in AppVeyor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out awaiting for _targetDiscoveryCompletionSource is needed before calling FinishInitializationIfReady(), otherwise FinishInitializationIfReady could execute before any targets have been added to the _targetsIdsForInit collection and the initialization would be considered completed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with how it's done now 👍

... but if we moved _connection.SendAsync("Target.setDiscoverTargets" to InitializeAsync again and awaited it, like in my example above, I'm failing to see how we can reach FinishInitializationIfReady before setDiscoverTargetsTask has completed, which should remove the need for _targetDiscoveryCompletionSource.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jnyrup had a closer look at your code above.

The change seems fine, and it simplifies a few things - for example it uses a try/catch which is more natural than checking t.IsFaulted. It's also closer to how FirefoxTargetManager works.

However, the _targetDiscoveryCompletionSource is still needed, to signal other threads that may be processing a message and executing OnAttachedToTarget.

It also has the downside of diverging from upstream.

I can commit these changes if you agree it's a better solution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, it's those events messing with my brain again...
I now (again) see why _targetDiscoveryCompletionSource still is necessary.

I'll let kblok decide whether to call _connection.SendAsync("Target.setDiscoverTargets" in the constructor of InitializeAsync.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that in this PR, _targetDiscoveryCompletionSource is more important than the location of Target.setDiscoverTargets. I would leave Target.setDiscoverTargets in the constructor as we have in upstream.

We are making some important changes (good changes). But it would be nice to see if we do need to move it to InitializeAsync in isolation. If we see that we need it, we can move it in a new PR.

This PR is looking great. I would merge it after the new WithTimeouts. Thank you @leonardo-fernandes, @jnyrup for the hard work here.

…hat new targets are added to `_attachedTargetsByTargetId` inside of the semaphore.

Also fixes `Page.CloseAsync()` which was returning before `Target.CloseTask` resolved. This affected BrowserContextTests.ShouldFireTargetEvents on which it was possible for the test to finish before the `TargetDestroy` event.
It was possible for the InitializeAsync to finish without all targets being initialized, and consequently the test would read an empty list of targets.

The _targetDiscoveryCompletionSource should be awaited before logic that depends on _targetsIdsForInit inside of message processing, to make sure this collection was already initialized during the browser launch.
…stingPage.

Disposing the `browser1` was closing the page, which then caused the `Page.CloseAsync()` in `PuppeteerPageBaseTest` to fail.

The test code now matches upstream puppeteer.
@leonardo-fernandes
Copy link
Contributor Author

leonardo-fernandes commented Jun 12, 2023

Anyone has any idea why FrameWaitForFunctionTests.ShouldPollOnMutation is failing in ubuntu? I can't find any explanation for that failure, unless something is really broken with Page.WaitForFunctionAsync() and it's not waiting for the poller to return a result.

I tried reproducing it on an ubuntu VM and was not able to get it to fail.

EDIT: what were the chances? after adding some logging, this test failed in my local machine for the first time.

What is happening is that the first WaitForFunctionAsync() in the test is not awaited. So we have no guarantee of when it's executed. WaitForFunctionAsync is itself broken down in multiple messages - Runtime.callFunctionOn to define the poller, and a second Runtime.callFunctionOn that calls poller.start().

When the test fails, it is because the first Page.EvaluateExpressionAsync() is executed between the two Runtime.callFunctionOn messages of the mutation poller. When poller.start() executes, the window.__FOO variable has already been set, so the mutation poller immediately resolves the promise without any mutation.

This looks like it's mostly an issue with the test itself, and not related to the changes I've done so far. So I'll think on how it can be fixed, and create a separate PR if I find a good solution that doesn't require diverging from upstream so much.

Comment on lines +686 to +691
await Client.Connection.SendAsync("Target.closeTarget", new TargetCloseTargetRequest
{
TargetId = Target.TargetId,
}).ContinueWith(task => ((Target)Target).CloseTask, TaskScheduler.Default);
}
}).ConfigureAwait(false);

_logger.LogWarning("Protocol error: Connection closed. Most likely the page has been closed.");
return Task.CompletedTask;
await ((Target)Target).CloseTask.ConfigureAwait(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm noticing that we now only execute ((Target)Target).CloseTask if Client.Connection.SendAsync("Target.closeTarget" completes successfully.
Should we still execute ((Target)Target).CloseTask if Client.Connection.SendAsync("Target.closeTarget" throws an exception?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jnyrup this matches upstream

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, I based it on upstream.

If Target.closeTarget fails, it's very likely that TargetManager_TargetGone will not be invoked, and the await of CloseTask would block the code indefinitely.

public ConcurrentDictionary<string, Target> GetAvailableTargets() => _attachedTargetsByTargetId;

public async Task InitializeAsync()
{
_ = _connection.SendAsync("Target.setDiscoverTargets", new TargetSetDiscoverTargetsRequest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with how it's done now 👍

... but if we moved _connection.SendAsync("Target.setDiscoverTargets" to InitializeAsync again and awaited it, like in my example above, I'm failing to see how we can reach FinishInitializationIfReady before setDiscoverTargetsTask has completed, which should remove the need for _targetDiscoveryCompletionSource.

{
_discoveredTargetsByTargetId.TryRemove(e.TargetId, out var targetInfo);
await _targetDiscoveryCompletionSource.Task.ConfigureAwait(false);
Copy link
Member

@kblok kblok Jun 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although we are trying to enforce this. What do you think about using WithTimeout, and throwing an exception? If we fail to complete this task somehow, the code will get stuck here.

The same for the other places

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that _targetDiscoveryCompletionSource will be set if and only if the message Target.setDiscoverTargets completes. I can't think of any scenario where it wouldn't complete.

There's no code that uses a timeout value in ChromeTargetManager. Where should I derive the timeout value from?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use Puppeteer.DefaultTimeout. It's quite a long timeout. We want to prevent the code from hanging.

@kblok
Copy link
Member

kblok commented Jun 12, 2023

image

I don't remember the last time I saw this in green! <3

@jnyrup
Copy link
Contributor

jnyrup commented Jun 12, 2023

@leonardo-fernandes Since you seem to excel at this concurrency stuff, you might also want to give FrameManager a similar look (after this PR is merged), to see if it suffers a similar race-condition as the one you found in ChromeTargetManager.

…nSource`, as a defensive measure against deadlocks.
Comment on lines 316 to 332
await _targetDiscoveryCompletionSource.Task.ConfigureAwait(false);
await EnsureTargetsIdsForInit().ConfigureAwait(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is way better than looking for the variable every time. Thanks! Let's wait for the green!

@kblok
Copy link
Member

kblok commented Jun 12, 2023

@leonardo-fernandes @jnyrup I consider this PR approved. Let's create a follow-up PR if you find more things to improve.

Side Note: You can't image, well, I bet you @jnyrup can imagine what it means for a community project to have more developers looking at this code. Thank you!

@kblok kblok merged commit e0df7f7 into hardkoded:master Jun 13, 2023
3 of 5 checks passed
leonardo-fernandes added a commit to leonardo-fernandes/puppeteer-sharp that referenced this pull request Jun 13, 2023
hardkoded#2214)

* Fix hardkoded#2202
Only process received messages after the discovered targets have been stored.

* Change the await to only apply to attachedToTarget, leaving other messages unchanged.
This fixes some of the unit tests, which were failing due to changes in the order of execution of initialization messages.

* Remove the await for OnAttachedToTarget call, and also included a missing return when ignoring a target.

* * Fixed a race condition if a message is received before the Browser._logger field is initialized.
* Fixed a deadlock that could happen if the connection is closed on the thread that is processing received messages. TaskQueue could not be disposed on the same thread that held the semaphore.
* Fixed a race condition if targets are created/changed concurrently before the TargetHandler is registered as an event handler.

* Previous commit introduced a new race condition. It was possible that thread A could invoke `TaskQueue.Dispose()` and set `_isDisposed = 1`, which would then allow thread B to finish work setting `_held = false` but without releasing the semaphore, and then thread A would attempt `_semaphore.Wait()` entering a deadlock.

* It was possible for the TargetManager initialization to finish without having discovered all targets.
This was causing unit tests such as PuppeteerConnectTests.ShouldSupportTargetFilter to fail because the test executed faster than the target discovery.

* PR review

* Rolling back Target.setDiscoverTargets to be sent from the constructor

* Handle exceptions in OnAttachedToTarget

* OnAttachedToTarget should be executed synchronously if possible, so that new targets are added to `_attachedTargetsByTargetId` inside of the semaphore.

Also fixes `Page.CloseAsync()` which was returning before `Target.CloseTask` resolved. This affected BrowserContextTests.ShouldFireTargetEvents on which it was possible for the test to finish before the `TargetDestroy` event.

* Fix PuppeteerConnectTests.ShouldSupportTargetFilter.

It was possible for the InitializeAsync to finish without all targets being initialized, and consequently the test would read an empty list of targets.

The _targetDiscoveryCompletionSource should be awaited before logic that depends on _targetsIdsForInit inside of message processing, to make sure this collection was already initialized during the browser launch.

* Fix OOPIFTests.ShouldDetectExistingOopifsWhenPuppeteerConnectsToAnExistingPage.

Disposing the `browser1` was closing the page, which then caused the `Page.CloseAsync()` in `PuppeteerPageBaseTest` to fail.

The test code now matches upstream puppeteer.

* Revert unintentional line ending changes.

* Use the launcher timeout when awaiting for `_targetDiscoveryCompletionSource`, as a defensive measure against deadlocks.
@danports
Copy link

Wow @leonardo-fernandes et al. - thanks for all of the effort here! Any idea when a 10.x release might land with this fix? I'm excited about not having to manually kill hung tasks!

@kblok
Copy link
Member

kblok commented Jun 19, 2023

@danports I'm shipping this tomorrow!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Puppeteer.LaunchAsync hangs every once in a while waiting for a page to open
5 participants