fix: change ASAR archive cache to per-process to fix leak #29293

dsanders11 · 2021-05-23T06:14:12Z

Description of Change

This PR changes the ASAR archive cache in shell/common/asar/asar_util.cc to be process-wide instead of in thread-locals to prevent the asar::Archive from being leaked on worker thread exit. See issue #29292 for the technical details about the bug.

Changing the cache to being process-wide means that thread-safety had to be added to asar::Archive and asar::GetOrCreateAsarArchive. This is (unfortunately?) accomplished with locks, as asar::GetOrCreateAsarArchive is used in both async (AsarURLLoader) and sync (NativeImage) contexts. I know in Chromium locks are frowned upon, but using them here allows the code to continue to be used where it is currently in synchronous code.

Other changes include making asar::IsDirectoryCached thread-safe, where it wasn't before, and removing all current usages of asar::ClearArchives since it was being used in spots that weren't really accomplishing anything (when processes are being torn down anyway). asar::ClearArchives was left in the code base though for potential future use in manually clearing the cache. The asar::Archive::header method was also removed since it is not currently used, and it could be used to modify the parsed header which would make asar::Archive not thread-safe.

Alternatives Considered

Quite a few alternatives were considered, in an attempt to avoid locking for thread-safety, in no particular order:

Use `base::ThreadLocalOwnedPointer` To Fix Leak

This was initially attempted as the most concise fix for the leak, as it would destruct the cache on thread exit. However, this change led to a crash on thread exit, through the File::Close code path. This is due to another thread-local storage in the Chromium code base, which appears to have already been destructed by the time the archive cache is being destructed, but the code in File::Close needs to use that other thread-local storage. Ensuring the order of destruction of disconnected thread-local storages seems fragile and bug prone. Per-thread caches are also redundant, and make it very difficult to clear the cache.

Eliminate Archive Cache All Together

The bug could be fixed by simply eliminating the archive cache and letting each interaction with an ASAR archive load and parse the header. However, that parsing of the header can be rather slow, in my testing I was seeing ~40ms with an ASAR archive header which was about 850KB. Eliminating the cache would lead to a lot of redundant processing, and more resource thrash since each request would temporarily store the parsed header in memory, and would open a new file handle.

Don't Use Archive Cache for Sync Code Paths, Use A Sequence Instead of Locks

This has the same shortcomings as eliminating the cache, but isolated to sync code (like NativeImage). The sync code path is accessed from JavaScript, so it likely currently benefits from the cache consistently since all access will be on the same thread. As such, not using the archive cache would be a performance regression for those code paths.

Use `base::SequenceLocalStorageSlot` Instead of Thread-Local Storage

base:: SequenceLocalStorageSlot is like thread-local storage, but it's per-sequence. That means all worker threads using the same sequence would get the same archive cache, and it would be destructed when the sequence was destructed. This approach isn't viable because AsarURLLoader runs on a new sequence for each request, and running all the loaders on the same sequence could be a bottleneck.

cc @nornagon, @zcbenz

Checklist

PR description included and stakeholders cc'd
npm test passes
PR release notes describe the change in a way relevant to app developers, and are capitalized, punctuated, and past tense.

Release Notes

Notes: Fixed memory leak when requesting files in ASAR archive from renderer

deepak1556 · 2021-05-24T10:14:51Z

Nice analysis of the issue!

Putting aside the process wide archive cache solution for a moment, regarding the ThreadLocalOwnedPointer crash, isn't it a bug in the chromium code to use the tls value that has already been freed ?

The call to File::Close runs a ScopeBlockingCall which relies on some tls keys and the one we are interested for the crash is the BlockingObserver . This observer slot value is basically a delegate to the worker thread which is filled up when the worker thread is run and is also owned by the worker thread, however the slot value is not cleared when the worker thread exits. I think ThreadGroupImpl::WorkerThreadDelegateImpl::OnMainExit should make a call to ClearBlockingObserverForCurrentThread() inorder to avoid the bad access.

dsanders11 · 2021-05-24T11:57:45Z

Putting aside the process wide archive cache solution for a moment, regarding the ThreadLocalOwnedPointer crash, isn't it a bug in the chromium code to use the tls value that has already been freed ?

It does seem like one since it's a bad access crash, yea.

I think ThreadGroupImpl::WorkerThreadDelegateImpl::OnMainExit should make a call to ClearBlockingObserverForCurrentThread() in order to avoid the bad access.

I've added that change locally, and it does indeed fix the bad access crash on thread exit. (Maybe submit that fix to Chromium?)

Interestingly enough, while using base::ThreadLocalOwnedPointer along with that Chromium patch does fix the leak, the memory usage remains at a higher level in my testing than it does with the change in this PR. I would have expected a little more memory usage since there could be multiple copies of the cache, but given that the worker pool dwindles down to two workers when idle, I wouldn't have thought the extra memory usage would be much more. So, initially unclear why the memory usage differs noticeably between the two after running requests. (Even with either fix there's still a slow memory usage growth when running batches of 1500 requests, but it's very small compared to this leak).

Here's some data on memory usage, first table is the fix in this PR, second table is using base::ThreadLocalOwnedPointer with the Chromium patch. Each iteration was recorded once the total number of threads hit 27-29. The zero iteration is with dev tools open to the console.

Iteration	Memory	Real Memory
0	43.3 MB	146.6 MB
1	60.8 MB	172.3 MB
2	60.9 MB	176.5 MB
3	61.6 MB	179.5 MB
4	61.6 MB	179.6 MB
5	61.7 MB	180.1 MB

Iteration	Memory	Real Memory	Open File Handles to ASAR
0	44.4 MB	145.4 MB	0
1	89.9 MB	240.6 MB	2
2	86.3 MB	236.5 MB	1
3	91.8 MB	251.7 MB	1
4	94.0 MB	267.6 MB	1
5	89.7 MB	274.8 MB	1

Here's the `base::ThreadLocalOwnedPointer` patch for completeness.

diff --git a/shell/common/asar/asar_util.cc b/shell/common/asar/asar_util.cc
index 65ddbf284..f43d9340d 100644
--- a/shell/common/asar/asar_util.cc
+++ b/shell/common/asar/asar_util.cc
@@ -10,6 +10,7 @@
 #include "base/files/file_path.h"
 #include "base/files/file_util.h"
 #include "base/lazy_instance.h"
+#include "base/memory/ptr_util.h"
 #include "base/stl_util.h"
 #include "base/threading/thread_local.h"
 #include "base/threading/thread_restrictions.h"
@@ -21,7 +22,7 @@ namespace {
 
 // The global instance of ArchiveMap, will be destroyed on exit.
 typedef std::map<base::FilePath, std::shared_ptr<Archive>> ArchiveMap;
-base::LazyInstance<base::ThreadLocalPointer<ArchiveMap>>::Leaky
+base::LazyInstance<base::ThreadLocalOwnedPointer<ArchiveMap>>::Leaky
     g_archive_map_tls = LAZY_INSTANCE_INITIALIZER;
 
 const base::FilePath::CharType kAsarExtension[] = FILE_PATH_LITERAL(".asar");
@@ -41,7 +42,7 @@ bool IsDirectoryCached(const base::FilePath& path) {
 
 std::shared_ptr<Archive> GetOrCreateAsarArchive(const base::FilePath& path) {
   if (!g_archive_map_tls.Pointer()->Get())
-    g_archive_map_tls.Pointer()->Set(new ArchiveMap);
+    g_archive_map_tls.Pointer()->Set(base::WrapUnique(new ArchiveMap));
   ArchiveMap& map = *g_archive_map_tls.Pointer()->Get();
 
   // if we have it, return it
@@ -61,8 +62,9 @@ std::shared_ptr<Archive> GetOrCreateAsarArchive(const base::FilePath& path) {
 }
 
 void ClearArchives() {
-  if (g_archive_map_tls.Pointer()->Get())
-    delete g_archive_map_tls.Pointer()->Get();
+  if (g_archive_map_tls.Pointer()->Get()) {
+    g_archive_map_tls.Pointer()->Set(nullptr);
+  }
 }
 
 bool GetAsarArchivePath(const base::FilePath& full_path,

deepak1556 · 2021-05-24T14:11:10Z

I've added that change locally, and it does indeed fix the bad access crash on thread exit. (Maybe submit that fix to Chromium?)

Yup, it is a good candidate. Please give it a try in upstreaming.

So, initially unclear why the memory usage differs noticeably between the two after running requests.

Heap Profiler from chromium should help identify the difference here. You would need this change to start the profiler.

I do like the thread scoped archive map for its simple lifetime and lock free code in IO sections. But the process wide map does show its advantage in memory usage. I am inclined to either solutions, will wait for other maintainers to chime in.

dsanders11 · 2021-05-24T15:37:46Z

Yup, it is a good candidate. Please give it a try in upstreaming.

I was suggesting you be the one to ustream it. 😄 I just randomly stuck the call into ThreadGroupImpl::WorkerThreadDelegateImpl::OnMainExit on your advice to see if it worked, I think you've got a better handle on that part of the Chromium code base, I just know enough to hurt myself.

Heap Profiler from chromium should help identify the difference here. You would need this change to start the profiler.

Thanks for the info. I'm not inclined to spend any more time on this for now, I've already sunk a good amount of time into debugging the issue and making sure I fully understood it for the bug report write up. Anyone else, feel free, all the info needed and patches are here.

I do like the thread scoped archive map for its simple lifetime and lock free code in IO sections. But the process wide map does show its advantage in memory usage. I am inclined to either solutions, will wait for other maintainers to chime in.

I like the lock-free nature of the thread scope map, which is why I examined quite a few alternatives before settling on this PR's final form, I would have preferred avoiding the explicit locking.

However, as mentioned in the original comment, the lifetime of the threads in which the cache would be most useful (thread pool workers) significantly diminishes cache hits. The threads are constantly destroyed, meaning cache misses will happen often and the ASAR will need to be parsed again on a request, so you're not gaining anything from the cache in those situations. As the second table in my previous comment shows (open file handles are a proxy measure for how many caches in threads there are), after 30 or so seconds there's only one cache (I even saw it drop back to 0 at one point in testing) remaining, so a single request might benefit from the cache, but if a handful of requests (fairly common usage) are sent at once, only one would benefit from the cache, the others will all parse the ASAR, using memory and file handles in the meantime.

It's also important to consider the code paths which use this cache. The JavaScript land fs support for ASARs does use asar::Archive, but it doesn't use asar:: GetOrCreateAsarArchive, so it doesn't use this cache - instead it caches them in the same style (map with path as the key) in JavaScript land. The code paths which use this cache are the AsarURLLoader, and some sync code, namely nativeImage.createFromPath. I believe that last case would be executing on the main thread (correct me if I'm wrong), so it wouldn't be accessing the thread scope caches created by AsarURLLoader, and AsarURLLoader can't access the thread scope cache nativeImage.createFromPath creates.

Finally, the distributed nature of the thread scope cache makes it difficult (if not impossible?) to ever provide manual control over the caching. This appears to be a use case some developers want (see #28368, and node-fs/#876), because the open file handles to the ASAR file prevents them from doing certain things. With a centralized cache there's at least hope of providing a way to free those file handles.

zcbenz

Since it does not affect fs module, there should be no performance punishment, I'm good with making the cache per-process.

deepak1556

LGTM 👍

Tested AsarURLLoader thread ticks before and after this change with batch of requests from relatively large asar file, PR didn't regress either sync or async scenario and timing is better with this change.

Trace code

diff --git a/shell/browser/net/asar/asar_url_loader.cc b/shell/browser/net/asar/asar_url_loader.cc
index 394d0f9db..3aca3e6ae 100644
--- a/shell/browser/net/asar/asar_url_loader.cc
+++ b/shell/browser/net/asar/asar_url_loader.cc
@@ -9,6 +9,7 @@
#include <utility>
#include <vector>

+#include "base/trace_event/typed_macros.h"
#include "base/strings/stringprintf.h"
#include "base/task/post_task.h"
#include "base/task/thread_pool.h"
@@ -90,6 +91,7 @@ class AsarURLLoader : public network::mojom::URLLoader {
            mojo::PendingReceiver<network::mojom::URLLoader> loader,
            mojo::PendingRemote<network::mojom::URLLoaderClient> client,
            scoped_refptr<net::HttpResponseHeaders> extra_response_headers) {
+    TRACE_EVENT_BEGIN("electron", "AsarURLLoader::Start");
   auto head = network::mojom::URLResponseHead::New();
   head->request_start = base::TimeTicks::Now();
   head->response_start = base::TimeTicks::Now();
@@ -264,6 +266,7 @@ class AsarURLLoader : public network::mojom::URLLoader {
 }

 void MaybeDeleteSelf() {
+    TRACE_EVENT_END("electron");
   if (!receiver_.is_bound() && !client_.is_bound())
     delete this;
 }

trace_logs.zip

dsanders11 · 2021-05-26T21:19:31Z

Great, thanks @deepak1556 for the extra testing with tracing. I'd like @nornagon to get a chance to comment when he gets a chance, since he was recently in this part of the code base with the ASAR memory-mapped file work.

dsanders11 · 2021-05-31T08:47:39Z

PR rebased and deleted some #include "shell/common/asar/asar_util.h" lines which are now unused.

shell/common/asar/archive.h

shell/common/asar/archive.cc

shell/common/asar/asar_util.cc

…9292)

dsanders11 · 2021-06-03T21:19:44Z

@nornagon, ready for re-review. 👍

shell/common/asar/archive.cc

shell/common/asar/archive.h

release-clerk · 2021-06-04T01:49:12Z

Release Notes Persisted

Fixed memory leak when requesting files in ASAR archive from renderer

trop · 2021-06-04T01:49:25Z

I was unable to backport this PR to "12-x-y" cleanly;
you will need to perform this backport manually.

trop · 2021-06-04T01:49:41Z

I have automatically backported this PR to "13-x-y", please check out #29535

trop · 2021-06-04T01:49:44Z

I have automatically backported this PR to "14-x-y", please check out #29536

…9293) * fix: change ASAR archive cache to per-process to fix leak (electron#29292) * chore: address code review comments * chore: tighten up thread-safety * chore: better address code review comments * chore: more code review changes

trop · 2021-06-05T01:00:45Z

@dsanders11 has manually backported this PR to "12-x-y", please check out #29548

electron-cation bot added the new-pr 🌱 PR opened in the last 24 hours label May 23, 2021

dsanders11 force-pushed the fix-asar-memory-leak branch from 38a08bf to 241e24e Compare May 23, 2021 15:36

zcbenz added semver/patch backwards-compatible bug fixes target/12-x-y labels May 24, 2021

electron-cation bot removed the new-pr 🌱 PR opened in the last 24 hours label May 24, 2021

zcbenz approved these changes May 25, 2021

View reviewed changes

dsanders11 mentioned this pull request May 25, 2021

Opening Emoji Picker Leaks Memory signalapp/Signal-Desktop#5290

Closed

1 task

deepak1556 approved these changes May 26, 2021

View reviewed changes

zcbenz added the target/14-x-y label May 27, 2021

dsanders11 force-pushed the fix-asar-memory-leak branch from 241e24e to 70f15ba Compare May 31, 2021 08:45

zcbenz changed the title ~~fix: change ASAR archive cache to per-process to fix leak (#29292)~~ fix: change ASAR archive cache to per-process to fix leak Jun 1, 2021

nornagon requested changes Jun 1, 2021

View reviewed changes

shell/common/asar/archive.h Outdated Show resolved Hide resolved

shell/common/asar/archive.cc Outdated Show resolved Hide resolved

shell/common/asar/asar_util.cc Outdated Show resolved Hide resolved

dsanders11 added 3 commits June 1, 2021 17:54

fix: change ASAR archive cache to per-process to fix leak (electron#2…

50f4eba

…9292)

chore: address code review comments

0c4a6b4

chore: tighten up thread-safety

4e24d9b

dsanders11 force-pushed the fix-asar-memory-leak branch from 70f15ba to 4e24d9b Compare June 2, 2021 01:05

chore: better address code review comments

d1458d4

dsanders11 force-pushed the fix-asar-memory-leak branch from ebb22b9 to d1458d4 Compare June 3, 2021 21:18

nornagon reviewed Jun 3, 2021

View reviewed changes

shell/common/asar/archive.cc Outdated Show resolved Hide resolved

nornagon reviewed Jun 3, 2021

View reviewed changes

shell/common/asar/archive.h Outdated Show resolved Hide resolved

chore: more code review changes

d510773

nornagon approved these changes Jun 3, 2021

View reviewed changes

zcbenz merged commit b1d1ac6 into electron:main Jun 4, 2021

trop bot added needs-manual-bp/12-x-y and removed target/12-x-y labels Jun 4, 2021

This was referenced Jun 4, 2021

fix: change ASAR archive cache to per-process to fix leak #29535

Merged

fix: change ASAR archive cache to per-process to fix leak #29536

Merged

trop bot removed the target/13-x-y label Jun 4, 2021

trop bot added in-flight/13-x-y and removed target/14-x-y labels Jun 4, 2021

dsanders11 mentioned this pull request Jun 5, 2021

fix: change ASAR archive cache to per-process to fix leak (#29293) #29548

Merged

trop bot added in-flight/12-x-y and removed needs-manual-bp/12-x-y labels Jun 5, 2021

dsanders11 mentioned this pull request Jun 10, 2021

[Bug]: Memory leak with asar #28962

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: change ASAR archive cache to per-process to fix leak #29293

fix: change ASAR archive cache to per-process to fix leak #29293

dsanders11 commented May 23, 2021 •

edited by zcbenz

deepak1556 commented May 24, 2021

dsanders11 commented May 24, 2021

deepak1556 commented May 24, 2021

dsanders11 commented May 24, 2021 •

edited

zcbenz left a comment

deepak1556 left a comment

dsanders11 commented May 26, 2021

dsanders11 commented May 31, 2021

dsanders11 commented Jun 3, 2021

release-clerk bot commented Jun 4, 2021

trop bot commented Jun 4, 2021

trop bot commented Jun 4, 2021

trop bot commented Jun 4, 2021

trop bot commented Jun 5, 2021

fix: change ASAR archive cache to per-process to fix leak #29293

fix: change ASAR archive cache to per-process to fix leak #29293

Conversation

dsanders11 commented May 23, 2021 • edited by zcbenz

Description of Change

Alternatives Considered

Use base::ThreadLocalOwnedPointer To Fix Leak

Eliminate Archive Cache All Together

Don't Use Archive Cache for Sync Code Paths, Use A Sequence Instead of Locks

Use base::SequenceLocalStorageSlot Instead of Thread-Local Storage

Checklist

Release Notes

deepak1556 commented May 24, 2021

dsanders11 commented May 24, 2021

deepak1556 commented May 24, 2021

dsanders11 commented May 24, 2021 • edited

zcbenz left a comment

Choose a reason for hiding this comment

deepak1556 left a comment

Choose a reason for hiding this comment

dsanders11 commented May 26, 2021

dsanders11 commented May 31, 2021

dsanders11 commented Jun 3, 2021

release-clerk bot commented Jun 4, 2021

trop bot commented Jun 4, 2021

trop bot commented Jun 4, 2021

trop bot commented Jun 4, 2021

trop bot commented Jun 5, 2021

dsanders11 commented May 23, 2021 •

edited by zcbenz

Use `base::ThreadLocalOwnedPointer` To Fix Leak

Use `base::SequenceLocalStorageSlot` Instead of Thread-Local Storage

dsanders11 commented May 24, 2021 •

edited