Roadmap: cache and sweeping improvements #6539

kyranet · 2021-08-27T22:39:40Z

Starting with Discord.js v13.0.0, we removed a lot of add-ons, such as Channel#_typing (#6114), User#lastMessageId, User#lastMessageChannelId (#6046), and so on. Those changes have helped reduce Discord.js' memory footprint greatly.

With Discord adding archivable thread channels, we saw a need to add a second sweeper, alongside message's. In an attempt to generalize sweeping caches, we added custom caching (#6013) and later sweeping caches (#6110). The latter (SweptCollection, now LimitedCollection) is the new alternative for sweeping, and noticed it's biggest improvement was to allow us to use them in even very nested collections (such as client -> channel -> message -> reactions), but as much as it helped, it has a bounce effect and does actually increase memory usage when used.

A LimitedCollection differs from Collection by several properties:

maxSize, which is usually a small integer.
keepOverLimit, which is a function, often a reference.
sweepFilter, which is also a function, often a reference.
interval, which is either an interval (with a lot of tracking properties) or null.

If you were to use LimitedCollection on a very duplicated manager, such as ReactionManager, which also stays empty most of the time, this would use far more memory than it should, despite the collection's very purpose being to reduce it.

Additionally, if a LimitedCollection is used, a lot of extra memory would be used for the FinalizationRegistry, which is used to hook into the runtime's garbage collector and clear the timers in said collections, so that they don't leak memory. However, this brings a serious issue: the cleanup is bound in every single instance of them, that's a full function (which has large memory overhead) being created, as well as it being stored in both a Set and finally the finalizer itself, alongside a message and a name:

discord.js/src/managers/CachedManager.js

Lines 17 to 28 in fc51f61

    
           let cleanup = this._cache[_cleanupSymbol]?.(); 
        
           if (cleanup) { 
        
             cleanup = cleanup.bind(this._cache); 
        
             client._cleanups.add(cleanup); 
        
             client._finalizers.register(this, { 
        
               cleanup, 
        
               message: 
        
                 `Garbage collection completed on ${this.constructor.name}, ` + 
        
                 `which had a ${this._cache.constructor.name} of ${this.holds.name}.`, 
        
               name: this.constructor.name, 
        
             }); 
        
           }

As a side note, sweeping collections are cancelled a bit late, if you set the interval to 2 minutes, chances are that once its manager is not longer accessible, it'll run at least twice, as the GC usually runs every 5 or 10 minutes, and of course, we don't want to be sweeping unaccessible memory.

With this, LimitedCollection became what it swore to destroy.

Following the discussion in the #Koyamies caching hell (874411732527886336) thread (from #library-discussion), @ckohen and I discussed a few alternatives to this caching hell, in the hope of making v13's memory usage go down.

1. Global caches

Everything would be easily accessible in the first layer, no nested iterations would happen, everything would be accessed as client.caches.channels, client.caches.messages, client.caches.reactions, and so on. This is the most memory efficient approach, and works specially well with the planned Structure#data (more information about this soon!), as we'd just access to them without the need to access to its parents (such as accessing to a Message without its parent Channel), as its ECS-like approach of removing the relationships (and thus promoting stateless-ness) makes everything far more efficient.

However, it has its drawbacks, all of which related to supporting scoped caches:

Scoped caches (such as channel.messages.cache) would need to be backed by either an Array or a Set of the structure's ID, so when we do an operation, we need to check whether the key is stored in the backed cache before accessing it, in this fashion:
```
get(key) {
  return this.keys.has(key) ? client.caches.messages.get(key) : undefined;
}
```
Deleting structures should delete all of its dependencies, so if a channel is deleted, all of its messages, permission overwrites, etc, should be also deleted. We can achieve something similar to this with a destroy() method (similar to a destructor in other languages):
```
destroy() {
  this.messages.destroy();
  this.permissionOverwrites.destroy();
}
```
Being CachedManager#destroy():
```
destroy() {
  for (const value of this.values()) {
    value.destroy();
  }
}
```
For simplicity.

⚠️ However, the dependencies make it sort of stateful, we might be able to emit events regardless of state with the cache changes, but we'd have to be careful not to cache them if their dependents aren't available. How do we handle this?
(Key) cache duplication, since we'd store both global and scoped keys, translating to a slight increase in memory specially for large caches.

With the Structure#data proposal, we could generate managers (and their cache) on the fly, rather than requiring them to be always instantiated, this can lower memory usage and be on par with other libraries as discord.js would use nearly as much memory as required to store the data Discord sends us, although this is at the expense of CPU time, and we'd ideally want to avoid issues similar to the one described in #6248, so this needs further investigation.

One could argue that the performance cost of scoped caches is amortized as a global sweeper wouldn't need to traverse many structures (see below), and several Discord.js internals benefit from this (no need to map arrays into collections when building the structures, yay!), especially since v14 aims to be less dependent of state. So unless the user requires scoped values very frequently, the performance benefits outweigh its long-term drawbacks.

2. Global sweepers

messageSweepInterval and messageCacheLifetime were very efficient, as there was a single interval being used for everything, it did not require FinalizationRegistry, it did not require clean-ups, no binding functions N-times (N being the amount of CachedManager instances of one type), and so on. However, we cannot afford having that many properties in ClientOptions, so we could do ClientOptions.sweepers instead, similar to:

new Client({
	intents: [/* ... */],
	sweepers: {
		messages: {
			maximumSize: 200,
			interval: minutes(2),
			lifetime: minutes(5)
		},
		threads: {
			interval: minutes(5),
			keepOverLimit: (/* ... */) => { /* ... */ }
		}
	}
});

This interface is very similar to the current one, and could perhaps be re-used, although it does not need to use the manager names, as the sweepers are global and managers do not need to read them.

This system comes with two major problems:

We need to somehow tell the collections about that maximum size.
We need to actively add code for each manager we add so it knows how to traverse the very nested sea of managers.

The former problem can be solved with one of the two following approaches:

Internal helpers that read a collection and the cache properties, such as CachingUtils.set(channel.messages, client.sweepers.messages).

Pros:
- Lower memory usage, as the property is stored only once.
- Cache changes are retroactive, updating the cache's limit affects everywhere else.
- We can override the maximum cache size by setting directly.
Cons:
- Setting new values becomes a bit more cluttered.
Keep LimitedCollection with all of its properties (except interval), we'd probably also want to keep the manager names so it's easy to build from internal code.

Pros:
- No internal helpers needed, code stays simple and we don't need to change much of the code.
- Local limits (such as a specific channel having a lower message limit, do we want this, tho?).
Cons:
- Needs maximumSize, keepOverLimit, and sweepFilter to be stored in each one of those collections, which translates to a small reduction of the memory usage overhead they have.

We can alleviate the latter problem by making some helpers that builds code on how to traverse the data, e.g. for reactions it'd be traverse('channels', 'messages', 'reactions'), which would basically iterate over client.channels.cache, and for each entry, iterate over channel.messages.cache, and for each entry, iterate over message.reactions.cache.

And because we need to write this code manually, we can hard-forbid specific managers from being swept, such as guilds and permission overwrites. It also provides additional flexibility so that we can handle channels significantly better in two ways.

Make it much easier for users to set up sweeping for channels by providing one channel sweeper (or one per type) instead of users having to remember to sweep up to 3 levels of caching (i.e, no more duplication between ChannelManager, GuildChannelManager, and ThreadManager). We can do this because every channel is stored in the client channel manager, which already has a way to traverse guild channel managers and thread managers to remove the desired channel.
We can hard code a filter that runs before sweeping channels to allow only the sweep of channels we can sweep.

3. Mixed

Global caches makes memory access and management easy.
Global sweepers reduces memory duplication by storing the metadata in a single place over having potentially millions of clones.

One of the global sweeper's problems is that we need to somehow tell each sweeper how to traverse, and this is a manual process. With the global caches proposal, we'd be able to just map to the property in client.caches, much like it does with managers, so client.sweepers.messages reads client.caches.messages.

In a nutshell, the first approach supports the near-stateless approach we desire in the near future, and the second approach reduces memory usage to the point we could theoretically have the next version of Discord.js use less memory than v11 and v12. Finally, of course, both ideas work very well with each other, so it will most likely be the point we'd like to be in.

Note: As for implementation, the second point can be done within semver: minor if we adopt the utility approach, as we'd use Collection instead of removing properties from LimitedCollection. The first point can be investigated and considered for v14, as it's very likely semver: major.

The text was updated successfully, but these errors were encountered:

MattIPv4 · 2021-08-28T01:49:50Z

Shared my opinion in the thread where this was discussed, but reposting here for visibility:

I vote for option 2 (global sweepers) first as this is likely achievable as a semver minor change, and then work to also do option 1 (global cache) if folks feels its viable down the road.

JMTK · 2021-08-28T17:04:00Z

Any plan for async cache methods eventually so we can offload cache out of the process memory to something like Redis? Or does the structure aspect of the data negate the ability to serialize it for something like Redis.

This would be good for sharded data that has duplicates cross process(reactions, users, etc).

Thanks for the thorough explanation on everything. The memory issue is probably the single most limiting factor to larger bots. Looking forward to this

kyranet · 2021-08-29T06:13:43Z

Any plan for async cache methods eventually so we can offload cache out of the process memory to something like Redis?

We haven't discussed anything about that matter in the last few months, but fortunately, we can go two ways around that, @JMTK:

We will keep releasing new micro-packages (@discordjs/rest, @discordjs/ws, @discordjs/builders...) that you can use to build your own Discord wrapper, this way, everything will be up to you.
(As a side note, @discordjs/rest will have strategy support, so we can have Redis for shared ratelimit handling).
We'll eventually merge our WS handlers and actions into a single structure. If correctly designed, this could allow us to have an asynchronous system that's able to deal with async cache (such as Redis). However, this can also complicate code a lot since we won't be able to use manager.resolve[Id], for example. And that's without saying the nightmares end-users would have for their most simple operations, such as iterating over a channel's messages.

If you ask me, the former will most likely happen before the latter. Besides, users who want to use Redis also prefer having higher control about what they cache and how the library interacts with it.

Or does the structure aspect of the data negate the ability to serialize it for something like Redis.

Structures#data (#6567) allows us to serialize and deserialize (as well as clone) with little to no performance impact as it's the JSON-deserialized data from Discord, I will be writing this roadmap proposal later.

Edit: Added reference to new issue.

kyranet added caching feature request utility performance labels Aug 27, 2021

kyranet added this to the Version 14 milestone Aug 27, 2021

kyranet assigned ckohen and kyranet Aug 27, 2021

ckohen mentioned this issue Oct 13, 2021

feat(Client): add global sweepers #6825

Merged

iCrawl added the packages:discord.js label Jan 7, 2022

kyranet mentioned this issue Feb 16, 2022

Roadmap: consolidating Actions and gateway handlers #7475

Open

iCrawl removed this from the discord.js v14 milestone Jul 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Roadmap: cache and sweeping improvements #6539

Roadmap: cache and sweeping improvements #6539

kyranet commented Aug 27, 2021 •

edited

MattIPv4 commented Aug 28, 2021

JMTK commented Aug 28, 2021 •

edited

kyranet commented Aug 29, 2021 •

edited

Roadmap: cache and sweeping improvements #6539

Roadmap: cache and sweeping improvements #6539

Comments

kyranet commented Aug 27, 2021 • edited

1. Global caches

2. Global sweepers

3. Mixed

MattIPv4 commented Aug 28, 2021

JMTK commented Aug 28, 2021 • edited

kyranet commented Aug 29, 2021 • edited

kyranet commented Aug 27, 2021 •

edited

JMTK commented Aug 28, 2021 •

edited

kyranet commented Aug 29, 2021 •

edited