Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap: global user cache removal #7474

Open
kyranet opened this issue Feb 16, 2022 · 0 comments
Open

Roadmap: global user cache removal #7474

kyranet opened this issue Feb 16, 2022 · 0 comments

Comments

@kyranet
Copy link
Member

kyranet commented Feb 16, 2022

Initially, we made a global user cache because they exist only once in Discord, however, that has brought many caching issues that eventually required fixes that damage the overall performance, see #6782.

With the release of intents, we can really limit the amount of GuildMember instances we have, and with it, also the amount of User instances.

There's also a huge memory leak that has been there since caching has been added to the library: if a bot leaves a guild, all the users that shared only said guild with the bot would sit in memory, forever, and eventually become stale or invalid were we to read their properties when they update something while we don't share a guild with them.

To mitigate this, #6013 added a cache system that allowed us to limit how many entries we store in our CachedManagers, and later #6825 improved the sweeping performance. However, by limiting or sweeping Users in client.users.cache, you're only removing one of the many references, in fact, they stay alive at the GuildMember instance:

_patch(data) {
if ('user' in data) {
/**
* The user that this guild member instance represents
* @type {?User}
*/
this.user = this.client.users._add(data.user, true);
}
And if it's swept and the user joins another guild, they'll just get a second User instance, completely independent of the first one.

In fact, because GuildMember has a transient relationship with User, some funny things happen. For example when discord.js fetches guild members and the developer decides to specify cache: false, the GuildMembers are not cached but their respective Users are instantiated and cached regardless (see code above, thanks @Jiralite).

We could just make GuildMember#user a getter so we don't duplicate Users, but, not only that would worsen the developer experience (since it becomes nullable), but it also doesn't solve the memory leak at all.

Alternatively, we could also sweep all entries from client.users.cache when the bot leaves a guild, but for large internal sharded bots, that's going to be a huge performance hit, and definitely not something we should do.

And last but not least, what's the use case for client.users.cache to even exist? Most users use client.users.fetch(UserResolvable) but very often:

  • Leave cache default, which means it'll always cache (as it defaults to true). This has a pitfall, if the user doesn't share a guild, it'll eventually be stale and incorrect, plus it'll be contributing to the memory leak.
  • Have to use force due to the aforementioned issue. If you cache a user that doesn't share a guild with the bot (e.g. for a whois command), and they change their avatar, the library will, by default, retrieve the entry from the cache, which has a stale avatar, and as such, the command will fail to display correctly. Using force disregards the cache, and ends up making the cache useless again.

A solution

Starting with #6013, Discord.js has a manager called DataManager, it is similar to CachedManager, but doesn't have a cache. We can use this so we're still able to do client.users.resolve[Id](UserResolvable). We may also move client.user to client.users.me or client.users.self, so we don't have two different properties.

With this approach, we'll stop caching Users globally. Now we need to store them somewhere: inside GuildMember. This approach makes sense, because if the guild member leaves a guild, both the GuildMember and the User instance will be disposed from memory. Memory leak solved!

And because data duplication may be controversial (and some people may object to this proposal because of it), let me sum up some points:

  • Sweeping users can and will result on data duplication due to hard references inside GuildMember and new ones not finding the previously created User instances.
  • The issue from feat(GuildMemberUpdate): emit on user updates #6782 exists, and the approach it takes duplicates User instances anyways.
  • As @Jiralite pointed out, if you specify cache: false in GuildMemberManager#fetch, members won't be cached, but users will.

There may be possible problems with the removal of the global cache, specially for guild data, since we won't be able to get the user data if the member leaves the guild. One place that would be very notable would be Message#author, however, it turns out that property holds a hard reference just like GuildMember#user does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant