Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sharding on multiple servers #204

Open
MrErikCodes opened this issue Jul 22, 2019 · 27 comments
Open

Sharding on multiple servers #204

MrErikCodes opened this issue Jul 22, 2019 · 27 comments
Labels
enhancement New feature or request

Comments

@MrErikCodes
Copy link

Is it possible to shard the bot on multiple servers? As of now it is overlapping itself. Any code examples?

@kyranet
Copy link
Contributor

kyranet commented Jul 22, 2019

Yeah it is, though I don't know if Kurasuta's docs allow it, the underlying message system (powered by veza) allows this .connectTo({ host: '255.255.255.255', port: 9999 }), which would stablish a connection with 255.255.255.255:9999.

@MrErikCodes
Copy link
Author

Any code examples or link to docs regarding the comment over? Cant seem to find it anywhere. I know discord.js (master) you can use shardList to give it a array to spawn

@kyranet
Copy link
Contributor

kyranet commented Jul 22, 2019

Actually, it's quite tricky... the processes are more than capable to connect to other servers, but I don't know about spawning them... I invoke you, @DevYukine

@PLASMAchicken
Copy link

This would acutally be very usefull, especially for large Music Bots

@DevYukine DevYukine added the enhancement New feature or request label Aug 31, 2019
@MrErikCodes
Copy link
Author

Will this be added? If so when? Any ETA?

@kyranet
Copy link
Contributor

kyranet commented Oct 14, 2019

Yes, this will be added. No ETA at the moment.

@owenselles
Copy link

Nice!

@MrErikCodes
Copy link
Author

Any update here? Could really need it

@PLASMAchicken
Copy link

Any update here? Could really need it

It's an OpenSource Project so you can make a PR if you want progress.

@Roki100
Copy link

Roki100 commented Feb 27, 2020

supporting this idea!

@anna-rmrf
Copy link

Any update on this one?

@PLASMAchicken
Copy link

It's an OpenSource Project so you can make a PR if you want progress.

@muraatydn
Copy link

i need this :)

@anna-rmrf
Copy link

It's an OpenSource Project so you can make a PR if you want progress.

Why would I comment here if I was able to do it myself? like can you think?

@DevYukine
Copy link
Owner

Hey, as already said this is an Open Source Project, i currently have no time to work on this feature and do not plan on doing that soon, if anyone wants it they can open a PR.

P.S. If you aren't able to do it yourself you can always take this as a learning experience and build up some more knowledge ;)

@zihadmahiuddin
Copy link

Hey, as already said this is an Open Source Project, i currently have no time to work on this feature and do not plan on doing that soon, if anyone wants it they can open a PR.

P.S. If you aren't able to do it yourself you can always take this as a learning experience and build up some more knowledge ;)

Hey there!
I was thinking about giving it a try, I think I can make veza use TCP for communication. But not sure how I'd do the spawning. Do you have any suggestions?
Thanks.

@PLASMAchicken
Copy link

Possibly you need a master server and slaves and each of then will register and get assigned their shardId and how many shards one handles.

@zihadmahiuddin
Copy link

Right. I thought about that too. But not sure how I can structure it. I was thinking maybe ShardingManager would take an optional mode option. There would be 3 modes. A default, local mode that does the same thing as Kurasuta does now. The second mode would be a gateway thing that would take care of things like how many shards, how many slaves, which slaves are running which shards, etc. There will be only one instance of this mode. And the other mode would connect to the 2nd mode to know how many more shards it needs etc. Then it will run only that amount of shards. And in this mode, broadcastEval etc. would not eval in the shards of the current machine but it would send the eval to the first mode instance (the gateway thing) and that would then send it to all the machines connected and return the values from all of them and so on.
But I am not very confident on this method. It could even have huge flaws or something. Let me know what you think. Also, should we keep it here or maybe continue in discord if you want?

@arpanr
Copy link

arpanr commented Jun 16, 2021

any updates to this? this would be really cool

@edazpotato
Copy link

edazpotato commented Nov 5, 2021

With the default discord.js sharding manager you can pass it an array of shard IDs that you want it to manage. This means that you can get it to manage a different chunk of a bot's shards on different machines. Maybe a similar solution could be implemented here?

(I saw that this had been mentioned in passing but I thought I'd state it explicitly)

@zihadmahiuddin
Copy link

With the default discord.js sharding manager you can pass it an array of shard IDs that you want it to manage. This means that you can get it to manage a different chunk of a bot's shards on different machines. Maybe a similar solution could be implemented here?

(I saw that this had been mentioned in passing but I thought I'd state it explicitly)

yes but you would need a "manager" of some sort that will keep track of which shards are currently connected, which shards are not, and how many shards should be connected, etc.
I have a kinda working "demo" here but it's far from stable. Just a PoC of some sort rn.

@maxschnee-dev
Copy link

maxschnee-dev commented Mar 1, 2022

Any new progress on this? I've been looking around and there doesn't seem to be a user-friendly way to shard across multiple machines. I've set up my own TCP server that works relatively well but it's tedious. Some solution like this would be awesome! (:
edit: this repo works quite well for anyone looking to do this

@DevYukine
Copy link
Owner

@Rebble69 its still the sams as what i said in #204 (comment), i personally do not have time to work on this but if someone makes a PR im free to merge it assuming it still works and follows my general code style

@kyranet
Copy link
Contributor

kyranet commented Jun 13, 2022

With permission from the maintainer, I'll be sharing this issue: discordjs/discord.js#8084

It's basically a library-agnostic sharder system (needs to be in order to support bots that are made with discord.js's component libraries without the main library) powered by composable strategies for maximum control over every component of it.

Compared to Kurasuta (and current discord.js sharder), it's a low-level highly-customizable strategy-based sharder system. It may not be very easy to use, specially since it's not tightly integrated for spawning websocket shards, but rather focuses on the process/worker sharding (leaving gateway ones for @discordjs/ws or similar).

The linked RFC also features proxy managers with load-balancing, and we're looking for developers who have worked on similar things to provide us some insight or advices 🙏🏼

@meister03
Copy link

I could give you a detailed insight. Since I tweaked a lot with my packages (mentioned by @Rebble69). From what I can say, that it will not be easy.

Firstly, the broadcastEval/Eval part should be removed. Its quite a bad practice doing it. Users end up executing scripts, which can end up in a security breach. The message handler approach would be better. You can send a op code with the message type stats and now the user themselves can call the stats function on the message event.

Cross communication should be of the major part of the Proxies. The easiest approach would be making a master process called bridge, which will then distrubute the messages to the sharding managers and coordinate them. It will be like a p2p connection.

When the upper info is helpful?, then I could elaborate the upper statements and some new ones.....

@JMTK
Copy link

JMTK commented Jun 13, 2022

(I didnt want to comment on the RFC Sharder issue with this as it is not directly helpful)

I implemented d.js sharding across multiple machines, using a VPC so the websocket ports aren't exposed to the internet. I wrote a guide but it's slightly outdated and probably has some pitfalls that aren't desirable: https://jmtk.co/blog/24. I've been using this method for 8 months now without any major issue and I saved some money as a result

The new ShardProxy concept does look like it would be a great replacement going forward and I would look to use that assuming it fulfilled all my needs.

@kyranet
Copy link
Contributor

kyranet commented Jun 13, 2022

Replying to each other... first to @meister03:

Firstly, the broadcastEval/Eval part should be removed. Its quite a bad practice doing it. Users end up executing scripts, which can end up in a security breach. The message handler approach would be better. You can send a op code with the message type stats and now the user themselves can call the stats function on the message event.

Yes, indeed. It's one of the first things I wanted to remove from discord.js's sharder, mostly due to security concerns, but also because it limits how we can format/structure the payloads.

Cross communication should be of the major part of the Proxies. The easiest approach would be making a master process called bridge, which will then distrubute the messages to the sharding managers and coordinate them. It will be like a p2p connection.

I disagree, the major part of the proxies is to load-balance its own cluster of shards. Communication should still be done by dedicated systems, but needless to say, the sharder will feature a very complete and powerful system that can be used for any priority task. The least they try to do, the better.


And now to you, @JMTK:

(I didnt want to comment on the RFC Sharder issue with this as it is not directly helpful)

Sure, I guess.

I implemented d.js sharding across multiple machines, using a VPC so the websocket ports aren't exposed to the internet. I wrote a guide but it's slightly outdated and probably has some pitfalls that aren't desirable: jmtk.co/blog/24. I've been using this method for 8 months now without any major issue and I saved some money as a result

The new ShardProxy concept does look like it would be a great replacement going forward and I would look to use that assuming it fulfilled all my needs.

The VPC bit is maybe covered under "There might also be a need to support SSH tunnels to bypass firewalls for greater security" which is at ShardManagerProxy's first paragraph. If there are other ways of making tunnels, I suppose we can look into it. DigitalOcean seems to have a lot of custom stuff too, and since I'm not a DO customer, I'm unaware of many of their technologies. I'm open to explore it in the future, although chances are that it will have to be done outside of the main package, I don't know, time will tell.

Similar to VPC, there's also VPN, which allows you to connect to a protected/unexposed network from the Internet in a secure way.


I have also edited the issue to add a few points to address some questions regarding the reliability and distributability of the network system, including a mention for fallback mirror managers for higher resilience to downtime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests