Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CL/HIER: check global team status #717

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Sergei-Lebedev
Copy link
Contributor

What

CL HIER should report global status on team create.

Why ?

It's possible that selection table may be different on different ranks if rank considers local status only.
Internal issue: https://redmine.mellanox.com/issues/3336577

How ?

Do service team allreduce at the end of CL HIER team create to know global team status.

@vspetrov
Copy link
Collaborator

Probably it is worth moving the team lvl allreduce logic to the core: in the ucc_team_create_test in the very end (after all CLs are created). So, that it will always be just 1 "status exchange allreduce" in the end of the team creation. CLs statusus would be part of it. If at some point we will add more info to exchange (synchronize) upon team creation we will piggy-back it there as well. Currently, for example, maybe CL/BASIC also needs to synch which TLs are created. Then both CLs could do it in just 1 allreduce.

makes sense?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants