Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate data returned when use num_workers param (multi-processing) in Dataloader #150

Open
Albert-Ma opened this issue Sep 26, 2020 · 0 comments
Labels
bug Something isn't working

Comments

@Albert-Ma
Copy link
Contributor

Albert-Ma commented Sep 26, 2020

`trainloader = mz.dataloader.DataLoader(
dataset=trainset,
stage='train',
num_workers=30,
callback=padding_callback
)``

Result:

image

It shows that model will train num_worker time, 30 in this experiment, in one epoch!

A Possible Solution: splitting workload across all workers in Dataset/ __iter__ function!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant