You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One of the scripts in the examples/ folder of Accelerate or an officially supported no_trainer script in the examples folder of the transformers repo (such as run_no_trainer_glue.py)
My own task or dataset (give details below)
Reproduction
Test example test_accelerate.py
from accelerate import PartialState # Can also be Accelerator or AcceleratorState
state = PartialState()
input_list = list(range(17))
with state.split_between_processes(input_list) as splitted_input_list:
print(f"{state.device}, {splitted_input_list}")
Run accelerate launch --num_processes 8 test_accelerate.py
I also observed the same thing on my end! I would split a list of 100 elements on 8 gpus and get a total of 113 elements or something. The fix proposed by @hkunzhe worked for me.
System Info
Information
Tasks
no_trainer
script in theexamples
folder of thetransformers
repo (such asrun_no_trainer_glue.py
)Reproduction
test_accelerate.py
accelerate launch --num_processes 8 test_accelerate.py
As we can see, there are three 16 among three different processes. It is caused by
accelerate/src/accelerate/state.py
Lines 440 to 442 in 7ac153f
We should modify above codes as following to get the expected output
Expected behavior
The text was updated successfully, but these errors were encountered: