`AcceleratorState` object has no attribute `distributed_type`. #2786

evelinamorim · 2024-05-16T12:34:14Z

System Info

accelerate-0.30.1
Google Colab
numpy-1.25.2
torch-2.2.1+cu121

Python 3.10.12

Regarding the accelerate configuration, I am using trainer which employs accelerate inside it, and I do not touch the configuration.

Information

The official example scripts
My own modified scripts

Tasks

One of the scripts in the examples/ folder of Accelerate or an officially supported no_trainer script in the examples folder of the transformers repo (such as run_no_trainer_glue.py)
My own task or dataset (give details below)

Reproduction

The args.json file employed below is available to download at: https://drive.google.com/file/d/1H2MstSq_oz7Xv7spMZCppf39fHGs5rW0/view?usp=drive_link.

The dataset specified in the args.json is the file: https://drive.google.com/file/d/18OVilNSqQQogSMiepe87vtNmzYpCalCs/view?usp=drive_link

In Google Colab, I coded:

!git clone https://github.com/evelinamorim/Seq2seqCoref.git
!pip install -U transformers accelerate

import sys
sys.path.insert(1, "Seq2seqCoref")

from transformers import HfArgumentParser, set_seed
from transformers import AutoModelForSeq2SeqLM, \
    DataCollatorForSeq2Seq, AutoConfig, AutoTokenizer
from transformers.integrations import TensorBoardCallback

from arguments import DataArguments, ModelArguments, CorefTrainingArguments \
    as TrainingArguments
from constants import SPEAKER_START, SPEAKER_END, MENTION_START, MENTION_END, \
    COPY, CLUSTER_NEW, CLUSTERS, SENTENCE_START, SENTENCE_END, SPECIAL_IDS, \
    NON_INT_SPECIAL_IDS, MARK_SPECIAL_IDS, MENTION_END_NON_INT_SPECIAL_IDS, \
    MENTION_ENDS
from data import CorefDataset
from trainer import CorefTrainer
import os

parser = HfArgumentParser(
        (ModelArguments, DataArguments, TrainingArguments))
model_args, data_args, training_args = parser.parse_json_file(
        json_file=os.path.abspath("args.json"))

set_seed(training_args.seed)

# tokenizer setup
tokenizer = AutoTokenizer.from_pretrained(model_args.model_name_or_path)

num_new_tokens = tokenizer.add_tokens([SPEAKER_START, SPEAKER_END,
                                           MENTION_START, MENTION_END,
                                           COPY])
num_new_tokens += tokenizer.add_tokens([SENTENCE_START, SENTENCE_END])

# loading config and model
config = AutoConfig.from_pretrained(model_args.model_name_or_path)
model = AutoModelForSeq2SeqLM.from_pretrained(
        model_args.model_name_or_path, config=config)

# data objects
collator = DataCollatorForSeq2Seq(tokenizer, model=model)
train_set = CorefDataset(tokenizer, data_args, training_args, 'train')

tb_callback = TensorBoardCallback()
trainer = CorefTrainer(
        tokenizer=tokenizer,
        model=model,
        args=training_args,
        train_dataset=train_set,
        #        eval_dataset=dev_set,
        data_collator=collator,
        callbacks=[tb_callback]
    )

trainer.train()

The traceback error is:

AttributeError                            Traceback (most recent call last)
<ipython-input-16-3435b262f1ae> in <cell line: 1>()
----> 1 trainer.train()

5 frames
/usr/local/lib/python3.10/dist-packages/transformers/trainer.py in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
   1857                 hf_hub_utils.enable_progress_bars()
   1858         else:
-> 1859             return inner_training_loop(
   1860                 args=args,
   1861                 resume_from_checkpoint=resume_from_checkpoint,

/content/Seq2seqCoref/trainer.py in _inner_training_loop(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval)
    169         self._train_batch_size = batch_size
    170         # Data loader and number of training steps
--> 171         train_dataloader = self.get_train_dataloader()
    172 
    173         # Setting up training control variables:

/usr/local/lib/python3.10/dist-packages/transformers/trainer.py in get_train_dataloader(self)
    877             dataloader_params["prefetch_factor"] = self.args.dataloader_prefetch_factor
    878 
--> 879         return self.accelerator.prepare(DataLoader(train_dataset, **dataloader_params))
    880 
    881     def _get_eval_sampler(self, eval_dataset: Dataset) -> Optional[torch.utils.data.Sampler]:

/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py in prepare(self, device_placement, *args)
   1246                 )
   1247 
-> 1248         if self.distributed_type == DistributedType.DEEPSPEED:
   1249             model_count = 0
   1250             for obj in args:

/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py in distributed_type(self)
    527     @property
    528     def distributed_type(self):
--> 529         return self.state.distributed_type
    530 
    531     @property

/usr/local/lib/python3.10/dist-packages/accelerate/state.py in __getattr__(self, name)
   1074         # so we just modify the error message
   1075         if name in self._known_attrs:
-> 1076             raise AttributeError(
   1077                 f"`AcceleratorState` object has no attribute `{name}`. "
   1078                 "This happens if `AcceleratorState._reset_state()` was called and "

AttributeError: `AcceleratorState` object has no attribute `distributed_type`. This happens if `AcceleratorState._reset_state()` was called and an `Accelerator` or `PartialState` was not reinitialized.

Expected behavior

To train the model at the end of the code.

The text was updated successfully, but these errors were encountered:

muellerzr · 2024-05-16T14:28:56Z

What is CorefTrainer? Does it make an AcceleratorState or PartialState or something? As the error hints at, somewhere along the line the state was reset without then being called again

evelinamorim · 2024-05-16T15:47:34Z

I am sorry I did not specify CorefTrainer. I am using a custom trainer (you can check in this link ).

This custom trainer is a subclass of the Seq2SeqTrainer. None of the implemented functions in the custom trainer reset AcceleratorState. I went through all the of Seq2SeqTrainer and Trainer, and I was only able to identify the method create_accelerator_and_postprocess that creates the accelerator for a trainer object. I do not know if I must provide some configuration to avoid this error.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`AcceleratorState` object has no attribute `distributed_type`. #2786

`AcceleratorState` object has no attribute `distributed_type`. #2786

evelinamorim commented May 16, 2024

muellerzr commented May 16, 2024

evelinamorim commented May 16, 2024 •

edited

AcceleratorState object has no attribute distributed_type. #2786

AcceleratorState object has no attribute distributed_type. #2786

Comments

evelinamorim commented May 16, 2024

System Info

Information

Tasks

Reproduction

Expected behavior

muellerzr commented May 16, 2024

evelinamorim commented May 16, 2024 • edited

`AcceleratorState` object has no attribute `distributed_type`. #2786

`AcceleratorState` object has no attribute `distributed_type`. #2786

evelinamorim commented May 16, 2024 •

edited