You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
accelerate-0.30.1
Google Colab
numpy-1.25.2
torch-2.2.1+cu121
Python 3.10.12
Regarding the accelerate configuration, I am using trainer which employs accelerate inside it, and I do not touch the configuration.
Information
The official example scripts
My own modified scripts
Tasks
One of the scripts in the examples/ folder of Accelerate or an officially supported no_trainer script in the examples folder of the transformers repo (such as run_no_trainer_glue.py)
!git clone https://github.com/evelinamorim/Seq2seqCoref.git
!pip install -U transformers accelerate
import sys
sys.path.insert(1, "Seq2seqCoref")
from transformers import HfArgumentParser, set_seed
from transformers import AutoModelForSeq2SeqLM, \
DataCollatorForSeq2Seq, AutoConfig, AutoTokenizer
from transformers.integrations import TensorBoardCallback
from arguments import DataArguments, ModelArguments, CorefTrainingArguments \
as TrainingArguments
from constants import SPEAKER_START, SPEAKER_END, MENTION_START, MENTION_END, \
COPY, CLUSTER_NEW, CLUSTERS, SENTENCE_START, SENTENCE_END, SPECIAL_IDS, \
NON_INT_SPECIAL_IDS, MARK_SPECIAL_IDS, MENTION_END_NON_INT_SPECIAL_IDS, \
MENTION_ENDS
from data import CorefDataset
from trainer import CorefTrainer
import os
parser = HfArgumentParser(
(ModelArguments, DataArguments, TrainingArguments))
model_args, data_args, training_args = parser.parse_json_file(
json_file=os.path.abspath("args.json"))
set_seed(training_args.seed)
# tokenizer setup
tokenizer = AutoTokenizer.from_pretrained(model_args.model_name_or_path)
num_new_tokens = tokenizer.add_tokens([SPEAKER_START, SPEAKER_END,
MENTION_START, MENTION_END,
COPY])
num_new_tokens += tokenizer.add_tokens([SENTENCE_START, SENTENCE_END])
# loading config and model
config = AutoConfig.from_pretrained(model_args.model_name_or_path)
model = AutoModelForSeq2SeqLM.from_pretrained(
model_args.model_name_or_path, config=config)
# data objects
collator = DataCollatorForSeq2Seq(tokenizer, model=model)
train_set = CorefDataset(tokenizer, data_args, training_args, 'train')
tb_callback = TensorBoardCallback()
trainer = CorefTrainer(
tokenizer=tokenizer,
model=model,
args=training_args,
train_dataset=train_set,
# eval_dataset=dev_set,
data_collator=collator,
callbacks=[tb_callback]
)
trainer.train()
The traceback error is:
AttributeError Traceback (most recent call last)
<ipython-input-16-3435b262f1ae> in <cell line: 1>()
----> 1 trainer.train()
5 frames
/usr/local/lib/python3.10/dist-packages/transformers/trainer.py in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
1857 hf_hub_utils.enable_progress_bars()
1858 else:
-> 1859 return inner_training_loop(
1860 args=args,
1861 resume_from_checkpoint=resume_from_checkpoint,
/content/Seq2seqCoref/trainer.py in _inner_training_loop(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval)
169 self._train_batch_size = batch_size
170 # Data loader and number of training steps
--> 171 train_dataloader = self.get_train_dataloader()
172
173 # Setting up training control variables:
/usr/local/lib/python3.10/dist-packages/transformers/trainer.py in get_train_dataloader(self)
877 dataloader_params["prefetch_factor"] = self.args.dataloader_prefetch_factor
878
--> 879 return self.accelerator.prepare(DataLoader(train_dataset, **dataloader_params))
880
881 def _get_eval_sampler(self, eval_dataset: Dataset) -> Optional[torch.utils.data.Sampler]:
/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py in prepare(self, device_placement, *args)
1246 )
1247
-> 1248 if self.distributed_type == DistributedType.DEEPSPEED:
1249 model_count = 0
1250 for obj in args:
/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py in distributed_type(self)
527 @property
528 def distributed_type(self):
--> 529 return self.state.distributed_type
530
531 @property
/usr/local/lib/python3.10/dist-packages/accelerate/state.py in __getattr__(self, name)
1074 # so we just modify the error message
1075 if name in self._known_attrs:
-> 1076 raise AttributeError(
1077 f"`AcceleratorState` object has no attribute `{name}`. "
1078 "This happens if `AcceleratorState._reset_state()` was called and "
AttributeError: `AcceleratorState` object has no attribute `distributed_type`. This happens if `AcceleratorState._reset_state()` was called and an `Accelerator` or `PartialState` was not reinitialized.
Expected behavior
To train the model at the end of the code.
The text was updated successfully, but these errors were encountered:
What is CorefTrainer? Does it make an AcceleratorState or PartialState or something? As the error hints at, somewhere along the line the state was reset without then being called again
I am sorry I did not specify CorefTrainer. I am using a custom trainer (you can check in this link ).
This custom trainer is a subclass of the Seq2SeqTrainer. None of the implemented functions in the custom trainer reset AcceleratorState. I went through all the of Seq2SeqTrainer and Trainer, and I was only able to identify the method create_accelerator_and_postprocess that creates the accelerator for a trainer object. I do not know if I must provide some configuration to avoid this error.
System Info
accelerate-0.30.1 Google Colab numpy-1.25.2 torch-2.2.1+cu121 Python 3.10.12 Regarding the accelerate configuration, I am using trainer which employs accelerate inside it, and I do not touch the configuration.
Information
Tasks
no_trainer
script in theexamples
folder of thetransformers
repo (such asrun_no_trainer_glue.py
)Reproduction
The args.json file employed below is available to download at: https://drive.google.com/file/d/1H2MstSq_oz7Xv7spMZCppf39fHGs5rW0/view?usp=drive_link.
The dataset specified in the args.json is the file: https://drive.google.com/file/d/18OVilNSqQQogSMiepe87vtNmzYpCalCs/view?usp=drive_link
In Google Colab, I coded:
The traceback error is:
Expected behavior
To train the model at the end of the code.
The text was updated successfully, but these errors were encountered: