-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to run example humaneval code #27
Comments
Same issue here. |
After I tried to remove the keyword, it also generates the error like the following: And I looked into the package(1.0.1.1) installed on my local server, I found the codes for this version did not sync with the main branch of the repo. It seems the latest main branch has fixed this issue. So I think we can fix it by reinstall the package from the repo rather than pip. |
For the TensorDataSet NameError, I found that adding this line solves the issue |
I would recommend upgrading numpy as well. |
`!pip install sentencepiece
from codetf.models import load_model_pipeline
from codetf.data_utility.human_eval_dataset import HumanEvalDataset
from codetf.performance.model_evaluator import ModelEvaluator
import os
os.environ["HF_ALLOW_CODE_EVAL"] = "1"
os.environ["TOKENIZERS_PARALLELISM"] = "true"
model_class = load_model_pipeline(model_name="causallm", task="pretrained",
model_type="codegen-350M-mono", is_eval=True,
load_in_8bit=True, weight_sharding=False)
dataset = HumanEvalDataset(tokenizer=model_class.get_tokenizer())
prompt_token_ids, prompt_attention_masks, references = dataset.load()
problems = TensorDataset(prompt_token_ids, prompt_attention_masks)
evaluator = ModelEvaluator(model_class)
avg_pass_at_k = evaluator.evaluate_pass_k(problems=problems, unit_tests=references)
print("Pass@k: ", avg_pass_at_k)`
Above is the code that was used. During execution in Google Colab, I received the error,
in <cell line: 15>:15 │
│ │
│ /usr/local/lib/python3.10/dist-packages/codetf/data_utility/human_eval_dataset.py:29 in load │
│ │
│ 26 │ │ │ unit_test = re.sub(r'METADATA = {[^}]*}', '', unit_test, flags=re.MULTILINE) │
│ 27 │ │ │ references.append(unit_test) │
│ 28 │ │ │
│ ❱ 29 │ │ prompt_token_ids, prompt_attention_masks = self.process_data(prompts, use_max_le │
│ 30 │ │ │
│ 31 │ │ return prompt_token_ids, prompt_attention_masks, references │
│ 32 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: BaseDataset.process_data() got an unexpected keyword argument 'use_max_length'
After looking through the source code I don't seem to see this keyword argument, apart from max_length. Would anyone mind shedding some light on the issue?
The text was updated successfully, but these errors were encountered: