Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Q] How to turn off only model synching in huggingface integration #7657

Open
jubueche opened this issue May 16, 2024 · 11 comments
Open

[Q] How to turn off only model synching in huggingface integration #7657

jubueche opened this issue May 16, 2024 · 11 comments

Comments

@jubueche
Copy link

Hi.
I am training large models and they are logged to wandb. This happens through the artifacts. How do I only turn off this feature?
I tried googling but couldn’t find an answer.

@ArtsiomWB
Copy link
Contributor

Hi @jubueche, could you please talk a bit about your workflow, and why you are interested in turning off the artifact logging ?

@jubueche
Copy link
Author

Hi,

I don't need to turn off the artifact logging. I just don't want my model to get synced to wandb. My models are multiple GB and it takes quite some space and time when I upload them.

@ArtsiomWB
Copy link
Contributor

Gotcha, could you please try setting os.environ["WANDB_LOG_MODEL"] = "false"

Here are our docs on it

@umarbutler
Copy link

Gotcha, could you please try setting os.environ["WANDB_LOG_MODEL"] = "false"

Here are our docs on it

This does not work.

@jubueche
Copy link
Author

@ArtsiomWB I can confirm. This does not work.

@jubueche
Copy link
Author

jubueche commented May 23, 2024

# # log the initial model and architecture to an artifact
# with tempfile.TemporaryDirectory() as temp_dir:
#     model_name = (
#         f"model-{self._wandb.run.id}"
#         if (args.run_name is None or args.run_name == args.output_dir)
#         else f"model-{self._wandb.run.name}"
#     )
#     model_artifact = self._wandb.Artifact(
#         name=model_name,
#         type="model",
#         metadata={
#             "model_config": model.config.to_dict() if hasattr(model, "config") else None,
#             "num_parameters": self._wandb.config.get("model/num_parameters"),
#             "initial_model": True,
#         },
#     )
#     model.save_pretrained(temp_dir)
#     # add the architecture to a separate text file
#     save_model_architecture_to_file(model, temp_dir)

#     for f in Path(temp_dir).glob("*"):
#         if f.is_file():
#             with model_artifact.new_file(f.name, mode="wb") as fa:
#                 fa.write(f.read_bytes())
#     self._wandb.run.log_artifact(model_artifact, aliases=["base_model"])

#     badge_markdown = (
#         f'[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge'
#         f'-28.svg" alt="Visualize in Weights & Biases" width="20'
#         f'0" height="32"/>]({self._wandb.run.get_url()})'
#     )

#     modelcard.AUTOGENERATED_TRAINER_COMMENT += f"\n{badge_markdown}"

I just commented out the following in the integration_utils.py of transformers of Hugginggface.

@ArtsiomWB
Copy link
Contributor

Hey @jubueche, thank you so much for the workaround. It is strange that os.environ["WANDB_LOG_MODEL"] = "false" is not working on your side. What version of wandb are you currently on? Will try reproducing this on my side.

@ArtsiomWB
Copy link
Contributor

Hi there, I wanted to follow up on this request. Please let us know if we can be of further assistance or if your issue has been resolved.

@jubueche
Copy link
Author

Hi sorry, my wandb version is

>>> wandb.__version__
'0.16.4'

For now I am just using the code with the commented out section.

@umarbutler
Copy link

# # log the initial model and architecture to an artifact
# with tempfile.TemporaryDirectory() as temp_dir:
#     model_name = (
#         f"model-{self._wandb.run.id}"
#         if (args.run_name is None or args.run_name == args.output_dir)
#         else f"model-{self._wandb.run.name}"
#     )
#     model_artifact = self._wandb.Artifact(
#         name=model_name,
#         type="model",
#         metadata={
#             "model_config": model.config.to_dict() if hasattr(model, "config") else None,
#             "num_parameters": self._wandb.config.get("model/num_parameters"),
#             "initial_model": True,
#         },
#     )
#     model.save_pretrained(temp_dir)
#     # add the architecture to a separate text file
#     save_model_architecture_to_file(model, temp_dir)

#     for f in Path(temp_dir).glob("*"):
#         if f.is_file():
#             with model_artifact.new_file(f.name, mode="wb") as fa:
#                 fa.write(f.read_bytes())
#     self._wandb.run.log_artifact(model_artifact, aliases=["base_model"])

#     badge_markdown = (
#         f'[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge'
#         f'-28.svg" alt="Visualize in Weights & Biases" width="20'
#         f'0" height="32"/>]({self._wandb.run.get_url()})'
#     )

#     modelcard.AUTOGENERATED_TRAINER_COMMENT += f"\n{badge_markdown}"

I just commented out the following in the integration_utils.py of transformers of Hugginggface.

Do you know if this was a new addition to transformers? Maybe the problem is on their side? This problem doesn't arise for me on a different system which has an older version of transformers.

@ArtsiomWB
Copy link
Contributor

Thank you so much for the follow up, @umarbutler , hey @jubueche , are you able to try a different version of transformets and see if that fixes it?

@umarbutler, what version currently works for you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants