You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Problem:
By mistake I do not know whether I used UUID4 or an integer number (marked as a category) to train my now historical models. I wanted to understand it from my JSON-model dumps ( model.save_model(fname=path, format="json") ) but found no info about the feature values (only splits, definitions through combination and so on).
I know that there should be "cat_features_hash" that holds this info. But it does not exist in my dumps.
With the help I found in the documentation that I should have been passed "pool" argument.
Documentation says the following but the code raises no warning (or error) when I skipped this arg.
This parameter is required if the model contains categorical features and the output format is cpp, python, or JSON.
Bonus comment:
Documentation says the following but I was able to predict the imported model and the results looked okay.
The model can be saved to the JSON format without a pool. In this case it is available for review but it is not applicable.
catboost version: 1.2.2
Operating System: Google Cloud Vertex AI Workbench User-Managed Notebook
CPU: ?
GPU: No GPU
The text was updated successfully, but these errors were encountered:
NickVeld
changed the title
[Model dump: JSON] cat_features_hash does not exist in JSON dump of model
[Model dump: JSON] model.save_model does not warn when required "pool" argument is skipped
May 13, 2024
NickVeld
added a commit
to NickVeld/catboost
that referenced
this issue
May 15, 2024
Problem:
By mistake I do not know whether I used UUID4 or an integer number (marked as a category) to train my now historical models. I wanted to understand it from my JSON-model dumps (
model.save_model(fname=path, format="json")
) but found no info about the feature values (only splits, definitions through combination and so on).I know that there should be "cat_features_hash" that holds this info. But it does not exist in my dumps.
With the help I found in the documentation that I should have been passed "pool" argument.
Documentation says the following but the code raises no warning (or error) when I skipped this arg.
Bonus comment:
Documentation says the following but I was able to predict the imported model and the results looked okay.
[Appendix]
My dump structure:
catboost version: 1.2.2
Operating System: Google Cloud Vertex AI Workbench User-Managed Notebook
CPU: ?
GPU: No GPU
The text was updated successfully, but these errors were encountered: