Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sagemaker TSX: Error with num_samples #4648

Closed
Alex-Wenner-FHR opened this issue May 3, 2024 · 1 comment
Closed

Sagemaker TSX: Error with num_samples #4648

Alex-Wenner-FHR opened this issue May 3, 2024 · 1 comment
Assignees
Labels

Comments

@Alex-Wenner-FHR
Copy link

Describe the bug

org.apache.spark.api.python.PythonException: Traceback (most recent call last):
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 830, in main
    process()
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 822, in process
    serializer.dump_stream(out_iter, outfile)
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 274, in dump_stream
    vs = list(itertools.islice(iterator, batch))
  File "/usr/local/lib/python3.9/site-packages/analyzer/analyzers/timeseries/timeseries_asymmetric_shap_analyzer.py", line 251, in explain
    instance_explanation=explainer.explain(self._to_explainer_input(row), baseline_config=baseline_config),
  File "/usr/local/lib/python3.9/site-packages/explainers/shap/asymmetric_shap/asymmetric_shap.py", line 94, in explain
    return self._explain_time_series(input_dataset, baseline_config or TimeSeriesBaselineConfig())
  File "/usr/local/lib/python3.9/site-packages/explainers/shap/asymmetric_shap/asymmetric_shap.py", line 121, in _explain_time_series
    return self._compute_feature_attributions(input_dataset, synthetic_dataset, baseline)
  File "/usr/local/lib/python3.9/site-packages/explainers/shap/asymmetric_shap/asymmetric_shap.py", line 179, in _compute_feature_attributions
    inference_result = self._model(synthetic_dataset.dataset)
  File "/usr/local/lib/python3.9/site-packages/analyzer/predictor/predictor.py", line 63, in __call__
    return self.predict(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/analyzer/predictor/predictor.py", line 328, in predict
    return np.array(predicted_labels).reshape(
ValueError: cannot reshape array of size 7722 into shape (289,786,26)

It appears this error has a direct relationship with the parameter of num_samples in the AsymmetricShapleyValueConfig.
I am operating under the impression that the num_samples with the fine_grained granularity should be the (dimension of target timeseries + dimension of related timeseries)^2. In my use case my target dimension is 1 and related timeseries is 16. Thus 17^2 would be 289. That is the value I am specifying: num_samples = 289

To reproduce
I am unsure how to reproduce this if following those specifications are working for others.

Expected behavior
I would expect the implementation to function properly.

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 2.218.0
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): N/A
  • Framework version: N/A
  • Python version: 3.9
  • CPU or GPU: Both
  • Custom Docker image (Y/N): N
@rvasahu-amazon rvasahu-amazon self-assigned this May 7, 2024
@Alex-Wenner-FHR
Copy link
Author

This has since been resolved. I had an issue with the way my endpoint was handling the invocations that the explainer was sending.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants