Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Job Prefixing Fails with TransformJobName Not Specified Error #4590

Open
lazem opened this issue Apr 17, 2024 · 1 comment
Open

Custom Job Prefixing Fails with TransformJobName Not Specified Error #4590

lazem opened this issue Apr 17, 2024 · 1 comment
Labels
component: pipelines Relates to the SageMaker Pipeline Platform type: question

Comments

@lazem
Copy link

lazem commented Apr 17, 2024

Describe the bug
When using the use_custom_job_prefix option in PipelineDefinitionConfig to enable custom prefixes for jobs within a SageMaker pipeline, an error is thrown stating that the TransformJobName has not been specified, even though job names are being used within the pipeline.

To reproduce

  • Configure a SageMaker pipeline with multiple steps, including a TransformStep.
  • Use base_transform_job_name during a Transformer instantiation.
  • Use PipelineDefinitionConfig with use_custom_job_prefix=True.
  • Attempt to create or update the pipeline using pipeline.upsert() or pipeline.create().

Expected behavior
The pipeline should accept the dynamically generated names for each job and apply the custom prefix as specified.

Actual behavior
The pipeline creation fails, and the following error is thrown:

ValueError: Invalid input: use_custom_job_prefix flag is set but the name field [TransformJobName] has not been specified. Please refer to the AWS Docs to identify which field should be set to enable the custom-prefixing feature for jobs created via a pipeline execution. https://docs.aws.amazon.com/sagemaker/latest/dg/build-and-manage-access.html#build-and-manage-step-permissions-prefix

Code Snippet

transformer = Transformer(
    model_name=create_model_step.properties.ModelName,
    sagemaker_session=sagemaker_session,
    base_transform_job_name="BE-Transform",
    instance_type="ml.m5.xlarge",
    instance_count=1,
    output_path=s3_output_path,
 
)
transform_step = TransformStep(
        name="be-transform-step",
        transformer=transformer,
        inputs=transform_input
    )

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 2.213.0
  • Python version: 3.11
  • CPU or GPU: CPU
  • Custom Docker image (Y/N): Y

Is there a specific configuration or step missing that is required to correctly set the TransformJobName when using custom prefixes? Any guidance or fix would be greatly appreciated.

@lazem lazem added the bug label Apr 17, 2024
@knikure knikure added the component: pipelines Relates to the SageMaker Pipeline Platform label Apr 17, 2024
@qidewenwhen
Copy link
Member

Hi @lazem , sorry for the delay.

Given the code snippet, seems you are using the old TransformStep step interface which takes in the Transformer object.

transform_step = TransformStep(
        name="be-transform-step",
        transformer=transformer, # <<<<<<<<<<<<<<<
        inputs=transform_input
    )

This old interface is obsoleted and we don't actively manage it anymore. All new features, including the use_custom_job_prefix, are built on top of a new interface of step_args.

Could you try the new interface as shown below and see if it can resolve the issue?

pipeline_session = PipelineSession(...) # https://sagemaker.readthedocs.io/en/stable/amazon_sagemaker_model_building_pipeline.html#pipeline-session
transformer = Transformer(
    model_name=create_model_step.properties.ModelName,
    sagemaker_session=pipeline_session, # Note: please use PipelineSession object here otherwise will get an error.
    base_transform_job_name="BE-Transform",
    instance_type="ml.m5.xlarge",
    instance_count=1,
    output_path=s3_output_path,
 
)

step_args = transformer.transform(
            data=transform_input.data,
            data_type=transform_input.data_type,
            content_type=transform_input.content_type,
...
        )

transform_step = TransformStep(
        name="be-transform-step",
        step_args=step_args, # <<<<<<<<<<<<<<
    )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: pipelines Relates to the SageMaker Pipeline Platform type: question
Projects
None yet
Development

No branches or pull requests

4 participants