Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataflow changes #384

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from
Draft

Dataflow changes #384

wants to merge 12 commits into from

Conversation

jiya-zhang
Copy link
Contributor

@jiya-zhang jiya-zhang commented Mar 25, 2024

I changed a few things in this PR:

  1. Add Dockerfile entrypoint for Dataflow (this is needed for Dataflow worker to start up successfully)
  2. Mount gcloud config folder to docker container (so that container will have the credentials to access GCS, etc.)
  3. Format command to avoid long commands and strip trailing newlines

Copy link
Contributor

@markblee markblee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Some initial comments/questions.

Dockerfile Show resolved Hide resolved
axlearn/cloud/gcp/jobs/dataflow.py Outdated Show resolved Hide resolved
dataflow_word_count.sh Outdated Show resolved Hide resolved
Copy link
Contributor

@markblee markblee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! A few additional suggestions while testing this PR:

  1. On L175, let's change to cfg.command = f"{cfg.command.strip()} {dataflow_flags}" so that trailing newlines in the command do not cause dataflow_flags to be interpreted as a separate command.
  2. L249, similar change to cmd = f"{cfg.setup_command.strip()} && {cmd}".
  3. You may want to rebase onto latest main (which fixes the bundler specs) for local testing.

axlearn/cloud/gcp/jobs/dataflow.py Outdated Show resolved Hide resolved
Copy link
Contributor

@markblee markblee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, also update the PR description?

@jiya-zhang
Copy link
Contributor Author

@markblee Do you mind trigger the CI tests again? The error should disappear now that dependency issue was fixed and merged. (I don't have permission to rerun CI tests)

@markblee
Copy link
Contributor

@markblee Do you mind trigger the CI tests again? The error should disappear now that dependency issue was fixed and merged. (I don't have permission to rerun CI tests)

It failed again with the same issue. Did you rebase?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants