Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prodigy annotator #2655

Merged
merged 56 commits into from
May 14, 2024
Merged

Prodigy annotator #2655

merged 56 commits into from
May 14, 2024

Conversation

strickvl
Copy link
Contributor

@strickvl strickvl commented May 1, 2024

This pull request adds the Prodigy Annotator integration to ZenML. Prodigy is a powerful annotation tool that allows for efficient data labeling. With this integration, users can now connect ZenML with Prodigy and leverage its annotation capabilities in their ML pipelines.

prodigy-annotator

Note, there are no tests included in this PR since prodigy the package is only installable via a custom wheel they ship when you buy a license (and thus isn't possible to install and use in our CI). I also made one small change to the base annotator interface to accommodate this new annotator.

I've developed and tested this on my machine (since I have a license).

Summary by CodeRabbit

  • New Features

    • Introduced Prodigy integration for ZenML, enhancing data annotation capabilities.
    • Added new documentation pages for ProdigyAnnotator and updated Table of Contents.
    • Enhanced BaseAnnotator to accept additional keyword arguments, improving flexibility.
  • Documentation

    • Updated documentation to include guides and references for Prodigy integration.
  • Chores

    • Adjusted installation scripts to include Prodigy in the list of ignored integrations during development setup.
  • Refactor

    • Modified CLI functions to handle different annotator flavors and improve error handling during dataset operations.
  • Bug Fixes

    • Fixed package installation handling in utility scripts to manage Prodigy package exceptions.

Copy link
Contributor

github-actions bot commented May 1, 2024

Images automagically compressed by Calibre's image-actions

Compression reduced images by 38.1%, saving 36.22 KB.

Filename Before After Improvement Visual comparison
docs/book/.gitbook/assets/prodigy-annotator.png 95.02 KB 58.81 KB -38.1% View diff

256 images did not require optimisation.

Update required: Update image-actions configuration to the latest version before 1/1/21. See README for instructions.

@strickvl
Copy link
Contributor Author

strickvl commented May 1, 2024

@coderabbitai review

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Out of diff range and nitpick comments (3)
scripts/install-zenml-dev.sh (2)

Line range hint 51-51: Consider enclosing $ignore_integrations_args in double quotes to prevent globbing and word splitting.

-    zenml integration export-requirements --output-file integration-requirements.txt $ignore_integrations_args
+    zenml integration export-requirements --output-file integration-requirements.txt "$ignore_integrations_args"

Line range hint 54-54: For efficiency, consider using braces for multiple commands redirecting to the same file.

-    echo "" >> integration-requirements.txt
-    echo "pyyaml>=6.0.1" >> integration-requirements.txt
-    echo "pyopenssl" >> integration-requirements.txt
-    echo "-e .[server,templates,terraform,secrets-aws,secrets-gcp,secrets-azure,secrets-hashicorp,s3fs,gcsfs,adlfs,dev,mlstacks]" >> integration-requirements.txt
+    {
+        echo ""
+        echo "pyyaml>=6.0.1"
+        echo "pyopenssl"
+        echo "-e .[server,templates,terraform,secrets-aws,secrets-gcp,secrets-azure,secrets-hashicorp,s3fs,gcsfs,adlfs,dev,mlstacks]"
+    } >> integration-requirements.txt
src/zenml/integrations/prodigy/annotators/prodigy_annotator.py (1)

67-76: The method get_url_for_dataset does not use the dataset_name parameter. Consider removing it if it's truly unnecessary, or document why it's kept.

@strickvl strickvl changed the title Add Prodigy Annotator Integration Prodigy Annotator Integration May 1, 2024
Copy link
Contributor

@avishniakov avishniakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@strickvl strickvl mentioned this pull request May 9, 2024
2 tasks
@strickvl strickvl changed the title Prodigy Annotator Integration Prodigy annotator May 13, 2024
@strickvl strickvl merged commit 75e65c8 into develop May 14, 2024
58 of 59 checks passed
@strickvl strickvl deleted the feature/prodigy-annotator branch May 14, 2024 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request internal To filter out internal PRs and issues run-slow-ci
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants