Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(recap): Add suport to parse ACMS attachment pages #3393

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

ERosendo
Copy link
Contributor

@ERosendo ERosendo commented Nov 21, 2023

This PR introduces the logic to process attachment pages from ACMS. Here's a breakdown of the key changes:

  • New Upload Type: A new upload type named ACMS_ATTACHMENT_PAGE is added to the ProcessingQueue and pacerhtmlfiles model.

  • A new helper method process_recap_acms_appellate_attachment is introduced. This method encapsulates the logic for parsing, processing, and merging ACMS attachment pages.

  • The PacerDocIdLookUpSerializer class is updated to include the acms_document_guid field in its response. This field is crucial for matching documents within the attachment page and generating the recap icon in ACMS.

  • Factories are added to mock the dictionary returned by Juriscraper methods. Additionally, tests are included to verify the functionality of the new features.

@ERosendo ERosendo changed the title feat(recap): Add new upload type for ACMS attachment pages feat(recap): Add suport to parse ACMS attachment pages Nov 21, 2023
@ERosendo ERosendo force-pushed the feat-add-support-to-parse-attachment-pages-from-acms branch 3 times, most recently from 7fb2719 to a58920e Compare November 21, 2023 17:09
@ERosendo ERosendo force-pushed the feat-add-support-to-parse-attachment-pages-from-acms branch from a58920e to 855aaf6 Compare November 21, 2023 17:22
cl/recap/mergers.py Outdated Show resolved Hide resolved
Adds a new parameter to the merge_attachment_page_data method and tweaks  the merge mechanism to integrate ACMS attachment pages into their corresponding docket entries.
@ERosendo ERosendo force-pushed the feat-add-support-to-parse-attachment-pages-from-acms branch 2 times, most recently from 0d221ca to 4ae06ee Compare November 21, 2023 22:40
@ERosendo ERosendo force-pushed the feat-add-support-to-parse-attachment-pages-from-acms branch from 4ae06ee to 5214801 Compare November 21, 2023 22:42
@ERosendo ERosendo marked this pull request as ready for review May 22, 2024 18:32
@ERosendo ERosendo requested a review from mlissner May 22, 2024 18:32
@mlissner
Copy link
Member

@grossir, can you please review again before I take another look? It's been a month or two and it looks like we've got some changes.

Thank you!

Copy link
Contributor

@grossir grossir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look fine, I just left a couple of observations

cl/recap/api_serializers.py Outdated Show resolved Hide resolved
"docket_entry", "docket_entry__docket"
)
.filter(**params)
.afirst()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure the first will be the desired RD? Note that the RecapDocument model has this Meta attribute ordering = ("document_type", "document_number", "attachment_number")

Copy link
Contributor Author

@ERosendo ERosendo May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure the first will be the desired RD?

Yes, we use attachment #1 as the main_rd for attachment pages. Unlike district court pages, ACMS pages only have attachments. Here's a comparison:

District court ACMS
image image

Note that the RecapDocument model has this Meta attribute ordering = ("document_type", "document_number", "attachment_number")

This won't be an issue because ACMS documents within an attachment page share identical pacer_doc_id, document_type (ATTACHMENT), document_number (the entry number), this ensures the queryset is ordered with attachment #1 appearing first. Here's an example:

pacer_doc_id document_type document_number attachment_number
583c6255-3fd2-ed11-b596... 2 29 1
583c6255-3fd2-ed11-b596... 2 29 2

…S types

This commit updates the ProcessingQueueSerializer to include validation for ACMS types. This ensures data integrity and prevents potential errors when processing ACMS-related data.
@ERosendo ERosendo requested a review from grossir May 29, 2024 17:12
@mlissner
Copy link
Member

The model changes are noops. Would it be possible to rename them to something like 123_add_upload_types_noop.py/sql, so that those of us doing deployments (and those inspecting migrations later) can see that they're not changing the DB? I think this is in our dev guide as a best practice, but if not, we should probably add it.

@ERosendo
Copy link
Contributor Author

ERosendo commented May 29, 2024

Would it be possible to rename them to something like 123_add_upload_types_noop.py/sql

Sure!

I think this is in our dev guide as a best practice, but if not, we should probably add it.

You're right, it is described in the "Steps to making new migrations" section of the Database Migrations wiki page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🔎In Review
Status: Priority
Development

Successfully merging this pull request may close these issues.

None yet

4 participants