Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MailPace: New ESP #355

Open
wants to merge 25 commits into
base: main
Choose a base branch
from
Open

Conversation

paul-oms
Copy link

@paul-oms paul-oms commented Feb 1, 2024

Hi Anymail team, a fresh PR for https://mailpace.com with verified commits, pynacl optional, and the linting issues fixed. I was unable to run everything locally, so 🤞 this passes the Github Actions this time round.

  • Backend
  • Webhooks
  • Inbound
  • Tags
  • Tests
  • Docs
  • Readme/project/tox/workflows etc.

tests/utils_mailpace.py Outdated Show resolved Hide resolved
@medmunds
Copy link
Contributor

medmunds commented Feb 3, 2024

This is looking pretty good so far. Added a couple of comments above that might help get the tests working. I'll try to get to a closer review over the weekend.

@paul-oms
Copy link
Author

paul-oms commented Feb 4, 2024

Phew, finally got the tests to pass. Let me know what else is needed.

Also, if you need an API key for integration testing, set up an account at https://app.mailpace.com and we'll add you to the free plan (presumably I can't put the API key anywhere for you).

@medmunds
Copy link
Contributor

medmunds commented Feb 6, 2024

Nice! Glad you got the tests working.

I created mailpace.anymail.dev as a testing domain. It would be great if you could set it up for free API use.

There's a decision to be made about tracking webhook verification. I think the choices are:
A. Signature verification (MAILPACE_WEBHOOK_KEY, pynacl) is optional; if not used, HTTP basic auth (WEBHOOK_SECRET) is "required" (strongly encouraged via Anymail warning message). This is how you've got it documented.
B. Signature verification is required. It's an error to use the MailPace tracking webhook without MAILPACE_WEBHOOK_KEY set and pynacl installed. This is easier to implement and test.

For option A, we need to pull in something like this code to control the basic auth warning, and then also rework the webhook tests to cover both scenarios (rather than just skipping all tests when pynacl isn't installed), similar to how the Resend webhook tests handle optional signature validation with the svix package.

Option B would also be fine—it's really up to you. (pynacl is not tied to any proprietary service, seems likely to be maintained for many years, and doesn't have any other dependencies. It wasn't immediately obvious svix fit those criteria, which is why Anymail's Resend webhooks make signature verification with svix optional.) For option B, we just need to add this line, and allow the configuration error if the webhook key is missing.

@paul-oms
Copy link
Author

paul-oms commented Feb 6, 2024

Thanks! I think Option A is the right choice - always better to make 3rd party dependencies optional.

Does this change cover it?

@paul-oms
Copy link
Author

paul-oms commented Feb 6, 2024

I created mailpace.anymail.dev as a testing domain. It would be great if you could set it up for free API use.

Added to our secret free plan ;)

Copy link
Contributor

@medmunds medmunds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good!

I have a handful of feature requests and a handful of minor suggestions in this review.

The updated webhook validation looks fine. I'll take a closer look at the tests tomorrow, but I think you've got all the cases covered.

anymail/backends/mailpace.py Outdated Show resolved Hide resolved
anymail/backends/mailpace.py Outdated Show resolved Hide resolved
)
elif status_msg == "error":
if "errors" in json_response:
for field in ["to", "cc", "bcc"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused by the logic in this section.

The goal is to provide a specific send status for each recipient, if known.

I think this code will end up setting the sent status for every recipient based on the last reported error in the response (for any recipient). So if all "to" addresses were valid, but one of the cc's was on the blocked list, all of the "to" and the "cc" recipients would get reported as "blocked".

(Are there any cases where MailPace would send the message to some recipients, but not to others? Or if there's a problem with any to/cc/bcc recipient, is the entire message always rejected?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the latter - if there's a problem with any recipient (e.g. blocked or invalid) the email will be rejected (error http status from the endpoint) and not sent, as the entire email is technically not valid in that case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks, that helps. I also did a little testing. [Sorry, this comment is kind of long—there are a bunch of cases to consider.]

The goal of parse_recipient_status is: based on the API response, for each individual "to" recipient, provide the send status if we know it, or "failed" if the message definitely wasn't sent to (queued for) that recipient but we can't exactly say why, or "unknown" if we're not sure whether or not the message was sent to that particular recipient. (This is especially important with some other ESPs that don't fail the entire message, but instead send to some recipients but not others.)

If that status information is also available for "cc" and "bcc" recipients, it's great to provide it. Otherwise, it's OK to just omit cc and bcc recipients.

In my testing, it looks like MailPace's API response currently reports:

  • {"errors":{"to":["is invalid"]}} if at least one (but not necessarily all) of the "to" recipients is invalid
  • {"errors":{"to":["contains a blocked address"]}} if at least one (but not necessarily all) of the "to" recipients is on the blocked list
  • {"errors":{"to":["contains a blocked address", "is invalid"]}} if both conditions apply
  • With multiple "to" addresses, there is no way to tell from the API response which of the addresses did or didn't cause these errors
  • It seems like errors in "cc" or "bcc" addresses are not currently reported—the message is queued and delivered to the "to" recipients. [Aside: it looks like delivery is attempted for a blocked cc. An invalid cc field seems to cause the message to get stuck in the Queued status. I'm guessing these might be bugs?]

With that in mind, here are some scenarios parse_recipient_status needs to cover:

  • One or more "to" recipients, no errors
    • Current implementation: all "to" recipients are reported "queued", which is the desired result
  • Exactly one "to" recipient, invalid address
    • Current implementation: the recipient is reported as "invalid", which is the desired result
  • Exactly one "to" recipient, on the blocked list
    • Current implementation: the recipient is reported as "rejected", which is the desired result
  • Multiple "to" recipients, some invalid and/or blocked (and some not)
    • Current implementation: all "to" recipients are reported as either "invalid" or "rejected", even if that individual recipient was valid. This is misleading. (Also, when both errors apply, whichever happens to be last in the errors list determines the status for all recipients.)
    • Desired result: with multiple recipients and some invalid/blocked errors, use status "failed" for all "to" recipients. (We know the message wasn't sent to that recipient, but for each individual address we can't be sure why—or even that there was a problem with that address.)
  • Any of the above, but also with "cc" and "bcc" recipients
    • Current implementation: "cc" and "bcc" recipients are given the same status as the "to" recipient(s) ("queued" or "invalid" or "rejected"), which may be misleading (because MailPace doesn't currently report errors on cc and bcc, and the cc/bcc recipients aren't necessarily invalid/blocked just because one of the to recipients is)
    • Desired result: probably easiest to just omit "cc" and "bcc" recipients. (Could instead give them status "unknown" if the "to" is "queued", or "failed" if the "to" is invalid/blocked, but that doesn't add much value.)
  • Zero "to" recipients, one or more "cc" and/or "bcc" recipients
    • [Developers sometimes try to use cc/bcc-only messages to implement something like a mailing list]
    • Current implementation: all "cc" and "bcc" recipients are given status "invalid", which will be misleading (the cc and bcc recipients are likely valid email addresses)
    • Desired result: since MailPace doesn't support messages without "to", just let its {"errors":{"to":["undefined field"]}} response be reported as an AnymailRequestsAPIError. (Don't try to handle this through recipient status.)
  • More than 50 "to" recipients
    • Current implementation: all recipients are reported as "invalid", which is misleading
    • Desired result: just let MailPace's "number of email addresses exceeds maximum volume" response be reported as an AnymailRequestsAPIError. (Don't try to handle this through recipient status.)

Here's a suggested approach that I think achieves all of this:

  • Replace to_cc_and_bcc_emails with just to_emails
  • If response.status_code == 200, return status = "queued" for all to_emails
    • Is it necessary to check parsed_response["status"] == "queued"? (Could there ever be any other status with a 200 response?)
  • If response.status_code == 400 and the only field in parsed_response["errors"] is "to" and the errors in parsed_response["errors"]["to"] are only "is invalid" and/or "contains a blocked address":
    • If there is exactly one to_emails:
      • If "is invalid", return status = "invalid" for all to_emails
      • If "contains a blocked address", return status = "rejected" for all to_emails
      • If both, return status = "failed" for all to_emails
    • Else (more than one to_emails) return status = "failed" for all to_emails (we don't know which are invalid or rejected)
  • In all other cases, raise an AnymailRequestsAPIError for the response
    • This includes all other problems that might be reported in parsed_response["errors"]["to"]
    • This also covers the case where MailPace later starts reporting errors in cc/bcc fields
  • (Wrap all parsed_response[...] access to convert KeyError and TypeError to AnymailRequestsAPIError("Unexpected MailPace API response format"...))

Does that make sense? Have I missed anything?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @medmunds - my suggestion here is for us to fix the issues with non-reporting of errors and blocked emails to cc and bcc'd recipients first (plus the other bug reported below), then come back and fix the PR appropriately. We'll get back shortly on that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix the issues with non-reporting of errors and blocked emails to cc and bcc'd recipients first

Sounds good, thanks.

anymail/backends/mailpace.py Outdated Show resolved Hide resolved
anymail/backends/mailpace.py Show resolved Hide resolved
anymail/backends/mailpace.py Show resolved Hide resolved
anymail/webhooks/mailpace.py Outdated Show resolved Hide resolved
anymail/webhooks/mailpace.py Outdated Show resolved Hide resolved
anymail/webhooks/mailpace.py Outdated Show resolved Hide resolved
anymail/webhooks/mailpace.py Outdated Show resolved Hide resolved
@paul-oms
Copy link
Author

paul-oms commented Feb 7, 2024

All done - there's one point to check above, but I think it's correct as is.

@medmunds
Copy link
Contributor

medmunds commented Feb 7, 2024

Was working on integration tests, and noticed there seems to be a problem with commas in "to" display names. Try sending "to": "\"Example, Inc.\" <info@example.com>". I get a 400 response with "errors": { "to": [ "is invalid" ] }.

(No need to go through Anymail—same problem just using curl with json. Probably also affects cc, bcc, and reply_to, though I haven't tested. The "from" field seems OK with commas in display names.)

Trouble with various characters in email display names is pretty common, and normally Anymail implements a workaround involving RFC-2047 encoding. Which is an option here, but maybe you want to look into a fix on your end instead?

@medmunds
Copy link
Contributor

medmunds commented Feb 7, 2024

Also, I've added some integration tests. here are some integration tests: test_mailpace_integration.py

You can run them locally with:

export ANYMAIL_TEST_MAILPACE_SERVER_TOKEN=your-server-token
export ANYMAIL_TEST_MAILPACE_DOMAIN=sending.domain.for.your.server.token
python runtests.py tests.test_mailpace_integration 

(These tests won't run in this PR until it's merged, because repository secrets aren't available within PRs from forks. They send real messages, but the destination addresses are all sink emails at anymail.dev.)

(Unfortunately, I can't just add this file to your PR because of a long-standing GitHub bug with forks scoped to organizations. You might want to add me as a collaborator on your django-anymail fork.)

@paul-oms
Copy link
Author

paul-oms commented Feb 7, 2024

Was working on integration tests, and noticed there seems to be a problem with commas in "to" display names. Try sending "to": "\"Example, Inc.\" <info@example.com>". I get a 400 response with "errors": { "to": [ "is invalid" ] }.

(No need to go through Anymail—same problem just using curl with json. Probably also affects cc, bcc, and reply_to, though I haven't tested. The "from" field seems OK with commas in display names.)

Trouble with various characters in email display names is pretty common, and normally Anymail implements a workaround involving RFC-2047 encoding. Which is an option here, but maybe you want to look into a fix on your end instead?

Thanks! This is most likely a bug on our end. I'll look into it and see if we can fix it.

@paul-oms
Copy link
Author

paul-oms commented Feb 7, 2024

You might want to add me as a collaborator on your django-anymail fork.)

Done!

@medmunds
Copy link
Contributor

medmunds commented Feb 7, 2024

Thanks, pushed the integration tests.

You might want to add some tests for the features you enabled recently: list-unsubscribe header (but other headers unsupported) and overriding the api token in esp_extra. (I'm happy to add these tests, too.)

I'm still thinking about the best way to map recipient status, will follow up later. Everything else looks good.

Copy link
Contributor

@medmunds medmunds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handful of issues from testing event tracking webhooks


return AnymailTrackingEvent(
event_type=event_type,
timestamp=parse_datetime(payload["created_at"]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timestamp is meant to be the time the event occurred. I think created_at is the time the email was originally sent, even for later events like delivered or bounced. I wonder if updated_at might be more appropriate?

return AnymailTrackingEvent(
event_type=event_type,
timestamp=parse_datetime(payload["created_at"]),
event_id=payload["id"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

event_id is meant to be a unique ID for this event, to help avoid processing the same event twice. I don't believe MailPace provides that sort of ID, so probably just pass event_id=None

event_type=event_type,
timestamp=parse_datetime(payload["created_at"]),
event_id=payload["id"],
message_id=payload["message_id"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

message_id is meant to match the message_id returned from the backend's parse_recipient_status, so that users can match up webhook events with the original sent message. So I think we want MailPace's "internal email id" here: message_id=payload["id"].

timestamp=parse_datetime(payload["created_at"]),
event_id=payload["id"],
message_id=payload["message_id"],
recipient=payload["to"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recipient must be only the addr_spec portion of the (one) relevant recipient address. It looks like payload["to"] is the entire To header, including display names and possibly multiple recipients.

You should be able to use something like parse_address_list(payload["to"])[0].addr_spec to get the first addr_spec in the list. (That function is from anymail.utils. If that expression raises a ValueError or IndexError, the to field is unparseable or empty, and it's fine to pass recipient=None.)

message_id=payload["message_id"],
recipient=payload["to"],
tags=tags,
reject_reason=reject_reason,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add esp_event=esp_event to the AnymailTrackingEvent constructor. This allows handler functions access to the complete raw event if necessary.

@medmunds
Copy link
Contributor

[Rebased to pick up main branch changes]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants