Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XIP-5: Support different payload content types #68

Merged
merged 1 commit into from
Apr 6, 2022

Conversation

mkobetic
Copy link
Contributor

Implements #54

@cloudflare-pages
Copy link

cloudflare-pages bot commented Feb 15, 2022

Deploying with  Cloudflare Pages  Cloudflare Pages

Latest commit: 774c341
Status: ✅  Deploy successful!
Preview URL: https://3313c977.xmtp-js.pages.dev

View logs

src/MessageContent.ts Outdated Show resolved Hide resolved
src/proto/messaging.proto Outdated Show resolved Hide resolved
@@ -21,52 +25,86 @@ message PublicKey {
}
}

message PrivateKey {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is moved further down as this is not part of the protocol

bytes payload = 3;
}
oneof union {
AES256GCM_HKDFSHA256 aes256GcmHkdfSha256 = 1;
}
}

message PublicKeyBundle {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is moved up, next to the PublicKey definition to tidy up the file.

test/Client.test.ts Outdated Show resolved Hide resolved
@mkobetic
Copy link
Contributor Author

mkobetic commented Mar 4, 2022

Writing the tests I grew to dislike the MessageContent type that allows bundling the content with the content type id. We need it on the sending side so that we can look up the right content encoder, but we don't need it on the receiving side as we're returning the Message anyway, so that the recipient can see the envelope as well. So on the receiving side you end up having to do something like message.content.content to get to the actual value. This is somewhat compounded by the fact that I'm trying to support both plain string and handle the type automatically and other content where it has to be specified explicitly. But I don't want to rip out the simplified string handling and it wouldn't solve this completely anyway.

So I'm inclined to dump the MessageContent in favour of introducing another optional contentType argument to the send API and adding contentType? field to Message. Then the receiving side can use msg.contentType and msg.content naturally without fuss. I think I like this better. Is this sufficiently convincing @neekolas?

The changes this change would require can be seen in this commit 8d4466b

PS: I just realized that I will also need to add an optional fallback argument to the send API as well. Two optional arguments might be pushing it, but it's still somewhat sensible, if you specify contentType, then you can optionally specify fallback as well. Not terrible I think. Alternatively we could go for optional ContentOptions that could include contentType, fallback and maybe even contentTypeParams (although I thought those would be only set by the encoder, not sure).

@mkobetic
Copy link
Contributor Author

mkobetic commented Mar 7, 2022

I've added support for fallback and expanded the test to cover those corners. Fished out few other bugs in the process. Sticking with the optional arguments so far, unless we decide otherwise.

@mkobetic mkobetic force-pushed the payload-content-types branch 2 times, most recently from 3f3675f to 128a397 Compare March 18, 2022 22:09
@mkobetic
Copy link
Contributor Author

Prototype of compression support is in 128a397. It is using the draft compression standard. Support for the draft is spotty, and Streams on Node are also different so I used this polyfill 🤷‍♂️. Examples of streaming bytes in memory seem scarce so I used a fairly naive approach to growing the stream buffer when it cannot accommodate the chunk being written to simply doubling it in size.

The protocol change is small, just adding a compression attribute to EncodedContent, with the value being from enum { deflate, gzip }. I'm inclined to drive the supported algorithms by what is available in JS, since that's our primary client environment. This finally pushed me over to stop adding parameters to the sending APIs and repackage them as SendOptions.

Compression opens up an attack vector, where a message that is relatively small in transit can be expanded to huge on the recipient side, not sure if we want to do something about that (the growth strategy could simply throw if we reach some pre-set limit)

test/Client.test.ts Show resolved Hide resolved
test/Client.test.ts Show resolved Hide resolved
src/Client.ts Outdated Show resolved Hide resolved
src/Client.ts Outdated Show resolved Hide resolved
src/Client.ts Outdated Show resolved Hide resolved
src/Client.ts Outdated Show resolved Hide resolved
// content allows attaching decoded content to the Message
// the message receiving APIs need to return a Message to provide access to the header fields like sender/recipient
contentType?: ContentTypeId
content?: any // eslint-disable-line @typescript-eslint/no-explicit-any
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that's really unfortunate is that this leads to so many any types.

It would require a black-belt in Typescript (which I do not possess), but it should be possible to make this into a discriminated union type using the ContentType field. I can figure out how to do that with a fixed set of types, but I have a lot harder of a time imagining how that could be extended by developer's adding additional types. Even if we could have a discriminated union for our own types, and then fall back to any for unknown types, that would be a win for developers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really want to get into this at this point. Should I make an issue to follow up on later?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I think that's going to be hard to resolve.

@mkobetic
Copy link
Contributor Author

Compression opens up an attack vector, where a message that is relatively small in transit can be expanded to huge on the recipient side, not sure if we want to do something about that (the growth strategy could simply throw if we reach some pre-set limit)

I've decided this is too much of a risk and added configurable maxContentSize limit (currently only enforced when decompressing), set to 100M by default

@neekolas
Copy link
Contributor

I've decided this is too much of a risk and added configurable maxContentSize limit (currently only enforced when decompressing), set to 100M by default

That seems sensible. I wonder how many valid use-cases there are for messages > 10mb even.

I believe LibP2P limits messages to ~1MB at the PubSub level, so it would have to be extremely compressible content for a valid message to get anywhere close to those numbers.

@mkobetic
Copy link
Contributor Author

mkobetic commented Mar 29, 2022

I figured that given that photos can be easily close to 10M today, that might be a bit low, but I'm just pulling a number out of thin air. Happy to be convinced otherwise, I don't feel strongly about this at all.

@mkobetic
Copy link
Contributor Author

Hrm, test:jsdom is failing in CI with ReferenceError: ReadableStream is not defined, I guess it will need a polyfill too?

@mkobetic mkobetic marked this pull request as ready for review March 31, 2022 19:05
@mkobetic
Copy link
Contributor Author

Almost there :/ jsdom/jsdom#3200

src/MessageContent.ts Outdated Show resolved Hide resolved
src/Client.ts Outdated Show resolved Hide resolved
@mkobetic
Copy link
Contributor Author

mkobetic commented Apr 1, 2022

One more thing that bothers me here is that the xmtp.org/text supports only utf-8. It seems JS has very limited support for other encodings. But that's not the same across all relevant tech stacks probably. Should we let JS dictate here or should we add an encoding parameter to open up usage of other encodings potentially?

@mkobetic
Copy link
Contributor Author

mkobetic commented Apr 4, 2022

I think I've addressed all the feedback, the main changes are

  • the major version that the codec type declares is the maximum supported version and the client will throw if higher major version is encountered
  • added "encoding" parameter to ContentTypeText, defaulting to UTF-8.

@mkobetic mkobetic requested a review from jazzz April 4, 2022 21:01
src/Client.ts Outdated
@@ -84,6 +125,12 @@ export default class Client {
const keyStore = createNetworkPrivateKeyStore(wallet, waku)
const keys = await loadOrCreateKeys(wallet, keyStore)
const client = new Client(waku, keys)
opts?.codecs?.forEach((codec) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're going to have to update this to be compatible with @jazzz's just-merged changes.

Copy link
Contributor

@jazzz jazzz Apr 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I just noticed, that #93 has impact on this, and I probably should have waited.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's OK I'll rebase and fix this up.

Copy link
Contributor Author

@mkobetic mkobetic Apr 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebased and cleaned up.

src/Client.ts Outdated
opts?.codecs?.forEach((codec) => {
client.registerCodec(codec)
})
if (opts?.maxContentSize) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Pebble] all parameters should be accessed through the options configuration instead of the raw passed in opts.

src/Client.ts Outdated

// Set the maximum content size in bytes that is allowed by the Client.
// Currently only checked when decompressing compressed content.
maxContentSize?: number
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Sand] Likely want to avoid optional parameters here in the future. We can increase clarity by making it a required value and providing a sensible default parameter in defaultOptions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to do this, but TS is forcing me to set the default values in the Client constructor (given that they are declared fields on Client). Do we want to set the defaults twice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I could do this for codecs, but I'll need to merge the two lists in defaultOptions()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I pushed a fix, I think this does look better, but I wasn't able to get rid of setting the _maxContentSize in both the constructor and defaultOptions. Is this good @jazzz ?

Copy link
Contributor

@jazzz jazzz Apr 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 . It's unfortunate about having to set it twice, but imagine we'll fix that with an update to the constructor at somepoint. As we add more properties to Client we'll likely want to set the values in the constructor rather than afterwards.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keyStoreType: KeyStoreType.networkTopicStoreV1,
env: 'testnet',
waitForPeersTimeoutMs: 10000,
codecs: [new TextCodec()],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having textCodec here in DefaultOptions makes sense if we want it to be possible for someone to disable it.

If it is required to be present, then I think it was fine baked into the object, as its not really an option/tunable parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I wrote the merging logic below, they can override it, but they cannot remove it. I think that's what we want, if they have better codec, they are free to use it, but they cannot disable support for xmtp.org/text altogether.

@mkobetic mkobetic changed the title Support different payload content types XIP-5: Support different payload content types Apr 6, 2022
BREAKING CHANGE: The protocol format of messages is changing, old clients won't be able to decode new messages correctly and new clients won't be able to decode old messages correctly. The API changes are backward compatible.
@mkobetic mkobetic merged commit d55d083 into main Apr 6, 2022
@mkobetic mkobetic deleted the payload-content-types branch April 6, 2022 16:39
@github-actions
Copy link
Contributor

github-actions bot commented Apr 6, 2022

🎉 This PR is included in version 3.0.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants