Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase robustness of geojson parsing by allowing extra properties and null values #681

Merged
merged 3 commits into from
May 10, 2022

Conversation

lognaturel
Copy link
Member

@lognaturel lognaturel commented May 4, 2022

Closes issues described on forum starting here. The existing code naively assumed that the geojson top level object would have just a type followed by features in that order and with no other properties.

What has been done to verify that this works as intended?

Added tests based on forum reports. Note that the test case provided includes another nested property with name type which was a really good inclusion. One thing to think about in review is whether there are other tricky structures like that which we should test for.

Why is this the best possible solution? Were any other approaches considered?

I briefly considered using an ObjectMapper instead of the parser. But I do think it's better to be able to stream the parsing. I think this is much more robust but it's possible there are still cases I haven't thought of.

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

This does change the geojson parsing strategy quite a bit so there's risk. Our test coverage is good, though, so I don't think it's high.

Do we need any specific form for testing your changes? If so, please attach one.

See tests.

Does this change require updates to documentation? If so, please file an issue here and include the link below.

No.

@lognaturel lognaturel requested a review from seadowg May 4, 2022 22:20
Copy link
Member

@seadowg seadowg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As well as the inline comments:

  • I do feel like the parsing feels a little "too close to the metal" here. Why not use a proper object mapper here?
  • The .geojson example files would be a lot easier to read if they were formatted. You can add .geojson as a JSON file type in IntelliJ using File type associations and then auto format the files.

@lognaturel
Copy link
Member Author

Why not use a proper object mapper here?

Streaming means we don't have to have two in-memory representations of the file. These files can get big so it seems worth it. It also better matches the CSV implementation. More on that in the original PR (#655). I'd rather keep this unless we have a really good reason not to.

The .geojson example files would be a lot easier to read if they were formatted.

I like the semantic formatting of those files. But you're right that it's annoying to have files that aren't auto formatted, fixed, thanks.

@lognaturel lognaturel requested a review from seadowg May 10, 2022 15:22
@lognaturel lognaturel merged commit b942cf8 into getodk:master May 10, 2022
@lognaturel lognaturel deleted the robust-geojson branch May 10, 2022 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants