Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for fuzzy patching #81

Open
goldsam opened this issue Nov 3, 2018 · 15 comments
Open

Allow for fuzzy patching #81

goldsam opened this issue Nov 3, 2018 · 15 comments

Comments

@goldsam
Copy link

goldsam commented Nov 3, 2018

In some applications it is useful to support "fuzzy" patching where it is acceptable for some patch operations to fail. Ideally, such an operation would be able to report all failed patch operations.

@holograph
Copy link
Collaborator

@goldsam, can you please provide an example? From what I can tell, the RFCs stipulate that erroneous operations SHOULD terminate processing (see RFC 6902 section 5), whereas the HTTP JSON Patch (see RFC 5789 section 2.2) recommends emitting 409 Conflict responses for similar situations.

While these indicate that we can support what you refer to, there's the other consideration which is: patch ops are applied in order, and are considered atomic (i.e. a failure in any of them should not modify the target document). Under what situations would "ignoring" (or just logging) errors be appropriate or desirable?

@goldsam
Copy link
Author

goldsam commented Nov 28, 2018

My specific use case is an implementation of Neil Fraser's Differential Synchronization algorithm which aims to synchronize a document among a number of remote clients. In short, the idea is take frequent diffs between an actively edited document and a recent "shadow" copy of a shared/synchronized document. Synchronization is achieved through the exchange of versioned diffs. The original work was focused on plain text, but the idea is easy to extend to JSON documents. This algorithm is reliant on a fuzzy patch algorithm.

An implementation is free to discard documents which do not "validate" after applying the fuzzy patch; that is, it is not the patch algorithm's responsibility to ensure correctness or consistency in this application. The idea is that an "fuzzy" patch would make a "best effort" at patching, where the meaning of "best effort" is up for debate.

Implementing support would required extending the core API. I think this might be achievable while maintaining backwards API compatibility. Would you be open to such changes?

@holograph
Copy link
Collaborator

Picking this up, this would essentially be providing a "fuzzy processor" instead of the default one, which suggests this could easily be done outside of the mainline codebase (which I feel should accurately reflect the RFC). That being said, any refactoring or convenience that can make this easier is definitely on the table...

@holograph
Copy link
Collaborator

@goldsam Still interested? Would appreciate a followup, otherwise I believe we can close this @vishwakarma

@goldsam
Copy link
Author

goldsam commented Feb 15, 2019

A custom processor would be a solution, but this currently can't be done outside of the main codebase because the only static method in JsonPatch that accepts a JsonPatchProcessor is private.

Ideally, it would be nice to implement this by leveraging the current behavior of class InPlaceApplyProcessor. Its seems like I simply need to overriding the void error(Operation forOp, String message) method. This would imply making the InPlaceApplyProcessor class a part of the public API of the library.

What do you think about the following proposed changes (in decreasing order of need/desire)?:

  • Change class JsonPatch by making the method static void process(JsonNode patch, JsonPatchProcessor processor, EnumSet<CompatibilityFlags> flags) public.
  • Change class InPlaceApplyProcessor to be public, or lift core behavior to abstract base class that is public.
  • Make the method void error(Operation forOp, String message) of this class (or lifted abstract base) public.

I would be happy to make a PR if you are open to such changes.

@holograph
Copy link
Collaborator

Eugh. You're absolutely correct, my only concern is that once these interfaces are made public it'll be hard to modify them (as I did in #95), and they'll need some serious documentation.

For my money, the "right" way to go is the first two out of your three wishlist items, and if you want to take a stab at it, I'm all for it!

@goldsam
Copy link
Author

goldsam commented Feb 21, 2019

Without exposing the void error(Operation forOp, String message) method publicly, there is not a great deal of benefit (that I can see) in exposing class InPlaceApplyProcessor as a part of the public API. At least, I can't think of any other use cases for adding this class to the public API.

That being said, I think it makes more sense to provide a simple extension point for customizing error handling beyond the default RFC-compliant behavior. To this end, I am proposing a simple strategy interface in #96. This maintains a smaller API surface area as compared to exposing class InPlaceApplyProcessor and the static void process(JsonNode patch, JsonPatchProcessor processor, EnumSet<CompatibilityFlags> flags) method of class JsonPatch.

I'm going back on what I had previously suggested, but I think this is more reasonable. What are your thoughts?

@holograph
Copy link
Collaborator

It's a possibility, but I'm not sure it's actually the right strategy. To start with, the API is somewhat limited in what it can offer (e.g. it's possible to write a "ignore test-op failures" strategy, but harder to only do so on specific nodes; it needs more context for error handling). Second, I'm hoping #95 gets merged, in which case I expect a lot of merge conflicts. I actually forked that branch locally and started making things public and adding JavaDocs, and it's very similar to your PR :-)

Overall exposing error handling semantics seems to necessitate even more work than previously. I'd actually prefer to just expose the processor API publicly, especially given the much cleaner API with the new JSON Pointer implementation in #95. @vishwakarma @LouizFC you have any thoughts on this?

@LouizFC
Copy link
Collaborator

LouizFC commented Feb 22, 2019

Sorry for the delay @holograph .

The changes presented by @goldsam looks good, but I also think that it is not the right way to handle it.

I am also a bit reluctant to expose the Processor API, because it is the "core" of the library. While it provides users with much more freedom and flexibility, it limit us on the kind of change we can make (for example, if exposed earlier, #95 would not be possible without some internal delegation).

I will think about it a bit more and see if I can provide a more useful answer

@goldsam
Copy link
Author

goldsam commented Feb 22, 2019

Another idea - The set of JSON Patch operations is finite and small. You could introduce an JsonPatchErrorHandler API with a method for each operation where all arguments would be elements of the current public API. This would be flexible while not exposing any existing elements of the current internal implementation. Maybe something like this:

public interface JsonPatchErrorHandler {
  void handleRemoveError(List<String> path, String errorMessage);
  void handleReplaceError(List<String> path, JsonNode value, String errorMessage);
  void handlAddError(List<String> path, JsonNode value, String errorMessage);
  void handleMoveError(List<String> fromPath, List<String> toPath, String errorMessage);
  void handleCopyError(List<String> fromPath, List<String> toPath, String errorMessage);
  void handleTestError(List<String> path, JsonNode value, String errorMessage);
}

@LouizFC
Copy link
Collaborator

LouizFC commented Feb 26, 2019

@goldsam this introduces some boilerplate to users that want to implement it (but again, this would be a "power user" feature), but seems usable.

@holograph what do you think? I think the best way to handle this would be:

@LouizFC
Copy link
Collaborator

LouizFC commented Feb 26, 2019

Another idea to the pile:

  • Maintain the first API (only one method for error handling)
  • Rename Diff.java to Patch, make it public, but maintaining everything else internal / package-private.
  • Bundle the patch information into the Patch object (only when a error occurs), and pass it to the handler, making the signature be something like handlePatchError(Operation, Patch, Message)
  • Make a read-only public API that does not expose the new JsonPointer (just yet), only Strings.

@goldsam
Copy link
Author

goldsam commented Feb 26, 2019

I've got one more wrench to throw into the mix:

When building an application that modifies a document explicitly (such as mine does), such modifications could be expressed as patch operations. This use case would reduce the complexity of building a patch to constant time without the need for searching or comparing (keep in mind my use case is a real-time collaborative editor). However, it would require API support for directly expressing those modifications. This would be an argument for exposing JsonPatchProcessor as part of the public API.

@goldsam
Copy link
Author

goldsam commented Apr 7, 2019

Any new thoughts on this?

@LouizFC
Copy link
Collaborator

LouizFC commented Apr 10, 2019

@goldsam nothing new. I am trying to squeeze some free time to make some refactorings that will make this issue more easy to implement, but until then I think there is not much I can do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants