- Sponsor
-
Notifications
You must be signed in to change notification settings - Fork 276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve XRegExp.matchRecursive handling of unbalanced delimiters #96
Comments
Yes, I'd be totally willing to include an option that allows Unfortunately, I don't expect to get to this for XRegExp 3.0 final, unless it's provided as a pull request. The challenge is supporting all of Features that would need to continue working correctly after such a change is introduced include:
If this is something you're able to help with, that would be awesome and much appreciated. |
Hi, I might look at this and create a PR. But all your changes requirements wont work. I don't want to make any hard change in your code so my PR will mostly only include the code i posted before. |
here s the PR #99 |
The handling I described is the handling of all regex engines that support recursive matching (Perl, PCRE, and .NET). What happens in the case of something like Aside: It might also make sense to add another option (named something like It seems I might have misunderstood your original message, though. Rather than supporting all aspects of recursive matching within unbalanced strings, it seems you're looking to make a narrower change that is fairly specific to your situation (ending a global match early and not throwing on unbalanced delimiters if you've already found at least one balanced match). If that's the case, it might be better to just make the change in your local copy. I'm not sure how common the case you're describing is, and partial handling might cause more confusion and problems than it solves. More feedback is welcome, and I'll leave this issue open for the general feature of recursive matching within unbalanced strings. |
@slevithan I want to make it work but i am not sure we can when the unbalance starts before any match found. But dont worry if we find a way to handle that i will change my PR to support it :D Best |
+1 |
Is there anyone tackling this right now? if not, I'm more than willing to put up a PR for this as I would like to just ignore unbalanced delimeters for a project I'm working on. Am cool with it being default behavior or opt in |
I'm not aware of anyone working on it, go for it! Happy to try to help out along the way |
Yeah, that would be great!
It should start out as opt-in since changing the default behavior would require a major version change. |
Thanks to @dvargas92495 for the sweet/clean implementation in #332. It came out even better than expected, and This will go out in the next minor release. |
Aside: There's lots of potential for additional future handling modes for unbalanced delimiters if there are strong use cases. Things like:
Or whatever else is actually useful for reasonably broad use cases. |
@dvargas92495 and others: This is now published on npm as v5.1.0. |
Hi,
First thanks for your great library.
I am using matchRecursive to find JSON messages on a stream.
Every time i receive a message, i push it a string stack.
Then i use
matchRecursive
to try and find messages. If i find some, i slice those messages from the stack.Now sometimes some messages get lost and i end up with "unbalanced" stack, like this:
{"toto":"test","bd":{"toto":"test","bd":{"toto":"test","bd":{"toto":"test","bd":1,"asdfg":true}{"toto":"test","bd":{"toto":"test","bd":1,"asdfg":true}{"toto":"test","bd":1,"asdfg":{"toto":2}}
If i use your library like so i cant get the message in this because it's unbalanced.
And this is correct. there is no
outputs
if you look at this message.Though there is a simple way to find the messages. Reverse the string and inverse the left and right delimiters.
It works and
matchRecursive
actually found the message. However it still reports unbalanced and throw at the end.I easily fixed this locally by only throwing if
ouptuts.length===0
like this:That way i can still get the matched strings even if at the end it is unbalanced.
This is very useful.
I was wondering if you would be willing to commit that in the master branch?
I think it would be a great addition (could become an option).
The text was updated successfully, but these errors were encountered: