Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expression.search yields wrong results #63

Open
NMEMine opened this issue Jun 8, 2021 · 4 comments
Open

expression.search yields wrong results #63

NMEMine opened this issue Jun 8, 2021 · 4 comments

Comments

@NMEMine
Copy link

NMEMine commented Jun 8, 2021

Hi everyone,

the company I'm working at uses the io.burt.jmespath-gson library in a Scala/Spark application in order to be able to run JMESPath queries on large json-structured Strings. However, recently we found out that there are some inconsistencies between the results of the lib compared to what interactive web interface at https://jmespath.org/ yields (or what for example the Python implementation yields).

Let me outline it with a small example:
The following snippet is a small Scala unit test which captures the bug

import org.scalatest.{ FlatSpec, Matchers }
import io.burt.jmespath.gson.GsonRuntime

class JmesPathBug extends FlatSpec with Matchers {

  it should "return expected value" in {
    val runtime = new GsonRuntime()

    val inputJson ="""
                        |{
                        |    "a": [
                        |        {
                        |            "b": 21,
                        |            "c": {
                        |                "d": 42
                        |            }
                        |        }
                        |    ]
                        |}""".stripMargin

    val queryString = "a[?b > `20`].c[?d > `41`].d[]"

    val input = runtime.parseString(inputJson)
    val query = runtime.compile(queryString)
    val result = query.search(input)

    result.toString should be("[42]")
  }
}

Running this unit test fails with the following message:

"[[]]" was not equal to "[[42]]"
org.scalatest.exceptions.TestFailedException: 
      at org.scalatest.MatchersHelper$.indicateFailure(MatchersHelper.scala:343)
      at org.scalatest.Matchers$ShouldMethodHelper$.shouldMatcher(Matchers.scala:6723)
      at org.scalatest.Matchers$AnyShouldWrapper.should(Matchers.scala:6759)
      at JmesPathBug$$anonfun$1.apply(JmesPathBug.scala:27)
      at JmesPathBug$$anonfun$1.apply(JmesPathBug.scala:6)
     [...]

The dependencies are:

scala_version=2.11.12
java_version=1.8.0_265
org.scalatest:scalatest_2.11:3.0.8
io.burt:jmespath-core:0.5.0
io.burt:jmespath-gson:0.5.0
com.google.code.gson:gson:2.8.5

However, if you run the same query on the same JSON string in the interactive web interface of https://jmespath.org/, it produces a different (valid?) result. See this screen shot:
Screenshot from 2021-06-08 11-59-17

This result can also be observed when running the same query on the same JSON string with the Python implementation.
This small unit test runs successfully:

import json
import jmespath


def test_jmespath_bug():
    content = json.loads("""{
        "a": [
            {
                "b": 21,
                "c": {
                    "d": 42
                }
            }
        ]
    }
    """)

    query = "a[?b > `20`].c[?d > `41`].d[]"

    assert str(jmespath.search(query, content)) == "[42]"
============================= test session starts ==============================
platform linux -- Python 3.8.5, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
rootdir: /home/myuser/projects/jmes_path
collected 1 item                                                               

test_jmespath_bug.py .                                                   [100%]

Dependencies for this Python test are:

jmespath==0.10.0

Expected behavior:
The Java implementation yields the same result as the one observed in the web interface and as that of other libraries.

If you need any further details, please let me know!

Best regards
Michael Bechtel

@iconara
Copy link
Collaborator

iconara commented Jun 8, 2021

Thank you for reporting this. It's been a long time since I worked on this code and figuring out what's going on here is going to require some thinking.

Simplifying the expression to a[?b].c[?d].d, which is equivalent for the given input, we get this AST:

Sequence(
  Property(a),
  Sequence(
    Selection(Property(b)),
    Projection(
      Sequence(
        Property(c),
        Sequence(
          Selection(Property(d)),
          Projection(Property(d))
        )
      )
    )
  )
)

I don't know if the problem is in the parser or the runtime. [?…] starts a projection, which has the effect that "all subsequent expressions are projected onto the resulting list". When the AST above is executed the c[?d].d expression is executed on each value coming out of a[?b] as one unit, which sounds like it follows the spec – but to get the desired results c needs to be executed on the results, and then [?d], and then d, each on the list of output from the previous step.

Sadly how this should work is not covered by the compliance tests, but it looks like the JavaScript, Python, and Go implementations handle this differently than this library so this library is probably wrong.

I can't promise that I will have the time to work on this. It was a very long time since I used this library myself. If you need a workaround you can use a[?b].c | [?d].d.

@NMEMine
Copy link
Author

NMEMine commented Jun 10, 2021

Hi Theo,

thanks for the quick reply and for looking into it! It's a pitty that this corner-case is not covered by the compliance tests. We've found a workaround for the concrete issue we're seeing in our context. However, I still wanted to report it so that it documented if maybe also others are affected by it.
Anyway, I'd be happy if you could work on it but I also completely understand that this is out of scope for you right now!

Best regards
Michael

@FilippoVigani
Copy link

FilippoVigani commented Feb 17, 2023

Thank you for reporting this, I am also facing this issue. Initially I thought that I was doing something wrong with my custom implementation of a runtime, however after testing it with the Gson runtime the issue persists.

I also resorted to using a | between the two projections, although it would be nice if the behavior was consistent between implementations.

@djaneluz
Copy link

Hello,

I'm facing a similar issue using jmespath-jackson 0.6.0 in Java.

JSON example used:

{
  "foo" : {
    "bar": {
      "count":1
    },
    "boo": {
      "count": 2
   }
  }
}

I tested it against the expression: foo.*.count.length(@)

I expected the result: 2, but got a FunctionCallException instead.

When I test on Jmespath page, I get the expected result:

image

The expression foo.*.count returns an array [1,2], and the function length should accept this array and return it's length, but for some reason the Java lib operates onto the first element of the array and ends up calling length(1) and number is not accepted as an input for this function.

I used the recommended workaround foo.*.count | length(@) and it worked, but it would be nice that the Java lib could behave the same way the JS.

I understand also that the Compliance Tests need to be updated to cover this situation, so we make sure that third party libs marked as "Fully Compliant" are in fact compliant.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants