expression.search yields wrong results #63

NMEMine · 2021-06-08T10:08:52Z

Hi everyone,

the company I'm working at uses the io.burt.jmespath-gson library in a Scala/Spark application in order to be able to run JMESPath queries on large json-structured Strings. However, recently we found out that there are some inconsistencies between the results of the lib compared to what interactive web interface at https://jmespath.org/ yields (or what for example the Python implementation yields).

Let me outline it with a small example:
The following snippet is a small Scala unit test which captures the bug

import org.scalatest.{ FlatSpec, Matchers }
import io.burt.jmespath.gson.GsonRuntime

class JmesPathBug extends FlatSpec with Matchers {

  it should "return expected value" in {
    val runtime = new GsonRuntime()

    val inputJson ="""
                        |{
                        |    "a": [
                        |        {
                        |            "b": 21,
                        |            "c": {
                        |                "d": 42
                        |            }
                        |        }
                        |    ]
                        |}""".stripMargin

    val queryString = "a[?b > `20`].c[?d > `41`].d[]"

    val input = runtime.parseString(inputJson)
    val query = runtime.compile(queryString)
    val result = query.search(input)

    result.toString should be("[42]")
  }
}

Running this unit test fails with the following message:

"[[]]" was not equal to "[[42]]"
org.scalatest.exceptions.TestFailedException: 
      at org.scalatest.MatchersHelper$.indicateFailure(MatchersHelper.scala:343)
      at org.scalatest.Matchers$ShouldMethodHelper$.shouldMatcher(Matchers.scala:6723)
      at org.scalatest.Matchers$AnyShouldWrapper.should(Matchers.scala:6759)
      at JmesPathBug$$anonfun$1.apply(JmesPathBug.scala:27)
      at JmesPathBug$$anonfun$1.apply(JmesPathBug.scala:6)
     [...]

The dependencies are:

scala_version=2.11.12
java_version=1.8.0_265
org.scalatest:scalatest_2.11:3.0.8
io.burt:jmespath-core:0.5.0
io.burt:jmespath-gson:0.5.0
com.google.code.gson:gson:2.8.5

However, if you run the same query on the same JSON string in the interactive web interface of https://jmespath.org/, it produces a different (valid?) result. See this screen shot:

This result can also be observed when running the same query on the same JSON string with the Python implementation.
This small unit test runs successfully:

import json
import jmespath


def test_jmespath_bug():
    content = json.loads("""{
        "a": [
            {
                "b": 21,
                "c": {
                    "d": 42
                }
            }
        ]
    }
    """)

    query = "a[?b > `20`].c[?d > `41`].d[]"

    assert str(jmespath.search(query, content)) == "[42]"

============================= test session starts ==============================
platform linux -- Python 3.8.5, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
rootdir: /home/myuser/projects/jmes_path
collected 1 item                                                               

test_jmespath_bug.py .                                                   [100%]

Dependencies for this Python test are:

jmespath==0.10.0

Expected behavior:
The Java implementation yields the same result as the one observed in the web interface and as that of other libraries.

If you need any further details, please let me know!

Best regards
Michael Bechtel

The text was updated successfully, but these errors were encountered:

iconara · 2021-06-08T16:32:43Z

Thank you for reporting this. It's been a long time since I worked on this code and figuring out what's going on here is going to require some thinking.

Simplifying the expression to a[?b].c[?d].d, which is equivalent for the given input, we get this AST:

Sequence(
  Property(a),
  Sequence(
    Selection(Property(b)),
    Projection(
      Sequence(
        Property(c),
        Sequence(
          Selection(Property(d)),
          Projection(Property(d))
        )
      )
    )
  )
)

I don't know if the problem is in the parser or the runtime. [?…] starts a projection, which has the effect that "all subsequent expressions are projected onto the resulting list". When the AST above is executed the c[?d].d expression is executed on each value coming out of a[?b] as one unit, which sounds like it follows the spec – but to get the desired results c needs to be executed on the results, and then [?d], and then d, each on the list of output from the previous step.

Sadly how this should work is not covered by the compliance tests, but it looks like the JavaScript, Python, and Go implementations handle this differently than this library so this library is probably wrong.

I can't promise that I will have the time to work on this. It was a very long time since I used this library myself. If you need a workaround you can use a[?b].c | [?d].d.

NMEMine · 2021-06-10T09:48:22Z

Hi Theo,

thanks for the quick reply and for looking into it! It's a pitty that this corner-case is not covered by the compliance tests. We've found a workaround for the concrete issue we're seeing in our context. However, I still wanted to report it so that it documented if maybe also others are affected by it.
Anyway, I'd be happy if you could work on it but I also completely understand that this is out of scope for you right now!

Best regards
Michael

FilippoVigani · 2023-02-17T12:19:11Z

Thank you for reporting this, I am also facing this issue. Initially I thought that I was doing something wrong with my custom implementation of a runtime, however after testing it with the Gson runtime the issue persists.

I also resorted to using a | between the two projections, although it would be nice if the behavior was consistent between implementations.

djaneluz · 2024-03-26T19:45:23Z

Hello,

I'm facing a similar issue using jmespath-jackson 0.6.0 in Java.

JSON example used:

{
  "foo" : {
    "bar": {
      "count":1
    },
    "boo": {
      "count": 2
   }
  }
}

I tested it against the expression: foo.*.count.length(@)

I expected the result: 2, but got a FunctionCallException instead.

When I test on Jmespath page, I get the expected result:

The expression foo.*.count returns an array [1,2], and the function length should accept this array and return it's length, but for some reason the Java lib operates onto the first element of the array and ends up calling length(1) and number is not accepted as an input for this function.

I used the recommended workaround foo.*.count | length(@) and it worked, but it would be nice that the Java lib could behave the same way the JS.

I understand also that the Compliance Tests need to be updated to cover this situation, so we make sure that third party libs marked as "Fully Compliant" are in fact compliant.

Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

expression.search yields wrong results #63

expression.search yields wrong results #63

NMEMine commented Jun 8, 2021 •

edited

iconara commented Jun 8, 2021

NMEMine commented Jun 10, 2021

FilippoVigani commented Feb 17, 2023 •

edited

djaneluz commented Mar 26, 2024

expression.search yields wrong results #63

expression.search yields wrong results #63

Comments

NMEMine commented Jun 8, 2021 • edited

iconara commented Jun 8, 2021

NMEMine commented Jun 10, 2021

FilippoVigani commented Feb 17, 2023 • edited

djaneluz commented Mar 26, 2024

NMEMine commented Jun 8, 2021 •

edited

FilippoVigani commented Feb 17, 2023 •

edited