New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add full support for JSONPath #2070
Comments
@lemire We are interested in this functionality for Velox. Curious if you have a timeline in mind. |
@mbasmanova Work on this feature will start 'soon' (1 week or 2 weeks). JSON Path is quite rich, and it is (if nothing else) challenging to test support. However, we should have partial support in the coming weeks, at the prototypical level. |
This is great. Keep us posted. |
@mbasmanova Sorry for the delay. We fully support JSON Pointer with high performance. Supporting JSON with high performance is... challenging. A subset of the language could be supported, but this subset has a significant overlap with JSON Pointers... |
Basically, there are engineering issues involved to do it efficiently. If you don't care about performance, then it is easy, of course, but providing slow code is not in the spirit of this project. So... it is a challenge... I do recommend people consider JSON Pointer. |
@lemire Daniel, thank you for the update. I'm wondering if you could share some more details. In particular, I'm curious what are the challenges in supporting JSON Path efficiently and what is the subset that can be supported. I haven't looked at JSON Pointers yet, but do you happen whether it is possible to automatically re-write a subset of JSON Path queries into JSON Pointers queries? |
Basically JSON Pointer provides forward queries... Given { "c" :{ "foo": { "a": [ 10, 20, 30 ] }}, "d": { "foo2": { "a": [ 10, 20, 30 ] }} , "e": 120 } You have the following JSON Pointer queries...
The equivalent in JSON Path might be... (up to potential semantics differences)
JSON Pointer is a well-established standard. See https://www.rfc-editor.org/rfc/rfc6901 I should stress that JSON Pointer queries are very much still used in production and the standard is very much alive. We also support an extension whereas you can apply a JSON Pointer from the current node, as in... auto cars_json = R"( [
{ "make": "Toyota", "model": "Camry", "year": 2018, "tire_pressure": [ 40.1, 39.9, 37.7, 40.4 ] },
{ "make": "Kia", "model": "Soul", "year": 2012, "tire_pressure": [ 30.1, 31.0, 28.6, 28.7 ] },
{ "make": "Toyota", "model": "Tercel", "year": 1999, "tire_pressure": [ 29.8, 30.0, 30.2, 30.5 ] }
] )"_padded;
ondemand::parser parser;
ondemand::document cars = parser.iterate(cars_json);
std::vector<double> measured;
for (auto car_element : cars) {
double x = (double) car_element.at_pointer("/tire_pressure/1");
measured.push_back(x);
}
// measured.push_back == {39.9, 31, 30};
We support JSON Pointer highly efficiently. There is no head memory allocation and no need for additional dependencies. As far as I can tell, JSON Path implementations are currently not guaranteed to be efficient. The current state-of-the-art with respect to attempting to implement JSON Path efficiently is JSONSki but they provide only a partial implementation... It has no support for descendant selectors, and their wildcard selector implements only a part of the JSONPath specification, stepping into every entry of an array, but not into every field of an object.
The type of JSON Path queries that would be challenging to implement efficiently are queries such as It is doable if you have enough engineering effort, and I am not closing this issue. In fact, I am marking it as 'help needed' and 'good first issue'. A couple of talented engineers could implement JSON Path on top of, say, the On Demand API. But it would take more than a few days. I would be interested in working on this, and I might still work on this, but it is not trivial. |
@lemire do you still think this is a good-first-issue? Issue looks challenging and interesting. |
@FranciscoThiesen It can be quite challenging, and maybe difficult as a starting point. However, you are welcome to give it a try, it might prove to be easier than I anticipate. Furthermore, it is not necessary to implement the full specification. |
@lemire Daniel, thank you for detailed explanation. I think I'm getting it. It sounds like we could support a subset of JSONPath that can be re-written into JSON Pointer. |
@mbasmanova Yes, such support could be done relatively quickly. |
Maybe @FranciscoThiesen could be interested !!! |
I'll give it a shot! @lemire can you assign it to me? |
@FranciscoThiesen Done. |
I took some time this weekend to familiarize myself with the codebase + PRs introducing json pointers in the past years + some Json Path resources like (https://goessner.net/articles/JsonPath/). @lemire @mbasmanova do you believe the strategy of (eficiently) converting json path -> json pointer and then just leveraging the current at_pointer functionality makes sense and adds value? (at least as a starting point) The json path -> json pointer conversion appears to be much simpler that to have an at_path() method implemented from scratch. |
I feel this would be valuable. |
@FranciscoThiesen Give it a try. |
Just wanted to give an update. I am actively working on it, currently trying to solve some linker errors |
@FranciscoThiesen Thank you for the update. Super excited about this functionality becoming available soon. |
When using bracket to specify field name in json path, e.g., |
To add to @PHILO-HE's question, does simdjson support json paths with keys with dots, e.g. $['store.1.2.3'] ? |
The documentation is as follows:
It is obviously underspecified. We fully support JSON Pointer (the entire specification), and @FranciscoThiesen essentially mapped a subset of JSONPath to the equivalent JSON Pointer. Observe that the issue is still open: we provide strictly limited support for JSON Path. We are certainly inviting further contributions. Question 1.
I believe that the expected JSON Path is void demo1() {
auto json = R"( {
"store": {
"book": [
{
"category": "reference",
"author": "Nigel Rees",
"title": "Sayings of the Century",
"price": 8.95
},
{
"category": "fiction",
"author": "Evelyn Waugh",
"title": "Sword of Honour",
"price": 12.99
},
{
"category": "fiction",
"author": "Herman Melville"
}
]
})"_padded;
ondemand::parser parser;
auto doc = parser.iterate(json);
// prints "Sayings of the Century"
std::cout << doc.at_path(".store.book[0].title") << std::endl;
}
Question 2.
It works in the sense of the examples below... void demo2() {
auto json = R"( {
"store": ["aa", ["humbug", "Montreal", ["a", "christmas", "carol", "by", "charles", "dickens"]]]
})"_padded;
ondemand::parser parser;
auto doc = parser.iterate(json);
// prints "by"
std::cout << doc.at_path(".store.1.2.3") << std::endl;
}
void demo3() {
auto json = R"( {
"store": {"1":{ "2":{ "3": "by" } }
})"_padded;
ondemand::parser parser;
auto doc = parser.iterate(json);
// prints "by"
std::cout << doc.at_path(".store.1.2.3") << std::endl;
} |
Hi @lemire, thanks for your comment! In Jayway JsonPath, I note an example for bracket–notation is |
The issue is open. We are eagerly inviting contributions. |
We support JSON Pointer, but we should support JSON Path. It is more work, but also more useful.
Currently, a limited subset of JSONPath is supported, see #2127
The text was updated successfully, but these errors were encountered: