Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected recursive context inclusion exception #2778

Open
jmfernandez opened this issue Apr 30, 2024 · 2 comments
Open

Unexpected recursive context inclusion exception #2778

jmfernandez opened this issue Apr 30, 2024 · 2 comments

Comments

@jmfernandez
Copy link

Hi, I'm involved into https://github.com/ResearchObject community, and I'm writing some code to parse RO-Crate JSON-LD representation, in order to perform further processing. Meanwhile I was doing some tests using RDFLib 7.0.0, I guess I have uncovered a corner case bug in its embedded JSON-LD processor plugin, and I have been able to narrow the test code and contents which fire it.

Using next code:

#!/usr/bin/env python3

import json
import rdflib
import sys

for filename in sys.argv[1:]:
    with open(filename, mode="r", encoding="utf-8") as IJD:
        print(f"Loading {filename}")
        input_jld = json.load(IJD)
        
        g = rdflib.Graph()
        parsed = g.parse(data=json.dumps(input_jld), format="json-ld")

works as expected with next attached toy files:

But it fails with next one:

raising

Loading fails1.jsonld
Traceback (most recent call last):
  File "/home/jmfernandez/projects/rdflib/load_test.py", line 13, in <module>
    parsed = g.parse(data=json.dumps(input_jld), format="json-ld")
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jmfernandez/projects/rdflib/.bug/lib/python3.11/site-packages/rdflib/graph.py", line 1492, in parse
    parser.parse(source, self, **args)
  File "/home/jmfernandez/projects/rdflib/.bug/lib/python3.11/site-packages/rdflib/plugins/parsers/jsonld.py", line 119, in parse
    to_rdf(data, conj_sink, base, context_data, version, generalized_rdf)
  File "/home/jmfernandez/projects/rdflib/.bug/lib/python3.11/site-packages/rdflib/plugins/parsers/jsonld.py", line 138, in to_rdf
    return parser.parse(data, context, dataset)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jmfernandez/projects/rdflib/.bug/lib/python3.11/site-packages/rdflib/plugins/parsers/jsonld.py", line 160, in parse
    context.load(local_context, context.base)
  File "/home/jmfernandez/projects/rdflib/.bug/lib/python3.11/site-packages/rdflib/plugins/shared/jsonld/context.py", line 401, in load
    self._prep_sources(base, source, sources, referenced_contexts)
  File "/home/jmfernandez/projects/rdflib/.bug/lib/python3.11/site-packages/rdflib/plugins/shared/jsonld/context.py", line 450, in _prep_sources
    self._prep_sources(
  File "/home/jmfernandez/projects/rdflib/.bug/lib/python3.11/site-packages/rdflib/plugins/shared/jsonld/context.py", line 430, in _prep_sources
    new_ctx = self._fetch_context(
              ^^^^^^^^^^^^^^^^^^^^
  File "/home/jmfernandez/projects/rdflib/.bug/lib/python3.11/site-packages/rdflib/plugins/shared/jsonld/context.py", line 463, in _fetch_context
    raise RECURSIVE_CONTEXT_INCLUSION
rdflib.plugins.shared.jsonld.errors.JSONLDException: recursive context inclusion

Surprisingly, next ones work (at the beginning I thought it was an issue with the current context):

@wallberg
Copy link
Contributor

The problem is that fails1.jsonld.json references both https://w3id.org/ro/crate/1.1/context and https://w3id.org/ro/terms/workflow-run, but https://w3id.org/ro/terms/workflow-run itself also references https://w3id.org/ro/crate/1.1/context.

$ curl -s -D - -L --header "Accept: application/ld+json, application/json;q=0.9, */*;q=0.1" https://w3id.org/ro/terms/workflow-run
HTTP/1.1 303 See Other
Date: Fri, 10 May 2024 13:33:26 GMT
Server: Apache/2.4.29 (Ubuntu)
Access-Control-Allow-Origin: *
Location: https://www.researchobject.org/ro-terms/workflow-run/context.json
Content-Length: 347
Content-Type: text/html; charset=iso-8859-1

HTTP/2 200
server: GitHub.com
content-type: application/json; charset=utf-8
last-modified: Tue, 07 May 2024 10:47:00 GMT
access-control-allow-origin: *
etag: "663a06a4-4fa"
expires: Fri, 10 May 2024 12:53:33 GMT
cache-control: max-age=600
x-proxy-cache: MISS
x-github-request-id: 66BE:12EDA7:4B3666:5B7E8D:663E1674
accept-ranges: bytes
age: 102
date: Fri, 10 May 2024 13:33:26 GMT
via: 1.1 varnish
x-served-by: cache-ewr18145-EWR
x-cache: HIT
x-cache-hits: 0
x-timer: S1715348006.225059,VS0,VE2
vary: Accept-Encoding
x-fastly-request-id: 1182bf956578eccfb9b7d97fcb3c094c8b2465d6
content-length: 1274

{
    "@context": [
        "https://w3id.org/ro/crate/1.1/context",
        {
            "ParameterConnection": "https://w3id.org/ro/terms/workflow-run#ParameterConnection",
            "ContainerImage": "https://w3id.org/ro/terms/workflow-run#ContainerImage",
            "DockerImage": "https://w3id.org/ro/terms/workflow-run#DockerImage",
            "SIFImage": "https://w3id.org/ro/terms/workflow-run#SIFImage",
            "connection": "https://w3id.org/ro/terms/workflow-run#connection",
            "sourceParameter": "https://w3id.org/ro/terms/workflow-run#sourceParameter",
            "targetParameter": "https://w3id.org/ro/terms/workflow-run#targetParameter",
            "md5": "https://w3id.org/ro/terms/workflow-run#md5",
            "sha1": "https://w3id.org/ro/terms/workflow-run#sha1",
            "sha256": "https://w3id.org/ro/terms/workflow-run#sha256",
            "sha512": "https://w3id.org/ro/terms/workflow-run#sha512",
            "environment": "https://w3id.org/ro/terms/workflow-run#environment",
            "registry": "https://w3id.org/ro/terms/workflow-run#registry",
            "tag": "https://w3id.org/ro/terms/workflow-run#tag",
            "containerImage": "https://w3id.org/ro/terms/workflow-run#containerImage"
        }
    ]
}

@wallberg
Copy link
Contributor

What I wonder is why encountering the same context a second time needs to raise an exception at all? Why can't we simply skip it?

I tried replacing the exception with a skip, return None, at https://github.com/RDFLib/rdflib/blob/main/rdflib/plugins/shared/jsonld/context.py#L474-L475, and all tests pass other than the one expecting the exception.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants