Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid queries if varible name is substring of another variable name #361

Open
jaw111 opened this issue Aug 31, 2021 · 2 comments
Open

Comments

@jaw111
Copy link

jaw111 commented Aug 31, 2021

Given a query like where a variable is a substring of another variable name

select *
where {
  [] rdfs:label ?__label ;
    skos:prefLabel ?__label2 .
}

If the request URL includes the label parameter e.g. ?label=foo then the resulting query is invalid. Whereby the string ?__label is replaced by "foo":

select *
where {
  [] rdfs:label "foo" ;
    skos:prefLabel "foo"2 .
}

The logic for rewriting the queries should be more robust than simply replacing strings in the query text to account for this.

@c-martinez
Copy link
Collaborator

Hi @jaw111! Interesting issue, I don't think we've ever come across this sort of use case before. We've been thinking for a while that the variable replacement code should be upgraded, to overcome issues such as #230, so maybe this is something to be taken into account as well.

The only thing that I can think of, is to do string replacement, starting from the longest variable name (?__label2 in your example above). So something along these lines should do the trick:

def doReplace(s, vals):
    # Start replacing longest variable names
    for key in sorted(vals.keys(), key=len, reverse=True):
        s = s.replace(key, vals[key])
    return s

This is not very sophisticated, but if you are aware of any more elegant algorithm to address this issue, we are open to suggestions :-)

@jaw111
Copy link
Author

jaw111 commented Sep 29, 2021

@c-martinez I like your suggestion, it's nice and simple :)

My other thought was that, as the query is being translated into the SPARQL Algebra Expression with rdflib, it should be possible to programmatically manipulate that expression to replace the variable by the RDF term and reserialize back to text. That might be more complex than manipulating the query text as a string, but should be a more robust approach.

Another approach would be to construct a VALUES clause with the bindings for the relevant variables and simply append that to the query text as suggested in #332.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants