Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixing response headers of requests #367

Open
rapw3k opened this issue Jan 20, 2022 · 2 comments
Open

fixing response headers of requests #367

rapw3k opened this issue Jan 20, 2022 · 2 comments

Comments

@rapw3k
Copy link

rapw3k commented Jan 20, 2022

Hi,
I have a few questions regarding headers of requests.
I am testing with this endpoint: http://grlc.io/api-git/rapw3k/cybele/#/json/get_allDatasets

however, the page=1 gives the same result as without page number, so shouldn't be next page = 2 if i dont send page number ?

Moreover, the links are actually incorrect, they have two times grlc.io in domain, i.e., http://grlc.io,grlc.io/ , and also why 10.0 instead of just 10 ?

thanks!
Raul

@c-martinez
Copy link
Collaborator

Hi @rapw3k,

Thanks for your comments! I don't think this particular functionality has been extensively used, so it is not very polished: there is definitely room for improvement :-)

shouldn't be next page = 2 if i dont send page number ?
Yes, I you are right. Probably the "page" variable should be set to 1 if not present in the request.

links are actually incorrect, they have two times grlc.io in domain, i.e., http://grlc.io,grlc.io/ , and also why 10.0 instead of just 10
Not sure why links are being generated like that, but indeed they look in correct.

why the response headers says the last page is page 10
page 10 does not give link to next page
what will happen if i have more than 100 (x10 pages) =1000 results

These are all related -- the issue comes from the fact that, because counting results before executing the query is expensive, at the moment grlc just 'guesses' (or "Provides a dummy count for now") there will be 1000 results (ugly hack):

grlc/src/gquery.py

Lines 68 to 73 in d4ddb15

def count_query_results(query, endpoint):
"""
Returns the total number of results that query 'query' will generate
WARNING: This is too expensive just for providing a number of result pages
Providing a dummy count for now
"""

Until now, we didn't have a good use case to justify the additional load of querying to pre-calculate the number of results. But if this is functionality that would be useful to you, maybe we've finally got a reason to implement this properly.

Is the paging functionality something you would need for your use case? Are there any particular considerations you think should be taken into account?

@albertmeronyo -- what do you think? Do you know if there are other use cases which would benefit from this functionality?

@rapw3k
Copy link
Author

rapw3k commented Jan 28, 2022

thanks for the reply @c-martinez
Indeed, we are having some cases where we are returning tens of results, and the paging becomes quite relevant in order to get them.

Some problematic points I see, apart from the ones mentioned above:

  • As far as I can see, one result may be spread in two pages,
  • The number of results per page is a little random. Some pages return 1 result, some other return 5 or 6 or more than 10 .e.g,

https://grlc.io/api-git/cybele-project/metadata/allDatasets_testbed?testbed=https://w3id.org/cybele/datasets/PSNC&page=70

https://grlc.io/api-git/cybele-project/metadata/allDatasets_testbed?testbed=https://w3id.org/cybele/datasets/PSNC&page=72

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants