Skip to content
This repository has been archived by the owner on Feb 25, 2022. It is now read-only.

Pass offset and limit query parameters to MS Graph API #194

Open
dominique-pfister opened this issue Oct 15, 2020 · 3 comments
Open

Pass offset and limit query parameters to MS Graph API #194

dominique-pfister opened this issue Oct 15, 2020 · 3 comments
Assignees

Comments

@dominique-pfister
Copy link
Contributor

dominique-pfister commented Oct 15, 2020

The current implementation of data-embed for excel sheets fetches the usedRange of a sheet, which includes not only all values, but also:

  • all cell formulas
  • all cell values as text
  • all cell value types

and more (see Range for a list of properties implicitly returned as result). The complete response size for the current index definition amounts to 7MB.

It then converts the values into JSON and finally filters them for the rows requested by offset and limit.

Instead, one could compute the requested range and limit the result returned as follows:

  1. Get only the address of the usedRange, e.g. A1-I6200
  2. Compute the address of the requested rows, e.g. A3073-I3328 (for offset=3072&limit=256)
  3. Get the column names (using syntax /worksheets/('sheet')/range(address='A1:I1')?$select=values)
  4. Get the column values (using syntax /worksheets/('sheet')/range(address='A3073:I3328')?$select=values)
  5. Create the expected JSON output
@dominique-pfister dominique-pfister self-assigned this Oct 15, 2020
@dominique-pfister
Copy link
Contributor Author

dominique-pfister commented Oct 15, 2020

Testing shows that the new behaviour is actually slower because it consists of 2 more requests to the Graph API, which seems to outweigh the smaller responses being returned. On average, the new behaviour will cause data-embed to take 8s as opposed to 6s for the old behaviour.

A small optimization where only the values of a range are returned (not all the other information listed above), shows a slight performance gain in the range of 1-2 seconds.

@tripodsan
Copy link
Contributor

Testing shows that the new behaviour is actually slower because it consists of 2 more requests to the Graph API, which seems to outweigh the smaller responses being returned.

  • why are there 2 requests more ? isn't it only one more for the column names?
  • can you use $skip and $top instead of the address?
  • when a table is used, the header names can be retrieved.

@dominique-pfister
Copy link
Contributor Author

dominique-pfister commented Oct 16, 2020

why are there 2 requests more ? isn't it only one more for the column names?

1 request for the used range address, 1 request for the column names, 1 request for the values (which are not adjacent to the column names when offset is not zero), as opposed to 1 request for all values in the used range, which include the header row.

can you use $skip and $top instead of the address?

already tried, has no effect, unfortunately.

when a table is used, the header names can be retrieved.

right, but this would still use one more request

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants