Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DONE% value when downloading a Zarr asset is not correct #1407

Closed
kabilar opened this issue Feb 15, 2024 · 0 comments · Fixed by #1443
Closed

DONE% value when downloading a Zarr asset is not correct #1407

kabilar opened this issue Feb 15, 2024 · 0 comments · Fixed by #1443

Comments

@kabilar
Copy link
Member

kabilar commented Feb 15, 2024

Description

When downloading a single Zarr asset or downloading an entire Dandiset that contains a Zarr asset, I have found that the DONE% value for the Zarr is constantly set as 98% or 99%. Whereas the Summary displays a seemingly more accurate progress (e.g. 1.95% in the screenshot below).

image

Reproducibility

  1. Version: dandi==0.59.1
  2. Command to download Zarr asset from Dandiset 000108:
    dandi download https://api.dandiarchive.org/api/assets/a25835b8-2293-4bf3-81f0-fbb3b22196fd/download/
    

Thank you

yarikoptic added a commit that referenced this issue May 14, 2024
Since maxsize is dynamically computed as we go through the files.
The idea, I guess, was that it would grow rapidly before actual
downloads commense but it is not the case, so we endup with done%
being always close to 100% since we get those reports on final
downloads completed close to when individual files are downloaded.

So this should close #1407 .

But for total zarr file to be used, we needed to account also for
skipped files.  I added reporting of sizes for skipped files as well.
It seems there is no negative side effect on regular files download.
So now for the %done of zarr we might be getting to 100% of original
size having downloaded nothing.  But IMHO it is ok since user does
not care as much of how many "subparts" are downloaded, but rather
to have adequate progress report back.

There also could be side effects if  -e skip  and we skip
download of some updated files which would be smaller than the local
ones so altogether we would get over 100% total at the end.
yarikoptic added a commit that referenced this issue May 15, 2024
Since maxsize is dynamically computed as we go through the files.
The idea, I guess, was that it would grow rapidly before actual
downloads commense but it is not the case, so we endup with done%
being always close to 100% since we get those reports on final
downloads completed close to when individual files are downloaded.

So this should close #1407 .

But for total zarr file to be used, we needed to account also for
skipped files.  I added reporting of sizes for skipped files as well.
It seems there is no negative side effect on regular files download.
So now for the %done of zarr we might be getting to 100% of original
size having downloaded nothing.  But IMHO it is ok since user does
not care as much of how many "subparts" are downloaded, but rather
to have adequate progress report back.

There also could be side effects if  -e skip  and we skip
download of some updated files which would be smaller than the local
ones so altogether we would get over 100% total at the end.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants