-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move attached files between assetstores #3467
Comments
To quote my comment from the import-tracker plugin PR (DigitalSlideArchive/import-tracker#19 (comment))
I believe this version addresses the chunking issue you mentioned, correct? Or am I misinterpreting the bottleneck at hand here? |
The chunking issue is the add in this loop:
specifically, the download iterator gets something on the order of 64kb at at time. We don't want to get the whole file before uploading (since it could be arbitrarily large). But in the girder code we default to an upload chunk size of 32 Mb, which means we might add chunks 512 times. Python creates a new memory object each and every time that occurs and does a memory copy (and is slow about it). Better would be to collect the chunks in a list:
Though that repeated code bothers me, and maybe it'd be better to keep a tally of lengths rather than call sum repeatedly. |
Got it, I made a PR with a modified version of your code for the speedup. It seems to perform quite well in comparison. I'm working on sorting out attached files on top of the listed changes |
We have a function to move files between assetstores. This doesn't support moving attached files, only files that are actually owned directly by an item.
Probably the correct behavior is to modify the upload record in
moveFileToAssetstore
to reflect the attached status.Further, there is a major performance penalty in the movement of files because we build up the chunk to upload by repeatedly adding data blocks from the small downloaded chunks. We should, instead, keep a list of these chunks and join them once for upload rather than iteratively adding binary string together.
The text was updated successfully, but these errors were encountered: