-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add benchmarking of various mounting strategies #67
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #67 +/- ##
==========================================
+ Coverage 60.43% 61.30% +0.86%
==========================================
Files 9 10 +1
Lines 685 876 +191
Branches 169 209 +40
==========================================
+ Hits 414 537 +123
- Misses 251 319 +68
Partials 20 20 ☔ View full report in Codecov by Sentry. |
since s3 can have different latencies at different times of the day, let's also make sure we have some estimate of s3 latency during each benchmark if these benchmarks take some time to run. if they run in mins, i would be less worried about latencies, and in such a scenario we should just get multiple estimates to create some error bars. |
How? |
this is old, but something like this: https://github.com/dvassallo/s3-benchmark |
@satra This seems like something that should be done separately from |
c237d13
to
2c7bebe
Compare
@yarikoptic Problem: |
since it all in motion, I think it would be ok to point to that branch you have for dandi-cli with pydantic 2.0 compat |
@@ -3,7 +3,7 @@ set -ex | |||
|
|||
PYTHON="$HOME"/miniconda3/bin/python | |||
DANDISETS_PATH=/mnt/backup/dandi/dandisets-healthstatus/dandisets | |||
MOUNT_PATH=/mnt/backup/dandi/dandisets-healthstatus/dandisets-fuse | |||
MOUNT_PATH=/tmp/dandisets-fuse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's do it under some safer user-specific location, e.g. /var/run/user/$UID/
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? /mnt/backup/dandi/dandisets-healthstatus/dandisets-fuse
is the path we've been using on drogon for FUSE-mounting the Dandisets. Also, tools/run.sh
is just for generating the healthstatus reports; the shell script for generating the benchmarks is unfinished and hasn't been committed yet.
Please provide results of running such benchmarking across possible solutions e.g. on typhon. (should be less busy ATM) |
scrape that about typhon, I forgot that we rely on having |
@yarikoptic You initially said here to run the benchmarks on smaug. |
if benchmarks rely on full clone of |
@yarikoptic I need permission to
Note that the colons in the URLs need to be escaped when adding them to the sudoers file. Also, |
done |
@yarikoptic matlab needs to be installed on smaug so that I can benchmark the associated test. |
@yarikoptic Ping. |
done now -- the same 2022b version is installed systemwide |
@yarikoptic When I try to run a matlab test on smaug, it fails with:
Note that there is no |
could you please give me full matlab invocation to ensure to work correctly? on smaug you do it under your account or some other (like datalad etc)? |
where |
command didn't run under my account on drogon, but worked (errored but past the license check) under |
@yarikoptic The benchmarking is failing because the matlab test on FUSE is exceeding the 1-hour timeout. I tried increasing the timeout to 2 hours, but it exceeded that as well. Should I try increasing the timeout to something incredibly high or take another approach? |
how long would it run on that file if downloaded in full? if it is just generally very slow (half an hour) -- might be smth to relay to matnwb. |
@yarikoptic 42 seconds |
hm. Any ideas on why fuse solution takes that long? how long it takes with datalad-fuse? |
@yarikoptic I don't know why it's so slow with FUSE, and I don't know how long it would take with FUSE, as the benchmark code kills the process at the 2-hour timeout. |
please make time out 5 hours and run against both fuse solutions -- datalad-fuse and dandidav + davfs2 |
ideally: profile datalad-fuse while running the test to see where it spends time. |
@yarikoptic The matnwb test on datalad-fuse exceeded the five-hour time limit as well. How exactly should I profile it? Just use py-spy? |
First - py-spy would not hurt indeed. Then I would have probably added log lines at DEBUG level within datalad-fuse to see what is actually taking time there if py-spy was not conclusive. |
@yarikoptic Is there a way to get datalad's logs to include timestamps? |
yes, there is also a number of other possibly helpful options (available through env vars or even git config since defined in common_cfg) for augmenting logging behavior:
|
Disregard@yarikoptic When I try running
I don't know why the EDIT: I realized that I wasn't setting the environment for the |
@yarikoptic I believe |
Let's try on |
Closes #66.
To do:
pynwb_open_load_ns
matnwb_nwbRead
DANDI_CACHE=ignore dandi ls
(to load metadata) on a single local assetsub-mouse1-fni16/sub-mouse1-fni16_ses-161228151100.nwb
in 000016 is suggested as a possible candidate.