-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compare performance of webdav+davfs2, webdav+webdavfs, and datalad-fuse #66
Comments
@yarikoptic Please answer my questions above. |
in general I wouldn't mind you choosing the way, but let's me make decision on the first way:
yes, let's call it
yes, I think so. Should take a path to operate from. this way we could test against some custom mounted filesystems etc
I guess it might come handy to troubleshoot etc, but I didn't envision it... So something like
could be ( |
@yarikoptic I still can't decide whether this should be subcommands of
Other concerns:
|
only for some extra_depends, e.g.
I think that is ok for now, but could as well just be listed in overall -- I do not see major cons stated for this one. As for independent script, I would imagine (if coded exposing some general interface) the pros would be: could be used by others to test some other operations, not necessarily healthchecks but may be some "real" analysis functions on data from the archive, e.g.
can't we just rely on them to be in the PATH, and then |
|
added symlink to it now under /usr/local/bin
yeah -- might need per setup/filesystem type custom unmount operation unfortunately seems to me. |
@yarikoptic I've decided to implement this as |
@yarikoptic How exactly should mounting & unmounting with webdavfs work? Based on its README, the recommended way to use webdavfs is to install it at |
Just for consistency, I did symlink mount it as $> mount -t webdav http://localhost:8080 /tmp/dandiarchive-fuse
mount: /tmp/dandiarchive-fuse: must be superuser to use mount.
$> webdavfs http://localhost:8080 /tmp/dandiarchive-fuse
http://localhost:8080: no PUT Range support, mounting read-only |
@yarikoptic You didn't answer my question: Which commands should I use for mounting & unmounting webdavfs?
|
why? FWIW it is exactly the same binary used.
It would need to be |
I even immediately came up with a recipe for disaster:
didn't try. but I wonder if smth like that could happen from FUSE filesystem - i.e. could there be root suid'ed content |
@yarikoptic When I said "you can give me permission to run |
via wrapper scripts I guess -- yeah, we could do that similarly to that unmount command, no problem. |
@yarikoptic No, not via wrapper scripts (That would just obscure what's going on to readers of the
and then I, via |
cool, I didn't know I can specify full command invocations in sudoers. verified that works on a /bin/ls locally❯ sudo grep bin/ls /etc/sudoers
[sudo] password for yoh:
yoh ALL=(ALL:ALL) NOPASSWD: /bin/ls --color=auto
❯ sudo -k
❯ sudo /bin/ls
[sudo] password for yoh:
sudo: a password is required
❯ sudo /bin/ls --color=auto
ab Maps
...
❯ sudo /bin/ls --color=auto -l
[sudo] password for yoh:
sudo: a password is required
added now those two to try out. |
|
|
The benchmarking is invoked as a CLI command, not a Python function. Commands can't return lists.
What sort of visualization? |
I meant internally.
I mean a text summary display in that CLI command at the end. Overall on above two points - just follow the classical MVC design pattern and have that model (structure of results) and view (CLI summary) with controller (benchmarking code). This way later on we can more easily change rendering or add another usage/visualization (e.g. store + summary over different runs etc). |
@yarikoptic If a test fails, should it be included in the visualization? What if a test is killed due to exceeding the one-hour timeout? |
hm... I think any fail should be treated as an error in the case of benchmarking and would need to resolve it first. |
@yarikoptic What about timeouts? |
error out if timeout happens I think |
@yarikoptic Is davfs2 currently set up so that I can do |
now you are -- but it is `-t davfs` apparently^P(base) smaug:~$ sudo /usr/bin/mount -t davfs http\://127.0.0.1\:8080 /tmp/dandisets-fuse
Please enter the username to authenticate with server
http://127.0.0.1:8080 or hit enter for none.
Username:
Please enter the password to authenticate user with server
http://127.0.0.1:8080 or hit enter for none.
Password:
/sbin/mount.davfs: connection timed out two times;
trying one last time
/sbin/mount.davfs: server temporarily unreachable;
mounting anyway
(base) smaug:~$ sudo umount /tmp/dandisets-fuse
/sbin/umount.davfs: waiting for mount.davfs (pid 535461) to terminate gracefully .. OK
we have 1.6.1-1 installed, upstream has 1.7.0. I filed request for update: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1060078 but might just NMU it later although I do not expect any performance changes there judging from changelog |
@yarikoptic Please append the following lines to
|
@yarikoptic Also, did you install webdavfs as
EDIT: According to the |
uncommented existing one and changed to 0, but didn't add that path limiter... try
it is there smaug:/mnt/btrfs/scrap
$> ls -l /usr/local/sbin/
total 20
-rwx------ 1 root staff 1647 Jul 2 2015 btrfsQuota*
-rwxr-xr-- 1 root adm 86 Jan 31 2015 flush-caches*
-rwx------ 1 root root 88 Dec 12 2018 flush_caches_kyle*
lrwxrwxrwx 1 yoh staff 23 Jan 17 13:16 mount.webdavfs -> /usr/local/bin/webdavfs*
-rwsr-xr-x 1 root root 43 Jan 5 12:11 unmount-tmp-fuse*
-rwxr-xr-x 1 root root 3636 Dec 18 2014 zfs-monitor.pl*
$> ls -l /usr/local/bin/webdavfs
lrwxrwxrwx 1 yoh staff 22 Jan 10 09:15 /usr/local/bin/webdavfs -> /opt/webdavfs/webdavfs*
$> ls -l /opt/webdavfs/webdavfs
-rwxr-xr-x 1 yoh yoh 8021561 Jan 5 12:02 /opt/webdavfs/webdavfs* and indeed odd since shell does find it smaug:/mnt/btrfs/scrap
$> sudo mount -t webdavfs http://127.0.0.1:8080 /tmp/dandisets-fuse2
mount: /tmp/dandisets-fuse2: unknown filesystem type 'webdavfs'.
dmesg(1) may have more information after failed mount system call.
$> sudo which mount.webdavfs
/usr/local/sbin/mount.webdavfs dunno... try to figure it out, if not -- there is |
oh... ok... hate to do that for local installs, but will do for uniformity (anyways will need to package the damn thing if it ends up to be the winner ;-) ) |
@yarikoptic I can get |
we have now
|
FWIW -- tried webdavfs on drogon but fail to get content of .zattrs:
so not sure if that works at all now :-/ |
the same for davfs2... may be none of those supports redirects? since seems to be ok for dandiset.yaml dandi@drogon:/mnt/backup/dandi$ cat /mnt/backup/dandi/dandidav-davfs2/dandisets/000108/draft/samples.tsv
cat: /mnt/backup/dandi/dandidav-davfs2/dandisets/000108/draft/samples.tsv: Input/output error
dandi@drogon:/mnt/backup/dandi$ head /mnt/backup/dandi/dandidav-davfs2/dandisets/000108/draft/dandiset.yaml
id: DANDI:000108/draft
doi: 10.80507/dandi.123456/0.123456.1234
url: https://dandiarchive.org/dandiset/000108/draft
name: Light sheet imaging of the human brain dang.. |
@yarikoptic davfs2 does support redirects, but it has to be enabled by adding Also, I found this davfs2 issue that may be relevant to what we're doing: Version 1.7.0 much slower than 1.6.1 (a hundred times slower) |
@yarikoptic I filed a bug report with davfs2 about lack of double-redirect support, but it doesn't look like the maintainers are actively handling bugs lately. |
@yarikoptic webdavfs doesn't support redirects at all; I filed an issue with it requesting support: miquels/webdavfs#30 |
Per brief discussion during our CON meetup today, just a note here (ping @jwodder) that we need to include in comparison our datalad-fuse solution (described in original post), so we make an informed decision on what backend to use for the healthstatus (currently datalad-fuse is used). |
(Copied from dandi/dandi-infrastructure#164 (comment) et sequentes)
A script should be written to run & time the following tests:
pynwb_open_load_ns
fromdandisets-healthstatus
matnwb_nwbRead
fromdandisets-healthstatus
dandi ls
(to load metadata) on a single local assetThese should be run with
DANDI_CACHE=ignore
set in order to avoid any possible caching side effects from fscacher.The tests should be run on assets mounted using each of the following methods:
datalad-fuse
The assets to test should be one (or more?) sample assets of some "typical" size (a few GBs).
sub-mouse1-fni16/sub-mouse1-fni16_ses-161228151100.nwb
in 000016 is suggested as a possible candidate.Testing should be run on smaug.
Webdavfs has been installed on smaug at
/opt/webdavfs/webdavfs
.umount
as root.sudo /usr/local/sbin/unmount-tmp-fuse
can be run to forcibly unmount/tmp/dandisets-fuse
.davfs2 is currently installed on smaug both system wide and (for a more recent version) at
/opt/davfs2/DESTDIR/usr/local/sbin/
(?), but @yarikoptic reports issues with getting it to work.@yarikoptic Question: Should the script be standalone or implemented as one or more subcommands of
dandisets-healthstatus
?If implementing as
dandisets-healthstatus
subcommands:What subcommands? Should there just be one subcommand that does all the benchmarking at once (mount mounts, run & time tests)? Do we need (as suggested in the original issue) a
run_benchmarks
command that just runs & times the tests? Should there be dedicated subcommands for mounting each of the three mount types and unmounting once the user hits Ctrl-C? Perhaps one subcommand that mounts a single mount type specified on the command line, runs & times the tests, and then unmounts?If the benchmarking is to be implemented as part of
dandisets-healthstatus
, this issue should be moved to that repository.If implementing as a separate script, the script will need to either use
dandisets-healthstatus
as a dependency or else copy essential parts of its code.If
dandisets-healthstatus
is used as a dependency, then since the benchmarking script will be separate from it, this comes with the risk that any future change todandisets-healthstatus
will break the script. One option to address this would be to include a Git commit hash in the benchmarking script's requirements specifier fordandisets-healthstatus
, but then the benchmarking script won't get any benefits that may come from future updates todandisets-healthstatus
.If we do this, I assume the script should be saved in this repository?
The text was updated successfully, but these errors were encountered: