Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--recursive option not behaving as expected (likely PEBKAC) #857

Open
mddeff opened this issue Oct 28, 2023 · 2 comments
Open

--recursive option not behaving as expected (likely PEBKAC) #857

mddeff opened this issue Oct 28, 2023 · 2 comments

Comments

@mddeff
Copy link

mddeff commented Oct 28, 2023

As the tile says, this is likely a PEBKAC/ID10T issue, but I haven't been able to figure it out so I'm sending up a flare. Redirect me as appropriate.

Source:

[mike@fs01]~% uname -a
Linux fs01.svr.zeroent.net 4.18.0-147.5.1.el8_1.x86_64 #1 SMP Wed Feb 5 02:00:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
[mike@fs01]~% zfs --version
zfs-0.8.3-1
zfs-kmod-0.8.3-1
[mike@fs01]~% zfs list dozer1 -o name -r
NAME
dozer1
dozer1/fast1
dozer1/fast1/k8s
dozer1/fast1/k8s/dmz
dozer1/fast1/k8s/internal
dozer1/fast1/spartek
dozer1/fast1/spartek/k8s1
dozer1/fast1/spartek/vms1
dozer1/fast1/vms3
dozer1/tank0
dozer1/tank0/data
dozer1/tank0/spartek
dozer1/tank0/spartek/vms1
dozer1/tank2
dozer1/tank2/docker
dozer1/tank2/docker/docker0
dozer1/tank2/iot
dozer1/tank2/nextcloud
dozer1/tank2/vms

(Inb4 CentOS 8 is dead and I'm running it on a storage array; its on the to-do list. And I'm sure my drastically different versions of ZFS isn't the best either.)

All of those datasets are created, populated, and managed by local (to src) syncoid using my autosyncoid script.

Dest:

[root@backup01 ~]# cat /etc/redhat-release 
Rocky Linux release 9.2 (Blue Onyx)
[root@backup01 ~]# uname -a
Linux backup01.svr.zeroent.net 5.14.0-284.30.1.el9_2.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Sep 16 09:55:41 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
[root@backup01 ~]# zfs --version
zfs-2.1.13-1
zfs-kmod-2.1.13-1

On dest, I pre-created the dozer0/ze-fs01/dozer1 dataset and then ran:

syncoid --no-sync-snap --force-delete --recursive --no-privilege-elevation \
  zfsbackup@fs01.svr.zeroent.net:dozer1 \
  dozer0/ze-fs01/dozer1

It successfully creates dozer0/ze-fs01/dozer1/fast and then syncs dozer1/fast1/* to dozer0/ze-fs01/dozer1/fast/* recursively creating all necessary child datasets. Then, when it gets to dozer1/tank2, it barfs:

INFO: Sending oldest full snapshot dozer1/tank2/vms@autosnap_2023-09-01_00:00:07_monthly (~ 195.6 GB) to new target filesystem:                                                                                                               
cannot open 'dozer0/ze-fs01/dozer1/tank2': dataset does not exist                                                                                                                                                                             
cannot receive new filesystem stream: unable to restore to destination                                                 
CRITICAL ERROR: ssh      -S /tmp/syncoid-zfsbackup@fs01.svr.zeroent.net-1697895831-1968 zfsbackup@fs01.svr.zeroent.net ' zfs send  '"'"'dozer1/tank2/vms'"'"'@'"'"'autosnap_2023-09-01_00:00:07_monthly'"'"' | mbuffer  -q -s 128k -m 16M' |  
zfs receive  -s -F 'dozer0/ze-fs01/dozer1/tank2/vms' failed: 256 at /usr/local/sbin/syncoid line 549.

So I manually created all of the child datasets on dest and then re-ran the same command, and now it seems to be working (ish)

CRITICAL: no snapshots exist on source dozer1/tank0, and you asked for --no-sync-snap.
NEWEST SNAPSHOT: autosnap_2023-10-28_17:00:00_hourly
Removing dozer0/ze-fs01/dozer1/tank0/data because no matching snapshots were found
NEWEST SNAPSHOT: autosnap_2023-10-28_17:00:00_hourly
INFO: Sending oldest full snapshot dozer1/tank0/data@syncoid_fs01.svr.zeroent.net_2022-10-12:07:17:08 (~ 3450.1 GB) to new target filesystem:

While fast1 has snapshots (everything under there has the same retention policy, so I'm having sanoid just snap the whole dataset recursively), the tank* datasets do not as they all have mixed usage (and snapshots only are occurring in the child datasets).

It looks like when there's no snapshots (and my usage of --no-sync-snap) on a dataset, syncoid doesn't sync it, but then it doesn't get created on the target system to support creation of child datasets that do have snapshots.

Is this behavior expected or have I found an edge case?

As always, thank you to Jim and the {san,sync,find}oid contributors that enable enterprise-grade storage/backup for the FOSS community!

@jimsalterjrs
Copy link
Owner

jimsalterjrs commented Oct 28, 2023 via email

@mddeff
Copy link
Author

mddeff commented Oct 28, 2023

Yep, totally understood. It's not so much a space thing, rather, I'm creating the snapshots at the child dataset level because they have different uses (and subsequently have different retention policies).

For instance, tank2/vms and tank2/iot have different needs for retention, so the policies are different.

Would it be better to:

A) Set a general Sanoid snapshot policy for tank2 that is the boolean of the policies for vms and iot, and then have the delta policies for each of those child sets? (This feels prone to error having two different policies creating the set of snapshots for a single dataset, but I could be overthinking it.)

B) Point Syncoid at tank2/vms and tank2/iot seperately?

C) Just do what I did where I create blank datasets on the target and let Syncoid take it from there

When I had originally read the --recursive option, my brain auto-completed a mkdir -p type behavior. Is there any reason why that behavior would be unwanted? Effectively just creating blank datasets on the target to support the tree structure of target datasets with actual snapshots to be transferred? Maybe a new flag?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants