Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] What is the meaning of NFS v4 grace period EVENT: EVENT_RELEASE_IP, EVENT_TAKE_IP? #1108

Closed
zhitaoli-6 opened this issue Mar 29, 2024 · 17 comments

Comments

@zhitaoli-6
Copy link
Contributor

zhitaoli-6 commented Mar 29, 2024

After seeing the implementation about state recovery, I understand as follows.

  1. EVENT_TAKE_IP means that if some new IP is added to the nfs-ganesha node, then nfs-ganesha enters into grace period to wait state recovery of the old NFS clients. This IP is where NFS server runs.

  2. EVENT_RELEASE_IP means that NFS clients from that IP are marked as expired, and the OP_RECLAIM_COMPLETE will be cancelled. Ganesha enters into grace period and wait for their completion of state reclaim. This IP is where NFS clients runs.
    I see no caller of this event. What is the background of this event?Is it necessary for nfs-ganesha to enter into grace?

These two events are asymmetrical and quite confusing. Looking forward to some introduction about these events.

@zhitaoli-6
Copy link
Contributor Author

Close this issue because it maybe duplicated with issue. It is unclear or not maintained now.

@zhitaoli-6
Copy link
Contributor Author

Reopen to wait for confirm from our community :)

@zhitaoli-6 zhitaoli-6 reopened this Apr 1, 2024
@ffilz ffilz added the question label Apr 1, 2024
@dang
Copy link
Contributor

dang commented Apr 2, 2024

I don't think these events are related at all. EVENT_RELEASE_IP appears to be used to force-expire a client. EVENT_TAKE_IP is used for server failover.

@zhitaoli-6
Copy link
Contributor Author

There is no description of force-expiring a NFS client according NFS v4.x RFC.

And EVENT_RELEASE_IP that forces to expire a client can not work at all according to this issue. No NFS client records will match with the IP released.

@zhitaoli-6
Copy link
Contributor Author

zhitaoli-6 commented Apr 8, 2024

When a nfs-ganesha node named NodeA crashes, the VIP(virtual IP) will be assigned to another node named NodeB. EVENT_TAKE_IP is used to tell the destination NodeB that it will serve with a new VIP(virtual IP), which will enter into grace period and wait for reclaim from old NFS clients connected with that VIP.

When NodeA comes back, the VIP is assigned to NodeA again. We need an event to tell NodeB to release client_records and states from memory , and still keeps the states in FSAL layer. Then NFS clients connected with that VIP will reclaim their states with NodeA.

The implementation of EVENT_RELEASE_IP is far away from our need. In fact, EVENT_RELEASE_VIP will force-expire clients from the IP, and it will also release all related states(open, byte-range lock, delegation and layout) from FSAL.

So I think we need an opposite event with EVENT_TAKE_IP to achieve high availability of VIP. If I misunderstand something, please point it out :)

Looking forward to help from our community.

@ffilz
Copy link
Member

ffilz commented Apr 8, 2024

OK, so your question is how to implement fail back,

EVENT_RELEASE_IP is for client assisted fail back, the cluster is put into grace period, the release IP is sent to the node that has the state which releases it from the FSAL. Take IP is sent to the recovered node to let it know to accept reclaims on that IP address. The clients will discover their clientid is failed and will reclaim. So basically, for those clients attached to NodeA's virtual IP, it will appear that their server crashed.

We have floated the idea of state transfer without this disruption but no one has stepped up with development resources for that yet.

@zhitaoli-6
Copy link
Contributor Author

zhitaoli-6 commented Apr 15, 2024

In our case, lock_state are persisted into FSAL layer. When lock is reclaimed, we check the existence of this lock state from underlying storage. And if it doesn't exist, the lock reclaim request from NFS client will fail. So in our usage, EVENT_RELEASE_IP should release states from nfs-ganesha memory but can NOT release lock state from FSAL layer(underlying storage). I want to do some adaption according to our case.

Before adaption, the call path which releases lock from FSAL is as follows:

1. nfs_release_v4_clients(ip) 
2. nfs_client_id_expire(cp, true)
3. state_nfs4_owner_unlock_all(lock_owner)
4. state_unlock()
5. do_lock_op()
6. lock_op2()

I want to add an argument bool fsal_unlock into state_unlock(), which controls whether to unlock from FSAL. After adaption:

1. nfs_release_v4_clients(ip) 
2. nfs_client_id_expire(cp, true) 
3. state_nfs4_owner_unlock_all(lock_owner, fsal_unlock=false) 
4. state_unlock(fsal_unlock=false). // this will skip to call do_lock_op()

Is this change acceptable to other cases? And are there any potential problems that I miss? Please help review this draft change :)

@ffilz
Copy link
Member

ffilz commented Apr 15, 2024

Is your persistent lock state enough to be able to reclaim locks without the client assistance? That is something folks are interested in.

But to your question, I think the best route might be to add a support bit to the FSAL supported things and then add a parameter set by RELEASE_IP that tests that support bit down in do_lock_op.

@zhitaoli-6
Copy link
Contributor Author

Our NFS v4.x lock implementation is as follows in general.

When lock is acquired, we persist the lock record into FSAL(our backend) with its clientid4. We strongly rely on the lifetime of clientid4 to validate the lock by checking whether the clientid4 field of lock record exists in recovery_backend.

  1. Non-reclaim lock can be granted if there is no such valid lock record in our backend.
  2. Lock can be reclaimed if there is such lock record.

On failures:

  1. If the client expires in nfs-ganesha, its clientid4 will be removed from recovery_backend. Lock will be non-reclaimable for NFS client.
  2. If some nfs-ganesha node named node-A fails, its VIP will be assigned to another node-B. Node-A need release states from its memory without entering into grace period. Node-B will enter into grace period, waiting for lock reclaim from NFS client. After grace period ends, clientid4 in recovery_backend will be reset according to the clientid4 list re-established during grace period. If lock reclaim does not succeed during grace period, the lock will be non-reclaimable for the old NFS client.

So in our case, EVENT_RELEASE_IP is used to mark those NFS clients that access the moving VIP stale and release their states from nfs-ganesha memory, but still keep the states into FSAL. Those NFS client will receive NFS4ERR_STALE_CLIENTID and try to reclaim their states with the destination nfs-ganesha node.

Are there any issues about correctness in our NFS v4.x lock design? If it is ok, I will move ahead with this suggestion and submit a patch to our mainstream.

But to your question, I think the best route might be to add a support bit to the FSAL supported things and then add a parameter set by RELEASE_IP that tests that support bit down in do_lock_op.

@ffilz
Copy link
Member

ffilz commented Apr 16, 2024

That looks good, please do submit a patch via Gerrithub.

@zhitaoli-6
Copy link
Contributor Author

We also change ip_match() implementation (issue #985) to match those NFS clients that access the VIP. Should these changes be merged into one patch?

@ffilz
Copy link
Member

ffilz commented Apr 17, 2024

Please submit the other change as a separate patch.

ffilz pushed a commit to ffilz/nfs-ganesha that referenced this issue Apr 20, 2024
Some FSAL persists enough information about lock. Itself can
control whether to reclaim lock, not completely trusting NFS
client. If VIP is detached, the FSAL expects lock record keep in
the backend storage, which is used to check following reclaim.

See nfs-ganesha#1108

Change-Id: Idb223b115fe846968644c86ddc7f9797fca32957
Signed-off-by: zhitaoli <zhitao.li@iomesh.com>
ffilz pushed a commit to ffilz/nfs-ganesha that referenced this issue Apr 20, 2024
Some FSAL persists enough information about lock. Itself can
control whether to reclaim lock, not completely trusting NFS
client. If VIP is detached, the FSAL expects lock record keep in
the backend storage, which is used to check following reclaim.

See nfs-ganesha#1108

Change-Id: Idb223b115fe846968644c86ddc7f9797fca32957
Signed-off-by: zhitaoli <zhitao.li@iomesh.com>
ffilz pushed a commit to ffilz/nfs-ganesha that referenced this issue Apr 20, 2024
Some FSAL persists enough information about lock. Itself can
control whether to reclaim lock, not completely trusting NFS
client. If VIP is detached, the FSAL expects lock record keep in
the backend storage, which is used to check following reclaim.

See nfs-ganesha#1108

Change-Id: Idb223b115fe846968644c86ddc7f9797fca32957
Signed-off-by: zhitaoli <zhitao.li@iomesh.com>
@zhitaoli-6
Copy link
Contributor Author

Please review this patch on Gerrithub :)

ffilz pushed a commit to ffilz/nfs-ganesha that referenced this issue Apr 20, 2024
Some FSAL persists enough information about lock. Itself can
control whether to reclaim lock, not completely trusting NFS
client. If VIP is detached, the FSAL expects lock record keep in
the backend storage, which is used to check following reclaim.

See nfs-ganesha#1108

Change-Id: Idb223b115fe846968644c86ddc7f9797fca32957
Signed-off-by: zhitaoli <zhitao.li@iomesh.com>
ffilz pushed a commit to ffilz/nfs-ganesha that referenced this issue Apr 22, 2024
Some FSAL persists enough information about lock. Itself can
control whether to reclaim lock, not completely trusting NFS
client. If VIP is detached, the FSAL expects lock record keep in
the backend storage, which is used to check following reclaim.

See nfs-ganesha#1108

Change-Id: Idb223b115fe846968644c86ddc7f9797fca32957
Signed-off-by: zhitaoli <zhitao.li@iomesh.com>
ffilz pushed a commit to ffilz/nfs-ganesha that referenced this issue Apr 22, 2024
Some FSAL persists enough information about lock. Itself can
control whether to reclaim lock, not completely trusting NFS
client. If VIP is detached, the FSAL expects lock record keep in
the backend storage, which is used to check following reclaim.

See nfs-ganesha#1108

Change-Id: Idb223b115fe846968644c86ddc7f9797fca32957
Signed-off-by: zhitaoli <zhitao.li@iomesh.com>
ffilz pushed a commit to ffilz/nfs-ganesha that referenced this issue Apr 22, 2024
Some FSAL persists enough information about lock. Itself can
control whether to reclaim lock, not completely trusting NFS
client. If VIP is detached, the FSAL expects lock record keep in
the backend storage, which is used to check following reclaim.

See nfs-ganesha#1108

Change-Id: Idb223b115fe846968644c86ddc7f9797fca32957
Signed-off-by: zhitaoli <zhitao.li@iomesh.com>
ffilz pushed a commit to ffilz/nfs-ganesha that referenced this issue Apr 23, 2024
Some FSAL persists enough information about lock. Itself can
control whether to reclaim lock, not completely trusting NFS
client. If VIP is detached, the FSAL expects lock record keep in
the backend storage, which is used to check following reclaim.

See nfs-ganesha#1108

Change-Id: Idb223b115fe846968644c86ddc7f9797fca32957
Signed-off-by: zhitaoli <zhitao.li@iomesh.com>
zhitaoli-6 added a commit to iomesh/nfs-ganesha that referenced this issue Apr 28, 2024
Some FSAL persists enough information about lock. Itself can
control whether to reclaim lock, not completely trusting NFS
client. If VIP is detached, the FSAL expects lock record keep in
the backend storage, which is used to check following reclaim.

See nfs-ganesha#1108

Change-Id: Idb223b115fe846968644c86ddc7f9797fca32957
Signed-off-by: zhitaoli <zhitao.li@iomesh.com>
zhitaoli-6 added a commit to iomesh/nfs-ganesha that referenced this issue Apr 30, 2024
Some FSAL persists enough information about lock. Itself can
control whether to reclaim lock, not completely trusting NFS
client. If VIP is detached, the FSAL expects lock record keep in
the backend storage, which is used to check following reclaim.

See nfs-ganesha#1108

Change-Id: Idb223b115fe846968644c86ddc7f9797fca32957
Signed-off-by: zhitaoli <zhitao.li@iomesh.com>
@zhitaoli-6
Copy link
Contributor Author

What can I do to move this patch forward?

@xiaods
Copy link

xiaods commented May 9, 2024

i have seen it is on "Ready to submit" status

@ffilz
Copy link
Member

ffilz commented May 9, 2024

Sorry, two weeks ago, I was in a rush to finish the weekly merge and missed this patch, and then last week I ran out of time to do a weekly merge. I should get it merged this week.

ffilz pushed a commit that referenced this issue May 10, 2024
Some FSAL persists enough information about lock. Itself can
control whether to reclaim lock, not completely trusting NFS
client. If VIP is detached, the FSAL expects lock record keep in
the backend storage, which is used to check following reclaim.

See #1108

Change-Id: Idb223b115fe846968644c86ddc7f9797fca32957
Signed-off-by: zhitaoli <zhitao.li@iomesh.com>
@xiaods
Copy link

xiaods commented May 26, 2024

/close as resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants