Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to the way meta indexing is handled for filestore. #4450

Merged
merged 1 commit into from Aug 30, 2023

Conversation

derekcollison
Copy link
Member

Historically we kept indexing information, either by sequence or by subject, as a per msg block operation. These were the ".idx" and ".fss" indexing files. When streams became very large this could have an impact on recovery time. Also, for encryption the fast path for determining if the indexing was current would require loading and decrypting the complete block.

This design moves to a more traditional WAL and snapshot approach. The snapshots for the complete stream, including summary information, global per subject information maps (PSIM) and per msg block details including summary and dmap, are processed asynchronously. The snapshot includes the msg block and has for the last record hash that was considered in the snapshot. On recovery the snapshot is read and processed and any additional records past the point of the snapshot itself are processed. To this end, any non-system removal of a message has to be expressed as a delete tombstone that is always added the the fs.lmb file. These are processed on recovery and our indexing layer knows to skip them.

Changing to this method drastically improves startup and recovery times, and has simplified the code. Some normal performance benefits have been seen as well.

Signed-off-by: Derek Collison derek@nats.io

Copy link
Member

@neilalexander neilalexander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I really like the approach here.

Some items for 2.11 out of this:

  1. Clean-up of accounting, quite a lot of duplicated code
  2. Clean-up of key loading from disk

server/filestore.go Outdated Show resolved Hide resolved
Copy link
Member

@wallyqs wallyqs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Historically we kept indexing information, either by sequence or by subject, as a per msg block operation. These were the "*.idx" and "*.fss" indexing files. When streams became very large this could have an impact on recovery time. Also, for encryption the fast path for determining if the indexing was current would require loading and decrypting the complete block.

This design moves to a more traditional WAL and snapshot approach. The snapshots for the complete stream, including sumary information, global per subject information maps (PSIM) and per msg block details including summary and dmap, are processed asynchronously. The snapshot includes the msh block and has for the last record considered in the snapshot. On recovery the snapshot is read and processed and any additional records past the point of the snapshot itself are processed. To this end, any removal of a message has to be expressed as a delete tombstone that is always added the the fs.lmb file. These are processed on recovery and our indexing layer knows to skip them.

Changing to this method drastically improves startup and recovery times, and has simplified the code. Some normal performance benefits have been seen as well.

Signed-off-by: Derek Collison <derek@nats.io>
@derekcollison derekcollison merged commit b9b284d into dev Aug 30, 2023
2 checks passed
@derekcollison derekcollison deleted the fs-meta-updates branch August 30, 2023 23:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants