Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stored: add dedup backend #1663

Draft
wants to merge 54 commits into
base: master
Choose a base branch
from
Draft

Conversation

sebsura
Copy link
Contributor

@sebsura sebsura commented Jan 15, 2024

Thank you for contributing to the Bareos Project!

The dedup backend makes it possible for deduplicating filesystems to deduplicate backed up data.

Based of pr #1662

Please check

  • Short description and the purpose of this PR is present above this paragraph
  • Your name is present in the AUTHORS file (optional)

If you have any questions or problems, please give a comment in the PR.

Helpful documentation and best practices

Checklist for the reviewer of the PR (will be processed by the Bareos team)

Make sure you check/merge the PR using devtools/pr-tool to have some simple automated checks run and a proper changelog record added.

General
  • Is the PR title usable as CHANGELOG entry?
  • Purpose of the PR is understood
  • Commit descriptions are understandable and well formatted
  • Check backport line
  • Required backport PRs have been created
Source code quality
  • Source code changes are understandable
  • Variable and function names are meaningful
  • Code comments are correct (logically and spelling)
  • Required documentation changes are present and part of the PR
Tests
  • Decision taken that a test is required (if not, then remove this paragraph)
  • The choice of the type of test (unit test or systemtest) is reasonable
  • Testname matches exactly what is being tested
  • On a fail, output of the test leads quickly to the origin of the fault

@sebsura
Copy link
Contributor Author

sebsura commented Jan 17, 2024

One thing i was not sure about was the error reporting. Currently the device itself outputs error messages whenever they are caught and prints an Emsg with it. This makes it easier to give good error reports but there will probably be a lot of double reports ala:

dedup device : could not open dedup volume. ERR=...
normal bareos device: could not open dedup volume. ERR=...

An alternative would be to set errno appropriately (printing a dmsg instead) and let the normal bareos routines handle the error reporting.

The problem is that some problems do not have an associated error number since they might be internal logic errors so there is no way to give a helpful message in that case.

@sebsura sebsura force-pushed the dev/ssura/fvec/dedup branch 2 times, most recently from 4a0948a to 61553e0 Compare January 19, 2024 09:31
It was previously not possible to abort once a commit was
started (since the savestate was moved immediately on the CommitBlock
call).  This was fixed by instead waiting for the commit to finish
before issueing the move.
This allows us to split up records into multiple parts with desirable
sizes. For example a 129k record may be split into a 128k part and a
1k part so that at least the first 128k are dedupable.

The record header now is not treated special anymore.  Its just
another (tiny) bit of data.
Previously we tracked the available capacity instead but this caused
problems with mmap wanting offsets page aligned and us not being able
to guarantee that cap * element_size is page aligned.
@arogge arogge added this to the 24.0.0 milestone Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants