Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test or feat: timeline hash via fullbackup #7715

Open
koivunej opened this issue May 13, 2024 · 3 comments
Open

test or feat: timeline hash via fullbackup #7715

koivunej opened this issue May 13, 2024 · 3 comments
Labels
a/test Area: related to testing c/storage/pageserver Component: storage: pageserver t/feature Issue type: feature, for new features or requests

Comments

@koivunej
Copy link
Contributor

At least in tests it would be nice to have a hash for timeline state at lsn.

This is computed in test_ancestor_detach_branched_from and it should be used in test_import.py as well. Path to stable hashable full backup:

  • option to skip zenith.signal
  • sorted visitation order for currently HashSet using listings like DbDir
  • set 0 as mtime in tar

More context: #7706 (comment)

Alternatively the fullbackup and tar_cmp should be refactored and re-used in test_import.py which currently only compares the tar sizes, probably because portable fullbackup comparison seemed too time consuming to implement at the time.

@koivunej koivunej added t/feature Issue type: feature, for new features or requests c/storage/pageserver Component: storage: pageserver a/test Area: related to testing labels May 13, 2024
@koivunej
Copy link
Contributor Author

sorted visitation order for currently HashSet using listings like DbDir

One could argue that it's not easy to hashdos us by using the relation numbers, so these could all be FxHasher based collections. I am unsure if that would give us the determinism in all possible cases.

@jcsp
Copy link
Contributor

jcsp commented May 13, 2024

Would also like to use this for things like test_sharding_split_compaction, where we should check that absolutely everything is still readable after we drop/rewrite layers.

@koivunej
Copy link
Contributor Author

koivunej commented May 15, 2024

For sharding I think the pythonic way will work better, as you can merge the lists of hashes (which will still be unique), as I assume one needs to do with sharding.

koivunej added a commit that referenced this issue May 22, 2024
"taking a fullbackup" is an ugly multi-liner copypasted in multiple
places, most recently with timeline ancestor detach tests. move it under
`PgBin` which is not a great place, but better than yet another utility
function.

Additionally:
- cleanup `psql_env` repetition (PgBin already configures that)
- move the backup tar comparison as a yet another free utility function
- use backup tar comparison in `test_import.py` where a size check was
done previously
- cleanup extra timeline creation from test

Cc: #7715
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a/test Area: related to testing c/storage/pageserver Component: storage: pageserver t/feature Issue type: feature, for new features or requests
Projects
None yet
Development

No branches or pull requests

2 participants