Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clone/get should support setting annex.private=true #6456

Open
mih opened this issue Feb 17, 2022 · 4 comments · May be fixed by #7247
Open

clone/get should support setting annex.private=true #6456

mih opened this issue Feb 17, 2022 · 4 comments · May be fixed by #7247

Comments

@mih
Copy link
Member

mih commented Feb 17, 2022

This was already requested some time ago in #5835 in a slightly different context, but it has never been done. Being able to do it is a major improvement when updating large datasets where repeated temporary clones need to be made to fit in the storage allotment.

In the simplest case the --reckless switch gets a dedicated mode for this.

@mih mih changed the title clone/get should support setting annex.private=true clone/get should support setting annex.private=true Feb 17, 2022
@matrss
Copy link
Contributor

matrss commented May 2, 2022

For clone this is possible already, by passing additional options to git clone: datalad clone <url> -c annex.private=true.

This would also be useful for create, e.g. when setting up a mostly read-only dataset in a temporary location.
Appending additional options to the git command does not work here, because they are added at the wrong position.

@mih
Copy link
Member Author

mih commented Jan 11, 2023

This would also be useful for create, e.g. when setting up a mostly read-only dataset in a temporary location. Appending additional options to the git command does not work here, because they are added at the wrong position.

It seems to be doable already too:

% datalad -c annex.private=true create dummy
% cat dummy/.git/config
...
[annex]
        uuid = f1eb3d37-0251-4510-bc98-ad2a7b2fcb55
        version = 8
        private = true
...

However, this way of passing the config item does not work for clone. The resulting heterogeneity is undesirable.

@mslw
Copy link
Contributor

mslw commented Jan 11, 2023

Though at first glance the above doesn't fully work (trying on DataLad 0.17.10):

❱ datalad -c annex.private=true create dummy
❱ cd dummy 
❱ cat .git/config
...
[annex]
	uuid = 1245a968-4d01-4280-be8d-cc2cf66e9c96
	version = 10
	private = true
...
❱ git cat-file blob git-annex:uuid.log
1245a968-4d01-4280-be8d-cc2cf66e9c96 mszczepanik@bnbnb64:~/dummy timestamp=1673436462.930674266s
❱ ls .git/annex/journal-private
ls: cannot access '.git/annex/journal-private': No such file or directory

Can't see why, could that be due to the internal order of things within create; or did I simply make a mistake?

Edit: I think that's because of the order in which things happen within create. The config overrides are applied after _setup_annex_repo, thus after git annex init; but private mode requires the config option to be set for git repo before git annex init.

I am finishing a PR (above) in which I propose introducing --reckless private for clone & get and introducing a --private option for create. We can of course disciss whether that's an appropriate naming/interface.

@mih
Copy link
Member Author

mih commented Jan 11, 2023

Yes, I can confirm too that I concluded to quickly. The config is set, but too late to have an impact on git annex init.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging a pull request may close this issue.

4 participants