Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add shallow checkout for azure pipelines #949

Open
Trzs opened this issue Jan 5, 2024 · 3 comments
Open

Add shallow checkout for azure pipelines #949

Trzs opened this issue Jan 5, 2024 · 3 comments

Comments

@Trzs
Copy link
Contributor

Trzs commented Jan 5, 2024

  • bootstrap.py always checks out the full repository
  • the azure runner is severely disk limited, so the '.git' directories are deleted again
  • kokkos needs git information to be build -> unwieldy work around to delete the '.git' dir in an extra step
  • I propose to add --depth=1 to git clone commands in bootstrap.py, so that only the necessary data is downloaded in the first place
  • can be reversed, where necessary, with git fetch --unshallow

Note to @nksauter, @bkpoon

@phyy-nx
Copy link
Contributor

phyy-nx commented Jan 5, 2024

$ time git clone git@github.com:cctbx/cctbx_project.git
real	0m44.325s
$ du -hs cctbx_project/
255M	cctbx_project/

$ time git clone --depth=1 git@github.com:cctbx/cctbx_project.git
real	0m12.546s
$ du -hs cctbx_project/
157M	cctbx_project/

XFEL CI on Azure is so tight on space this kinda thing can help (plus it's good dinosaur management). But how does this affect developers? With depth=1, the old commit history is literally not there. Git log shows only the latest commit. So I think we'd only want this for Azure, right? Maybe a flag to bootstrap update that we can set in Azure?

@bkpoon
Copy link
Member

bkpoon commented Jan 5, 2024

How much more disk space is needed? There is a disk clean up step in

- script: |
sudo mkdir -p /tmp/empty_dir
sudo rsync --stats -a --delete /tmp/empty_dir/ /tmp/host_opt/hostedtoolcache
for d in \
lib/jvm \
local/.ghcup \
local/lib/android \
local/share/powershell \
share/dotnet \
share/swift \
; do
echo Deleting $d
sudo rsync --stats -a --delete /tmp/empty_dir/ /tmp/host_usr/$d
done
df -h
displayName: Clean up host image
continueOnError: true

You essentially have root access to the virtual machine on Azure so you can delete whatever you want. There is sort of a limit in that in this pipeline, you are already inside the Docker image, but you can always make other host directories writeable here

container:
image: ${{ parameters.distribution }}:${{ join('.', parameters.version) }}
options: --name ci-container --privileged
volumes:
- /usr/bin/docker:/tmp/docker:ro
- /usr:/tmp/host_usr:rw
- /opt:/tmp/host_opt:rw

For example, in some pipelines, I clean up more stuff and make a swap file

https://github.com/phenix-project/phenix-installer/blob/main/scripts/clean_linux.sh

But that is done in the normal Azure image, not the Docker image. I get about 47 GB free in / after this step. Also, you do not need to use the 14 GB partition, you can use any partition on the image.

I can probably help more next week once I get back.

@Trzs
Copy link
Contributor Author

Trzs commented Jan 8, 2024

@phyy-nx to get the full git history, you can run either git fetch --unshallow or git pull --unshallow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants