Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.deb downloaded from snapshots might be broken or incomplete #154

Open
yarikoptic opened this issue Apr 20, 2023 · 1 comment
Open

.deb downloaded from snapshots might be broken or incomplete #154

yarikoptic opened this issue Apr 20, 2023 · 1 comment

Comments

@yarikoptic
Copy link
Member

Recent build on appveyor (invocation is defined as git-annex -m deb-url --url http://snapshot.debian.org/archive/debian/20210906T204127Z/pool/main/g/git-annex/git-annex_8.20210903-1_amd64.deb) ubuntu has failed with

2023-04-20T18:27:12+0000 [INFO    ] datalad_installer Downloading http://snapshot.debian.org/archive/debian/20210906T204127Z/pool/main/g/git-annex/git-annex_8.20210903-1_amd64.deb
2023-04-20T18:31:30+0000 [INFO    ] datalad_installer Running: sudo dpkg -i /home/appveyor/DLTMP/tmp0e53p1a8/git-annex.deb
Selecting previously unselected package git-annex.
(Reading database ... 306255 files and directories currently installed.)
Preparing to unpack .../tmp0e53p1a8/git-annex.deb ...
Unpacking git-annex (8.20210903-1) ...
dpkg-deb (subprocess): cannot copy archive member from '/home/appveyor/DLTMP/tmp0e53p1a8/git-annex.deb' to decompressor pipe: unexpected end of file or stream
dpkg-deb (subprocess): decompressing archive member: lzma error: unexpected end of input
dpkg-deb: error: <decompress> subprocess returned error exit status 2
dpkg: error processing archive /home/appveyor/DLTMP/tmp0e53p1a8/git-annex.deb (--install):
 cannot copy extracted data for './usr/bin/git-annex' to '/usr/bin/git-annex.dpkg-new': unexpected end of file or stream
Errors were encountered while processing:
 /home/appveyor/DLTMP/tmp0e53p1a8/git-annex.deb
Traceback (most recent call last):
  File "/home/appveyor/dlvenv/bin/datalad-installer", line 8, in <module>
    sys.exit(main())
  File "/home/appveyor/dlvenv/lib/python3.8/site-packages/datalad_installer.py", line 1788, in main
    return manager.main(argv)
  File "/home/appveyor/dlvenv/lib/python3.8/site-packages/datalad_installer.py", line 657, in main
    self.addcomponent(name=cr.name, **cr.kwargs)
  File "/home/appveyor/dlvenv/lib/python3.8/site-packages/datalad_installer.py", line 690, in addcomponent
    component(self).provide(**kwargs)
  File "/home/appveyor/dlvenv/lib/python3.8/site-packages/datalad_installer.py", line 1032, in provide
    bins = self.get_installer(method).install(self.NAME, **kwargs)
  File "/home/appveyor/dlvenv/lib/python3.8/site-packages/datalad_installer.py", line 1118, in install
    bindir = self.install_package(package, **kwargs)
  File "/home/appveyor/dlvenv/lib/python3.8/site-packages/datalad_installer.py", line 1391, in install_package
    self.manager.sudo(*cmd)
  File "/home/appveyor/dlvenv/lib/python3.8/site-packages/datalad_installer.py", line 576, in sudo
    return runcmd("sudo", *args, **kwargs)
  File "/home/appveyor/dlvenv/lib/python3.8/site-packages/datalad_installer.py", line 1756, in runcmd
    return subprocess.run(arglist, check=True, **kwargs)
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['sudo', 'dpkg', '-i', '/home/appveyor/DLTMP/tmp0e53p1a8/git-annex.deb']' returned non-zero exit status 1.
Command exited with code 1
Running "on_finish" scripts
while [ -f ~/BLOCK ]; do sleep 5; done

so there were no download error, but obtained .deb was not good.

I checked -- it happened before a few times this year

(git)smaug:/mnt/datasets/datalad/ci/logs/2023[master]git
$> datalad foreach-dataset --o-s relpath -r -J10 git grep 'lzma error: unexpected end of input' | sort
01/16/push/maint/5649c54/appveyor-9094-failed/sm7u32wv3ofj2r5r.txt:[00:03:58] dpkg-deb (subprocess): decompressing archive member: lzma error: unexpected end of input
01/16/push/maint/5649c54/appveyor-9094-failed/wxq2emxfye6oge4d.txt:[00:03:58] dpkg-deb (subprocess): decompressing archive member: lzma error: unexpected end of input
03/24/pr/7342/8bdd95e/appveyor-9261-failed/3v3vkao7lb57s9uj.txt:[00:03:53] dpkg-deb (subprocess): decompressing archive member: lzma error: unexpected end of input
03/24/push/maint/362834f/appveyor-9275-failed/q7p7yuiuket91hcv.txt:[00:04:27] dpkg-deb (subprocess): decompressing archive member: lzma error: unexpected end of input
03/28/pr/7359/fc8580c/appveyor-9292-failed/luwc0638irf4wb3g.txt:[00:04:11] dpkg-deb (subprocess): decompressing archive member: lzma error: unexpected end of input
04/13/push/maint/f16787c/appveyor-9309-failed/osmhsjv1fu4tcdvh.txt:[00:03:57] dpkg-deb (subprocess): decompressing archive member: lzma error: unexpected end of input
04/13/push/maint/f16787c/appveyor-9309-failed/qy26stlbt7kuvc3i.txt:[00:03:57] dpkg-deb (subprocess): decompressing archive member: lzma error: unexpected end of input
04/13/push/maint/f16787c/appveyor-9309-failed/uhty1lhms1er5prt.txt:[00:04:39] dpkg-deb (subprocess): decompressing archive member: lzma error: unexpected end of input

well -- snapshots.debian.org has various throttling etc policies in place, so I wonder if may be it somehow at times manages to just drop the connection? then we must catch that and report I guess. Or what else could it be?

I thought for us to add validation of retrieved .deb and retry if we find it a bad download based on etag, but could not figure out what it corresponds to:

etag: "c0fc68-5cb5a4359b140" ``` ❯ wget -S http://snapshot.debian.org/archive/debian/20210906T204127Z/pool/main/g/git-annex/git-annex_8.20210903-1_amd64.deb URL transformed to HTTPS due to an HSTS policy --2023-04-20 15:47:07-- https://snapshot.debian.org/archive/debian/20210906T204127Z/pool/main/g/git-annex/git-annex_8.20210903-1_amd64.deb Resolving snapshot.debian.org (snapshot.debian.org)... 185.17.185.185, 193.62.202.27, 2001:1af8:4020:b030:deb::185, ... Connecting to snapshot.debian.org (snapshot.debian.org)|185.17.185.185|:443... connected. HTTP request sent, awaiting response... HTTP/1.1 200 OK date: Thu, 20 Apr 2023 19:47:08 GMT server: Apache x-content-type-options: nosniff x-frame-options: sameorigin referrer-policy: no-referrer x-xss-protection: 1 permissions-policy: interest-cohort=() last-modified: Mon, 06 Sep 2021 21:23:41 GMT etag: "c0fc68-5cb5a4359b140" accept-ranges: bytes content-length: 12647528 x-clacks-overhead: GNU Terry Pratchett cache-control: max-age=31536000, public x-varnish: 133040245 age: 0 via: 1.1 varnish (Varnish/6.5) strict-transport-security: max-age=15768000; preload connection: close Length: 12647528 (12M) Saving to: ‘git-annex_8.20210903-1_amd64.deb’

git-annex_8.20210903-1_amd64.deb 100%[=========================================================================================================================================================>] 12.06M 4.58MB/s in 2.6s

2023-04-20 15:47:11 (4.58 MB/s) - ‘git-annex_8.20210903-1_amd64.deb’ saved [12647528/12647528]

</details>

I will check with snapshots people since could not quickly grep it in codebase
@yarikoptic
Copy link
Member Author

in #153 we added check by content-length... but apparently we have been installing a fixed outdated version for the datalad-installer!!! Filed datalad/datalad#7380

yarikoptic added a commit to yarikoptic/datalad that referenced this issue May 1, 2023
Although snapshots might be more "official" and thus more "reliably present"
than our own server,  snapshots. has all kinds of throttling settings which
delay download or even cause it to fail:
datalad/datalad-installer#154

although datalad-installer should retry now (after it gets upgraded within
appveyor setup see datalad#7380) if
size changes, I think it would still be more robust to just get it from our
server.

I also made a note in
datalad/datalad-installer#160
so may be we gain possibility to specify multiple URLs thus to robustify.

=== Do not change lines below ===
{
 "chain": [],
 "cmd": "sed -i -e s,http://snapshot.debian.org/archive/debian/20210906T204127Z/pool/main/g/git-annex/,https://datasets.datalad.org/datalad/packages/neurodebian/,g .appveyor.yml",
 "exit": 0,
 "extra_inputs": [],
 "inputs": [],
 "outputs": [
  ".appveyor.yml"
 ],
 "pwd": "."
}
^^^ Do not change lines above ^^^
yarikoptic added a commit to yarikoptic/datalad that referenced this issue May 1, 2023
Although snapshots might be more "official" and thus more "reliably present"
than our own server,  snapshots. has all kinds of throttling settings which
delay download or even cause it to fail:
datalad/datalad-installer#154

although datalad-installer should retry now (after it gets upgraded within
appveyor setup see datalad#7380) if
size changes, I think it would still be more robust to just get it from our
server.

I also made a note in
datalad/datalad-installer#160
so may be we gain possibility to specify multiple URLs thus to robustify.

=== Do not change lines below ===
{
 "chain": [],
 "cmd": "sed -i -e s,http://snapshot.debian.org/archive/debian/20210906T204127Z/pool/main/g/git-annex/,https://datasets.datalad.org/datalad/packages/neurodebian/,g .appveyor.yml",
 "exit": 0,
 "extra_inputs": [],
 "inputs": [],
 "outputs": [
  ".appveyor.yml"
 ],
 "pwd": "."
}
^^^ Do not change lines above ^^^
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant