Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

git-lfs-transfer opens a lot of unused sessions #5622

Open
axkibe opened this issue Jan 18, 2024 · 2 comments
Open

git-lfs-transfer opens a lot of unused sessions #5622

axkibe opened this issue Jan 18, 2024 · 2 comments

Comments

@axkibe
Copy link

axkibe commented Jan 18, 2024

Describe the issue
This isn't a bug that breaks anything, but something that wastes resources.

I'm currently implementing git-lfs-transfer for the SSH protocol for my node.js git server.

I tried to make the log most sensible to understand.

My syntax is to assign events to streams is "connection number:session number".

The git client simply tries to make a push with 1 LFS object.

So far its all good, first git makes one ssh connection with git-receive-pack command, then comes a second ssh connection with the LFS stuff and a third one for git-upload-pack.*

Within the LFS connection client and server version-hello each other, the client makes a batch request, which I reply them, with a go ahead (otherwise as one server solution I don't care about announcing these things much anyway, but I do the batch things to make the client happy) .. but then the client then makes 7 more sessions within that connection, makes 7 times version hellos with the server and then never to use them and only to uses the primary one for the put-object request (which I didn't implement yet)

Jan 18 16:22:28 ssh-lfs[2:1]: client wants to transfer: /axel.git upload
Jan 18 16:22:28 ssh-lfs[2:1]: user axel lfs transfers axel.git (upload)
2:1 OUT:000eversion=1\n0000
2:1  IN:000eversion 1\n
2:1  IN:0000
2:1 OUT:000fstatus 200\n0000
Jan 18 16:22:28 ssh[3]: client connected {
  ip: '::ffff:127.0.0.1',
  family: 'IPv6',
  port: 40928,
  header: {
    greeting: '',
    identRaw: 'SSH-2.0-OpenSSH_9.3p1 Ubuntu-1ubuntu3.2',
    versions: { protocol: '2.0', software: 'OpenSSH_9.3p1' },
    comments: 'Ubuntu-1ubuntu3.2'
  }
}
Jan 18 16:22:28 ssh-git[3:1]: client wants to git-upload-pack: '/axel.git'
Jan 18 16:22:28 ssh-git[3:1]: user axel accesses axel.git (git-upload-pack)
Jan 18 16:22:28 ssh-git[3:1]: spawning axel git-upload-pack
Jan 18 16:22:28 ssh-git[3:1]: path /vcs/git/axel.git
Jan 18 16:22:28 ssh-git[3:1]: git spawn close 0
Jan 18 16:22:28 ssh[3]: client disconnected
2:1  IN:000abatch\n
2:1  IN:0011transfer=ssh\n
2:1  IN:0015hash-algo=sha256\n
2:1  IN:001erefname=refs/heads/master\n
2:1  IN:0001
2:1  IN:004a44b5cad58d7f3247f8a0afada8bf5ecd06c903c84304758015adba455df7c32e 4964\n
2:1  IN:0000
2:1 OUT:000fstatus 200\n
2:1 OUT:0001
2:1 OUT:005144b5cad58d7f3247f8a0afada8bf5ecd06c903c84304758015adba455df7c32e 4964 upload\n
2:1 OUT:0000
Jan 18 16:22:28 ssh-lfs[2:2]: client wants to transfer: /axel.git upload
Jan 18 16:22:28 ssh-lfs[2:2]: user axel lfs transfers axel.git (upload)
2:2 OUT:000eversion=1\n0000
2:2  IN:000eversion 1\n
2:2  IN:0000
2:2 OUT:000fstatus 200\n0000
Jan 18 16:22:28 ssh-lfs[2:3]: client wants to transfer: /axel.git upload
Jan 18 16:22:28 ssh-lfs[2:3]: user axel lfs transfers axel.git (upload)
2:3 OUT:000eversion=1\n0000
2:3  IN:000eversion 1\n
2:3  IN:0000
2:3 OUT:000fstatus 200\n0000
Jan 18 16:22:28 ssh-lfs[2:4]: client wants to transfer: /axel.git upload
Jan 18 16:22:28 ssh-lfs[2:4]: user axel lfs transfers axel.git (upload)
2:4 OUT:000eversion=1\n0000
2:4  IN:000eversion 1\n
2:4  IN:0000
2:4 OUT:000fstatus 200\n0000
Jan 18 16:22:28 ssh-lfs[2:5]: client wants to transfer: /axel.git upload
Jan 18 16:22:28 ssh-lfs[2:5]: user axel lfs transfers axel.git (upload)
2:5 OUT:000eversion=1\n0000
2:5  IN:000eversion 1\n
2:5  IN:0000
2:5 OUT:000fstatus 200\n0000
Jan 18 16:22:28 ssh-lfs[2:6]: client wants to transfer: /axel.git upload
Jan 18 16:22:28 ssh-lfs[2:6]: user axel lfs transfers axel.git (upload)
2:6 OUT:000eversion=1\n0000
2:6  IN:000eversion 1\n
2:6  IN:0000
2:6 OUT:000fstatus 200\n0000
Jan 18 16:22:28 ssh-lfs[2:7]: client wants to transfer: /axel.git upload
Jan 18 16:22:28 ssh-lfs[2:7]: user axel lfs transfers axel.git (upload)
2:7 OUT:000eversion=1\n0000
2:7  IN:000eversion 1\n
2:7  IN:0000
2:7 OUT:000fstatus 200\n0000
Jan 18 16:22:28 ssh-lfs[2:8]: client wants to transfer: /axel.git upload
Jan 18 16:22:28 ssh-lfs[2:8]: user axel lfs transfers axel.git (upload)
2:8 OUT:000eversion=1\n0000
2:8  IN:000eversion 1\n
2:8  IN:0000
2:8 OUT:000fstatus 200\n0000
2:1  IN:0050put-object 44b5cad58d7f3247f8a0afada8bf5ecd06c903c84304758015adba455df7c32e\n
XXXX UNKNOWN CMD: put-object 44b5cad58d7f3247f8a0afada8bf5ecd06c903c84304758015adba455df7c32e

Expected behavior
Especially in this case I wouldnt expect any new sessions, you make them for simulatenous data transfer if there are more uploads? Isn't the bottleneck most of the time the network anyway?

Output of git lfs env

git-lfs/3.4.0 (GitHub; linux amd64; go 1.21.0)
git version 2.40.1

Endpoint=https://localhost/axel.git/info/lfs (auth=none)
  SSH=localhost:/axel.git
LocalWorkingDir=/home/axel/testgit/axel
LocalGitDir=/home/axel/testgit/axel/.git
LocalGitStorageDir=/home/axel/testgit/axel/.git
LocalMediaDir=/home/axel/testgit/axel/.git/lfs/objects
LocalReferenceDirs=
TempDir=/home/axel/testgit/axel/.git/lfs/tmp
ConcurrentTransfers=8
TusTransfers=false
BasicTransfersOnly=false
SkipDownloadErrors=false
FetchRecentAlways=false
FetchRecentRefsDays=7
FetchRecentCommitsDays=0
FetchRecentRefsIncludeRemotes=true
PruneOffsetDays=3
PruneVerifyRemoteAlways=false
PruneRemoteName=origin
LfsStorageDir=/home/axel/testgit/axel/.git/lfs
AccessDownload=none
AccessUpload=none
DownloadTransfers=basic,lfs-standalone-file,ssh
UploadTransfers=basic,lfs-standalone-file,ssh
GIT_EXEC_PATH=/usr/lib/git-core
git config filter.lfs.process = "git-lfs filter-process"
git config filter.lfs.smudge = "git-lfs smudge -- %f"
git config filter.lfs.clean = "git-lfs clean -- %f"
  • (Side node, IMO git could/should keep the one ssh connection for receive and upload and make another session in that connection, but thats something I'd have to take with the git guys tough, I understand why you as filter cant hijack their connection tough, with a ssh CONTROL_MASTER git will go through one connection, I have to test if git-lfs-transfer respects that later)
@bk2204
Copy link
Member

bk2204 commented Jan 19, 2024

Hey,

In the ideal world, we could well max out a single connection, but experience has shown us that that's not always the case. This value is controlled by the lfs.concurrenttransfers setting, which defaults to 8. However, since by default we do use the ControlMaster connection (except on Windows), we probably would get the same performance from a single connection.

I'll go ahead and change the default here for SSH to 1 and also see if I can make the creation of the connections lazier.

@axkibe
Copy link
Author

axkibe commented Jan 19, 2024

Yes it's mutliple sessions in a single SSH connection, in my case the server being a asynchronous monolithic that just speaks these protocols the impact is minimal, not to mention, only wondered during debugging what is happening. However more traditional server with sshd and git-lfs-transfer being an actual binary it would spawn 7 more processes which is a I guess I a non-legetible load for them.

If the bottle neck would the servers (or clients) harddrive it would make sense tough to have simultanesous uploads I doubt in practice it's hardly the case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants