Automatically detecting smart Git hosts #1628

nwinkler · 2014-12-15T10:38:42Z

Added logic to automatically detect smart Git hosts that allow shallow cloning. This is done by sending an ls-remote request to the server and then evaluating the returned HTTP header fields. For this, Curl
verbose logging is enabled for the ls-remote request, since Curl verbose logging sends the returned HTTP headers to stderr.

If the stderr output contains the desired header

Content-Type: application/x-git-upload-pack-advertisement

then the server supports shallow cloning.

This approach uses Git and Curl for the heavy lifting. Instead of implementing the request to the server using a simple HTTP client, Git is used, since it takes care of authentication using stored credentials.

The used approach should also work for BitBucket, which only sends the Content-Type header when a specific user agent is used. Using Git to make the request enables this behavior.

The function to detect the smart Git host (GitRemoteResolver.prototype._supportsShallowCloning) returns a promise that is resolved when the server's request is evaluated. The promise handling required an addition to GitHubResolver.js - to always resolve the promise to true, since GitHub supports shallow cloning.

This should fix the issues #1558, #1559 and #1568 - and provide a better solution than #1393, which disabled shallow cloning for everyone.

I don't have an instance of GitHub Enterprise that I could test this with - but I'm pretty confident that it is working. It would be great if someone with access to a GitHub Enterprise instance could verify this.

If you run Bower with -V (for verbose logging), you should see a message indicating whether the host in question (for non-GitHub hosts) supports shallow cloning, i.e. is a smart host.

The code could be optimized to cache information about the hosts - I haven't added this yet since I wanted to get feedback on whether the approach I've taken is sensible/acceptable.

sindresorhus · 2014-12-15T18:06:40Z

lib/core/resolvers/GitRemoteResolver.js

+            var stderrString;
+            var isSmartServer;
+
+            stderrString = stderr.toString();


moot as stderr is already a string

Yes, you're right, of course. I copied this from here: https://github.com/bower/bower/blob/master/lib/core/resolvers/GitRemoteResolver.js#L180 - toString() is used on stdout.

sheerun · 2015-01-11T16:42:11Z

Travis is failing for some reason
We probably need at least one test for this feature
Can you squash commits?

nwinkler · 2015-01-12T07:12:53Z

Sure - I'll take a look.

sheerun · 2015-01-12T10:07:17Z

Btw. there helpers.require helper that allows for mocking of requires in tests. It may be helpful.

nwinkler · 2015-01-12T14:13:02Z

Do you have an example for using helpers.require to mock a spawned command (via the cmd function)? That would be helpful.

I checked the existing test cases, but I couldn't find any instance where cmd was mocked.

sheerun · 2015-01-12T15:06:25Z

Something like this should work:

var gitRemoteResolverFactory = function (handler) {
  return helpers.require('lib/core/resolvers/GitRemoteResolver', {
    '../../util/cmd': handler
  });
};

var gitRemoteResolver = gitRemoteResolverFactory(function (cmd) {
  return Q.all(["stdout", "stderr"]);
});

EDIT: fixed few things

nwinkler · 2015-01-12T15:27:13Z

Thanks - this looks helpful indeed! I'll give it a try.

nwinkler · 2015-01-13T13:39:50Z

@sheerun Thanks a lot! Your code example was really helpful.

Added unit tests to verify the smart host detection and the --depth 1 parameter when cloning.
Squashed commits
Rebased onto master

Please let me know what you think.

nwinkler · 2015-02-27T08:35:43Z

Sorry for being such a nuisance, but is there anything I can do to get this moving?

sheerun · 2015-03-23T15:59:02Z

@nwinkler After I rebase it on master, the tests are failing :( Could you look at it?

nwinkler · 2015-03-23T16:06:22Z

Sure, I'll take a look at it either tonight or tomorrow morning.

samccone · 2015-03-24T03:49:29Z

lib/core/resolvers/GitRemoteResolver.js

+//      negotiation that needs to take place.
+//
+// The above should cover most cases, including BitBucket.
+GitRemoteResolver.prototype._supportsShallowCloning = function () {


wow this is so excellent!

Almost seems like this method deserves its own npm lib

Thanks - appreciate the feedback!

sheerun · 2015-03-24T05:53:24Z

this._shallowClone probably needs to be a lazy promise, instead of eager one.

Also, sometimes parsing url fails, and this._remote is null.

sheerun · 2015-03-24T06:03:53Z

Also, probably result of this should be cached between class instances (as far as I see GitRemoteResolver is instantiated for each source in bower.json; instead there should be only one remote call per host).

nwinkler · 2015-03-24T07:30:50Z

Yes, I'm aware that the result should probably be cached per host. I wanted to add that once I'm sure that the approach taken is acceptable.

Can you give me an example under which circumstances you see the remote parsing fails?

And I'm not sure I'm following with regards to eager/lazy promises - can you give an example?

nwinkler · 2015-03-24T10:54:52Z

I think I've found the culprit with regards to the failing test cases. It's in test/core/resolverFactory.js. The it('should recognize git remote endpoints correctly') test is creating several instances of GitRemoteResolver (one per URL), and they are all executing git ls-remote (through _supportsShallowCloning).

It looks like the way to solve this would be to mock the GitRemoteResolver._supportsShallowCloning call for these invocations. I've tried using helpers.require, but I'm not sure how to inject the mocked instance into the resolverFactory in the test.

This test (and the one after it) is essentially "spamming" the cmd throttle queue with calls to git ls-remote to detect whether the host supports shallow cloning. These queued requests are causing the timeouts in the other test cases, causing them to fail.

sheerun · 2015-03-24T15:52:04Z

I feel the tests should pass even before you implement mocks. That's why I suggested lazy promise (i.e. check shallow cloning support only when _shallowClone is actually used, instead executing it right away in constructor. This can be done by using _shallowClone to cache promise, and calling _supportsShallowCloning directly. Something like:

_supportsShallowCloning: =>
  return @_cachedClone if @_cachedClone
  @cachedClone = Q.resolve(true)

After that you'd need to properly handle failure case of shallow clone checking, and default it to false.

So for example if ls-remote takes more than two seconds / returns non-zero code you assume that host doesn't support shallow cloning.

After that point probably mocks are OK, but I'm not sure how to handle it. Probably we'd need to use https://github.com/mfncooper/mockery and intercept cmd calls selectively (or even introduce new abstraction layer on commands, so we can mock them more easily).

nwinkler · 2015-03-24T15:57:07Z

Yes, that makes sense. The lazy promise should be easy to add. Thanks for the hint!

Added logic to automatically detect smart Git hosts that allow shallow cloning. This is done by sending an `ls-remote` request to the server and then evaluating the returned HTTP header fields. For this, Curl verbose logging is enabled for the `ls-remote` request, since Curl verbose logging sends the returned HTTP headers to `stderr`. If the `stderr` output contains the desired header Content-Type: application/x-git-upload-pack-advertisement then the server supports shallow cloning. This approach uses Git and Curl for the heavy lifting. Instead of implementing the request to the server using a simple HTTP client, Git is used, since it takes care of authentication using stored credentials. The used approach should also work for BitBucket, which only sends the Content-Type header when a specific user agent is used. Using Git to make the request enables this behavior. The function to detect the smart Git host (`GitRemoteResolver.prototype._supportsShallowCloning`) returns a promise that is resolved when the server's request is evaluated. The promise handling required an addition to `GitHubResolver.js` - to always resolve the promise to `true`, since GitHub supports shallow cloning. Added test cases to verify the new functionality.

nwinkler · 2015-03-24T16:18:35Z

I've changed the implementation to only call git ls-remote when needed, and not from the constructor. Please take a look at the updated code and let me know whether you think I should something else.

The tests run fine for me locally now - everything rebased on master.

Let me know - I can spend some more time on this tonight/tomorrow if required.

nwinkler · 2015-03-24T19:39:58Z

Put some more work into this and added the following:

Caching for hosts that support shallow cloning. They should only be requested once, minimizing the number of requests.
Verfication that the this._remote is not empty. In this case, false is returned.

Added several unit tests to verify the above.

nwinkler · 2015-03-24T19:40:16Z

Let me know if I should squash commits at one point...

sheerun · 2015-03-24T20:08:21Z

We probably need to add caching, as without it this feature can slow down things instead ;/

We can probably extract this functionality it as util. Maybe even npm module.

I can squash commits for you when merging. You'll still be present as an author.

nwinkler · 2015-03-24T20:17:26Z

I've added caching in my last commit, it will only try once per host.

Looking forward to seeing this merged! And thanks for all the help!

nwinkler · 2015-03-25T10:36:38Z

Just a note to show how significant change is. For one of our typical projects, the time for bower install with v1.3.9 was two minutes. With the change in v1.3.10 (disabling shallow cloning), this time increased to seven minutes. With the above changes, it's down to two minutes again.

Really looking forward to having this in the official version.

sheerun · 2015-03-26T02:56:18Z

Looks good to me now :) Thank you for this amazing work!

I'll release it as soon as I finish tests for the login command: #1732

Automatically detecting smart Git hosts

sheerun · 2015-03-26T03:14:33Z

@nwinkler For some reason few tests are failing on master after merging (only on travis).

Could you look at it?

nwinkler · 2015-03-26T07:32:22Z

I'm taking a look now. It's strange, since in the PR Travis build, everything was fine: https://travis-ci.org/bower/bower/builds/55695755

nwinkler · 2015-03-26T08:39:24Z

I don't think it's my code that caused the unit test issue. I created a new branch off master before you merged in my PR, and the same error happens there: https://travis-ci.org/nwinkler/bower/jobs/55915627

It looks like something changed in the environment or dependencies between yesterday and today.

nwinkler · 2015-03-29T14:24:42Z

@sheerun I found the culprit for the failing tests: The request module updated from v2.53.0 to v2.54.0. The semver in bower's package.json has ^2.51.0, which leads to an update to v2.54.0.

If you set the version of request to 2.53.0 in package.json, the tests will start to work again.

nwinkler mentioned this pull request Dec 15, 2014

Configuration support for enabling/disabling shallow cloning #1559

Closed

sindresorhus reviewed Dec 15, 2014
View reviewed changes

sheerun added the enhancement label Jan 11, 2015

nwinkler force-pushed the detect-smart-git branch from 2118d82 to 3452dce Compare January 12, 2015 10:05

nwinkler force-pushed the detect-smart-git branch from 3452dce to 6f0028b Compare January 13, 2015 13:35

sheerun mentioned this pull request Mar 19, 2015

Release 1.4 #1732

Closed

16 tasks

samccone reviewed Mar 24, 2015
View reviewed changes

nwinkler and others added 2 commits March 24, 2015 17:13

Using a function reference instead of calling directly from constructor

a352d51

nwinkler force-pushed the detect-smart-git branch from 6f0028b to a352d51 Compare March 24, 2015 16:15

nwinkler added 2 commits March 24, 2015 20:22

Added support for caching hosts that support shallow cloning.

912808b

Added check for empty remote or no protocol set.

7e0a2ea

sheerun added a commit that referenced this pull request Mar 26, 2015

Merge pull request #1628 from nwinkler/detect-smart-git

4d59d26

Automatically detecting smart Git hosts

sheerun merged commit 4d59d26 into bower:master Mar 26, 2015

samny mentioned this pull request Apr 1, 2015

Installing bower dependencies by URL from Github Enterprise stopped working after upgrade to 1.4.1 #1764

Closed

ysbaddaden mentioned this pull request Apr 22, 2015

crystal deps install should do a shallow clone crystal-lang/crystal#555

Closed

joeljeske mentioned this pull request May 29, 2015

Local git server: stuck on checkout master Hacklone/private-bower#140

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatically detecting smart Git hosts #1628

Automatically detecting smart Git hosts #1628

nwinkler commented Dec 15, 2014

sindresorhus Dec 15, 2014

nwinkler Dec 16, 2014

sheerun commented Jan 11, 2015

nwinkler commented Jan 12, 2015

sheerun commented Jan 12, 2015

nwinkler commented Jan 12, 2015

sheerun commented Jan 12, 2015

nwinkler commented Jan 12, 2015

nwinkler commented Jan 13, 2015

nwinkler commented Feb 27, 2015

sheerun commented Mar 23, 2015

nwinkler commented Mar 23, 2015

samccone Mar 24, 2015

nwinkler Mar 24, 2015

sheerun commented Mar 24, 2015

sheerun commented Mar 24, 2015

nwinkler commented Mar 24, 2015

nwinkler commented Mar 24, 2015

sheerun commented Mar 24, 2015

nwinkler commented Mar 24, 2015

nwinkler commented Mar 24, 2015

nwinkler commented Mar 24, 2015

nwinkler commented Mar 24, 2015

sheerun commented Mar 24, 2015

nwinkler commented Mar 24, 2015

nwinkler commented Mar 25, 2015

sheerun commented Mar 26, 2015

sheerun commented Mar 26, 2015

nwinkler commented Mar 26, 2015

nwinkler commented Mar 26, 2015

nwinkler commented Mar 29, 2015

Automatically detecting smart Git hosts #1628

Automatically detecting smart Git hosts #1628

Conversation

nwinkler commented Dec 15, 2014

sindresorhus Dec 15, 2014

Choose a reason for hiding this comment

nwinkler Dec 16, 2014

Choose a reason for hiding this comment

sheerun commented Jan 11, 2015

nwinkler commented Jan 12, 2015

sheerun commented Jan 12, 2015

nwinkler commented Jan 12, 2015

sheerun commented Jan 12, 2015

nwinkler commented Jan 12, 2015

nwinkler commented Jan 13, 2015

nwinkler commented Feb 27, 2015

sheerun commented Mar 23, 2015

nwinkler commented Mar 23, 2015

samccone Mar 24, 2015

Choose a reason for hiding this comment

nwinkler Mar 24, 2015

Choose a reason for hiding this comment

sheerun commented Mar 24, 2015

sheerun commented Mar 24, 2015

nwinkler commented Mar 24, 2015

nwinkler commented Mar 24, 2015

sheerun commented Mar 24, 2015

nwinkler commented Mar 24, 2015

nwinkler commented Mar 24, 2015

nwinkler commented Mar 24, 2015

nwinkler commented Mar 24, 2015

sheerun commented Mar 24, 2015

nwinkler commented Mar 24, 2015

nwinkler commented Mar 25, 2015

sheerun commented Mar 26, 2015

sheerun commented Mar 26, 2015

nwinkler commented Mar 26, 2015

nwinkler commented Mar 26, 2015

nwinkler commented Mar 29, 2015