Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(bundler): extract, update, artifacts #3058

Merged
merged 21 commits into from
Jan 14, 2019

Conversation

rarkins
Copy link
Collaborator

@rarkins rarkins commented Jan 12, 2019

This PR adds extract, update and artifacts functions to put Bundler support into an "alpha" support stage. Some known gotchas:

  • Extracting and updating multi-part ranges, e.g. gem "uglifier", ">= 1.3.0", "< 1.4.0", require: false
  • Dynamic switching between Bundler versions. If Bundler 2.0 is meant to achieve this, it doesn't seem to be working. We'll probably solve it by using multiple Docker images
  • Updating the Gemfile.lock requires any referenced gemspec files to be present, so this only works with gitFs mode right now

@rarkins rarkins added the review label Jan 12, 2019
@rarkins
Copy link
Collaborator Author

rarkins commented Jan 12, 2019

Re: locating/writing gemspec files before lock file generation, does anyone know how exhaustively we need to check for these? Do we just need to look for gemspec and <project name>.gemspec, for example?

@rarkins rarkins mentioned this pull request Jan 12, 2019
@rarkins rarkins changed the title feat(bundler): extract, update, artifacts feat(bundler): extract, update, artifacts (WIP) Jan 12, 2019
@bjeanes
Copy link

bjeanes commented Jan 12, 2019

I don't recall ever running into a gem whose gemspec wasn't <name of library>.gemspec in root directory, if that's what you mean. https://guides.rubygems.org/make-your-own-gem/ doesn't mention any extra locations and the template that rubygems generates for new gems would put it there.

@rarkins
Copy link
Collaborator Author

rarkins commented Jan 12, 2019

Is the project name extracted from any project metadata or based purely on the parent directory name?

@bjeanes
Copy link

bjeanes commented Jan 12, 2019

That I don't know. I wouldn't be surprised if it's the other way around and it discovers gems via **/*.gemspec and then uses the metadata from within those files to determin actual gems. I suspect the placement and naming of the gemspec file is purely convention, but I don't know for sure, unfortunately.

@rarkins
Copy link
Collaborator Author

rarkins commented Jan 12, 2019

I’ll copy over every gemspec file it finds just to be “safe”.

@rye
Copy link

rye commented Jan 12, 2019

@rarkins, thanks for doing this!

For what it's worth, gem build explicitly takes an argument that is the gemspec file, so I don't think there exists any requirement within rubygems itself on the file being named or placed in a given location.

After a bit more digging, I found that Bundler's handling of gemspec's is largely inherited from the Bundler::Source::Path class, which has a DEFAULT_GLOB that may be of interest: {,*,*/*}.gemspec. This means that files matching .gemspec, *.gemspec, and */*.gemspec should match that glob. This is, of course, unless the glob option is set in the bundler config? I haven't ever seen this in practice so I'd have to do some more digging to find that.

I hope this is helpful. I'd be glad to do some more research if there are any other questions that need answering.

@rarkins
Copy link
Collaborator Author

rarkins commented Jan 12, 2019

@rye thanks very much for the extra research! Right now the string match we use on the file list should catch every single .gemspec file. When I tested against Rails's repo I found it still failed though because there was an arbitrary inclusion of its ./RAILS_VERSION file. So even if we grab ever gemspec file, literally any other file in the repo could be referenced. Anyway, this will still be fine if we use Renovate's gitFs as a fallback, as every single file is then available.

For now I think we're fine on the gemspec topic. The big remaining one is to be able to dynamically switch Bundler versions to match what's written in the Gemfile.lock. It's not really difficult but will take some time to set up all the Docker images necessary to support for the hosted version.

@rye
Copy link

rye commented Jan 12, 2019

@rarkins, yes, that inclusion of other files in a project is actually very commonplace, since gemspecs are actually Ruby source files and are interpreted as such. One approach to gathering information from a gemspec is to do as Bundler does and actually load the gemspec using RubyGems' own code, then extract dependency information and the like from that. I would be glad to work on this approach when I have some more availability—it should be moderately straightforward though, it's just calling Gem::Specification.load and identifying which fields to pull out. A bit of poking around with pry -rrubygems should be all that is needed. (Edit: to ascertain these fields)

As far as the Bundler version switching goes, that sounds like a good plan. I'll note that Bundler 2.0 (recently launched) includes the ability to interpret prior versions of Gemfile.locks, but many projects have not switched yet and it's probably better to use a project's own Bundler version for parity. That will be an intriguing challenge, especially since Bundler 2.0+ have more strict Ruby version requirements.

@rarkins
Copy link
Collaborator Author

rarkins commented Jan 12, 2019

My own experience attempting to use Bundler 2.0 for both 2.x and 1.x failed. I see they’ve had teething problems and even backed off requiring Rubygems 3.0.0 now, so I figured it’s easiest to just dynamically switch versions using docker tags instead (and parsing Gemfile.lock). With this current approach I hope we don’t have the need for any advanced ruby requiring as we’ll just use the full file system if necessary plus exact Bundler version.

@rye
Copy link

rye commented Jan 12, 2019

@rarkins yep, there have been issues with the 2.0 upgrade all around.

I will note, still, that parsing the Gemfile and Gemfile.lock using Bundler's own code may be a better approach, since it's available to you and would ensure consistent results—Gemfiles themselves are also Ruby source code… (source, group and gem are part of a DSL)

There are idiosyncrasies like that you can use ' or " when wrapping strings and such that would be hard(er) to account for if you're just parsing the file as raw input. And I'm sure someone somewhere has found a reason to require from their Gemfile.

As a proof of concept, I was able to rather easily use Bundler::LockfileParser to parse my project's lockfile:

require 'bundler'

lockfile_contents = open('Gemfile.lock', 'rb', &:read)
Bundler::LockfileParser.new(lockfile_contents).dependencies
# => {"fastlane"=>Gem::Dependency.new("fastlane", Gem::Requirement.new([">= 0"]), :runtime),
# "fastlane-plugin-bugsnag"=>Gem::Dependency.new("fastlane-plugin-bugsnag", Gem::Requirement.new([">= 0"]), :runtime),
# "json"=>Gem::Dependency.new("json", Gem::Requirement.new([">= 0"]), :runtime),
# "rubocop"=>Gem::Dependency.new("rubocop", Gem::Requirement.new(["~> 0.60"]), :runtime),
# "xcodeproj"=>Gem::Dependency.new("xcodeproj", Gem::Requirement.new([">= 0"]), :runtime)}

Bundler::LockfileParser.new(lockfile_contents).dependencies.keys
# => ["fastlane", "fastlane-plugin-bugsnag", "json", "rubocop", "xcodeproj"]

Bundler::LockfileParser.new(lockfile_contents).sources.map(&:remotes).flatten.map(&:to_s)
# => ["https://rubygems.org/"]

(This API works for Bundler 2.0 and 1.17, I didn't test widely but I'd imagine it hasn't changed much.)

So, I guess my argument is that your extraction code could be made much shorter (and more general) if you did opt to use the Bundler code itself. Calling this from JS could be as simple as exec-ing ruby with -rbundler and -e a string containing a script, or you could also have a Ruby script that handles other things. The nice thing is that then you have the exact same information that Bundler has.

Thoughts?

@rarkins
Copy link
Collaborator Author

rarkins commented Jan 12, 2019

@rye I'm still making up my mind regarding whether to do Gemfile parsing and/or updating using Ruby or keep it using JS. I had expected to need Ruby, but for now I think it might be at 95% of use cases already. e.g. even if someone requires something from their Gemfile, it doesn't necessarily mean we need to care about it - or support it.

However I do ideally want Renovate to be "flexible so you don't need to be", so I'm opening to switching to Ruby-based parsing if there are clear benefits.

One other thing to note is that we need to be able to not only parse a file, but also to be able to update it too. i.e. it's not enough to get a parsed object if is then really hard to perform the update. You can see with the current JS-based approach that we can do it with line numbers, which makes for the easiest possible update logic.

BTW I think the lockfile may be the one case where we don't need to do it in Ruby because it's simple and not executable code.

One downside of making it Ruby-based is that we may need to not only call a child process but actually call a Docker child process - which significantly increases the latency. The reason for this is that if we are executing arbitrary Ruby code then we need to as sandboxed as possible within Docker, rather than directly calling Ruby on the same machine/VM. That is, unless we think Bundler's parsing is sandboxed enough that it couldn't allow malicious code. But although this increases latency a lot, it remains to be seen whether it matters. e.g. if a typical Ruby project takes 10s to run and we add 0.5s to that, it's really not a big deal.

I think the way forward will be like this:

  1. Launch with this simple JS-based parsing
  2. Wait for cases where it "doesn't work" (for example, single quotes rather than double was an example you mentioned)
  3. Determine if they can be easily fixed in the JS parsing or not
  4. If not, then decide whether to make the slower Ruby parsing as a fallback option (configurable), or as the only option

Thanks again for your feedback

@rarkins
Copy link
Collaborator Author

rarkins commented Jan 13, 2019

My latest attempt at a Bundler 2 image seems to switch back to 1.x as desired, so I'll use that for now

@bjeanes
Copy link

bjeanes commented Jan 13, 2019

BTW I think the lockfile may be the one case where we don't need to do it in Ruby because it's simple and not executable code.

Correct.

Unfortunately, executable Gemfiles are pretty common. I've done this during Rails upgrades, for instance, so that I can dynamically change versions of some gems (e.g. gem X v1 is only compatible with Rails <=3, but v2 is ONLY compatible with Rails 4+, so there is no bridging version).

@rarkins
Copy link
Collaborator Author

rarkins commented Jan 13, 2019

As a general rule, I don't mind shortcomings/known limitation so long as they're gracefully wrong and also that they don't render the stuff that works useless. So I think in this case if you're pulling in versions based on Ruby logic, it simply won't update those and will only update gems that have version fields defined.

So a quick status update:

  • Parsing done with JS, any versions that rely on third party Ruby files won't get "renovated"
  • Using a Bundler 2.0 image for Docker, anyone running locally is free of course to use their own locally-installed/matching version
  • All *.gemspec files will be copied over prior to bundle lock. If Bundler still fails because the gemspec files reference other files, then need to opt into gitFs

@rye
Copy link

rye commented Jan 13, 2019

@rarkins, I definitely see where you're at. Thanks again—at this point, it seems like this PR is only meant to add bundler support in alpha stage, which it does quite well. I was under the impression that more projects than my own use ' as a string boundary in their Gemfiles… turns out it's just me. 😅 Looking around it should cover a lot of use cases.

Personally, I'd be more than happy to opt-in on one of my projects to see how well it works.

Just out of curiosity, would you be interested in getting Ruby-based parsing (sandboxed appropriately) instead as a contribution later on down the road? I'm now reading through the codebase to figure out how this might be done, but I'd be happy to chip that in later.

@rarkins
Copy link
Collaborator Author

rarkins commented Jan 13, 2019

@rye I think this approach may even be good enough for "GA", although obviously wait for feedback first. I'd definitely be happy to have a Ruby-based extract option added later to capture all use cases. Once the JS one is working and we can identify any shortcomings, being able to test/verify a Ruby-based drop-in replacement should be fairly simple. It would pretty much just be a matter of replacing the existing lib/manager/bundler/extract and update functions.

BTW we already do a Python-based parsing of setup.py files, so it wouldn't be the first time.

Anyway I think I'll merge this in the next day or two and will blog instructions on how to test/what to look for and will appreciate your feedback. Are you using the app or CLI currently?

@rye
Copy link

rye commented Jan 13, 2019

@rarkins All this sounds good. I definitely think having a full test base and feedback from more users will make any further changes this faster and higher-confidence, so releasing this as an opt-in kind of thing is the first step so some entirely bundler-based projects can start testing it out. (I have a few projects that I'll be inclined to use Renovate on with even partial Bundler support.)

Any yes, I'm using the app, didn't even know that Renovate could be run from the CLI until now, but that actually makes a lot of sense. I'm assuming I'll need to explicitly enable something in the config, but I'm definitely going to do that on StoDevX/AAO-React-Native quite quickly after this is made available to that project. (I could definitely run locally in the meantime if that would be more helpful.)

I'll be glad to provide any further feedback and help as a likely end-user of Bundler support in the long run.

@rarkins rarkins changed the title feat(bundler): extract, update, artifacts (WIP) feat(bundler): extract, update, artifacts Jan 14, 2019
@rarkins rarkins merged commit ba77d4a into master Jan 14, 2019
@rarkins rarkins deleted the feat/2983-bundler-extract-update branch January 14, 2019 05:52
@rarkins rarkins removed the review label Jan 14, 2019
@renovate-bot
Copy link
Collaborator

🎉 This PR is included in version 13.175.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@bjeanes
Copy link

bjeanes commented Jan 15, 2019

Where would you like feedback about this feature? New issues?

I don't have anything specific yet other than that every update is in an "error" state (according to master issue) but the job logs don't have anything that sticks out to me (I searched for dep name and for "error" and came up empty handed).

@rarkins
Copy link
Collaborator Author

rarkins commented Jan 15, 2019

@bjeanes I'll take a look for your repo now. For future I've posted this new issue so any feedback doesn't get lost: #3073

@rarkins
Copy link
Collaborator Author

rarkins commented Jan 15, 2019

@bjeanes I see that Renovate has detected Gemfile and .cache/Gemfile. Should the latter be ignored?

@rarkins
Copy link
Collaborator Author

rarkins commented Jan 15, 2019

By the way, I don't see any errors in the logs. It looks like it doesn't see any updates. Are there any that you are expecting? And can you paste a screenshot of the master issue here, or send to support@renovatebot.com if confidential?

@bjeanes
Copy link

bjeanes commented Jan 15, 2019

I see that Renovate has detected Gemfile and .cache/Gemfile. Should the latter be ignored?

In my case, yes, but this isn't anything common or universal about Ruby-land. I imagine there is some way to disable that that within my config.

I don't see errors either. Where I am seeing "error" is in the master issue. Renovate has since edited the issue but from the history view of the comment content:

screenshot 2019-01-15 at 21 24 05

@rarkins
Copy link
Collaborator Author

rarkins commented Jan 15, 2019

First I saw this:

{
 "branch":"renovate/machinist-2.x",
  "dependencies":["machinist"],
  "msg":"Skipping branch creation as not within schedule",
  "time":"2019-01-15T10:20:35.150Z"
}

And then the most recent log shows your repo exiting with:

Response code 429 (Too Many Requests)

It seems rubygems is rate limiting us, even with a very low number of requests so far. I need to look into it more.

@bjeanes
Copy link

bjeanes commented Jan 15, 2019

Skipping branch creation as not within schedule

I forced it in master issue and then it went into error state, so likely you are looking at a log of a newer job which is back to not scheduling it.

@rye
Copy link

rye commented Jan 16, 2019

@rarkins from the documentation on the RubyGems API:

Rate Limits

To protect the RubyGems.org service from abuse, both intentionally and unintentionally, we have rate limits in place for some of our endpoints. Some endpoints may be cached by our CDN at times and therefore may allow higher request rates. The following is a general guideline for the rate limit rules.

  • API and website: 10 requests per second
  • Dependency API: 15 requests per second
  • Website sign up, sign in, api key, and forgot password: 100 requests per 10 minutes

Users who hit a rate limit will see HTTP 429 responses, for the remainder of the limit window. Usually this is just a few minutes.

The RubyGems.org team may occasionally blackhole user IP addresses for extreme cases to protect the platform. If you think this has happened to you, please submit a help ticket and we’ll be happy to look at it.

From what you're seeing, does it look like Renovate is hitting the 15 requests per second limit? If so, it'll be worth digging into why. Alternatively, I would wonder if it's hitting the 100 request/10 minute limit for some reason?

@rarkins
Copy link
Collaborator Author

rarkins commented Jan 16, 2019

@bjeanes I think we're probably hitting the 15 requests per second limit, I will need to work out the best location to limit it

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 16, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants