Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop anemone and use Spidr for Repo discovery #10947

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

sjha4
Copy link
Member

@sjha4 sjha4 commented Mar 22, 2024

To Do:

  • Support basic auth
  • Support proxy

What are the changes introduced in this pull request?

Considerations taken when implementing this change?

What are the testing steps for this pull request?

bundle install
Go to Content > Product > Repo discovery
Run repo discovery.

@sjha4
Copy link
Member Author

sjha4 commented Mar 22, 2024

@evgeni : Thoughts on spidr gem as replacement for anemone?

@evgeni
Copy link
Member

evgeni commented Mar 22, 2024

Doesn't look crazy? Only dep is nokogiri, which we have anyway, tested on modern rubies. Why not.

@sjha4
Copy link
Member Author

sjha4 commented May 21, 2024

I am seeing significant performance difference with what's on the PR vs the existing workflow..Looking at ways to speed this up..Will push updates when I get the performance sorted.

Update: Should be good to go with latest commit.
Able to see chunked output in repo discovery page and the task finishes in about the same time as earlier.

@sjha4 sjha4 changed the title Early Draft - Drop anemone and use Spidr Drop anemone and use Spidr for Repo discovery May 21, 2024
@sjha4 sjha4 marked this pull request as ready for review May 21, 2024 16:59
app/lib/katello/repo_discovery.rb Outdated Show resolved Hide resolved
katello.gemspec Outdated Show resolved Hide resolved
@sjha4 sjha4 force-pushed the anemone branch 2 times, most recently from d96f3f0 to 1d7766a Compare May 21, 2024 17:56
Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation wise I think you should separate the crawler and Docker search into separate classes. Perhaps even the file crawl as well. Right now it's confusing.

app/lib/katello/repo_discovery.rb Outdated Show resolved Hide resolved
app/lib/katello/repo_discovery.rb Outdated Show resolved Hide resolved
app/lib/katello/repo_discovery.rb Outdated Show resolved Hide resolved
katello.gemspec Outdated Show resolved Hide resolved
katello.gemspec Outdated Show resolved Hide resolved
Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this heads in the right direction.

Review wise I wonder if it makes sense to split it up: 1 to create the 3 classes and a follow up that replaces anemone with spidr.

app/lib/katello/resources/discovery/file_discovery.rb Outdated Show resolved Hide resolved
app/lib/katello/resources/discovery/file_discovery.rb Outdated Show resolved Hide resolved
@sjha4
Copy link
Member Author

sjha4 commented Jun 3, 2024

Review wise I wonder if it makes sense to split it up: 1 to create the 3 classes and a follow up that replaces anemone with spidr.

The anemone -> spidr change is localized to yum_discovery only right now as far a changes go. My 2 cents is it's a small enough change to be in one PR?

Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review wise I always prefer to first have a refactor and then an actual code change. Right now if I read the commits then it's the other way around. That makes this whole change harder to follow, but I don't have commit permissions here so I'll let other reviewers weigh in on that aspect.

upstream_credentials_and_search = {
upstream_username: nil,
upstream_password: nil,
search: '*'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be nil if upstream_credentials_and_search is overridden by a caller. If you want it to default to * you can use @search = upstream_credentials_and_search.fetch(:search, '*').

Comment on lines +10 to +11
@upstream_username = upstream_credentials_and_search[:upstream_username].empty? ? nil : upstream_credentials_and_search[:upstream_username]
@upstream_password = upstream_credentials_and_search[:upstream_password].empty? ? nil : upstream_credentials_and_search[:upstream_password]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it was already this way, but isn't this equivalent?

Suggested change
@upstream_username = upstream_credentials_and_search[:upstream_username].empty? ? nil : upstream_credentials_and_search[:upstream_username]
@upstream_password = upstream_credentials_and_search[:upstream_password].empty? ? nil : upstream_credentials_and_search[:upstream_password]
@upstream_username = upstream_credentials_and_search[:upstream_username].presence
@upstream_password = upstream_credentials_and_search[:upstream_password].presence

@@ -1,206 +1,33 @@
require 'uri'
require 'spidr'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should live in the yum class

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants