-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate homepage URLs for repositories #1438
base: master
Are you sure you want to change the base?
Conversation
Repositories can be configured with a homepage, which is prominently featured on GitHub as a link. For repositories under the `rust-lang` organization, we want to make sure that those links only point to domains that are explicitly allowed. Ideally, only domains owned and operated by the Rust project itself will be whitelisted. The risk with other domains is that they might expire silently and get taken over by malicious actors, who can then host phishing campaigns or malware on sites "advertised" by the Rust project.
An initial selection of domains has been added to the allowlist for homepage URLs. The domains are either owned and operated by the infra-team, or belong to GitHub.
This proposal is being discussed here: https://rust-lang.zulipchat.com/#narrow/stream/242791-t-infra/topic/Domains.20on.20GitHub |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, left one comment.
if let Some(homepage) = &repo.homepage { | ||
if !allowed_domains.iter().any(|domain| domain.is_match(homepage)) { | ||
errors.push(format!( | ||
"homepage URL for {}/{} is not on an allowed domain: {}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this could print the list of allowed domains to make debugging faster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm I'm not sure if that is helpful in the output. What I haven't done yet is document the allowlist in the schema and point to the configuration file. I think that should hopefully be enough for folks to quickly find the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, fair enough, if it's documented than that should be enough.
config.toml
Outdated
"https://areweasyncyet.rs/?.*", | ||
"https://crates.io/?.*", | ||
"https://docs.rs/?.*", | ||
"https://github.com/rust-lang/rust/issues/35437", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious: why also the URL of this github issue? :)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because someone set it as the URL on a repo. 😅 I don't want to allow all of GitHub with something like https://github.com/?.*
, because that could be abused with typosquatting. But we can change this to allow anything under the rust-lang
organization with a rule like http://github.com/rust-lang/?.*
.
@jdno since this patch in in draft, I assume this list is still incomplete, correct? |
Yes, there are three URLs that are currently missing from the list. As far as I can tell, they are not owned/managed by the infra-team:
|
That |
I didn't add all the domains that we own. Instead, I looked at the links that are currently set as a |
@@ -18,6 +18,18 @@ allowed-github-orgs = [ | |||
"rust-analyzer", | |||
] | |||
|
|||
allowed-domains = [ | |||
"https://(.*)rust-lang.org/?.*", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"https://(.*)rust-lang.org/?.*", | |
"https://(.*)\\.rust-lang.org/?.*", |
instead of all of this regex, would it make more sense to parse the URLs and check whether the host part ends with a list of allowed host suffixes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was the first approach that I considered, but then dropped when I came across the link to the GitHub issue and the bit.ly
link. There are probably valid reasons to link to domains that we don't own, but in that case I would only want to allow specific URLs and not all of bit.ly
or GitHub.
Repositories can be configured with a homepage, which is prominently featured on GitHub as a link. For repositories under the
rust-lang
organization, we want to make sure that those links only point to domains that are explicitly allowed. Ideally, only domains owned and operated by the Rust project itself will be whitelisted.The risk with other domains is that they might expire silently and get taken over by malicious actors, who can then host phishing campaigns or malware on sites "advertised" by the Rust project.
An initial selection of domains has been added to the allowlist for homepage URLs. The domains are either owned and operated by the infra-team or belong to GitHub.