Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some improvements to nix parsing #1403

Merged
merged 2 commits into from
May 16, 2024
Merged

Conversation

jopejoe1
Copy link
Contributor

Added build logs to nix, improved the parsing of URL fields in nix, and added binary names to nix.

@jopejoe1 jopejoe1 force-pushed the nix-improvement branch 6 times, most recently from ad5b7f9 to a91285e Compare May 13, 2024 22:06
@AMDmi3
Copy link
Member

AMDmi3 commented May 15, 2024

add build logs to nix

We need direct links to raw build log files.
My bad, we support these kind of links as well.

improve url parsing in nix

You don't need extract_nix_url, as add_links already handles argument as either str, list or None. downloadPage support is useful, but I'd introduce a new link type for it (repology/package.py).

add binary name to nix

This part is invalid, binname is binary package name, not executable name.

@jopejoe1
Copy link
Contributor Author

add build logs to nix

We need direct links to raw build log files.

I don't think it is currently possible to get a link to the raw logs from the data that get published in the package.json as the raw log files all have hash or build ID in their URL, e.g. https://cache.nixos.org/log/s7ir5kdff5437fna8ks274sl537ks5md-firefox-devedition-unwrapped-126.0b5.drv or https://hydra.nixos.org/build/259278227/nixlog/1

improve url parsing in nix

You don't need extract_nix_url, as add_links already handles argument as either str, list or None. downloadPage support is useful, but I'd introduce a new link type for it (repology/package.py).

Would something like UPSTREAM_DOWNLOAD_PAGE be okay for that link type?

add binary name to nix

This part is invalid, binname is binary package name, not executable name.

Removing it.

@@ -297,6 +297,9 @@

# guix
'download_hosts_blacklist': [str],

# nixos
'branch': Coerce(str),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer strict schema so just str.

@@ -83,6 +100,11 @@ def extract_nix_licenses(whatever: Any) -> list[str]:


class NixJsonParser(Parser):
_branch: str

def __init__(self, branch: str = '') -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have str | None for optional values.

@@ -61,6 +61,23 @@ def extract_nix_licenses(whatever: Any) -> list[str]:
return []


def nix_has_logs(meta: dict[str, Any], arch: str) -> bool:
if 'platforms' not in meta or arch not in meta['platforms']:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most X in meta and meta[X] conditions can be simplified to meta.get(X, <default>).

@@ -182,6 +204,31 @@ def iter_parse(self, path: str, factory: PackageFactory) -> Iterable[PackageMake
if 'changelog' in meta:
pkg.add_links(LinkType.UPSTREAM_CHANGELOG, meta.get('changelog'))

for arch in ['x86_64-linux', 'aarch64-linux', 'x86_64-darwin', 'aarch64-darwin']:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's strange to do loop over items of two kinds the differentiate them with extra condition. Let's do two loops (for linux and darwin) instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even better, let's do a constant map of (arch, platform, is_trunk) -> link templates as in

_BUILD_LOGS_LINK_TEMPLATES = {
    ("x86_64", "linux", True): f'https://hydra.nixos.org/job/nixos/trunk-combined/nixpkgs.{key}.x86_64-linux',
    ...
}
for arch, platform in [('x86_64', 'linux')]:
    if nix_has_logs(meta, f'{arch}-{platform}') and (template := _BUILD_LOGS_LINK_TEMPLATES.get(arch, platform, branch is not None)):
        pkg.add_links(template.format(key = key, branch = branch))

which will also reduce code duplication.

@@ -61,6 +61,23 @@ def extract_nix_licenses(whatever: Any) -> list[str]:
return []


def nix_has_logs(meta: dict[str, Any], arch: str) -> bool:
if 'platforms' not in meta or arch not in meta['platforms']:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also simplify this into single return with complex multiline condition.

def nix_has_logs(meta: dict[str, Any], arch: str) -> bool:
if 'platforms' not in meta or arch not in meta['platforms']:
return False
elif 'broken' in meta and meta['broken']:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it documented anywhere that packages with these flags are not built or that these flags indicate build failure? If latter is the case, isn't link still valid containing earlier (possibly successful) builds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some documentation at https://github.com/NixOS/nixpkgs/blob/master/doc/stdenv/meta.chapter.md, but it's sadly incomplete and missing a lot of stuff.

Most of these are pretty much we know it's broken, so no need to waste resources building.

@@ -21,6 +21,7 @@
url: https://channels.nixos.org/nixos-{{branch}}/packages.json.br
parser:
class: NixJsonParser
branch: {{branch}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrap in {% if branch %} and quote for proper typing.

@AMDmi3
Copy link
Member

AMDmi3 commented May 15, 2024

I don't think it is currently possible to get a link to the raw logs from the data that get published in the package.json as the raw log files all have hash or build ID in their URL

I've updated the the comment - let's add these links. I've requested a few changes.

Would something like UPSTREAM_DOWNLOAD_PAGE be okay for that link type?

It would be perfect. Note that you may want to add it to a few places where UPSTREAM_HOMEPAGE or UPSTREAM_DOWNLOADS are handled, namely in sql.d/update/update_problems.sql of this repo and webapp (2 places in code and template I've just fixed). If it's not apparent how these should be handled just ignore this, I'll add them myself.

@jopejoe1
Copy link
Contributor Author

Applied all recommend changes and add champion pr repology/repology-webapp#246 for the webapp.

@AMDmi3 AMDmi3 merged commit 905b67d into repology:master May 16, 2024
1 check passed
@AMDmi3
Copy link
Member

AMDmi3 commented May 16, 2024

Excellent, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants