Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support session dependencies (requires) #631

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

gschaffner
Copy link

@gschaffner gschaffner commented Jul 13, 2022

Hi! This adds session dependency resolution support to close #453 and as a prerequisite for #167. This implementation works and is fully tested, but this PR still needs some documentation and discussion.

To get a topological sort of the dependency graph, I haven't used graphlib.TopologicalSorter or NetworkX DAG methods because I don't think that such sorters can give nox the desired behavior. The problem, essentially, is that we don't just want any topological sorter but a specific (unique, hopefully) topological sort that the user can exert some ordering preference over. For example:

#  /<- e <- d
# c <------ a
# b <------/
#  \<------ f

@nox.session(python=False, requires=["c", "b"])
def a(session): pass

@nox.session(python=False)
def b(session): pass

@nox.session(python=False)
def c(session): pass

@nox.session(python=False, requires=["e"])
def d(session): pass

@nox.session(python=False, requires=["c"])
def e(session): pass

@nox.session(python=False, requires=["b"])
def f(session): pass

For this noxfile I would expect the following behavior:

  • nox -s e runs c, e.
  • nox -s d a runs c, e, d, b, a. Note that d and its dependencies run before a and its dependencies.
  • nox -s a d runs c, b, a, e, d. Note that a and its dependencies run before d and its dependencies. Also note that c runs before b; I'll get to the reasoning for this in a moment.

The problem with using graphlib.TopologicalSorter or NetworkX DAG methods is twofold:

  1. These sorters/methods take as argument just the graph. One cannot specify that they want only a topological sort of the subgraph specifically containing the -s sessions and their dependencies. That is, in the examples above, nox should not run f because f is not a (direct or recursive) dependency of any of the sessions specified by -s.

    One could work around this issue by using e.g. graphlib.TopologicalSorter to produce a topological sort of the entire graph, and then drop from this sort anything in {f and its recursive dependencies} - {*{node and its recursive dependencies} for node in (sessions listed in -s)}. This is still more computationally expensive than it needs to be though, although this should have minimal effect for typical noxfile-sized graphs.

  2. Crucially: graphlib.TopologicalSorter and networkx.topological_sort give no guarantee over which topological sort you get; they just guarantee that you get some topological sort. (The other NX topo. sorter, networkx.lexicographical_topological_sort, does give a unique sort, but I don't think that it can be used to produce the desired behavior still.)

    For nox's behavior to be deterministic, though, we want a specific topological sort1. For example, in the example above, all of the following are valid topological sorts for subgraph for nox -s a d (i.e. the subgraph containing only a, d, and their recursive dependencies):

    1. c, b, a, e, d # the desired sort for nox -s a d ?
    2. b, c, a, e, d # the desired sort for nox -s a d ?
    3. c, e, b, a, d
    4. b, c, e, a, d
    5. c, b, e, a, d
    6. c, e, d, b, a # the desired sort for nox -s d a
    7. b, c, e, d, a
    8. c, b, e, d, a
    9. c, e, b, d, a

    I would argue that (i) is the desired sort for nox -s a d because it is

    • "Lazy", i.e. it does not try run dependencies earlier than necessary. E.g. e does not run any earlier than is required for d.
    • "Stable". E.g. it runs a and its dependencies before d and its dependencies since a occurred before d in the -s listing. Similarly, it runs c before b since c occurred before b in a's requires.

    (ii) almost satisfies the same properties, but it is not stable in b and c. The way that the dependency resolver can choose between (1) and (2) without leaving this choice as an undefined implementation detail is by preferring to run b and c in the order that they appear in a's requires. This is exactly analogous to how the resolver prefers to run a and d in the order that they occur in in nox -s a d.

  3. As a minor point, using graphlib would require nox to either vendor graphlib or depend on e.g. graphlib-backport, which is a small third-party dependency that would be best to avoid for security.

nox._resolver.lazy_stable_topo_sort is a topological sorting algorithm with these properties. It can also detect what graph cycle prevented the subgraph from having a topological sort if the subgraph is cyclic. Its implementation is recursive, but I believe it is still best-case and worst-case O(n) where n is the number of nodes in the subgraph that a sort is requested for. I would love to hear if these properties also make sense to others.

Another point of discussion for this is how requires should interact with filtering flags (-k, -t). I think that nox should not allow sessions to be accidentally run without their dependencies, i.e. the manifest filtering should be done first, followed by inserting the dependencies of the filtered queue into to the queue appropriately. Thoughts?

More Examples

  • Change a in the above example to

    @nox.session(python=False, requires=["d"])
    def a(session): pass

    Then nox -s a d runs c, e, d, a. The resolver ignores the order preference a, d and will instead run d before a to satisfy the dependency a -> d.

  • #  /<------\
    # a    b    c
    #  \->/ \->/
    
    # E.g. `nox -s c` gives "Sessions are in a dependency cycle: c -> a -> b -> c".
    
    @nox.session(python=False, requires=["b"])
    def a(session): pass
    
    @nox.session(python=False, requires=["c"])
    def b(session): pass
    
    @nox.session(python=False, requires=["a"])
    def c(session): pass
  • # a <- b <- c
    
    nox.options.error_on_external_run = True
    nox.options.sessions = ("c",)
    
    @nox.session(python=False)
    def a(session):
        session.run("fail")
    
    @nox.session(python=False, requires=["a"])
    def b(session):
        print("I will never run since one of my required sessions doesn't succeed!")
    
    @nox.session(python=False, requires=["b"])
    def c(session):
        print("I will never run since one of my required sessions doesn't succeed!")
  • Putting {python} in requires is also supported:

    @nox.session(python=["3.8", "3.9"])
    def a(session): pass
    
    @nox.session(requires=["a"])
    def b(session): pass
    # e.g `nox -s b` runs [a-3.8, a-3.9, b].
    
    @nox.session(python="3.8", requires=["a-{python}"])
    def c(session): pass
    # e.g. `nox -s c` runs [a-3.8, c].

Also see the nox._resolver.lazy_stable_topo_sort docstring and tests/test__resolver.py for more examples and comments.

1: It would also be great if the topo. sort was not just deterministic but was mathematically unique such that which sort nox uses is not simply an implementation detail but part of the spec. Uniqueness is hard though. In any case, I think nox should pin its topological sorting algorithm for stability, i.e. don't use an algorithm implemented in an external dependency that we can't pin since pinning it could break installs of other packages into the same venv (thus potentially breaking packaging of nox for most non-Nix package managers).

@gschaffner gschaffner changed the title Session dependencies Support Session dependencies (requires) Jul 13, 2022
@gschaffner gschaffner changed the title Support Session dependencies (requires) Support session dependencies (requires) Jul 13, 2022
@gschaffner gschaffner force-pushed the session-dependencies branch 10 times, most recently from 3e48a6f to dc4af45 Compare July 15, 2022 22:26
@gschaffner
Copy link
Author

gschaffner commented Jul 15, 2022

https://github.com/gschaffner/nox/tree/share-venv branches off of this to also add a working implementation for #167.

Demo: (click to expand)

nox.options.error_on_external_run = True
nox.options.sessions = ("a", "b")

@nox.session(python="3.10")
def env1(session):
    session.install("pytest")

@nox.session(venv="env1")
def a(session):
    print(f"      v Should be 3.10")
    session.run("python", "-V")
    # pytest is available to this session.
    print("      v Should show pytest stderr below.")
    session.run("pytest", "cause_pytest_to_print_stderr")

@nox.session(python="3.9", requires=["env1"])
def b(session):
    # pytest is not available to this session---it depends on env1 as as a prerequisite,
    # but it does not share env1's venv.
    print("      v Should log either 'pytest not found' or an error_on_external_run error.")
    session.run("pytest", "-h")

@nox.session(python=["3.9", "3.10"])
def env2(session):
    session.install("pytest")

@nox.session(python="3.10", venv="env2-{python}")
def c(session): pass
# e.g. `nox -s c` will run [env2-3.10, c] but not env2-3.9

@nox.session(python=["3.9", "3.10"], venv="env2-{python}")
def d(session): pass
# e.g. `nox -s d` will run [env2-3.9, d-3.9, env2-3.10, d-3.10]

@nox.session(python=False, venv="env1-{python}")
def e(session): pass
# e.g. `nox -s e` will raise since the requested venv does not exist

I imagine that #167 should be handled in a separate PR after/if this PR is merged, though.

@FollowTheProcess
Copy link
Collaborator

Hey @gschaffner This is an impressive piece of work! Thanks for all effort you've clearly put in. I'll go through it all soon and add some comments/thoughts but for now I think agree with most of your opinions on how the topo sort should behave wrt lazy/stable.

I think that nox should not allow sessions to be accidentally run without their dependencies

I agree with this so long as it won't adversely affect people who don't use the requires features.

My main thoughts so far from a quick scan through are:

  • We could do with some more tests around this (you've already pointed this out) and it's falling slightly short of out coverage target. As this is a reasonably sized new feature ideally I'd like to see a fair few tests ensuring all sorts of weirdness is handled correctly (think of the worst noxfile someone could write using the graph and ensure we handle it properly etc.)
  • There's a few lint steps that seem to be failing, this is no big deal for now but should be cleaned up before we think about merging

This implementation works and has been tested, but this PR is not done and still needs some discussion

Is it worth marking this PR as draft for now then?

I'd love to get some other maintainers eye's on this as it's pretty sizeable but great work!

@theacodes
Copy link
Collaborator

I am a soft "no" on Nox having a full dependency graph for sessions. I envisioned Nox as a very small tool, and one of the early decisions was not to try to do what Make and ninja already do better.

That said, I am not a BDFL and I don't take a very active role in Nox's new features. If any active maintainer wants this, make it so.

@gschaffner
Copy link
Author

gschaffner commented Aug 24, 2022

Hi all, sorry for the delay! I just fixed the minor pycln issue. I also added integration tests to this branch and to https://github.com/gschaffner/nox/tree/share-venv as well (which extends this branch to add support for #167). The new integration tests should hit full coverage (with the minor exception of one # pragma: no cover for an error check that should not be reachable unless Nox's implementation for venv gets broken by a change in the future). There are

a fair few tests ensuring all sorts of weirdness is handled correctly (think of the worst noxfile someone could write using the graph and ensure we handle it properly etc.)

included, I believe.

Even if this doesn't get merged into upstream Nox, I wanted to add tests to this (in part since we've been using it at my workplace for a month) :)

I agree with this so long as it won't adversely affect people who don't use the requires features.

Yep, it would not affect users who don't use the requires or venv features.

I envisioned Nox as a very small tool, and one of the early decisions was not to try to do what Make and ninja already do better.

I totally agree, there is a big benefit to keeping tools small and targeted. At the same time though I think this has the potential to be useful to quite a few people who want to use a single tool without having to add complexity by wrapping their nox calls in a separate build system. I think a lot of the advantage of Nox comes from it being pure-Python and thus approachable by any Python dev (who may not necessarily be familiar with, or have any other reason to learn, Make or another graph-aware build system).

If any active maintainer wants this, make it so.

I'll post on the linked issues to hopefully bring some attention to this PR and https://github.com/gschaffner/nox/tree/share-venv from those who expressed interest in these features.

Is it worth marking this PR as draft for now then?

Done. The only thing missing still is documentation (and perhaps discussion).

If there is interest in merging this into upstream Nox, it might be good to think about reviewing it in conjunction with https://github.com/gschaffner/nox/tree/share-venv, as I organized the code in this branch with the plan to make it very simple to add venv-sharing on top of it in https://github.com/gschaffner/nox/tree/share-venv.

@smarie
Copy link
Contributor

smarie commented Sep 5, 2022

Great piece of work @gschaffner ! I do not currently have enough bandwidth to review the code, but as far as features are concerned, I think that it would be worth extending what you have done for using {python} in requires, to any (combination of) session parameter(s). Indeed python is now supported as a nox session parameter with @nox.parametrize, the same way other parameters are.

Concerning this mechanism, my other comment is also that requires could maybe accept both a string representing the (possibly parametrized) session id (see also creating custom ids for parametrized sessions), OR a dict dict(session="a", python="3.8", param_a=1, param_2=12). Note that at this stage requiring tuples of parameter values, or several parameter values, seems too much of an extra complexity (think about dict(session="a", python="3.8", params={"a,b"=[(1, 2), (3, 4)]}....)

Finally, it may become more readable to offer a dedicated @nox.requires() decorator if usage shows that the requires parameter gets overused/popular (which would mean success :) ). That could be more elegant in the long run (maybe...to check first).

Once again great proposal, and I hope that this will make its way to the main nox, without degrading its current elegance and performance of course ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

sessions: require a session?
4 participants