Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MNT] remove coverage reporting and pytest-cov from PR CI #6363

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

fkiraly
Copy link
Collaborator

@fkiraly fkiraly commented Apr 29, 2024

This PR removes generation of coverage reports, installation and use of pytest-cov from standard CI. Also removes the (unreliable) coverage badge from the README

Reasons:

@fkiraly fkiraly added maintenance Continuous integration, unit testing & package distribution module:tests test framework functionality - only framework, excl specific tests labels Apr 29, 2024
@fkiraly
Copy link
Collaborator Author

fkiraly commented May 1, 2024

Anecdotal, but looks like this leads to substantial runtime improvements.

Before:
image

After:
image

@yarnabrina
Copy link
Collaborator

@fkiraly I want to be catious for this. Can we test these:

  1. what happens if instead of removing completely we only skip the xml and html reports? I believe a base one (.coverage) gets generated always, and these two run separately.
  2. if it is removed completely (assuming only for CI runs in PR), how does coverage report appear in README (after merge to main)? If it shows missing or 0% or similar, that will be misleading. To test, may be you can try to edit the link of README in this branch without actually merging.

My caution is mainly for the reason that it's highly counter intuitive to me that coverage will affect timing by this much. It's more than 3-4 times in your screenshots, and if had that been the general effect of pytest-cov, it's expected to be detected by users quite ago. It's very popular and standard, so I am really wondering if we are missing something else (though I don't have any alternative ideas yet).

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 1, 2024

what happens if instead of removing completely we only skip the xml and html reports? I believe a base one (.coverage) gets generated always, and these two run separately.

According to the profiler, indeed these parts create the overhead.

How would you turn these off separately? Can you help? Would it be removing the --cov-report html etc, but not --cov?

how does coverage report appear in README

This PR also removes the badge from the readme, because it is misleading anyway, with or without this PR.

We should find a way to display genuine coverage in the readme (see #5090) - I would consider that a separate issue (namely, #5090), and it would then include adding the correct coverage display to the readme.

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 1, 2024

so I am really wondering if we are missing something else (though I don't have any alternative ideas yet).

so am I.
A wild guess is that we have some runaway import chains in the style of #6355, which is causing the long runtimes.

Or, perhaps cause/effect are hard to detect in general?

@yarnabrina
Copy link
Collaborator

How would you turn these off separately? Can you help? Would it be removing the --cov-report html etc, but not --cov?

Yes, that only. Let's see what happens.

This PR also removes the badge from the readme, because it is misleading anyway, with or without this PR.

I think if README shows 0% or etc. it may give potential users/contributors a negative impression that this framework is untested (e.g. I know I'll feel the same for a new tool).

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 2, 2024

ok - I've added it back in the test-nosoftdeps-full job now, and in the pyproject.toml, to see what happens

@yarnabrina
Copy link
Collaborator

Only 5 jobs got triggered, not a single testing job! How did it ran everything earlier?

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 2, 2024

Only 5 jobs got triggered, not a single testing job! How did it ran everything earlier?

I see - I think I understand why the difference.

Previously, pytest-cov was removed from pyproject.toml, and that triggered "test all". Now, we added it back, and pyproject.toml is no longer modified, so there is no trigger for testing anything anymore.

@yarnabrina
Copy link
Collaborator

https://github.com/sktime/sktime/actions/runs/8940323391

I triggered a manual test all workflow on this branch for debugging.

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 3, 2024

Thanks.

Sth is taking hours again, how do we find out with which estimator it gets stuck?

@yarnabrina
Copy link
Collaborator

I am not aware of a better solution than going into verbose mode.

By the way, have you seen the failures? It seems every single "other" run failed with this:

FAILED sktime/tests/tests/test_test_utils.py::test_run_test_for_class - AssertionError: assert 'True_run_always' in ['True_pyproject_change', 'True_changed_class', 'True_changed_tests']

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 3, 2024

By the way, have you seen the failures? It seems every single "other" run failed with this:

Thanks for pointing this out - this is a bug with a test I added to make sure we test the run_test_for_class utility.

The bug surfaces only in the test_all workflow which has a certain combination of conditions. Fix here:
#6383

@yarnabrina
Copy link
Collaborator

I checked other jobs, and so far no timeout failure. Only one module job failed and it's for forecasting:

FAILED sktime/forecasting/model_evaluation/tests/test_evaluate.py::test_evaluate_common_configs[backend8-scoring1-refit-1-10-fh5-ExpandingWindowSplitter] - OverflowError: Python int too large to convert to C long

Ref. https://github.com/sktime/sktime/actions/runs/8940323391/job/24558260007#step:3:6594

Any idea if it's sporadic? We'll probably know from random seed diagnostic.

(FYI @benHeid )

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 3, 2024

FAILED sktime/forecasting/model_evaluation/tests/test_evaluate.py::test_evaluate_common_configs[backend8-scoring1-refit-1-10-fh5-ExpandingWindowSplitter] - OverflowError: Python int too large to convert to C long

This is definitely a new one - have not seen this before.

However, there have been failures in test_evaluate_common_configs in ancient times, but these seem unrelated?
#1194

We'll probably know from random seed diagnostic.

Probably not, as that one does not add random seeds except in TestAllForecasters, the test_evaluate_common_configs lives elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Continuous integration, unit testing & package distribution module:tests test framework functionality - only framework, excl specific tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants