Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: pull-image: Only skip tests for TEEs #9613

Conversation

fidencio
Copy link
Member

@fidencio fidencio commented May 9, 2024

On 1423420, I've mistakenly disabled the tests entirely, for both non-TEEs and TEEs.

This happened as I didn't realise that confidential_setup would take non-TEEs into consideration. :-/

Now, let me follow-up on that and make sure that the tests will be running on non-TEEs.

@katacontainersbot katacontainersbot added the size/small Small and simple task label May 9, 2024
@fidencio fidencio force-pushed the topic/skip-pull-image-tests-on-tees-part-II branch from 41e31f9 to c2615cd Compare May 9, 2024 08:59
@katacontainersbot katacontainersbot added size/medium Average sized task and removed size/small Small and simple task labels May 9, 2024
@@ -9,7 +9,7 @@ load "${BATS_TEST_DIRNAME}/lib.sh"
load "${BATS_TEST_DIRNAME}/confidential_common.sh"

setup() {
confidential_setup && skip "Due to issues related to pull-image integration skip tests for ${KATA_HYPERVISOR}."
confidential_setup_non_tees || skip "Due to issues related to pull-image integration skip tests for ${KATA_HYPERVISOR}."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is maybe slightly outside of the scope of this PR, but it feels like the confidential_setup_non_tees/ confidential_setup logic is pretty confusing with various negation and requiring || vs. &&. This lead to the original mistake that at least three people overlooked.

This might be controversial, but I wonder if we would have better clarity (and avoid the issue we had here) with some renaming and separation and a bit more code. So this code would look something like:

if [ ! is_confidential_runtime_class ] then;
    skip "Test not supported for ${KATA_HYPERVISOR}."
fi
if [ is_confidential_hardware ]; then
     skip "Due to issues related to pull-image integration skip tests for ${KATA_HYPERVISOR}."
fi

It isn't as succinct coding wise, but I'd argue that the best code is the easier to understand rather than the shortest.

We could create a confidential_common function that wraps this and was called something like skip_for_non_confidential_and_tee_hardware that would add clarity as well, to avoid the duplication of this code in the bats files?

What do others thinks?

@fidencio fidencio force-pushed the topic/skip-pull-image-tests-on-tees-part-II branch from c2615cd to 56d7aad Compare May 9, 2024 10:05
@stevenhorsman
Copy link
Member

Looking much clearer - Thank you!

@fidencio fidencio force-pushed the topic/skip-pull-image-tests-on-tees-part-II branch from 56d7aad to 16a22b5 Compare May 9, 2024 11:07
Copy link
Member

@stevenhorsman stevenhorsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking much clearer. Thanks for the updates!

@fidencio fidencio force-pushed the topic/skip-pull-image-tests-on-tees-part-II branch from 16a22b5 to be25683 Compare May 9, 2024 14:23
@fidencio fidencio marked this pull request as draft May 9, 2024 15:25
@fidencio fidencio force-pushed the topic/skip-pull-image-tests-on-tees-part-II branch 4 times, most recently from cb5ef4f to a74c399 Compare May 15, 2024 11:11
@fidencio fidencio force-pushed the topic/skip-pull-image-tests-on-tees-part-II branch from a74c399 to 3dd6057 Compare May 15, 2024 17:21
@fidencio fidencio marked this pull request as ready for review May 15, 2024 17:21
Copy link
Contributor

@wainersm wainersm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is much better now! Thanks @fidencio @stevenhorsman !

Copy link
Member

@stevenhorsman stevenhorsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@fidencio
Copy link
Member Author

/test

@fidencio fidencio marked this pull request as draft May 16, 2024 01:24
@stevenhorsman stevenhorsman force-pushed the topic/skip-pull-image-tests-on-tees-part-II branch from 3dd6057 to 9d8bc54 Compare May 16, 2024 07:47
@stevenhorsman
Copy link
Member

We are getting a failure here where the pause image rootfs which we still expect on the host isn't found in the standard place

 count of container rootfs in host is: 0, expect count is: 1

We have seen this before in kata-containers/tests#5781, but we thought that was due to stale processes on bare-metal machines.

With the impending move to shared_fs=none we shouldn't rely on this any more, so I'm tempted to relax the check to ensure that the rootfs count is no higher than we expect, but I've rebased this to pick up the latest nydus-snapshotter version changes first, and will see if that makes a difference.

@stevenhorsman stevenhorsman force-pushed the topic/skip-pull-image-tests-on-tees-part-II branch from b625265 to dfbc2e7 Compare May 16, 2024 09:39
@fidencio
Copy link
Member Author

We are getting a failure here where the pause image rootfs which we still expect on the host isn't found in the standard place

 count of container rootfs in host is: 0, expect count is: 1

We have seen this before in kata-containers/tests#5781, but we thought that was due to stale processes on bare-metal machines.

With the impending move to shared_fs=none we shouldn't rely on this any more, so I'm tempted to relax the check to ensure that the rootfs count is no higher than we expect, but I've rebased this to pick up the latest nydus-snapshotter version changes first, and will see if that makes a difference.

Let's do this, as this whole test will basically be dropped in the very short future.

@stevenhorsman
Copy link
Member

stevenhorsman commented May 16, 2024

Let's do this, as this whole test will basically be dropped in the very short future.

Already done!

@fidencio @wainersm - I also took the opportunity to un skip the tests that check for runc then guest-pull and vice versa. They pass, but we're hitting failures in the attestation tests, but the output isn't useful. Any ideas?:

[run_kubernetes_tests.sh:110] INFO: Executing k8s-confidential-attestation.bats
1..2
# bats warning: Executed 0 instead of expected 2 tests
[run_kubernetes_tests.sh:111] ERROR: bats --show-output-of-passing-tests k8s-confidential-attestation.bats

@wainersm
Copy link
Contributor

Executed 0 instead of expected 2 tests

This kind of issue usually means unset variable being used. Let me look at the test's code.

@wainersm
Copy link
Contributor

Executed 0 instead of expected 2 tests

This kind of issue usually means unset variable being used. Let me look at the test's code.

Ok, k8s-confidential-attestation.bats calls confidential_setup() that no longer exist. We should adjust all tests that are still using that function.

@stevenhorsman
Copy link
Member

stevenhorsman commented May 16, 2024

Executed 0 instead of expected 2 tests

This kind of issue usually means unset variable being used. Let me look at the test's code.

Ok, k8s-confidential-attestation.bats calls confidential_setup() that no longer exist. We should adjust all tests that are still using that function.

Thanks for the made bats knowledge! I'll work on that tomorrow morning.

@fidencio fidencio force-pushed the topic/skip-pull-image-tests-on-tees-part-II branch from dfbc2e7 to 3451411 Compare May 17, 2024 10:33
@fidencio fidencio marked this pull request as ready for review May 17, 2024 10:33
fidencio and others added 7 commits May 17, 2024 12:39
Let's rename it to `is_confidential_runtime_class`, and adapt all the
places where it's called.

The new name provides a better description, leading to a better
understanding of what the function really does.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This function is a helper to check whether the KATA_HYPERVISOR being
used is a confidential hardware (TEE) or not, and we can use it to
skip or only run tests on those platforms when needed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
On 1423420, I've mistakenly disabled the tests entirely, for both
non-TEEs and TEEs.

This happened as I didn't realise that `confidential_setup` would take
non-TEEs into consideration. :-/

Now, let me follow-up on that and make sure that the tests will be
running on non-TEEs.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Running those with the non-TEE runtime classes will simply fail.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This test is called from `tests/integration/run_kuberentes_tests.sh`,
which already ensures that yq is installed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
- We previously have an expectation for the pause rootfs
to be pull on the host when we did a guest pull. We weren't
really clear why, but it is plausible related to the issues we had
with containerd and nydus caching. Now that is fixed we can begin
to address this with setting shared_fs=none, but let's start with
updating the rootfs host check to be not higher than expected

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Given that we think the containerd -> snapshotter image cache
problems have been resolved by bumping to nydus-snapshotter v0.3.13
we can try removing the skips to test this out

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
@fidencio fidencio force-pushed the topic/skip-pull-image-tests-on-tees-part-II branch from 3451411 to a92defd Compare May 17, 2024 10:40
@stevenhorsman
Copy link
Member

/test

This is done in order to work around
Azure/azure-cli#28984, following a suggestion
on the very same issue.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
@fidencio
Copy link
Member Author

/test

@fidencio fidencio merged commit cbfdc70 into kata-containers:main May 18, 2024
292 of 303 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ok-to-test size/medium Average sized task
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants