Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

USM: ssl: Ignore ELF files for other architectures #25505

Merged
merged 9 commits into from May 16, 2024

Conversation

vitkyrka
Copy link
Contributor

@vitkyrka vitkyrka commented May 10, 2024

What does this PR do?

Ignore ELF files which are not for the architecture we're running on. Without this, we could end up installing a uprobe on, for example, an arm64 binary on an amd64 machine, thus corrupting the arm64 instruction and leading to segmentation faults.

Motivation

https://datadoghq.atlassian.net/browse/USMON-981

Additional Notes

Possible Drawbacks / Trade-offs

Describe how to test/QA your changes

Test cases are included in the PR.

@vitkyrka vitkyrka added changelog/no-changelog team/usm The USM team qa/done Skip QA week as QA was done before merge and regressions are covered by tests labels May 10, 2024
@pr-commenter
Copy link

pr-commenter bot commented May 10, 2024

Test changes on VM

Use this command from test-infra-definitions to manually test this PR changes on a VM:

inv create-vm --pipeline-id=34436718 --os-family=ubuntu

Ignore ELF files which are not for the architecture we're running on.
Without this, we could end up installing a uprobe on, for example, an
arm64 binary on an amd64 machine, thus corrupting the arm64 instruction
and leading to segmentation faults.
@vitkyrka vitkyrka force-pushed the vincent.whitchurch/fix-qemu-segfault branch from 27798c0 to fcb01b1 Compare May 14, 2024 14:58
@vitkyrka vitkyrka changed the title USM: ssl: Fix segfault with QEMU USM: ssl: Ignore ELF files for other architectures May 14, 2024
@vitkyrka vitkyrka marked this pull request as ready for review May 14, 2024 15:03
@vitkyrka vitkyrka requested review from a team as code owners May 14, 2024 15:04
@pr-commenter
Copy link

pr-commenter bot commented May 14, 2024

Regression Detector

Regression Detector Results

Run ID: 9dccc378-9789-4f01-aa7d-05dd21098025
Baseline: 2cacfc1
Comparison: f2e1b88

Performance changes are noted in the perf column of each table:

  • ✅ = significantly better comparison variant performance
  • ❌ = significantly worse comparison variant performance
  • ➖ = no significant change in performance

No significant changes in experiment optimization goals

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.

Experiments ignored for regressions

Regressions in experiments with settings containing erratic: true are ignored.

perf experiment goal Δ mean % Δ mean % CI
file_to_blackhole % cpu utilization -9.74 [-14.78, -4.69]

Fine details of change detection per experiment

perf experiment goal Δ mean % Δ mean % CI
idle memory utilization +0.49 [+0.45, +0.52]
uds_dogstatsd_to_api_cpu % cpu utilization +0.02 [-2.85, +2.90]
uds_dogstatsd_to_api ingress throughput +0.01 [-0.20, +0.21]
trace_agent_json ingress throughput +0.00 [-0.01, +0.01]
trace_agent_msgpack ingress throughput -0.00 [-0.00, +0.00]
tcp_dd_logs_filter_exclude ingress throughput -0.02 [-0.06, +0.02]
pycheck_1000_100byte_tags % cpu utilization -0.05 [-4.72, +4.63]
otel_to_otel_logs ingress throughput -0.12 [-0.48, +0.25]
file_tree memory utilization -0.18 [-0.26, -0.11]
basic_py_check % cpu utilization -0.23 [-2.79, +2.34]
tcp_syslog_to_blackhole ingress throughput -4.48 [-25.15, +16.19]
file_to_blackhole % cpu utilization -9.74 [-14.78, -4.69]

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

  1. Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.

  2. Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.

  3. Its configuration does not mark it "erratic".


for {
// To allow time to attach
time.Sleep(1000 * time.Millisecond)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: time.Second
advanced nit: make that configurable (via command line), the CI can be slow sometimes (and USM won't be fast enough to hook)
advanced advanced nit: The binary can block on getting a character in stdin, to allow the test to signal when to start
advanced nit: how does this file differ from fmapper (pkg/network/usm/sharedlibraries/testutil/fmapper/fmapper.go) or prefetch_file (pkg/network/usm/testutil/prefetch_file/prefetch_file.go)

on a side note, why do we need both prefetch_file and fmapper? 😶‍🌫️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I got rid of the new program and just used fmapper which already has a waiting mechanism.

Comment on lines 55 to 65
func waitForProgramNotToBeTraced(t *testing.T, cmd *exec.Cmd) {
programType := "shared_libraries"
pid := cmd.Process.Pid

time.Sleep(3000 * time.Millisecond)

traced := utils.GetTracedPrograms(programType)
for _, prog := range traced {
require.False(t, slices.Contains[[]uint32](prog.PIDs, uint32(pid)))
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I wonder if we can remove the sleep in the code, and check if the path or the pid appear in the block-list (blocklistByID) we have in the different FileRegistries

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@vitkyrka
Copy link
Contributor Author

/merge

@dd-devflow
Copy link

dd-devflow bot commented May 16, 2024

🚂 MergeQueue

Pull request added to the queue.

This build is next! (estimated merge in less than 53m)

Use /merge -c to cancel this operation!

@dd-mergequeue dd-mergequeue bot merged commit a6c060e into main May 16, 2024
272 checks passed
@dd-mergequeue dd-mergequeue bot deleted the vincent.whitchurch/fix-qemu-segfault branch May 16, 2024 13:00
@github-actions github-actions bot added this to the 7.55.0 milestone May 16, 2024
vitkyrka added a commit that referenced this pull request May 22, 2024
* USM: ssl: Ignore ELF files for other architectures

Ignore ELF files which are not for the architecture we're running on.
Without this, we could end up installing a uprobe on, for example, an
arm64 binary on an amd64 machine, thus corrupting the arm64 instruction
and leading to segmentation faults.

* Use blockList in test

* Use OpenFromAnotherProcess instead of libmmap

* Remove unused libmmap

* Simplify testArch

* Skip test on unsupported platforms

* Format fakessl.c with clang-format

* Add shebang link to script

* Move libs to testdata

So that kmt will copy it to the VMs.

(cherry picked from commit a6c060e)

 Conflicts:
	pkg/network/usm/utils/debugger.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog/no-changelog component/system-probe qa/done Skip QA week as QA was done before merge and regressions are covered by tests team/usm The USM team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants