Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tcpdump for Everyone: Changes to diego-release for the proposed pcap-release #703

Open
a18e opened this issue Feb 17, 2023 · 1 comment
Open

Comments

@a18e
Copy link

a18e commented Feb 17, 2023

Recently we proposed pcap-release as an easy way for CF application developers and landscape operators to capture network traffic for their apps and/or their BOSH VMs. See issue cloudfoundry/cf-deployment#980 for a more detailed description of pcap-release.

For the use case of capturing traffic from CF apps, we would need to implement some features in diego-release and would like to get your feedback on our proposed solution.

The following diagram shows how we're planning to capture app network traffic via the pcap-agent on the app-container, which is then sent via the pcap-api to the cf-CLI on the client machine:

single_instance_stream_to_client_pcapagent_on_container

Our proposed solution would work similarly to the cf app-ssh process:

  • cf-CLI plugin that implements commands to enable and perform tcpdumps on specific apps/app instances, with a possibility to pass on a packet filter as a parameter (e.g. for a specific source address) (see app-ssh commands)
  • pcap-api (analogous to ssh-proxy for app-ssh) acts as endpoint for cf-CLI and passes the requests on to the pcap-agent on the app-containers. pcap-api is also responsible for user authentication.
  • pcap-agent (analogous to diego-sshd for app-ssh) runs on the container and acts as a wrapper to libpcap to capture network traffic

We have already successfully executed a spike/PoC where we modified cloud-controller and diego-release on one of our dev-landscapes to globally enable pcap-agent/run the agent on every app-container in the landscape:

  • We added a new package “pcap-agent” to diego-release which build the pcap-agent from source
    (Note: For the final release, we're planning to use a submodule, see below)
  • The pcap-agent binary then packaged into the buildpack_app_lifecycle and docker_app_lifecycle (alongside diego-sshd), which are then extracted on every app-container

With these small changes we were able to perform a tcpdump on an app-container via the pcap-agent from any landscape-internal VM.

(Our issue on the required changes to the cloud-controller: cloudfoundry/cloud_controller_ng#3193)

While we directly included the pcap-agent source code in the diego-release src-directory, we’re planning to do this with a submodule in the future (We will extract the src/pcap folder in the current pcap-release into a separate repository which will serve as the diego-release submodule)

Before we move further, we would like to get your feedback, especially for the following questions:

  • Do you see any roadblocks or complexities we might have missed?
  • Is not having a Windows pcap-agent an issue?
  • Is it OK to include the pcap-agent-binaries in buildpack_app_lifecycle?
  • Do you agree with having a submodule for pcap-agent source code and including it as a submodule here?
  • How do we approach having our own go.mod file vs. the one in the diego-release/src/code.cloudfoundry.org folder?
@winkingturtle-vmw
Copy link
Contributor

@a18e Is this still a conversation happening in the community? Do you still need an answer to your questions ?

Looking at this briefly, I think one concerns that came to my mind is the backward compatibility of this feature. My understanding is that pcap-agent is claiming a port and I don't know if we can certainly guarantee that no one's app is not using that port.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

2 participants