Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dns proxy use original source address and port #28928

Conversation

jrajahalme
Copy link
Member

@jrajahalme jrajahalme commented Nov 1, 2023

Make Cilium DNS proxy transparent by using the original source address (and port) in upstream connections. Since clients can issue multiple requests from the same port without waiting for the responses in between, we must support multiple requests in flight at the same time on the client side as well. To this end we share the DNS client between all requests that use the same upstream 5-tuple.

Support for shared clients is imported from a new version of cilium/dns.

Cilium DNS proxy now uses the original pod's address as the source address towards the DNS servers.

@jrajahalme jrajahalme added area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. release-note/minor This PR changes functionality that users may find relevant to operating Cilium. area/fqdn Affects the FQDN policies feature labels Nov 1, 2023
@jrajahalme jrajahalme requested review from a team as code owners November 1, 2023 12:23
@jrajahalme jrajahalme marked this pull request as draft November 1, 2023 12:23
@jrajahalme
Copy link
Member Author

In local testing cilium monitor now shows the original pod source address and port is also used on the overlay:

<- endpoint 1929 flow 0x0 , identity 3664->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.1.203:43094 -> 10.96.0.10:53 udp
Policy verdict log: flow 0x0 local EP ID 1929, remote ID 36693, proto 17, egress, action redirect, auth: disabled, match L3-L4, 10.244.1.203:43094 -> 10.244.0.124:53 udp
-> proxy port 37757 flow 0x0 , identity 3664->unknown state new ifindex 0 orig-ip 0.0.0.0: 10.244.1.203:43094 -> 10.244.0.124:53 udp
<- endpoint 1929 flow 0x0 , identity 3664->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.1.203:43094 -> 10.96.0.10:53 udp
-> proxy port 37757 flow 0x0 , identity 3664->unknown state established ifindex 0 orig-ip 0.0.0.0: 10.244.1.203:43094 -> 10.244.0.124:53 udp
<- proxy flow 0x5164dc81 , identity 3664->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.1.203:43094 -> 10.244.0.124:53 udp
-> overlay flow 0x5164dc81 , identity 3664->36693 state unknown ifindex cilium_vxlan orig-ip 0.0.0.0: 10.244.1.203:43094 -> 10.244.0.124:53 udp
<- proxy flow 0xa0e9948a , identity 3664->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.1.203:43094 -> 10.244.0.124:53 udp
-> overlay flow 0xa0e9948a , identity 3664->36693 state unknown ifindex cilium_vxlan orig-ip 0.0.0.0: 10.244.1.203:43094 -> 10.244.0.124:53 udp
<- overlay flow 0x52db30f3 , identity 36693->unknown state unknown ifindex cilium_vxlan orig-ip 0.0.0.0: 10.244.0.124:53 -> 10.244.1.203:43094 udp
-> proxy flow 0x52db30f3 , identity 36693->3664 state reply ifindex lxc53a2958fd80a orig-ip 10.244.0.124: 10.244.0.124:53 -> 10.244.1.203:43094 udp
<- overlay flow 0x52db30f3 , identity 36693->unknown state unknown ifindex cilium_vxlan orig-ip 0.0.0.0: 10.244.0.124:53 -> 10.244.1.203:43094 udp
-> proxy flow 0x52db30f3 , identity 36693->3664 state reply ifindex lxc53a2958fd80a orig-ip 10.244.0.124: 10.244.0.124:53 -> 10.244.1.203:43094 udp
<- proxy flow 0x0 , identity unknown->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.0.124:53 -> 10.244.1.203:43094 udp
-> endpoint 1929 flow 0x0 , identity 36693->3664 state reply ifindex lxc53a2958fd80a orig-ip 10.244.0.124: 10.96.0.10:53 -> 10.244.1.203:43094 udp
<- endpoint 1929 flow 0x0 , identity 3664->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.1.203:43094 -> 10.96.0.10:53 udp
-> proxy port 37757 flow 0x0 , identity 3664->unknown state established ifindex 0 orig-ip 0.0.0.0: 10.244.1.203:43094 -> 10.244.0.124:53 udp
<- proxy flow 0xe1185fe7 , identity 3664->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.1.203:43094 -> 10.244.0.124:53 udp
-> overlay flow 0xe1185fe7 , identity 3664->36693 state unknown ifindex cilium_vxlan orig-ip 0.0.0.0: 10.244.1.203:43094 -> 10.244.0.124:53 udp
<- overlay flow 0x52db30f3 , identity 36693->unknown state unknown ifindex cilium_vxlan orig-ip 0.0.0.0: 10.244.0.124:53 -> 10.244.1.203:43094 udp
-> proxy flow 0x52db30f3 , identity 36693->3664 state reply ifindex lxc53a2958fd80a orig-ip 10.244.0.124: 10.244.0.124:53 -> 10.244.1.203:43094 udp
<- proxy flow 0x0 , identity unknown->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.0.124:53 -> 10.244.1.203:43094 udp
-> endpoint 1929 flow 0x0 , identity 36693->3664 state reply ifindex lxc53a2958fd80a orig-ip 10.244.0.124: 10.96.0.10:53 -> 10.244.1.203:43094 udp

@jrajahalme
Copy link
Member Author

dnsproxy gets timeouts waiting for response from the dns server when curl is issuing both IPv4 (A) and IPv6 (AAAA) queries at the same time. Works flawlessly if limiting to IPv4 only (curl -4 ...).

Functionally significant change is that now there are two upstream connections (one for A and other for AAAA records) at the same time, with the same 5-tuple. So the connections are fighting for the same packets. What I think happens on the timeout is that the "missing" response was delivered to the same socket as the successful one, but the connection does not read multiple answers from the same socket.

@jrajahalme
Copy link
Member Author

Added commit to share a DNS client between requests that share the same upstream 5-tuple.

@jrajahalme
Copy link
Member Author

One of the commits needs to be moved to cilium/dns.

@jrajahalme jrajahalme force-pushed the dns-proxy-use-original-source-address-and-port branch 2 times, most recently from 6e4b87e to a806be7 Compare November 3, 2023 06:18
@jrajahalme
Copy link
Member Author

Moved the refactoring commit to cilium/dns, added TryLock support in sasha-s/go-deadlock/pull/30

@jrajahalme
Copy link
Member Author

/test

@jrajahalme jrajahalme added the dont-merge/blocked Another PR must be merged before this one. label Nov 3, 2023
@jrajahalme
Copy link
Member Author

marked as blocked as we need to decide if we wait for sasha-s/go-deadlock#30 to merge or use my fork with it like in the 1st commit via go mod replace right now.

@jrajahalme jrajahalme force-pushed the dns-proxy-use-original-source-address-and-port branch from a806be7 to 0d80ed4 Compare November 3, 2023 10:00
@jrajahalme
Copy link
Member Author

Avoiding the use of original source address if it is known to be from the host networking namespace (host endpoint or localhost).

@jrajahalme
Copy link
Member Author

/test

Copy link
Member

@bimmlerd bimmlerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's still a draft, but I've briefly looked through and the code seems to have become more complex; thus questions around the approach arose

pkg/fqdn/dnsproxy/proxy.go Outdated Show resolved Hide resolved
@jrajahalme jrajahalme force-pushed the dns-proxy-use-original-source-address-and-port branch from 0d80ed4 to f743dda Compare November 6, 2023 10:40
@jrajahalme jrajahalme removed the dont-merge/blocked Another PR must be merged before this one. label Nov 6, 2023
@jrajahalme jrajahalme marked this pull request as ready for review November 6, 2023 10:40
@jrajahalme jrajahalme requested a review from a team as a code owner November 6, 2023 10:40
@jrajahalme
Copy link
Member Author

Rebased to restart CI in hopes cloudflare is better today.

@jrajahalme jrajahalme force-pushed the dns-proxy-use-original-source-address-and-port branch from f743dda to 12bfdeb Compare November 6, 2023 10:56
@jrajahalme
Copy link
Member Author

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Nov 8, 2023
@jrajahalme jrajahalme merged commit 9d70db8 into cilium:main Nov 9, 2023
59 of 61 checks passed
@jrajahalme jrajahalme removed the affects/v1.11 This issue affects v1.11 branch label Nov 9, 2023
@jrajahalme jrajahalme added backport-pending/1.12 backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. and removed needs-backport/1.12 needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch labels Nov 9, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport pending to v1.12 in 1.12.16 Nov 9, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport pending to v1.14 in 1.14.4 Nov 9, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport pending to v1.13 in 1.13.9 Nov 9, 2023
@github-actions github-actions bot added backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. and removed backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels Nov 9, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed this from Backport pending to v1.14 in 1.14.4 Nov 9, 2023
@github-actions github-actions bot added backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. and removed backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. labels Nov 9, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Backport done to v1.14 in 1.14.4 Nov 9, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed this from Backport pending to v1.13 in 1.13.9 Nov 9, 2023
@github-actions github-actions bot added backport-done/1.12 The backport for Cilium 1.12.x for this PR is done. and removed backport-pending/1.12 labels Nov 10, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Backport done to v1.13 in 1.13.9 Nov 10, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed this from Backport pending to v1.12 in 1.12.16 Nov 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/fqdn Affects the FQDN policies feature area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. backport-done/1.12 The backport for Cilium 1.12.x for this PR is done. backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.
Projects
No open projects
1.13.9
Backport done to v1.13
1.14.4
Backport done to v1.14
Development

Successfully merging this pull request may close these issues.

None yet

8 participants