Network policy blocks established connections to RDS #236

Mohilpalav · 2024-03-19T22:53:06Z

What happened:

We have a workload running in an EKS cluster which makes a request to an RDS cluster on startup. This request is blocked by the Network policy despite having an egress rule to the RDS cluster subnet from that workload. We suspect that the outbound connection goes through before the network policy node agent starts tracking the connections, and when the response is received the node agent doesn't have the known allowed connection to match due to which the traffic gets denied.

This is what we can see in the network policy flow logs:

Node: ip-10-51-21-121.us-east-1.compute.internal;SIP: 10.47.53.151;SPORT: 5432;DIP: 10.27.36.181;DPORT: 45182;PROTOCOL: TCP;PolicyVerdict: DENY
Node: ip-10-51-21-121.us-east-1.compute.internal;SIP: 10.47.53.151;SPORT: 5432;DIP: 10.27.36.181;DPORT: 45174;PROTOCOL: TCP;PolicyVerdict: DENY

10.47.53.151:5432-> RDS
10.27.36.181 -> EKS workload

Unfortunately, the node agent logs only show this at the moment #103:

2024-03-19 21:31:19.049604118 +0000 UTC Logger.check error: failed to get caller
2024-03-19 21:31:19.858783024 +0000 UTC Logger.check error: failed to get caller
2024-03-19 21:31:19.923276681 +0000 UTC Logger.check error: failed to get caller

What you expected to happen:
The connection to RDS should be allowed.

How to reproduce it (as minimally and precisely as possible):

create a network policy that allows all egress, but no ingress traffic for a simple application
application, on startup, makes outbound connections (several) to some external service (eg. example.com)
deploy the application as a multi-replica deployment making this behavior more consistent
review to see if any return traffic / responses are denied by network policy agent when they should not be

Anything else we need to know?:
Similar issues:
#73
#186

Environment:

Kubernetes version (use kubectl version): v1.28
CNI Version: v1.16.4
Network Policy Agent Version: v1.0.8
OS (e.g: cat /etc/os-release): Amazon Linux 2
Kernel (e.g. uname -a): 5.10.210-201.852.amzn2.x86_64

The text was updated successfully, but these errors were encountered:

jayanthvn · 2024-05-09T01:18:01Z

Here the pod attempted to start a connection before NP enforcement and hence response packet is dropped. Pl refer to this #189 (comment) for detailed explanation.

Our recommended solution for this is Strict mode, which will gate pod launch until policies are configured against the newly launched pod - https://github.com/aws/amazon-vpc-cni-k8s?tab=readme-ov-file#network_policy_enforcing_mode-v1171

Other option if you don't want to enable this mode is to allow Service CIDRs given that your pods communicate via Service vips and this will allow return traffic..

achevuru · 2024-06-03T21:04:53Z

@Mohilpalav Did Strict mode help with your use case/issue?

FabrizioCafolla · 2024-06-11T16:11:54Z

@Mohilpalav Is there any solution for this issue?

Monska85 · 2024-06-12T15:53:57Z

Hello there,

we have the same problem, connecting to RDS service from a pod, but also when contacting the S3 service.
We try to reproduce the error, but it is not something predictable. We have some errors when we try to deploy a lot of pods at the same time that try to connect to the RDS or S3 service, but it is not always the case.

Did you find any solution to this problem?

Monska85 · 2024-08-02T16:00:38Z

Hello there,

we found a workaround here.

Using the ANNOTATE_POD_IP environment variable speeds up the process of discovering pod IP and, at the moment, the pod startup issues are no longer present.

albertschwarzkopf · 2024-09-11T06:35:10Z

In my case ANNOTATE_POD_IP has not helped really. Randomly pods have issues to esatablish networkconnections (e.g. after restarts), although it has worked before the restart.

Mohilpalav added the bug Something isn't working label Mar 19, 2024

younsl mentioned this issue Apr 11, 2024

Intermittent connection reset and delay running time #245

Closed

jayanthvn added the strict mode Issues blocked on strict mode implementation label May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Network policy blocks established connections to RDS #236

Network policy blocks established connections to RDS #236

Mohilpalav commented Mar 19, 2024

jayanthvn commented May 9, 2024

achevuru commented Jun 3, 2024

FabrizioCafolla commented Jun 11, 2024

Monska85 commented Jun 12, 2024

Monska85 commented Aug 2, 2024

albertschwarzkopf commented Sep 11, 2024

Network policy blocks established connections to RDS #236

Network policy blocks established connections to RDS #236

Comments

Mohilpalav commented Mar 19, 2024

jayanthvn commented May 9, 2024

achevuru commented Jun 3, 2024

FabrizioCafolla commented Jun 11, 2024

Monska85 commented Jun 12, 2024

Monska85 commented Aug 2, 2024

albertschwarzkopf commented Sep 11, 2024