Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8s.pod.network.io gives data only from eth0 #30196

Open
prabhatsharma opened this issue Dec 23, 2023 · 10 comments · May be fixed by #34287
Open

k8s.pod.network.io gives data only from eth0 #30196

prabhatsharma opened this issue Dec 23, 2023 · 10 comments · May be fixed by #34287
Labels
bug Something isn't working receiver/kubeletstats

Comments

@prabhatsharma
Copy link

Component(s)

receiver/kubeletstats

What happened?

Description

I have been trying to get network io details per namespace. Comparing the metrics of k8s.pod.network.io with cadvisor - container_network_receive_bytes_total I found that kubeletstatsreceiver does not return data for interfaces other than eth0 . This provides incomplete picture around networking bandwidth utilized.

Steps to Reproduce

sum(rate(k8s_pod_network_io{k8s_namespace_name="$k8s_namespace_name", direction="receive"}[5m])) by (interface)

vs

sum(rate(container_network_receive_bytes_total{namespace="$k8s_namespace_name"}[5m])) by (interface)

Expected Result

I should see data from all interfaces from k8s_pod_network_io stream.

Actual Result

got results only for eth0

image

Collector version

v0.90.1

Environment information

Environment

Amazon EKS

OpenTelemetry Collector configuration

receivers:
  kubeletstats:
      collection_interval: 15s
      auth_type: "serviceAccount"
      endpoint: "https://${env:K8S_NODE_NAME}:10250"
      insecure_skip_verify: true
      extra_metadata_labels:
        - container.id
        - k8s.volume.type
      metric_groups:
        - node
        - pod
        - container
        - volume
      metrics:
        k8s.pod.cpu_limit_utilization:
          enabled: true
        k8s.pod.cpu_request_utilization:
          enabled: true
        k8s.pod.memory_limit_utilization:
          enabled: true
        k8s.pod.memory_request_utilization:
          enabled: true

Log output

No response

Additional context

No response

@prabhatsharma prabhatsharma added bug Something isn't working needs triage New item requiring triage labels Dec 23, 2023
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@TylerHelmuth
Copy link
Member

@prabhatsharma I believe you're right, thank you for bringing this to our attention.

In https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/kubeletstatsreceiver/internal/kubelet/network.go we use the bytes provided at the root of https://pkg.go.dev/k8s.io/kubelet/pkg/apis/stats/v1alpha1#NetworkStats. If we wanted to recorded all the interface stats I believe we'd need to loop through the Interfaces slice.

If we did this I believe it would add an extra dimension to the datapoints we produced for interface name. I believe it would also be a breaking change.

@dmitryax @povilasv @jinja2 what are your thoughts?

@povilasv
Copy link
Contributor

povilasv commented Jan 9, 2024

This issue makes sense, looks like we are collecting only default network stats:

// Stats for the default interface, if found
// [InterfaceStats](https://pkg.go.dev/k8s.io/kubelet/pkg/apis/stats/v1alpha1#InterfaceStats) `json:",inline"`

It would make sense to add interface name as an extra dimension.

Regarding breaking change, we could make a featureflag?

@TylerHelmuth
Copy link
Member

Definitely a featureflag.

Also I think we're in luck: the existing metric already defines interface as an attribute so I think we could do this without breaking the existing metric. The breaking change would only be from new default metrics.

@crobert-1 crobert-1 added enhancement New feature or request bug Something isn't working and removed bug Something isn't working needs triage New item requiring triage enhancement New feature or request labels Jan 10, 2024
@jinja2
Copy link
Contributor

jinja2 commented Jan 16, 2024

I believe this issue is also impacting the k8s.node.network.* metrics. We are only reporting the default interface for node networkstats. The default interface is hardcoded to eth0 in kubelet it seems, so for setups not using eth0 interface name might not be seeing any network metrics with the receiver. I can look more into this if nobody has started work on it.

I am wondering if we need to have any additional logic for pods which run in hostNetwork since these would have all host network interfaces show up which can blow up in cardinality and the values might not even make sense since it is for complete host traffic.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@ChrsMark
Copy link
Member

I believe this issue is also impacting the k8s.node.network.* metrics. We are only reporting the default interface for node networkstats. The default interface is hardcoded to eth0 in kubelet it seems, so for setups not using eth0 interface name might not be seeing any network metrics with the receiver. I can look more into this if nobody has started work on it.

I am wondering if we need to have any additional logic for pods which run in hostNetwork since these would have all host network interfaces show up which can blow up in cardinality and the values might not even make sense since it is for complete host traffic.

#33993 reports the issue for the k8s.node.network.* metric. Any specific reason for #30626 not making it? Was that just left stale or there was another blocking reason?

@TylerHelmuth
Copy link
Member

I don't believe there was any blocking reason. We still want this, but it has been hard to prioritize.

@ChrsMark
Copy link
Member

I don't believe there was any blocking reason. We still want this, but it has been hard to prioritize.

Revived that at #34287. PTAL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working receiver/kubeletstats
Projects
None yet
6 participants