diff --git a/deployment/helm/balloons/README.md b/deployment/helm/balloons/README.md new file mode 100644 index 000000000..d8e95dce9 --- /dev/null +++ b/deployment/helm/balloons/README.md @@ -0,0 +1,104 @@ +# Balloons Policy Plugin + +This chart deploys balloons Node Resource Interface (NRI) plugin. The balloons +NRI resource policy plugin implements workload placement into “balloons” that +are disjoint CPU pools. + +## Prerequisites + +- Kubernetes 1.24+ +- Helm 3.0.0+ +- Container runtime: + - containerD: + - At least [containerd 1.7.0](https://github.com/containerd/containerd/releases/tag/v1.7.0) + release version to use the NRI feature. + + - Enable NRI feature by following + [these](https://github.com/containerd/containerd/blob/main/docs/NRI.md#enabling-nri-support-in-containerd) + detailed instructions. You can optionally enable the NRI in containerd + using the Helm chart during the chart installation simply by setting the + `nri.patchRuntimeConfig` parameter. For instance, + + ```sh + helm install my-balloons nri-plugins/nri-resource-policy-balloons --set nri.patchRuntimeConfig=true --namespace kube-system + ``` + + Enabling `nri.patchRuntimeConfig` creates an init container to turn on + NRI feature in containerd and only after that proceed the plugin + installation. + + - CRI-O + - At least [v1.26.0](https://github.com/cri-o/cri-o/releases/tag/v1.26.0) + release version to use the NRI feature + - Enable NRI feature by following + [these](https://github.com/cri-o/cri-o/blob/main/docs/crio.conf.5.md#crionri-table) + detailed instructions. You can optionally enable the NRI in CRI-O using + the Helm chart during the chart installation simply by setting the + `nri.patchRuntimeConfig` parameter. For instance, + + ```sh + helm install my-balloons nri-plugins/nri-resource-policy-balloons --namespace kube-system --set nri.patchRuntimeConfig=true + ``` + +## Installing the Chart + +Path to the chart: `nri-resource-policy-balloons` + +```sh +helm repo add nri-plugins https://containers.github.io/nri-plugins +helm install my-balloons nri-plugins/nri-resource-policy-balloons --namespace kube-system +``` + +The command above deploys balloons NRI plugin on the Kubernetes cluster within +the `kube-system` namespace with default configuration. To customize the +available parameters as described in the [Configuration options](#configuration-options) +below, you have two options: you can use the `--set` flag or create a custom +values.yaml file and provide it using the `-f` flag. For example: + +```sh +# Install the balloons plugin with custom values provided using the --set option +helm install my-balloons nri-plugins/nri-resource-policy-balloons --namespace kube-system --set nri.patchRuntimeConfig=true +``` + +```sh +# Install the balloons plugin with custom values specified in a custom values.yaml file +cat < myPath/values.yaml +nri: + patchRuntimeConfig: true + +tolerations: +- key: "node-role.kubernetes.io/control-plane" + operator: "Exists" + effect: "NoSchedule" +EOF + +helm install my-balloons nri-plugins/nri-resource-policy-balloons --namespace kube-system -f myPath/values.yaml +``` + +## Uninstalling the Chart + +To uninstall the balloons plugin run the following command: + +```sh +helm delete my-balloons --namespace kube-system +``` + +## Configuration options + +The tables below present an overview of the parameters available for users to +customize with their own values, along with the default values. + +| Name | Default | Description | +| ------------------------ | ----------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------- | +| `image.name` | [ghcr.io/containers/nri-plugins/nri-resource-policy-balloons](https://ghcr.io/containers/nri-plugins/nri-resource-policy-balloons) | container image name | +| `image.tag` | unstable | container image tag | +| `image.pullPolicy` | Always | image pull policy | +| `resources.cpu` | 500m | cpu resources for the Pod | +| `resources.memory` | 512Mi | memory qouta for the Pod | +| `hostPort` | 8891 | metrics port to expose on the host | +| `config` |
ReservedResources:
cpu: 750m
| plugin configuration data | +| `nri.patchRuntimeConfig` | false | enable NRI in containerd or CRI-O | +| `initImage.name` | [ghcr.io/containers/nri-plugins/config-manager](https://ghcr.io/containers/nri-plugins/config-manager) | init container image name | +| `initImage.tag` | unstable | init container image tag | +| `initImage.pullPolicy` | Always | init container image pull policy | +| `tolerations` | [] | specify taint toleration key, operator and effect | diff --git a/deployment/helm/memory-qos/README.md b/deployment/helm/memory-qos/README.md new file mode 100644 index 000000000..38fc2b18a --- /dev/null +++ b/deployment/helm/memory-qos/README.md @@ -0,0 +1,102 @@ +# Memory-QoS Plugin + +This chart deploys memory-qos Node Resource Interface (NRI) plugin. The +memory-qos NRI plugin adds two methods for controlling cgroups v2 memory.* +parameters: QoS class and direct memory annotations. + +## Prerequisites + +- Kubernetes 1.24+ +- Helm 3.0.0+ +- Container runtime: + - containerD: + - At least [containerd 1.7.0](https://github.com/containerd/containerd/releases/tag/v1.7.0) + release version to use the NRI feature. + + - Enable NRI feature by following + [these](https://github.com/containerd/containerd/blob/main/docs/NRI.md#enabling-nri-support-in-containerd) + detailed instructions. You can optionally enable the NRI in containerd + using the Helm chart during the chart installation simply by setting the + `nri.patchRuntimeConfig` parameter. For instance, + + ```sh + helm install my-memory-qos nri-plugins/nri-memory-qos --set nri.patchRuntimeConfig=true --namespace kube-system + ``` + + Enabling `nri.patchRuntimeConfig` creates an init container to turn on + NRI feature in containerd and only after that proceed the plugin + installation. + + - CRI-O + - At least [v1.26.0](https://github.com/cri-o/cri-o/releases/tag/v1.26.0) + release version to use the NRI feature + - Enable NRI feature by following + [these](https://github.com/cri-o/cri-o/blob/main/docs/crio.conf.5.md#crionri-table) + detailed instructions. You can optionally enable the NRI in CRI-O using + the Helm chart during the chart installation simply by setting the + `nri.patchRuntimeConfig` parameter. For instance, + + ```sh + helm install my-memory-qos nri-plugins/nri-memory-qos --namespace kube-system --set nri.patchRuntimeConfig=true + ``` + +## Installing the Chart + +Path to the chart: `nri-memory-qos`. + +```sh +helm repo add nri-plugins https://containers.github.io/nri-plugins +helm install my-memory-qos nri-plugins/nri-memory-qos --namespace kube-system +``` + +The command above deploys memory-qos NRI plugin on the Kubernetes cluster +within the `kube-system` namespace with default configuration. To customize the +available parameters as described in the [Configuration options](#configuration-options) +below, you have two options: you can use the `--set` flag or create a custom +values.yaml file and provide it using the `-f` flag. For example: + +```sh +# Install the memory-qos plugin with custom values provided using the --set option +helm install my-memory-qos nri-plugins/nri-memory-qos --namespace kube-system --set nri.patchRuntimeConfig=true +``` + +```sh +# Install the memory-qos plugin with custom values specified in a custom values.yaml file +cat < myPath/values.yaml +nri: + patchRuntimeConfig: true + +tolerations: +- key: "node-role.kubernetes.io/control-plane" + operator: "Exists" + effect: "NoSchedule" +EOF + +helm install my-memory-qos nri-plugins/nri-memory-qos --namespace kube-system -f myPath/values.yaml +``` + +## Uninstalling the Chart + +To uninstall the memory-qos plugin run the following command: + +```sh +helm delete my-memory-qos --namespace kube-system +``` + +## Configuration options + +The tables below present an overview of the parameters available for users to +customize with their own values, along with the default values. + +| Name | Default | Description | +| ------------------------ | ----------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------- | +| `image.name` | [ghcr.io/containers/nri-plugins/nri-memory-qos](https://ghcr.io/containers/nri-plugins/nri-memory-qos) | container image name | +| `image.tag` | unstable | container image tag | +| `image.pullPolicy` | Always | image pull policy | +| `resources.cpu` | 10m | cpu resources for the Pod | +| `resources.memory` | 100Mi | memory qouta for the | +| `nri.patchRuntimeConfig` | false | enable NRI in containerd or CRI-O | +| `initImage.name` | [ghcr.io/containers/nri-plugins/config-manager](https://ghcr.io/containers/nri-plugins/config-manager) | init container image name | +| `initImage.tag` | unstable | init container image tag | +| `initImage.pullPolicy` | Always | init container image pull policy | +| `tolerations` | [] | specify taint toleration key, operator and effect | diff --git a/deployment/helm/memtierd/README.md b/deployment/helm/memtierd/README.md new file mode 100644 index 000000000..73ec9569b --- /dev/null +++ b/deployment/helm/memtierd/README.md @@ -0,0 +1,102 @@ +# Memtierd Plugin + +This chart deploys memtierd Node Resource Interface (NRI) plugin. The memtierd +NRI plugin enables managing workloads with Memtierd in Kubernetes. + +## Prerequisites + +- Kubernetes 1.24+ +- Helm 3.0.0+ +- Container runtime: + - containerD: + - At least [containerd 1.7.0](https://github.com/containerd/containerd/releases/tag/v1.7.0) + release version to use the NRI feature. + + - Enable NRI feature by following + [these](https://github.com/containerd/containerd/blob/main/docs/NRI.md#enabling-nri-support-in-containerd) + detailed instructions. You can optionally enable the NRI in containerd + using the Helm chart during the chart installation simply by setting the + `nri.patchRuntimeConfig` parameter. For instance, + + ```sh + helm install my-memtierd nri-plugins/nri-memtierd --set nri.patchRuntimeConfig=true --namespace kube-system + ``` + + Enabling `nri.patchRuntimeConfig` creates an init container to turn on + NRI feature in containerd and only after that proceed the plugin + installation. + + - CRI-O + - At least [v1.26.0](https://github.com/cri-o/cri-o/releases/tag/v1.26.0) + release version to use the NRI feature + - Enable NRI feature by following + [these](https://github.com/cri-o/cri-o/blob/main/docs/crio.conf.5.md#crionri-table) + detailed instructions. You can optionally enable the NRI in CRI-O using + the Helm chart during the chart installation simply by setting the + `nri.patchRuntimeConfig` parameter. For instance, + + ```sh + helm install my-memtierd nri-plugins/nri-memtierd --namespace kube-system --set nri.patchRuntimeConfig=true + ``` + +## Installing the Chart + +Path to the chart: `nri-memtierd`. + +```sh +helm repo add nri-plugins https://containers.github.io/nri-plugins +helm install my-memtierd nri-plugins/nri-memtierd --namespace kube-system +``` + +The command above deploys memtierd NRI plugin on the Kubernetes cluster within +the `kube-system` namespace with default configuration. To customize the +available parameters as described in the [Configuration options](#configuration-options) +below, you have two options: you can use the `--set` flag or create a custom +values.yaml file and provide it using the `-f` flag. For example: + +```sh +# Install the memtierd plugin with custom values provided using the --set option +helm install my-memtierd nri-plugins/nri-memtierd --namespace kube-system --set nri.patchRuntimeConfig=true +``` + +```sh +# Install the nri-memtierd plugin with custom values specified in a custom values.yaml file +cat < myPath/values.yaml +nri: + patchRuntimeConfig: true + +tolerations: +- key: "node-role.kubernetes.io/control-plane" + operator: "Exists" + effect: "NoSchedule" +EOF + +helm install my-memtierd nri-plugins/nri-memtierd --namespace kube-system -f myPath/values.yaml +``` + +## Uninstalling the Chart + +To uninstall the memtierd plugin run the following command: + +```sh +helm delete my-memtierd --namespace kube-system +``` + +## Configuration options + +The tables below present an overview of the parameters available for users to +customize with their own values, along with the default values. + +| Name | Default | Description | +| ------------------------ | ----------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------- | +| `image.name` | [ghcr.io/containers/nri-plugins/nri-memtierd](https://ghcr.io/containers/nri-plugins/nri-memtierd) | container image name | +| `image.tag` | unstable | container image tag | +| `image.pullPolicy` | Always | image pull policy | +| `resources.cpu` | 250m | cpu resources for the Pod | +| `resources.memory` | 100Mi | memory qouta for the | +| `outputDir` | empty string | host directory for memtierd.output files | +| `nri.patchRuntimeConfig` | false | enable NRI in containerd or CRI-O | +| `initImage.name` | [ghcr.io/containers/nri-plugins/config-manager](https://ghcr.io/containers/nri-plugins/config-manager) | init container image name | +| `initImage.tag` | unstable | init container image tag | +| `initImage.pullPolicy` | Always | init container image pull policy | +| `tolerations` | [] | specify taint toleration key, operator and effect | diff --git a/deployment/helm/topology-aware/README.md b/deployment/helm/topology-aware/README.md new file mode 100644 index 000000000..7acbfc7be --- /dev/null +++ b/deployment/helm/topology-aware/README.md @@ -0,0 +1,105 @@ +# Topology-Aware Policy Plugin + +This chart deploys topology-aware Node Resource Interface (NRI) plugin. +Topology-aware NRI resource policy plugin is a NRI plugin that will apply +hardware-aware resource allocation policies to the containers running in the +system. + +## Prerequisites + +- Kubernetes 1.24+ +- Helm 3.0.0+ +- Container runtime: + - containerD: + - At least [containerd 1.7.0](https://github.com/containerd/containerd/releases/tag/v1.7.0) + release version to use the NRI feature. + + - Enable NRI feature by following + [these](https://github.com/containerd/containerd/blob/main/docs/NRI.md#enabling-nri-support-in-containerd) + detailed instructions. You can optionally enable the NRI in containerd + using the Helm chart during the chart installation simply by setting the + `nri.patchRuntimeConfig` parameter. For instance, + + ```sh + helm install my-topology-aware nri-plugins/nri-resource-policy-topology-aware --set nri.patchRuntimeConfig=true --namespace kube-system + ``` + + Enabling `nri.patchRuntimeConfig` creates an init container to turn on + NRI feature in containerd and only after that proceed the plugin + installation. + + - CRI-O + - At least [v1.26.0](https://github.com/cri-o/cri-o/releases/tag/v1.26.0) + release version to use the NRI feature + - Enable NRI feature by following + [these](https://github.com/cri-o/cri-o/blob/main/docs/crio.conf.5.md#crionri-table) + detailed instructions. You can optionally enable the NRI in CRI-O using + the Helm chart during the chart installation simply by setting the + `nri.patchRuntimeConfig` parameter. For instance, + + ```sh + helm install my-topology-aware nri-plugins/nri-resource-policy-topology-aware --namespace kube-system --set nri.patchRuntimeConfig=true + ``` + +## Installing the Chart + +Path to the chart: `nri-resource-policy-topology-aware`. + +```sh +helm repo add nri-plugins https://containers.github.io/nri-plugins +helm install my-topology-aware nri-plugins/nri-resource-policy-topology-aware --namespace kube-system +``` + +The command above deploys topology-aware NRI plugin on the Kubernetes cluster +within the `kube-system` namespace with default configuration. To customize the +available parameters as described in the [Configuration options](#configuration-options) +below, you have two options: you can use the `--set` flag or create a custom +values.yaml file and provide it using the `-f` flag. For example: + +```sh +# Install the topology-aware plugin with custom values provided using the --set option +helm install my-topology-aware nri-plugins/nri-resource-policy-topology-aware --namespace kube-system --set nri.patchRuntimeConfig=true +``` + +```sh +# Install the topology-aware plugin with custom values specified in a custom values.yaml file +cat < myPath/values.yaml +nri: + patchRuntimeConfig: true + +tolerations: +- key: "node-role.kubernetes.io/control-plane" + operator: "Exists" + effect: "NoSchedule" +EOF + +helm install my-topology-aware nri-plugins/nri-resource-policy-topology-aware --namespace kube-system -f myPath/values.yaml +``` + +## Uninstalling the Chart + +To uninstall the topology-aware plugin run the following command: + +```sh +helm delete my-topology-aware --namespace kube-system +``` + +## Configuration options + +The tables below present an overview of the parameters available for users to +customize with their own values, along with the default values. + +| Name | Default | Description | +| ------------------------ | ----------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------- | +| `image.name` | [ghcr.io/containers/nri-plugins/nri-resource-policy-topology-aware](https://ghcr.io/containers/nri-plugins/nri-resource-policy-topology-aware) | container image name | +| `image.tag` | unstable | container image tag | +| `image.pullPolicy` | Always | image pull policy | +| `resources.cpu` | 500m | cpu resources for the Pod | +| `resources.memory` | 512Mi | memory qouta for the Pod | +| `hostPort` | 8891 | metrics port to expose on the host | +| `config` |
ReservedResources:
cpu: 750m
| plugin configuration data | +| `nri.patchRuntimeConfig` | false | enable NRI in containerd or CRI-O | +| `initImage.name` | [ghcr.io/containers/nri-plugins/config-manager](https://ghcr.io/containers/nri-plugins/config-manager) | init container image name | +| `initImage.tag` | unstable | init container image tag | +| `initImage.pullPolicy` | Always | init container image pull policy | +| `tolerations` | [] | specify taint toleration key, operator and effect | diff --git a/docs/conf.py b/docs/conf.py index 90e631737..b3377a93b 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -127,7 +127,7 @@ def gomod_versions(modules): # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. # This pattern also affects html_static_path and html_extra_path. -exclude_patterns = ['_build', '.github', '_work', 'generate', 'README.md', 'TODO.md', 'SECURITY.md', 'CODE-OF-CONDUCT.md', 'docs/releases', 'test/self-hosted-runner/README.md', 'test/e2e/README.md', 'docs/resource-policy/releases', 'docs/resource-policy/README.md','test/statistics-analysis/README.md'] +exclude_patterns = ['_build', '.github', '_work', 'generate', 'README.md', 'TODO.md', 'SECURITY.md', 'CODE-OF-CONDUCT.md', 'docs/releases', 'test/self-hosted-runner/README.md', 'test/e2e/README.md', 'docs/resource-policy/releases', 'docs/resource-policy/README.md','test/statistics-analysis/README.md', 'deployment/helm/*/*.md'] # -- Options for HTML output ------------------------------------------------- diff --git a/docs/deployment/balloons.md b/docs/deployment/balloons.md new file mode 100644 index 000000000..89c8d3a70 --- /dev/null +++ b/docs/deployment/balloons.md @@ -0,0 +1,2 @@ +```{include} ../../deployment/helm/balloons/README.md +``` \ No newline at end of file diff --git a/docs/deployment/index.md b/docs/deployment/index.md new file mode 100644 index 000000000..c2c0d14ab --- /dev/null +++ b/docs/deployment/index.md @@ -0,0 +1,15 @@ +# Deployment + +The only supported installation method of the NRI plugins at the moment is +through Helm. + +```{toctree} +--- +maxdepth: 2 +caption: Contents +--- +balloons.md +topology-aware.md +memory-qos.md +memtierd.md +``` diff --git a/docs/deployment/memory-qos.md b/docs/deployment/memory-qos.md new file mode 100644 index 000000000..4ed196196 --- /dev/null +++ b/docs/deployment/memory-qos.md @@ -0,0 +1,2 @@ +```{include} ../../deployment/helm/memory-qos/README.md +``` diff --git a/docs/deployment/memtierd.md b/docs/deployment/memtierd.md new file mode 100644 index 000000000..40612361c --- /dev/null +++ b/docs/deployment/memtierd.md @@ -0,0 +1,2 @@ +```{include} ../../deployment/helm/memtierd/README.md +``` diff --git a/docs/deployment/topology-aware.md b/docs/deployment/topology-aware.md new file mode 100644 index 000000000..3ad6dc67d --- /dev/null +++ b/docs/deployment/topology-aware.md @@ -0,0 +1,2 @@ +```{include} ../../deployment/helm/topology-aware/README.md +``` diff --git a/docs/index.md b/docs/index.md index 1aa33fb9a..d6e4a7a36 100644 --- a/docs/index.md +++ b/docs/index.md @@ -8,6 +8,7 @@ caption: Contents introduction.md resource-policy/index.rst memory/index.md +deployment/index.md contributing.md Project GitHub repository ``` diff --git a/docs/introduction.md b/docs/introduction.md index af51c169a..6cf1e2bad 100644 --- a/docs/introduction.md +++ b/docs/introduction.md @@ -1,6 +1,7 @@ # Introduction The NRI plugins is a collection of NRI (Node Resource Interface) based plugins -to manage various aspects of pod and container life cycle. -For example the [resource policy plugins](resource-policy/policy/index.md) can be used to modify -the container resource allocation depending on available system resources. +to manage various aspects of pod and container life cycle. For example the +[resource policy plugins](resource-policy/policy/index.md) can be used to +modify the container resource allocation depending on available system +resources. diff --git a/docs/memory/memory-qos.md b/docs/memory/memory-qos.md index da13faf6e..6d7aa4d83 100644 --- a/docs/memory/memory-qos.md +++ b/docs/memory/memory-qos.md @@ -6,6 +6,7 @@ parameters: memory QoS classes and direct memory annotations. ## Workload configuration There are two configuration methods: + 1. Memory QoS classes: memory parameters are calculated in the same way for all workloads that belong to the same class. 2. Direct workload-specific memory parameters. @@ -48,6 +49,7 @@ Plugin configuration lists memory QoS classes and their parameters that affect calculating actual memory parameters. `classes:` is followed by list of maps with following keys and values: + - `name` (string): name of the memory QoS class, matches `class.memory-qos.nri.io` annotation values. - `swaplimitratio` (from 0.0 to 1.0): minimum ratio of container's @@ -94,7 +96,8 @@ This configuration defines the following. - Containerd v1.7+ - Enable NRI in /etc/containerd/config.toml: - ``` + + ```toml [plugins."io.containerd.nri.v1.nri"] disable = false disable_connections = false diff --git a/docs/memory/memtierd.md b/docs/memory/memtierd.md index a98bc89e3..aa6d06301 100644 --- a/docs/memory/memtierd.md +++ b/docs/memory/memtierd.md @@ -31,6 +31,7 @@ The class of a pod or a container is defined using pod annotations: Plugin configuration lists workload classes and their attributes. `classes:` is followed by list of maps with following keys and values: + - `name` (string): name of the class, matches `class.memtierd.nri.io` annotations. - `allowswap` (`true` or `false`): if `true`, allow OS to swap the @@ -91,7 +92,8 @@ for more configuration options. - Containerd v1.7+ - Enable NRI in /etc/containerd/config.toml: - ``` + + ```toml [plugins."io.containerd.nri.v1.nri"] disable = false disable_connections = false @@ -101,7 +103,9 @@ for more configuration options. plugin_request_timeout = "2s" socket_path = "/var/run/nri/nri.sock" ``` + - To run the nri-memtierd plugin on a host, install memtierd on the host. + ```bash GOBIN=/usr/local/bin go install github.com/intel/memtierd/cmd/memtierd@latest ``` diff --git a/docs/resource-policy/README.md b/docs/resource-policy/README.md deleted file mode 100644 index 3d6d28ae4..000000000 --- a/docs/resource-policy/README.md +++ /dev/null @@ -1,172 +0,0 @@ -# NRI Resource Policy for Kubernetes - -NRI resource policy is a NRI plugin that will apply hardware-aware -resource allocation policies to the containers running in the system. - -## NRI Resource Policy Usage - -Compile the available resource policies. Currently there exists -topology-aware and balloons policies. The binaries are created to -build/bin directory. - -``` - $ make -``` - -In order to use the policies in a Kubernetes cluster node, a DaemonSet deployment -file and corresponding container image are created to build/images directory. -You need to have Docker installed in order to build the images. - -``` - $ make images - $ ls build/images - nri-resource-policy-balloons-deployment.yaml - nri-resource-policy-balloons-image-ed6fffe77071.tar - nri-resource-policy-topology-aware-deployment.yaml - nri-resource-policy-topology-aware-image-9797e8de7107.tar -``` - -Only one policy can be running in the cluster node at one time. In this example we -run topology-aware policy in the cluster node. - -You need to copy the deployment file (yaml) and corresponding image file (tar) -to the node: - -``` - $ scp nri-resource-policy-topology-aware-deployment.yaml nri-resource-policy-topology-aware-image-9797e8de7107.tar node: -``` - -NRI needs to be setup in the cluster node: - -``` - # mkdir -p /etc/nri - # echo "disableConnections: false" > /etc/nri/nri.conf - # mkdir -p /opt/nri/plugins -``` - -Note that containerd must have NRI support enabled and NRI is currently only -available in 1.7beta or later containerd release. This is why you must do -some extra steps in order to enable NRI plugin support in containerd. - -This will create a fresh config file and backup the old one if it existed: - -``` - # [ -f /etc/containerd/config.toml ] && cp /etc/containerd/config.toml.backup - # containerd config default > /etc/containerd/config.toml -``` - -Edit the `/etc/containerd/config.toml` file and set `plugins."io.containerd.nri.v1.nri"` -option `disable = true` to `disable = false` and restart containerd. - - -Before deploying NRI resource policy plugin, you need to declare the CRDs it needs. -Copy first the CRD YAMLs to the node: - -``` - $ scp deployment/base/crds/noderesourcetopology_crd.yaml node: -``` - -Then log in to the node and create the CRDs: - -``` - $ ssh node - (on the node) $ kubectl apply -f noderesourcetopology_crd.yaml -``` - -You can now deploy NRI resource policy plugin: - -``` - $ ctr -n k8s.io images import nri-resource-policy-topology-aware-image-9797e8de7107.tar - $ kubectl apply -f nri-resource-policy-topology-aware-deployment.yaml -``` - -Verify that the pod is running: - -``` - $ kubectl -n kube-system get pods - NAMESPACE NAME READY STATUS RESTARTS AGE - kube-system nri-resource-policy-nblgl 1/1 Running 0 18m -``` - -To see the resource policy logs: - -``` - $ kubectl -n kube-system logs nri-resource-policy-nblgl -``` - -In order to see how resource policy allocates resources for the topology-aware policy, -you can create a simple pod to see the changes: - -``` - $ cat pod0.yaml -apiVersion: v1 -kind: Pod -metadata: - name: pod0 - labels: - app: pod0 -spec: - containers: - - name: pod0c0 - image: busybox - imagePullPolicy: IfNotPresent - command: - - sh - - -c - - echo pod0c0 $(sleep inf) - resources: - requests: - cpu: 750m - memory: '100M' - limits: - cpu: 750m - memory: '100M' - - name: pod0c1 - image: busybox - imagePullPolicy: IfNotPresent - command: - - sh - - -c - - echo pod0c0 $(sleep inf) - resources: - requests: - cpu: 750m - memory: '100M' - limits: - cpu: 750m - memory: '100M' - terminationGracePeriodSeconds: 1 - - $ kubectl apply -f pod0.yaml -``` - -Then if you *have already* deployed nri-resource-policy, the resources are allocated in isolation like this: - -``` - $ kubectl exec pod0 -c pod0c0 -- grep allowed_list: /proc/self/status - Cpus_allowed_list: 8 - Mems_allowed_list: 2 - - $ kubectl exec pod0 -c pod0c1 -- grep allowed_list: /proc/self/status - Cpus_allowed_list: 12 - Mems_allowed_list: 3 -``` - -If you *have not* deployed yet nri-resource-policy, the containers are allocated to same CPUs and memory: - -``` - $ kubectl exec pod0 -c pod0c0 -- grep allowed_list: /proc/self/status - Cpus_allowed_list: 0-15 - Mems_allowed_list: 0-3 - - $ kubectl exec pod0 -c pod0c1 -- grep allowed_list: /proc/self/status - Cpus_allowed_list: 0-15 - Mems_allowed_list: 0-3 -``` - -You can also check the difference in resource allocation using an alternative sequence -of steps. Remove the simple test pod, remove the nri-resource-policy deployment, re-create the -simple test pod, check the resources, re-create the nri-resource-policy deployment, then check -the resources and compare to the previous. You should see the resources reassigned so -that the containers in the pod are isolated from each other into different NUMA nodes -if your HW setup makes this possible. diff --git a/docs/resource-policy/developers-guide/architecture.md b/docs/resource-policy/developers-guide/architecture.md index 006dcdce6..7d8b2d792 100644 --- a/docs/resource-policy/developers-guide/architecture.md +++ b/docs/resource-policy/developers-guide/architecture.md @@ -41,11 +41,12 @@ by NRI-RP with the Kubernetes Control Plane go through the node agent with the node agent performing any direct interactions on behalf of NRI-RP. The agent interface implements the following functionality: - - push updated external configuration data to NRI-RP - - updating resource capacity of the node - - getting, setting, or removing labels on the node - - getting, setting, or removing annotations on the node - - getting, setting, or removing taints on the node + +- push updated external configuration data to NRI-RP +- updating resource capacity of the node +- getting, setting, or removing labels on the node +- getting, setting, or removing annotations on the node +- getting, setting, or removing taints on the node The config interface is defined and has its gRPC server running in NRI-RP. The agent acts as a gRPC client for this interface. The low-level @@ -55,9 +56,9 @@ NRI-RP acts as a gRPC client for the low-level plumbing interface. Additionally, the stock node agent that comes with NRI-RP implements schemes for: - - configuration management for all NRI-RP instances - - management of dynamic adjustments to container resource assignments +- configuration management for all NRI-RP instances +- management of dynamic adjustments to container resource assignments ### [Resource Manager](tree:/pkg/resmgr/) @@ -81,35 +82,37 @@ pipeline; hand it off for logging, then relay it to the server and the corresponding response back to the client. B. If the request needs to be intercepted for policying, do the following: - 1. Lock the processing pipeline serialization lock. - 2. Look up/create cache objects (pod/container) for the request. - 3. If the request has no resource allocation consequences, do proxying - (step 6). - 4. Otherwise, invoke the policy layer for resource allocation: - - Pass it on to the configured active policy, which will - - Allocate resources for the container. - - Update the assignments for the container in the cache. - - Update any other containers affected by the allocation in the cache. - 5. Invoke the controller layer for post-policy processing, which will: - - Collect controllers with pending changes in their domain of control - - for each invoke the post-policy processing function corresponding to - the request. - - Clear pending markers for the controllers. - 6. Proxy the request: - - Relay the request to the server. - - Send update requests for any additional affected containers. - - Update the cache if/as necessary based on the response. - - Relay the response back to the client. - 7. Release the processing pipeline serialization lock. + +1. Lock the processing pipeline serialization lock. +2. Look up/create cache objects (pod/container) for the request. +3. If the request has no resource allocation consequences, do proxying + (step 6). +4. Otherwise, invoke the policy layer for resource allocation: + - Pass it on to the configured active policy, which will + - Allocate resources for the container. + - Update the assignments for the container in the cache. + - Update any other containers affected by the allocation in the cache. +5. Invoke the controller layer for post-policy processing, which will: + - Collect controllers with pending changes in their domain of control + - for each invoke the post-policy processing function corresponding to + the request. + - Clear pending markers for the controllers. +6. Proxy the request: + - Relay the request to the server. + - Send update requests for any additional affected containers. + - Update the cache if/as necessary based on the response. + - Relay the response back to the client. +7. Release the processing pipeline serialization lock. The high-level control flow of the event processing pipeline is one of the following, based on the event type: - - For policy-specific events: - 1. Engage the processing pipeline lock. - 2. Call policy event handler. - 3. Invoke the controller layer for post-policy processing (same as step 5 for requests). - 4. Release the pipeline lock. +- For policy-specific events: + 1. Engage the processing pipeline lock. + 2. Call policy event handler. + 3. Invoke the controller layer for post-policy processing (same as step 5 for + requests). + 4. Release the pipeline lock. ### [Cache](tree:/pkg/resmgr/cache/) @@ -144,14 +147,12 @@ managers event loop. This causes a callback from the resource manager to the policy's event handler with the injected event as an argument and with the cache properly locked. - ### [Generic Policy Layer](blob:/pkg/resmgr/policy/policy.go) The generic policy layer defines the abstract interface the rest of NRI-RP uses to interact with policy implementations and takes care of the details of activating and dispatching calls through to the configured active policy. - ### [Generic Resource Controller Layer](blob:/pkg/resmgr/control/control.go) The generic resource controller layer defines the abstract interface the rest @@ -159,7 +160,6 @@ of NRI-RP uses to interact with resource controller implementations and takes care of the details of dispatching calls to the controller implementations for post-policy enforcment of decisions. - ### [Metrics Collector](tree:/pkg/metrics/) The metrics collector gathers a set of runtime metrics about the containers @@ -168,7 +168,6 @@ collected data to determine how optimal the current assignment of container resources is and to attempt a rebalancing/reallocation if it is deemed both possible and necessary. - ### [Policy Implementations](tree:/cmd/plugins) #### [Topology Aware](tree:/cmd/plugins/topology-aware/) diff --git a/docs/resource-policy/developers-guide/e2e-test.md b/docs/resource-policy/developers-guide/e2e-test.md index 06adb92f5..1d18561ed 100644 --- a/docs/resource-policy/developers-guide/e2e-test.md +++ b/docs/resource-policy/developers-guide/e2e-test.md @@ -3,6 +3,7 @@ ## Prerequisites Install: + - `docker` - `vagrant` @@ -10,14 +11,14 @@ Install: Run policy tests: -``` +```bash cd test/e2e [VAR=VALUE...] ./run_tests.sh policies.test-suite ``` Run tests only on certain policy, topology, or only selected test: -``` +```bash cd test/e2e [VAR=VALUE...] ./run_tests.sh policies.test-suite[/POLICY[/TOPOLOGY[/testNN-*]]] ``` @@ -65,7 +66,8 @@ configuration. The `topology` variable is a JSON array of objects. Each object defines one or more NUMA nodes. Keys in objects: -``` + +```text "mem" mem (RAM) size on each NUMA node in this group. The default is "0G". "nvmem" nvmem (non-volatile RAM) size on each NUMA node @@ -82,12 +84,12 @@ defines one or more NUMA nodes. Keys in objects: The default is 1. ``` - Example: Run the test in a VM with two NUMA nodes. There are 4 CPUs (two cores, two threads per core by default) and 4G RAM in each node -``` + +```bash e2e$ vm_name=my2x4 topology='[{"mem":"4G","cores":2,"nodes":2}]' ./run.sh ``` @@ -97,7 +99,7 @@ two NUMA nodes, each node containing 2 CPU cores, each core containing two threads. And with a NUMA node with 16G of non-volatile memory (NVRAM) but no CPUs. -``` +```bash e2e$ vm_name=mynvram topology='[{"mem":"4G","cores":2,"nodes":2,"dies":2,"packages":2},{"nvmem":"16G"}]' ./run.sh ``` diff --git a/docs/resource-policy/developers-guide/unit-test.md b/docs/resource-policy/developers-guide/unit-test.md index 752a11f23..d6f594628 100644 --- a/docs/resource-policy/developers-guide/unit-test.md +++ b/docs/resource-policy/developers-guide/unit-test.md @@ -1,7 +1,8 @@ # Unit tests Run unit tests with -``` + +```bash make test ``` diff --git a/docs/resource-policy/index.md b/docs/resource-policy/index.md index cff336824..bf8d5f4ea 100644 --- a/docs/resource-policy/index.md +++ b/docs/resource-policy/index.md @@ -6,7 +6,6 @@ maxdepth: 2 caption: Contents --- introduction.md -installation.md setup.md configuration.md policy/index.md diff --git a/docs/resource-policy/installation.md b/docs/resource-policy/installation.md deleted file mode 100644 index b08da0738..000000000 --- a/docs/resource-policy/installation.md +++ /dev/null @@ -1,246 +0,0 @@ -# Installation - -This repository hosts a collection of plugins of various types, one of which is the resource -policy plugins. In this example, we will demonstrate the installation process for the topology-aware -plugin, which falls under the resource policy type. The installation methods outlined -here can be applied to any other plugin hosted in this repository, regardless of its type. - -Currently, there are two installation methods available. - -1. [Helm](#installing-the-helm-chart) -2. [Manual](#manual-installation) - -Regardless of the chosen installation method, the NRI plugin installation includes the -following components: DaemonSet, ConfigMap, CustomResourceDefinition, and RBAC-related objects. - -## Prerequisites - -- Container runtime: - - containerD: - - At least [containerd 1.7.0](https://github.com/containerd/containerd/releases/tag/v1.7.0) - release version to use the NRI feature. - - - Enable NRI feature by following [these](https://github.com/containerd/containerd/blob/main/docs/NRI.md#enabling-nri-support-in-containerd) - detailed instructions. You can optionally enable the NRI in containerd using the Helm chart - during the chart installation simply by setting the `nri.patchRuntimeConfig` parameter. - For instance, - - ```sh - helm install topology-aware nri-plugins/nri-resource-policy-topology-aware --namespace kube-system --set nri.patchRuntimeConfig=true - ``` - - Enabling `nri.patchRuntimeConfig` creates an init container to turn on - NRI feature in containerd and only after that proceed the plugin installation. - - - CRI-O - - At least [v1.26.0](https://github.com/cri-o/cri-o/releases/tag/v1.26.0) release version to - use the NRI feature - - Enable NRI feature by following [these](https://github.com/cri-o/cri-o/blob/main/docs/crio.conf.5.md#crionri-table) detailed instructions. - You can optionally enable the NRI in CRI-O using the Helm chart - during the chart installation simply by setting the `nri.patchRuntimeConfig` parameter. - For instance, - - ```sh - helm install topology-aware nri-plugins/nri-resource-policy-topology-aware --namespace kube-system --set nri.patchRuntimeConfig=true - ``` - -- Kubernetes 1.24+ -- Helm 3.0.0+ - -## Installing the Helm Chart - -1. Add the nri-plugins charts repository so that Helm install can find the actual charts. - - ```sh - helm repo add nri-plugins https://containers.github.io/nri-plugins - ``` - -1. List chart repositories to ensure that nri-plugins repo is added. - - ```sh - helm repo list - ``` - -1. Install the plugin. Replace release version with the desired version. If you wish to - provide custom values to the Helm chart, refer to the [table](#helm-parameters) below, - which describes the available parameters that can be modified before installation. - Parameters can be specified either using the --set option or through the -f flag along - with the custom values.yaml file. It's important to note that specifying the namespace - (using `--namespace` or `-n`) is crucial when installing the Helm chart. If no namespace - is specified, the manifests will be installed in the default namespace. - - ```sh - # Install the topology-aware plugin with default values - helm install topology-aware nri-plugins/nri-resource-policy-topology-aware --namespace kube-system - - # Install the topology-aware plugin with custom values provided using the --set option - helm install topology-aware nri-plugins/nri-resource-policy-topology-aware --namespace kube-system --set nri.patchRuntimeConfig=true - - # Install the topology-aware plugin with custom values specified in a custom values.yaml file - cat < myPath/values.yaml - nri: - patchRuntimeConfig: true - - tolerations: - - key: "node-role.kubernetes.io/control-plane" - operator: "Exists" - effect: "NoSchedule" - EOF - - helm install topology-aware nri-plugins/nri-resource-policy-topology-aware --namespace kube-system -f myPath/values.yaml - ``` - - The helm repository is named `nri-plugins`, and in step 1, you have the - flexibility to choose any name when adding it. However, it's important to - note that `nri-resource-policy-topology-aware`, which serves as the path - to the chart, must accurately reflect the actual name of the chart. You - can find the path to each chart in the [helm parameters table](#helm-parameters). - - -1. Verify the status of the daemonset to ensure that the plugin is running successfully - - ```bash - kubectl get daemonset -n kube-system nri-resource-policy-topology-aware - - NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE - nri-resource-policy-topology-aware 1 1 0 1 0 kubernetes.io/os=linux 4m33s - ``` - -That's it! You have now installed the topology-aware NRI resource policy plugin using Helm. - -## Uninstalling the Chart - -To uninstall plugin chart just deleting it with the release name is enough: - -```bash -helm uninstall topology-aware --namespace kube-system -``` - -Note: this removes DaemonSet, ConfigMap, CustomResourceDefinition, and RBAC-related objects associated with the chart. - -### Helm parameters - -The tables below present an overview of the parameters available for users to customize with their own values, -along with the default values, for the Topology-aware and Balloons plugins Helm charts. - -#### Topology-aware - -Path to the chart: `nri-resource-policy-topology-aware` - -| Name | Default | Description | -| ------------------ | ----------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------- | -| `image.name` | [ghcr.io/containers/nri-plugins/nri-resource-policy-topology-aware](ghcr.io/containers/nri-plugins/nri-resource-policy-topology-aware) | container image name | -| `image.tag` | unstable | container image tag | -| `image.pullPolicy` | Always | image pull policy | -| `resources.cpu` | 500m | cpu resources for the Pod | -| `resources.memory` | 512Mi | memory qouta for the Pod | -| `hostPort` | 8891 | metrics port to expose on the host | -| `config` |
ReservedResources:
cpu: 750m
| plugin configuration data | -| `nri.patchRuntimeConfig` | false | enable NRI in containerd or CRI-O | -| `initImage.name` | [ghcr.io/containers/nri-plugins/config-manager](ghcr.io/containers/nri-plugins/config-manager) | init container image name | -| `initImage.tag` | unstable | init container image tag | -| `initImage.pullPolicy` | Always | init container image pull policy | -| `tolerations` | [] | specify taint toleration key, operator and effect | - -#### Balloons - -Path to the chart: `nri-resource-policy-balloons` - -| Name | Default | Description | -| ------------------ | ----------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------- | -| `image.name` | [ghcr.io/containers/nri-plugins/nri-resource-policy-balloons](ghcr.io/containers/nri-plugins/nri-resource-policy-balloons) | container image name | -| `image.tag` | unstable | container image tag | -| `image.pullPolicy` | Always | image pull policy | -| `resources.cpu` | 500m | cpu resources for the Pod | -| `resources.memory` | 512Mi | memory qouta for the Pod | -| `hostPort` | 8891 | metrics port to expose on the host | -| `config` |
ReservedResources:
cpu: 750m
| plugin configuration data | -| `nri.patchRuntimeConfig` | false | enable NRI in containerd or CRI-O | -| `initImage.name` | [ghcr.io/containers/nri-plugins/config-manager](ghcr.io/containers/nri-plugins/config-manager) | init container image name | -| `initImage.tag` | unstable | init container image tag | -| `initImage.pullPolicy` | Always | init container image pull policy | -| `tolerations` | [] | specify taint toleration key, operator and effect | - -#### Memtierd - -Path to the chart: `nri-memtierd` - -| Name | Default | Description | -| ------------------ | ----------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------- | -| `image.name` | [ghcr.io/containers/nri-plugins/nri-memtierd](ghcr.io/containers/nri-plugins/nri-memtierd) | container image name | -| `image.tag` | unstable | container image tag | -| `image.pullPolicy` | Always | image pull policy | -| `resources.cpu` | 250m | cpu resources for the Pod | -| `resources.memory` | 100Mi | memory qouta for the | -| `outputDir` | empty string | host directory for memtierd.output files | -| `nri.patchRuntimeConfig` | false | enable NRI in containerd or CRI-O | -| `initImage.name` | [ghcr.io/containers/nri-plugins/config-manager](ghcr.io/containers/nri-plugins/config-manager) | init container image name | -| `initImage.tag` | unstable | init container image tag | -| `initImage.pullPolicy` | Always | init container image pull policy | -| `tolerations` | [] | specify taint toleration key, operator and effect | - - -#### Memory-qos - -Path to the chart: `nri-memory-qos` - -| Name | Default | Description | -| ------------------ | ----------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------- | -| `image.name` | [ghcr.io/containers/nri-plugins/nri-memory-qos](ghcr.io/containers/nri-plugins/nri-memory-qos) | container image name | -| `image.tag` | unstable | container image tag | -| `image.pullPolicy` | Always | image pull policy | -| `resources.cpu` | 10m | cpu resources for the Pod | -| `resources.memory` | 100Mi | memory qouta for the | -| `nri.patchRuntimeConfig` | false | enable NRI in containerd or CRI-O | -| `initImage.name` | [ghcr.io/containers/nri-plugins/config-manager](ghcr.io/containers/nri-plugins/config-manager) | init container image name | -| `initImage.tag` | unstable | init container image tag | -| `initImage.pullPolicy` | Always | init container image pull policy | -| `tolerations` | [] | specify taint toleration key, operator and effect | - - -## Manual installation - -For the manual installation we will be using templating tool to generate Kubernetes YAML manifests. -1. Clone the project to your local machine - ```sh - git clone https://github.com/containers/nri-plugins.git - ``` - -1. Navigate to the project directory - ```sh - cd nri-plugins - ``` - -1. If there are any specific configuration values you need to modify, navigate to the plugins - [directory](https://github.com/containers/nri-plugins/tree/main/deployment/overlays) containing - the Kustomization file and update the desired configuration - values according to your environment in the Kustomization file. - -1. Use kustomize to generate the Kubernetes manifests for the desired plugin and apply the generated - manifests to your Kubernetes cluster using kubectl. - - ```sh - kustomize build deployment/overlays/topology-aware/ | kubectl apply -f - - ``` - -1. Verify the status of the DaemonSet to ensure that the plugin is running successfully - - ```bash - kubectl get daemonset -n kube-system nri-resource-policy - - NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE - nri-resource-policy 1 1 0 1 0 kubernetes.io/os=linux 4m33s - ``` - -That's it! You have now installed the topology-aware NRI resource policy plugin using kutomize. - -## Manual uninstallation - -To uninstall plugin manifests you can run the following command: - -```sh -kustomize build deployment/overlays/topology-aware/ | kubectl delete -f - -``` - -Note: this removes DaemonSet, ConfigMap, CustomResourceDefinition, and RBAC-related objects associated -with the chart. diff --git a/docs/resource-policy/policy/balloons.md b/docs/resource-policy/policy/balloons.md index f6a9c80da..eae9263f7 100644 --- a/docs/resource-policy/policy/balloons.md +++ b/docs/resource-policy/policy/balloons.md @@ -35,11 +35,11 @@ min and max frequencies on CPU cores and uncore. 7. Next the policy decides which balloon of the decided type will run the container. Options are: - - an existing balloon that already has enough CPUs to run its - current and new containers - - an existing balloon that can be inflated to fit its current and - new containers - - new balloon. + - an existing balloon that already has enough CPUs to run its + current and new containers + - an existing balloon that can be inflated to fit its current and + new containers + - new balloon. 9. When a CPU is added to a balloon or removed from it, the CPU is reconfigured based on balloon's CPU class attributes, or idle CPU @@ -48,7 +48,7 @@ min and max frequencies on CPU cores and uncore. ## Deployment Deploy nri-resource-policy-balloons on each node as you would for any -other policy. See [installation](../installation.md) for more details. +other policy. See [deployment](../../deployment/index.md) for more details. ## Configuration @@ -83,6 +83,16 @@ Balloons policy parameters: pack new balloons tightly into the same NUMAs/dies/packages. This helps keeping large portions of hardware idle and entering into deep power saving states. +- `PreferSpreadOnPhysicalCores` prefers allocating logical CPUs + (possibly hyperthreads) for a balloon from separate physical CPU + cores. This prevents workloads in the balloon from interfering with + themselves as they do not compete on the resources of the same CPU + cores. On the other hand, it allows more interference between + workloads in different balloons. The default is `false`: balloons + are packed tightly to a minimum number of physical CPU cores. The + value set here is the default for all balloon types, but it can be + overridden with the balloon type specific setting with the same + name. - `BalloonTypes` is a list of balloon type definitions. Each type can be configured with the following parameters: - `Name` of the balloon type. This is used in pod annotations to @@ -98,8 +108,8 @@ Balloons policy parameters: is allowed to co-exist. The default is 0: creating new balloons is not limited by the number of existing balloons. - `MaxCPUs` specifies the maximum number of CPUs in any balloon of - this type. Balloons will not be inflated larger than this. 0 means - unlimited. + this type. Balloons will not be inflated larger than this. 0 means + unlimited. - `MinCPUs` specifies the minimum number of CPUs in any balloon of this type. When a balloon is created or deflated, it will always have at least this many CPUs, even if containers in the balloon @@ -111,10 +121,10 @@ Balloons policy parameters: is `false`: prefer placing containers of the same pod to the same balloon(s). - `PreferPerNamespaceBalloon`: if `true`, containers in the same - namespace will be placed in the same balloon(s). On the other - hand, containers in different namespaces are preferrably placed in - different balloons. The default is `false`: namespace has no - effect on choosing the balloon of this type. + namespace will be placed in the same balloon(s). On the other + hand, containers in different namespaces are preferrably placed in + different balloons. The default is `false`: namespace has no + effect on choosing the balloon of this type. - `PreferNewBalloons`: if `true`, prefer creating new balloons over placing containers to existing balloons. This results in preferring exclusive CPUs, as long as there are enough free @@ -133,6 +143,8 @@ Balloons policy parameters: - `numa`: ...in the same numa node(s) as the balloon. - `core`: ...allowed to use idle CPU threads in the same cores with the balloon. + - `PreferSpreadOnPhysicalCores` overrides the policy level option + with the same name in the scope of this balloon type. - `AllocatorPriority` (0: High, 1: Normal, 2: Low, 3: None). CPU allocator parameter, used when creating new or resizing existing balloons. If there are balloon types with pre-created balloons @@ -140,6 +152,7 @@ Balloons policy parameters: `AllocatorPriority` are created first. Related configuration parameters: + - `policy.ReservedResources.CPU` specifies the (number of) CPUs in the special `reserved` balloon. By default all containers in the `kube-system` namespace are assigned to the reserved balloon. @@ -149,6 +162,7 @@ Related configuration parameters: ### Example Example configuration that runs all pods in balloons of 1-4 CPUs. + ```yaml policy: Active: balloons diff --git a/docs/resource-policy/policy/index.md b/docs/resource-policy/policy/index.md index 9017a1a6d..4abd559a4 100644 --- a/docs/resource-policy/policy/index.md +++ b/docs/resource-policy/policy/index.md @@ -2,11 +2,12 @@ Currently there are two resource policies: -The Topology Aware resource policy provides a nearly zero configuration resource -policy that allocates resources evenly in order to avoid the "noisy neighbor" problem. +The Topology Aware resource policy provides a nearly zero configuration +resource policy that allocates resources evenly in order to avoid the "noisy +neighbor" problem. -The Balloons resource policy allows user to allocate workloads to resources in a more -user controlled way. +The Balloons resource policy allows user to allocate workloads to resources in +a more user controlled way. ```{toctree} --- diff --git a/docs/resource-policy/policy/topology-aware.md b/docs/resource-policy/policy/topology-aware.md index 0404a34f4..273203bd8 100644 --- a/docs/resource-policy/policy/topology-aware.md +++ b/docs/resource-policy/policy/topology-aware.md @@ -60,7 +60,7 @@ data transfer between CPU cores. Another property of this setup is that the resource sets of sibling pools at the same depth in the tree are disjoint while the resource sets of descendant pools along the same path in the tree partially overlap, with the intersection -decreasing as the the distance between pools increases. This makes it easy to +decreasing as the the distance between pools increases. This makes it easy to isolate workloads from each other. As long as workloads are assigned to pools which has no other common ancestor than the root, the resources of these workloads should be as well isolated from each other as possible on the given @@ -70,49 +70,49 @@ With such an arrangement, this policy should handle topology-aware alignment of resources without any special or extra configuration. When allocating resources, the policy - - filters out all pools with insufficient free capacity - - runs a scoring algorithm for the remaining ones - - picks the one with the best score - - assigns resources to the workload from there +- filters out all pools with insufficient free capacity +- runs a scoring algorithm for the remaining ones +- picks the one with the best score +- assigns resources to the workload from there Although the details of the scoring algorithm are subject to change as the implementation evolves, its basic principles are roughly - - prefer pools lower in the tree, IOW stricter alignment and lower latency - - prefer idle pools over busy ones, IOW more remaining free capacity and - fewer workloads - - prefer pools with better overall device alignment +- prefer pools lower in the tree, IOW stricter alignment and lower latency +- prefer idle pools over busy ones, IOW more remaining free capacity and + fewer workloads +- prefer pools with better overall device alignment ## Features The `topology-aware` policy has the following features: - - topologically aligned allocation of CPU and memory - * assign CPU and memory to workloads with tightest available alignment - - aligned allocation of devices - * pick pool for workload based on locality of devices already assigned - - shared allocation of CPU cores - * assign workload to shared subset of pool CPUs - - exclusive allocation of CPU cores - * dynamically slice off CPU cores from shared subset and assign to workload - - mixed allocation of CPU cores - * assign both exclusive and shared CPU cores to workload - - discovering and using kernel-isolated CPU cores (['isolcpus'](https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html#cpu-lists)) - * use kernel-isolated CPU cores for exclusively assigned CPU cores - - exposing assigned resources to workloads - - notifying workloads about changes in resource assignment - - dynamic relaxation of memory alignment to prevent OOM - * dynamically widen workload memory set to avoid pool/workload OOM - - multi-tier memory allocation - * assign workloads to memory zones of their preferred type - * the policy knows about three kinds of memory: - - DRAM is regular system main memory - - PMEM is large-capacity memory, such as - [Intel® Optane™ memory](https://www.intel.com/content/www/us/en/products/memory-storage/optane-dc-persistent-memory.html) - - [HBM](https://en.wikipedia.org/wiki/High_Bandwidth_Memory) is high speed memory, - typically found on some special-purpose computing systems - - cold start - * pin workload exclusively to PMEM for an initial warm-up period +- topologically aligned allocation of CPU and memory + - assign CPU and memory to workloads with tightest available alignment +- aligned allocation of devices + - pick pool for workload based on locality of devices already assigned +- shared allocation of CPU cores + - assign workload to shared subset of pool CPUs +- exclusive allocation of CPU cores + - dynamically slice off CPU cores from shared subset and assign to workload +- mixed allocation of CPU cores + - assign both exclusive and shared CPU cores to workload +- discovering and using kernel-isolated CPU cores (['isolcpus'](https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html#cpu-lists)) + - use kernel-isolated CPU cores for exclusively assigned CPU cores +- exposing assigned resources to workloads +- notifying workloads about changes in resource assignment +- dynamic relaxation of memory alignment to prevent OOM + - dynamically widen workload memory set to avoid pool/workload OOM +- multi-tier memory allocation + - assign workloads to memory zones of their preferred type + - the policy knows about three kinds of memory: + - DRAM is regular system main memory + - PMEM is large-capacity memory, such as + [Intel® Optane™ memory](https://www.intel.com/content/www/us/en/products/memory-storage/optane-dc-persistent-memory.html) + - [HBM](https://en.wikipedia.org/wiki/High_Bandwidth_Memory) is high + speed memory, typically found on some special-purpose computing systems +- cold start + - pin workload exclusively to PMEM for an initial warm-up period ## Activating the Policy @@ -134,22 +134,25 @@ behavior. These options can be supplied as part of the or in a fallback or forced configuration file. These configuration options are - - `PinCPU` - * whether to pin workloads to assigned pool CPU sets - - `PinMemory` - * whether to pin workloads to assigned pool memory zones - - `PreferIsolatedCPUs` - * whether isolated CPUs are preferred by default for workloads that are - eligible for exclusive CPU allocation - - `PreferSharedCPUs` - * whether shared allocation is preferred by default for workloads that - would be otherwise eligible for exclusive CPU allocation - - `ReservedPoolNamespaces` - * list of extra namespaces (or glob patters) that will be allocated to reserved CPUs - - `ColocatePods` - * whether try to allocate containers in a pod to the same or close by topology pools - - `ColocateNamespaces` - * whether try to allocate containers in a namespace to the same or close by topology pools +- `PinCPU` + - whether to pin workloads to assigned pool CPU sets +- `PinMemory` + - whether to pin workloads to assigned pool memory zones +- `PreferIsolatedCPUs` + - whether isolated CPUs are preferred by default for workloads that are + eligible for exclusive CPU allocation +- `PreferSharedCPUs` + - whether shared allocation is preferred by default for workloads that + would be otherwise eligible for exclusive CPU allocation +- `ReservedPoolNamespaces` + - list of extra namespaces (or glob patters) that will be allocated to + reserved CPUs +- `ColocatePods` + - whether try to allocate containers in a pod to the same or close by + topology pools +- `ColocateNamespaces` + - whether try to allocate containers in a namespace to the same or close by + topology pools ## Policy CPU Allocation Preferences @@ -162,53 +165,53 @@ time the container's creation / resource allocation request hits the policy. The set of these extra optimizations consist of - - assignment of `kube-reserved` CPUs - - assignment of exclusively allocated CPU cores - - usage of kernel-isolated CPU cores (for exclusive allocation) +- assignment of `kube-reserved` CPUs +- assignment of exclusively allocated CPU cores +- usage of kernel-isolated CPU cores (for exclusive allocation) The policy uses a combination of the QoS class and the resource requirements of the container to decide if any of these extra allocation preferences should be applied. Containers are divided into five groups, with each group having a slightly different set of criteria for eligibility. - - `kube-system` group - * all containers in the `kube-system` namespace - - `low-priority` group - * containers in the `BestEffort` or `Burstable` QoS class - - `sub-core` group - * Guaranteed QoS class containers with `CPU request < 1 CPU` - - `mixed` group - * Guaranteed QoS class containers with `1 <= CPU request < 2` - - `multi-core` group - * Guaranteed QoS class containers with `CPU request >= 2` +- `kube-system` group + - all containers in the `kube-system` namespace +- `low-priority` group + - containers in the `BestEffort` or `Burstable` QoS class +- `sub-core` group + - Guaranteed QoS class containers with `CPU request < 1 CPU` +- `mixed` group + - Guaranteed QoS class containers with `1 <= CPU request < 2` +- `multi-core` group + - Guaranteed QoS class containers with `CPU request >= 2` The eligibility rules for extra optimization are slightly different among these groups. - - `kube-system` - * not eligible for extra optimizations - * eligible to run on `kube-reserved` CPU cores - * always run on shared CPU cores - - `low-priority` - * not eligible for extra optimizations - * always run on shared CPU cores - - `sub-core` - * not eligible for extra optimizations - * always run on shared CPU cores - - `mixed` - * by default eligible for exclusive and isolated allocation - * not eligible for either if `PreferSharedCPUs` is set to true - * not eligible for either if annotated to opt out from exclusive allocation - * not eligible for isolated allocation if annotated to opt out - - `multi-core` - * CPU request fractional (`(CPU request % 1000 milli-CPU) != 0`): - - by default not eligible for extra optimizations - - eligible for exclusive and isolated allocation if annotated to opt in - * CPU request not fractional: - - by default eligible for exclusive allocation - - by default not eligible for isolated allocation - - not eligible for exclusive allocation if annotated to opt out - - eligible for isolated allocation if annotated to opt in +- `kube-system` + - not eligible for extra optimizations + - eligible to run on `kube-reserved` CPU cores + - always run on shared CPU cores +- `low-priority` + - not eligible for extra optimizations + - always run on shared CPU cores +- `sub-core` + - not eligible for extra optimizations + - always run on shared CPU cores +- `mixed` + - by default eligible for exclusive and isolated allocation + - not eligible for either if `PreferSharedCPUs` is set to true + - not eligible for either if annotated to opt out from exclusive allocation + - not eligible for isolated allocation if annotated to opt out +- `multi-core` + - CPU request fractional (`(CPU request % 1000 milli-CPU) != 0`): + - by default not eligible for extra optimizations + - eligible for exclusive and isolated allocation if annotated to opt in + - CPU request not fractional: + - by default eligible for exclusive allocation + - by default not eligible for isolated allocation + - not eligible for exclusive allocation if annotated to opt out + - eligible for isolated allocation if annotated to opt in Eligibility for kube-reserved CPU core allocation should always be possible to honor. If this is not the case, it is probably due to an incorrect configuration @@ -390,10 +393,10 @@ class so no need to mention `kube-system` in this list. ## Reserved CPU annotations -User is able to mark certain pods and containers to have a reserved CPU allocation -by using annotations. Containers having a such annotation will only run on CPUs set -aside according to the global CPU reservation, as configured by the ReservedResources -configuration option in the policy section. +User is able to mark certain pods and containers to have a reserved CPU +allocation by using annotations. Containers having a such annotation will only +run on CPUs set aside according to the global CPU reservation, as configured by +the ReservedResources configuration option in the policy section. For example: @@ -460,19 +463,22 @@ Since these hints are interpreted always by a particular *policy implementation* the exact definitions of 'close' and 'far' are also somewhat *policy-specific*. However as a general rule of thumb containers running - - on CPUs within the *same NUMA nodes* are considered *'close'* to each other, - - on CPUs within *different NUMA nodes* in the *same socket* are *'farther'*, and - - on CPUs within *different sockets* are *'far'* from each other +- on CPUs within the *same NUMA nodes* are considered *'close'* to each other, +- on CPUs within *different NUMA nodes* in the *same socket* are *'farther'*, and +- on CPUs within *different sockets* are *'far'* from each other These hints are expressed by `container affinity annotations` on the Pod. There are two types of affinities: - - `affinity` (or `positive affinty`): cause affected containers to *pull* each other closer - - `anti-affinity` (or `negative affinity`): cause affected containers to *push* each other further away +- `affinity` (or `positive affinty`): cause affected containers to *pull* each + other closer +- `anti-affinity` (or `negative affinity`): cause affected containers to *push* + each other further away Policies try to place a container - - close to those the container has affinity towards - - far from those the container has anti-affinity towards. + +- close to those the container has affinity towards +- far from those the container has anti-affinity towards. ### Affinity Annotation Syntax @@ -533,89 +539,100 @@ metadata: An affinity consists of three parts: - - `scope expression`: defines which containers this affinity is evaluated against - - `match expression`: defines for which containers (within the scope) the affinity applies to - - `weight`: defines how *strong* a pull or a push the affinity causes +- `scope expression`: defines which containers this affinity is evaluated against +- `match expression`: defines for which containers (within the scope) the + affinity applies to +- `weight`: defines how *strong* a pull or a push the affinity causes *Affinities* are also sometimes referred to as *positive affinities* while *anti-affinities* are referred to as *negative affinities*. The reason for this is that the only difference between these are that affinities have a *positive weight* while anti-affinities have a *negative weight*. -The *scope* of an affinity defines the *bounding set of containers* the affinity can -apply to. The affinity *expression* is evaluated against the containers *in scope* and -it *selects the containers* the affinity really has an effect on. The *weight* specifies -whether the effect is a *pull* or a *push*. *Positive* weights cause a *pull* while -*negative* weights cause a *push*. Additionally, the *weight* specifies *how strong* the -push or the pull is. This is useful in situations where the policy needs to make some -compromises because an optimal placement is not possible. The weight then also acts as -a way to specify preferences of priorities between the various compromises: the heavier -the weight the stronger the pull or push and the larger the propbability that it will be -honored, if this is possible at all. - -The scope can be omitted from an affinity in which case it implies *Pod scope*, in other -words the scope of all containers that belong to the same Pod as the container for which -which the affinity is defined. - -The weight can also be omitted in which case it defaults to -1 for anti-affinities -and +1 for affinities. Weights are currently limited to the range [-1000,1000]. - -Both the affinity scope and the expression select containers, therefore they are identical. -Both of them are *expressions*. An expression consists of three parts: - - - key: specifies what *metadata* to pick from a container for evaluation - - operation (op): specifies what *logical operation* the expression evaluates - - values: a set of *strings* to evaluate the the value of the key against +The *scope* of an affinity defines the *bounding set of containers* the +affinity can apply to. The affinity *expression* is evaluated against the +containers *in scope* and it *selects the containers* the affinity really has +an effect on. The *weight* specifies whether the effect is a *pull* or a +*push*. *Positive* weights cause a *pull* while *negative* weights cause a +*push*. Additionally, the *weight* specifies *how strong* the push or the pull +is. This is useful in situations where the policy needs to make some +compromises because an optimal placement is not possible. The weight then also +acts as a way to specify preferences of priorities between the various +compromises: the heavier the weight the stronger the pull or push and the +larger the propbability that it will be honored, if this is possible at all. + +The scope can be omitted from an affinity in which case it implies *Pod scope*, +in other words the scope of all containers that belong to the same Pod as the +container for which which the affinity is defined. + +The weight can also be omitted in which case it defaults to -1 for +anti-affinities and +1 for affinities. Weights are currently limited to the +range [-1000,1000]. + +Both the affinity scope and the expression select containers, therefore they +are identical. Both of them are *expressions*. An expression consists of three +parts: + +- key: specifies what *metadata* to pick from a container for evaluation +- operation (op): specifies what *logical operation* the expression evaluates +- values: a set of *strings* to evaluate the the value of the key against The supported keys are: - - for pods: - - `name` - - `namespace` - - `qosclass` - - `labels/` - - `id` - - `uid` - - for containers: - - `pod/` - - `name` - - `namespace` - - `qosclass` - - `labels/` - - `tags/` - - `id` +- for pods: + - `name` + - `namespace` + - `qosclass` + - `labels/` + - `id` + - `uid` +- for containers: + - `pod/` + - `name` + - `namespace` + - `qosclass` + - `labels/` + - `tags/` + - `id` Essentially an expression defines a logical operation of the form (key op values). Evaluating this logical expression will take the value of the key in which either evaluates to true or false. a boolean true/false result. Currently the following operations are supported: - - `Equals`: equality, true if the *value of key* equals the single item in *values* - - `NotEqual`: inequality, true if the *value of key* is not equal to the single item in *values* - - `In`: membership, true if *value of key* equals to any among *values* - - `NotIn`: negated membership, true if the *value of key* is not equal to any among *values* - - `Exists`: true if the given *key* exists with any value - - `NotExists`: true if the given *key* does not exist - - `AlwaysTrue`: always evaluates to true, can be used to denote node-global scope (all containers) - - `Matches`: true if the *value of key* matches the globbing pattern in values - - `MatchesNot`: true if the *value of key* does not match the globbing pattern in values - - `MatchesAny`: true if the *value of key* matches any of the globbing patterns in values - - `MatchesNone`: true if the *value of key* does not match any of the globbing patterns in values - -The effective affinity between containers C_1 and C_2, A(C_1, C_2) is the sum of the -weights of all pairwise in-scope matching affinities W(C_1, C_2). To put it another way, -evaluating an affinity for a container C_1 is done by first using the scope (expression) -to determine which containers are in the scope of the affinity. Then, for each in-scope -container C_2 for which the match expression evaluates to true, taking the weight of the -affinity and adding it to the effective affinity A(C_1, C_2). - -Note that currently (for the topology-aware policy) this evaluation is asymmetric: -A(C_1, C_2) and A(C_2, C_1) can and will be different unless the affinity annotations are -crafted to prevent this (by making them fully symmetric). Moreover, A(C_1, C_2) is calculated -and taken into consideration during resource allocation for C_1, while A(C_2, C_1) -is calculated and taken into account during resource allocation for C_2. This might be -changed in a future version. - +- `Equals`: equality, true if the *value of key* equals the single item in *values* +- `NotEqual`: inequality, true if the *value of key* is not equal to the single + item in *values* +- `In`: membership, true if *value of key* equals to any among *values* +- `NotIn`: negated membership, true if the *value of key* is not equal to any + among *values* +- `Exists`: true if the given *key* exists with any value +- `NotExists`: true if the given *key* does not exist +- `AlwaysTrue`: always evaluates to true, can be used to denote node-global + scope (all containers) +- `Matches`: true if the *value of key* matches the globbing pattern in values +- `MatchesNot`: true if the *value of key* does not match the globbing pattern + in values +- `MatchesAny`: true if the *value of key* matches any of the globbing patterns + in values +- `MatchesNone`: true if the *value of key* does not match any of the globbing + patterns in values + +The effective affinity between containers C_1 and C_2, A(C_1, C_2) is the sum +of the weights of all pairwise in-scope matching affinities W(C_1, C_2). To put +it another way, evaluating an affinity for a container C_1 is done by first +using the scope (expression) to determine which containers are in the scope of +the affinity. Then, for each in-scope container C_2 for which the match +expression evaluates to true, taking the weight of the affinity and adding it +to the effective affinity A(C_1, C_2). + +Note that currently (for the topology-aware policy) this evaluation is +asymmetric: A(C_1, C_2) and A(C_2, C_1) can and will be different unless the +affinity annotations are crafted to prevent this (by making them fully +symmetric). Moreover, A(C_1, C_2) is calculated and taken into consideration +during resource allocation for C_1, while A(C_2, C_1) is calculated and taken +into account during resource allocation for C_2. This might be changed in a +future version. Currently affinity expressions lack support for boolean operators (and, or, not). Sometimes this limitation can be overcome by using joint keys, especially with @@ -623,11 +640,13 @@ matching operators. The joint key syntax allows joining the value of several key with a separator into a single value. A joint key can be specified in a simple or full format: - - simple: ``, this is equivalent to `:::` - - full: `` +- simple: ``, this is equivalent to + `:::` +- full: `` -A joint key evaluates to the values of all the ``-separated subkeys joined by ``. -A non-existent subkey evaluates to the empty string. For instance the joint key +A joint key evaluates to the values of all the ``-separated subkeys +joined by ``. A non-existent subkey evaluates to the empty string. For +instance the joint key `:pod/qosclass:pod/name:name` @@ -635,8 +654,8 @@ evaluates to `::` -For existence operators, a joint key is considered to exist if any of its subkeys exists. - +For existence operators, a joint key is considered to exist if any of its +subkeys exists. ### Examples @@ -677,13 +696,13 @@ one needs to give just the names of the containers, like in the example below. container4: [ container2, container3 ] ``` - This shorthand notation defines: - - `container3` having - - affinity (weight 1) to `container1` - - `anti-affinity` (weight -1) to `container2` - - `container4` having - - `anti-affinity` (weight -1) to `container2`, and `container3` + +- `container3` having + - affinity (weight 1) to `container1` + - `anti-affinity` (weight -1) to `container2` +- `container4` having + - `anti-affinity` (weight -1) to `container2`, and `container3` The equivalent annotation in full syntax would be diff --git a/docs/resource-policy/setup.md b/docs/resource-policy/setup.md index 88a2004eb..9190aa7c1 100644 --- a/docs/resource-policy/setup.md +++ b/docs/resource-policy/setup.md @@ -4,8 +4,8 @@ When you want to try NRI Resource Policy, here is the list of things you need to do, assuming you already have a Kubernetes\* cluster up and running, using either `containerd` or `cri-o` as the runtime. - * [Install](installation.md) NRI Resource Policy DaemonSet deployment file. - * Runtime (containerd / cri-o) configuration +* [Deploy](../deployment/index.md) NRI Resource Policy DaemonSet deployment file. +* Runtime (containerd / cri-o) configuration For NRI Resource Policy, you need to provide a configuration file. The default configuration ConfigMap file can be found in the DaemonSet deployment yaml file.