Skip to content

Commit

Permalink
Address comment #292 (comment)
Browse files Browse the repository at this point in the history
  • Loading branch information
enxebre committed Jun 26, 2020
1 parent 981d894 commit 5e56429
Showing 1 changed file with 15 additions and 11 deletions.
26 changes: 15 additions & 11 deletions enhancements/machine-api/control-plane.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,12 @@ reviewers:
- smarterclayton
- derekwaynecarr
approvers:
- TBD
- hexfusion
- jeremyeder
- abhinavdahiya
- joelspeed
- smarterclayton
- derekwaynecarr

creation-date: 2020-04-02
last-updated: yyyy-mm-dd
Expand Down Expand Up @@ -40,7 +45,6 @@ superseded-by:
- [ ] Graduation criteria for dev preview, tech preview, GA
- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)


## Glossary

Control Plane: The collection of stateless and stateful processes which enable a Kubernetes cluster to meet minimum operational requirements. This includes: kube-apiserver, kube-controller-manager, kube-scheduler, kubelet and etcd.
Expand All @@ -61,7 +65,6 @@ This proposal assumes that all etcd operational aspects are managed by the clust
The contract between the etcd operations and the compute resources is given by the PDBs that blocks machine's deletion.
Depends on https://issues.redhat.com/browse/ETCD-74?jql=project%20%3D%20ETCD.


## Motivation

The Control Plane is the most critical and sensitive entity of a running cluster. Today OCP Control Plane instances are "pets" and therefore fragile. There are multiple scenarios where adjusting the compute capacity which is backing the Control Plane components might be desirable either for resizing or repairing.
Expand Down Expand Up @@ -110,7 +113,6 @@ Although is out of the scope for the first implementation, to provide long term
- This would provide a single provider config to be reused and to be changed across any control plane machine.
- This would give the `ControlPlane` controller all the semantics it needs to fully automate vertical rolling upgrades across multiple failure domains while provider config changes would need to happen in one single place.


The lifecycle of the compute resources still remains decoupled and orthogonal to the lifecycle and management of the Control Plane components hosted by the compute resources. All of these components, including etcd are expected to keep self managing themselves as the cluster shrink and expand the Control Plane compute resources.

### User Stories [optional]
Expand Down Expand Up @@ -153,7 +155,7 @@ Additionally it will create a ControlPlane resource to manage the lifecycle of t
- The ControlPlane controller will create MachineSets to adopt those machines by looking up known labels (Adopting behaviour already exists in machineSet logic).
- `machine.openshift.io/cluster-api-machineset": <cluster_name>-<label>-<zone>-controlplane`
- `machine.openshift.io/cluster-api-cluster": clusterID`
- The ControlPlane controller will create and manage a Machine Health Checking resource targeting the Control Plane Machines. Specific MHC details can be found [here](https://github.com/openshift/enhancements/blob/master/enhancements/machine-api/machine-health-checking.md)
- The ControlPlane controller will create and manage a Machine Health Checking resource targeting the Control Plane Machines. It will keep a `maxUnhealthy` value non-disruptive for etcd quorum, i.e 1 out of 3. Specific MHC details can be found [here](https://github.com/openshift/enhancements/blob/master/enhancements/machine-api/machine-health-checking.md)

Example:
```
Expand Down Expand Up @@ -185,17 +187,20 @@ spec:

#### Declarative horizontal scaling
- Out of scope:
- We'll reserve the ability to scale horizontally for further iterations if required. For the initial implementation the ControlPlane controller will limit the underlying machineSet horizontal scale capabilities.
- It will ensure the machineSet replicas is always 1 to enforce recreation of any of the adopted Machines.
- If the machineSet replicas were to be modified out of band, the ControlPlane controller will set it back to 1 while enforcing a "newest" delete policy on the machineSet.
- We'll reserve the ability to scale horizontally for further iterations if required.
- For the initial implementation the ControlPlane controller will limit the underlying machineSet horizontal scale capabilities:
- It will ensure the machineSet replicas is always 1 to enforce recreation of any of the adopted Machines.
- If the machineSet replicas were to be modified out of band, the ControlPlane controller will set it back to 1 while enforcing a "newest" delete policy on the machineSet.


#### Autoscaling
- Out of scope:
- This proposal sets the foundation for enabling vertical autoscaling. It enables any consumer to develop autoscaling atop the semi-automated process for "Declarative Vertical scaling" described above.
- In a future proposal we plan to add vertical autoscaling ability on the ControlPlane resource.

- Any machine deletion will always honour and it will be blocked on Pod Disruption Budgets (PBD). This gives etcd guard the chance to block a deletion that it considers disruptive.
#### Node Autorepair
- Any machine deletion will always honour and it will be blocked on Pod Disruption Budgets (PBD). This gives etcd guard the chance to block a deletion that it might consider to be disruptive for etcd quorum.
- Deletion operations triggered by the managed Machine Health Check will be limited to `maxUnhealthy` value, i.e 1 out of 3.

#### API Changes

Expand Down Expand Up @@ -271,7 +276,6 @@ The contract for the machine API is by honouring Pod Disruption Budgets (PBD), w
- Wait for new Machines to come up.
- Wait for old Machines to go away.


### Graduation Criteria

This proposal will be released in 4.N as long as:
Expand All @@ -284,7 +288,7 @@ This proposal will be released in 4.N as long as:

New IPI clusters deployed after the targeted release will run the ControlPlane resource deployed by the installer out of the box.

For UPI clusters and existing IPI clusters this is opt-in. As a user I can opt-in by creating a ControlPlane resource, i.e `kubectl create ControlPlane`
For UPI clusters and existing IPI clusters this is opt-in. As a user I can opt-in by creating a ControlPlane resource, i.e `kubectl create ControlPlane`.

### Version Skew Strategy

Expand Down

0 comments on commit 5e56429

Please sign in to comment.