Skip to content

Commit

Permalink
Add proposal for AzureManagedCluster v1
Browse files Browse the repository at this point in the history
  • Loading branch information
jackfrancis committed Nov 3, 2022
1 parent bbb0e9e commit 9600a32
Showing 1 changed file with 80 additions and 0 deletions.
80 changes: 80 additions & 0 deletions docs/proposals/20220825-azuremanaged-cluster-exp-graduation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
---
title: AzureManagedCluster graduation from experimental
authors:
- "@jackfrancis"
reviewers:
- @CecileRobertMichon
- @zmalik
- @NovemberZulu
- @mtougeron
- @nojnhuh
creation-date: 2022-08-25
last-updated: 2022-08-25
status: implementable
see-also:
- https://github.com/kubernetes-sigs/cluster-api-provider-azure/issues/2204
- https://github.com/kubernetes-sigs/cluster-api/pull/6988
- https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/2739
---


# AzureManagedCluster graduation from experimental

## Summary

`AzureManagedCluster` and its corresponding set of CRDs (we will refer to these CRDs as simply "`AzureManagedCluster`" in this document) is a capz-native implementation of Azure Managed Kubernetes (AKS). Because there is no standard set of Cluster API resource definitions for a "Managed Kubernetes cluster", it is left up to the provider to re-use the existing Cluster API specification (for example, the `Cluster` and its to-be-implemented-by-provider properties such as `ControlPlaneEndpoint`, `ControlPlaneRef` and `InfrastructureRef`). As a result, capz implemented "`AzureManagedCluster`" with an API contract designation of "experimental", to allow for rapid prototyping and discovery.

With the recent adoption of "`AzureManagedCluster`" by the capz community for practical, real-world use, we want to identify the set of outstanding items that may prevent graduation from experimental, and address each one of them, so that future adoption can be unlocked, and users can confidently build resilient systems on top of a stable API.

## Motivation

### Goals
- Agree upon a durable, post-experimental specification of the set of CRDs that implement Managed Kubernetes on Azure, e.g., `AzureManagedCluster`, `AzureManagedControlPlane`, `AzureManagedMachinePool`
- Prioritize timeliness of a post-experimental definition so that users can confidently plan for any breaking changes that result from a graduated spec as soon as possible
- Prioritize "architectural affinity" with other Cluster API providers implementing Managed Kubernetes

### Non-Goals / Future Work
- Add to, or remove features from, "`AzureManagedCluster`"
- Standardize any opinions about how best to use AKS
- Promote capz-specific opinions about Managed Kubernetes across the Cluster API provider ecosystem
- Define operational support

## Post-graduation Prerequisites

Concrete prereq workstreams are [tracked here](https://github.com/orgs/kubernetes-sigs/projects/26/views/1)

### 1. Land Managed Cluster in Cluster API Proposal

See:

- https://github.com/kubernetes-sigs/cluster-api/pull/6988

The above proposal defines a set of recommendations for Cluster API providers implementing Managed Kubernetes solutions. Because this proposal is an opt-in collection of architectural recommendations, it is not required that "`AzureManagedCluster`" strictly agrees with everything therein. However, by contributing to the successful landing of that proposal in Cluster API, we can best ensure that any capz-specific opinions or learnings are reflected, for the benefit of the larger ecosystem, and for the maximum happiness of AKS customers in particular.

### 2. Integrate Cluster API Proposal Recommendations

Once the proposal is accepted and merged into the Cluster API project as an endorsed set of provider recommendations, capz can audit the existing "`AzureManagedCluster`" experimental implementation for areas of disagreement with said recommendations. For each discovery of disagreement, there should be a defensible reason to matriculate a divergent capz Managed Kubernetes implementation into a post-experimental definition of "`AzureManagedCluster`", and ideally some form of sign-off from other provider maintainers. Where community consensus cannot be reached, capz should strongly consider evolving the experimental "`AzureManagedCluster`" implementation to meet the Cluster API Managed Kubernetes recommendation as a pre-requisite to graduating from experimental.

### 3. Full AKS Feature Support

"`AzureManagedCluster`" should be able to accommodate the entire feature set of AKS. Rather than require the concrete list of features of AKS to be fully implemented as a prerequisite to graduation, we should instead audit the AKS feature matrix against the "`AzureManagedCluster`" architectural surface area and ensure that capz is well prepared to continually integrate existing and new AKS features into "`AzureManagedCluster`" along a non-breaking path forward.

### 4. Affinity between capz / Cluster API resource lifecycle enforcement and AKS lifecycle enforcement.

Not unrelated to the above audit of AKS features, we should also overlay the Cluster API lifecycle combinatorics (simply speaking: Create, Read, Update, Delete) against the set of AKS cluster primitives to ensure that sane interfaces exist in capz in order to effectively enforce (for example, add an AKS node pool), or passively delegate authority (for example, defer to the built-in AKS autoscaler when enabled to enforce `MachinePool` replica count), if appropriate.

### 5. Azure API Request Optimizations

Prior to graduation from experimental, we should ensure that the sufficient set of configurable interfaces exist to allow "`AzureManagedCluster`" users to effectively tune capz so that no unnecessary Azure API requests are introduced into their AKS environment. We should ship sane, optimized defaults, but allow for flexible overrides to anticipate a wide variety of operational use cases.

### 6. E2E Tests

For every supported AKS feature, and for every supported lifecycle mutation of said feature, capz should have thorough, regular E2E test coverage.

## Open Questions

### Dependency upon (currently experimental) MachinePool spec

At present we re-use the Cluster API `MachinePool` specification (as `AzureManagedMachinePool`) to implement AKS node pools running on Azure VMSS. A consideration here is that `MachinePool` is considered experimental, and behind a feature flag, by Cluster API. Do we want to add the graduation of `MachinePool` out of experimental as a prerequisite for graduating "`AzureManagedCluster`" out of experimental?

We are tracking a concrete Cluster API implementation of `MachinePoolMachine` [here](https://github.com/kubernetes-sigs/cluster-api/pull/6089).

0 comments on commit 9600a32

Please sign in to comment.