You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are 2 ways to use the azure engine wrt service principals:
static AAD service principal - which is supplying the application object id.
dynamic AAD service principal - which is you provide the role info, principal and role assignments get created.
So again, when using dynamic principals, the service principal is created, then a role assignment is done.
Sometimes this fail, and can fail consistently.
How we found this issue:
We knew about the AAD propagation delay about creation of service principal and wanted to make that more reliable
We use terraform to create the static AAD service principal and role assignment, this failed....
╷
│ Error: authorization.RoleAssignmentsClient#Create: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="RoleAssignmentLimitExceeded" Message="No more role assignments can be created."
│
│ with module.azure-service-principal-xxxxxxxx-xxxxx-xxxx.azurerm_role_assignment.service-principal-built-in-roles["/providers/Microsoft.Management/managementGroups/xxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxx.Name of my Service"],
│ on modules/cross_subscription_service_principal/main.tf line 109, in resource "azurerm_role_assignment" "service-principal-built-in-roles":
│ 109: resource "azurerm_role_assignment" "service-principal-built-in-roles" {
│
╵
I suspect maybe we misconfigured permanently_delete option or that the chosen ttl can be an issue with hitting our RoleAssignment quota before old objects are deleted / GCed.
However what can happen if you use a kubernetes deployment and for some reason that deployment is failing, each restart is going to create a new service principal and a new role assignment, this can also lead to resource exhaustion.
So an operator may fix the leak by going to the azure portal, checking role assignments, deleting old ones, etc.
Hi @dnozay! We recently addressed some of the leaking role assignment concerns in #110. Does this address your concerns? If not, specific steps to reproduce leaking role assignments would be helpful here. Thanks!
There are 2 ways to use the azure engine wrt service principals:
So again, when using dynamic principals, the service principal is created, then a role assignment is done.
Sometimes this fail, and can fail consistently.
How we found this issue:
I suspect maybe we misconfigured
permanently_delete
option or that the chosenttl
can be an issue with hitting our RoleAssignment quota before old objects are deleted / GCed.However what can happen if you use a kubernetes deployment and for some reason that deployment is failing, each restart is going to create a new service principal and a new role assignment, this can also lead to resource exhaustion.
So an operator may fix the leak by going to the azure portal, checking role assignments, deleting old ones, etc.
When role unassignment is performed, if it fails it does not retry:
https://github.com/hashicorp/vault-plugin-secrets-azure/blob/main/path_service_principal.go#L276-L296
This can also be a source of leaks.
The text was updated successfully, but these errors were encountered: