Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit kubernetes events from KEDA #1523

Merged
merged 6 commits into from
Feb 6, 2021
Merged

Conversation

ahmelsayed
Copy link
Contributor

@ahmelsayed ahmelsayed commented Jan 21, 2021

Signed-off-by: Ahmed ElSayed ahmels@microsoft.com

This PR adds the following events:

For both ScaledObjects and ScaledJobs:

  • Ready
  • CheckFailed
  • Deleted
  • ScalersStarted
  • ScalersRestarted
  • ScalersStopped

For ScaledObjects:

  • ScaleTargetActivated
  • ScaleTargetDeactivated
  • ScaleTargetActivationFailed
  • ScaleTargetDeactivationFailed

For ScaledJobs:

  • JobsCreated

If this list looks okay, I'll open a docs PR to list them in there.

Checklist

Fixes #530

@@ -90,6 +94,9 @@ func (h *scaleHandler) HandleScalableObject(scalableObject interface{}) error {
cancelValue()
}
h.scaleLoopContexts.Store(key, cancel)
h.recorder.Event(withTriggers, corev1.EventTypeNormal, eventreason.ScalersRestarted, "Restarted scalers watch")
} else {
h.recorder.Event(withTriggers, corev1.EventTypeNormal, eventreason.ScalersStarted, "Started scalers watch")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this emit every time the controller pod restarts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I wasn't sure if there is a best practice somewhere for when to emit events. I tried to add all the ones mentioned in #530 but if there is a best practices for this I'd love to check it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The basic guideline is it should only be when something actually occurs. So any situation where a reconcile happens and it takes no action, it should produce no events :) Otherwise they can get spammy and overwhelm Etcd.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I don't suppose that controller pod restarts happen that often or it is something that we should consider as normal. Not sure what other way we can emit this kind of event?

} else {
reqLogger.V(1).Info(msg)
conditions.SetReadyCondition(metav1.ConditionTrue, "ScaledObjectReady", msg)
r.Recorder.Event(scaledObject, corev1.EventTypeNormal, eventreason.Ready, msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't usually the kind of thing you would want in an event since it's not a specific action or event, it's a convergent state.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, we should emit only once, for the first time.

@tomkerkhove
Copy link
Member

tomkerkhove commented Jan 22, 2021

Before we merge, can you PR a new sub-page to our Operate docs please?
https://keda.sh/docs/2.1/operate/

Would be good to list all event types and what scenario they represent.

} else {
reqLogger.V(1).Info(msg)
conditions.SetReadyCondition(metav1.ConditionTrue, "ScaledJobReady", msg)
r.Recorder.Event(scaledJob, corev1.EventTypeNormal, eventreason.Ready, msg)
Copy link
Member

@zroubalik zroubalik Jan 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be emitted only for the first time (if at all).

} else {
reqLogger.V(1).Info(msg)
conditions.SetReadyCondition(metav1.ConditionTrue, "ScaledObjectReady", msg)
r.Recorder.Event(scaledObject, corev1.EventTypeNormal, eventreason.Ready, msg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, we should emit only once, for the first time.

@@ -90,6 +94,9 @@ func (h *scaleHandler) HandleScalableObject(scalableObject interface{}) error {
cancelValue()
}
h.scaleLoopContexts.Store(key, cancel)
h.recorder.Event(withTriggers, corev1.EventTypeNormal, eventreason.ScalersRestarted, "Restarted scalers watch")
} else {
h.recorder.Event(withTriggers, corev1.EventTypeNormal, eventreason.ScalersStarted, "Started scalers watch")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I don't suppose that controller pod restarts happen that often or it is something that we should consider as normal. Not sure what other way we can emit this kind of event?

CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Member

@zroubalik zroubalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list of events looks okay!

@tomkerkhove
Copy link
Member

tomkerkhove commented Jan 22, 2021

What about these events:

  • Scaler failed (which is not stopped I guess?)
  • Scaler unauthorized
  • New Trigger Authentication Created
  • Deployment Scaled to Zero
  • Scaledobject/SCaledJob is linked to trigger authentication

Follow-up PR/issue is ok, just asking

Signed-off-by: Ahmed ElSayed <ahmels@microsoft.com>
@ahmelsayed ahmelsayed force-pushed the ahmels/events branch 2 times, most recently from 52306cd to d3191b5 Compare January 29, 2021 19:51
Signed-off-by: Ahmed ElSayed <ahmels@microsoft.com>
Signed-off-by: Ahmed ElSayed <ahmels@microsoft.com>
@ahmelsayed
Copy link
Contributor Author

Thanks @coderanger, @zroubalik, @tomkerkhove for the feedback.
I made the following changes in 2872b99

  • The *Ready events to only happen either on the first time a ScaledObject/ScaledJob is reconciled, or if it's previous Ready status was False/Unknown.
  • Removed ScalersRestarted since that would always happen on any ScaledObject/ScaledJob update
  • Renamed events to either include ScaledJob, ScaledObject, or KEDA prefix
  • Added a controller for TriggerAuthentication and events for a newly added TriggerAuthentication or deleted one.
  • Opened Provide documentation for Kubernetes events keda-docs#361

Some remarks:

  • ScalersStarted event will still fire for all scalers on KEDA restart.
  • @tomkerkhove, Scaler unauthorized will be KEDAScalerFailed, but that will also happen for any other errors other than Unauthorized. Scalers themselves don't get passed the event recorder object, so errors from them are opaque to KEDA itself. Initially I wanted to avoid scaler authers having to deal with Kubernetes APIs too much. do you think this is sufficient?
  • @tomkerkhove Regarding "Scaledobject/SCaledJob is linked to trigger authentication", I'm not sure how best to do this tbh. This can happen on any ScaledObject/ScaledJob update. I can diff them on every update and emit those events, but currently we don't really store an easily enumerable list of ScaledObjects/ScaledJobs anywhere (each just gets a context and keeps checking the scalers until the context is canceled) I'll need to store references to all of them to be able to diff the old vs new, @zroubalik what do you think about that?

Signed-off-by: Ahmed ElSayed <ahmels@microsoft.com>
Copy link
Contributor

@coderanger coderanger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall +1 from me on the API plumbing side.

}

if triggerAuthentication.ObjectMeta.Generation == 1 {
r.Recorder.Event(triggerAuthentication, corev1.EventTypeNormal, eventreason.TriggerAuthenticationAdded, "New TriggerAuthentication configured")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a slightly weird one, but I don't think it will do any harm :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we can go with this for now :)

@@ -90,6 +95,8 @@ func (h *scaleHandler) HandleScalableObject(scalableObject interface{}) error {
cancelValue()
}
h.scaleLoopContexts.Store(key, cancel)
} else {
h.recorder.Event(withTriggers, corev1.EventTypeNormal, eventreason.KEDAScalersStarted, "Started scalers watch")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would display on every restart of the controller? Can probably drop this and the scalers-stopped events since they don't correspond to actual actions, just internal code state.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but how often do we want to(or would like to see) restart the controller? Is this happening that often? And in fact, with a restart the scalers do start watch, so the event message is correct.

@tomkerkhove
Copy link
Member

  • Renamed events to either include ScaledJob, ScaledObject, or KEDA prefix

I've had a look kedacore/keda-docs#361 and it's a bit odd since some have the KEDA prefix and others don't. I know I've requested this, but if others think it's stupid I would remove it or add the prefix to all of them. Thoughts @zroubalik?

Just for context: The reason why I suggested this was for consumers since they typically process the whole event stream and this would make it easier for them to understand where these come from.

Thanks 💘

  • @tomkerkhove, Scaler unauthorized will be KEDAScalerFailed, but that will also happen for any other errors other than Unauthorized. Scalers themselves don't get passed the event recorder object, so errors from them are opaque to KEDA itself. Initially I wanted to avoid scaler authers having to deal with Kubernetes APIs too much. do you think this is sufficient?

That's not ideal as you might want to filter out to detect authentication issues, but we can still split them later on if this is too much trouble now. Thoughts @zroubalik?

  • @tomkerkhove Regarding "Scaledobject/SCaledJob is linked to trigger authentication", I'm not sure how best to do this tbh. This can happen on any ScaledObject/ScaledJob update. I can diff them on every update and emit those events, but currently we don't really store an easily enumerable list of ScaledObjects/ScaledJobs anywhere (each just gets a context and keeps checking the scalers until the context is canceled) I'll need to store references to all of them to be able to diff the old vs new, @zroubalik what do you think about that?

Let's leave this out then, we can still add it later on if need be?

@tomkerkhove tomkerkhove added this to the v2.2 milestone Jan 30, 2021
Copy link
Member

@zroubalik zroubalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, I like the renamed event names!

  • we should probably add a controller for ClusterTriggerAuthentication and cover this new resource as well.
  • ad "Scaledobject/SCaledJob is linked to trigger authentication" discussion - yeah, we would need to track it as you suggest. But I don't think is necessary to add this complexity now. We could add later if there's a need from community. what you say @tomkerkhove?

scaleHandler scaling.ScaleHandler
}

// SetupWithManager initializes the ScaledJobReconciler instance and starts a new controller managed by the passed Manager instance.
func (r *ScaledJobReconciler) SetupWithManager(mgr ctrl.Manager) error {
r.scaleHandler = scaling.NewScaleHandler(mgr.GetClient(), nil, mgr.GetScheme(), r.GlobalHTTPTimeout)
r.scaleHandler = scaling.NewScaleHandler(mgr.GetClient(), nil, mgr.GetScheme(), r.GlobalHTTPTimeout, mgr.GetEventRecorderFor("scale-handler"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Event recorder for Metrics Adapter is named keda-metrics-adapter and this one is named scale-handler. For consistency, this one could be maybe named keda-operator/ keda-controller and sync them with those set in main.go WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that makes sense. I'll change it to one recorder with the name keda-operator

CHANGELOG.md Outdated
@@ -43,6 +43,7 @@
- Global authentication credentials can be managed using `ClusterTriggerAuthentication` objects ([#1452](https://github.com/kedacore/keda/pull/1452))
- Introducing OpenStack Swift scaler ([#1342](https://github.com/kedacore/keda/issues/1342))
- Introducing MongoDB scaler ([#1467](https://github.com/kedacore/keda/pull/1467))
- Emit Kubernetes Events on KEDA events ([#1523](https://github.com/kedacore/keda/pull/1523)):wq
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This should be moved to Unreleased section above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And remove the vim suffix at the end :)

}

if triggerAuthentication.ObjectMeta.Generation == 1 {
r.Recorder.Event(triggerAuthentication, corev1.EventTypeNormal, eventreason.TriggerAuthenticationAdded, "New TriggerAuthentication configured")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we can go with this for now :)

@@ -90,6 +95,8 @@ func (h *scaleHandler) HandleScalableObject(scalableObject interface{}) error {
cancelValue()
}
h.scaleLoopContexts.Store(key, cancel)
} else {
h.recorder.Event(withTriggers, corev1.EventTypeNormal, eventreason.KEDAScalersStarted, "Started scalers watch")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but how often do we want to(or would like to see) restart the controller? Is this happening that often? And in fact, with a restart the scalers do start watch, so the event message is correct.

@zroubalik
Copy link
Member

zroubalik commented Feb 1, 2021

  • Renamed events to either include ScaledJob, ScaledObject, or KEDA prefix

I've had a look kedacore/keda-docs#361 and it's a bit odd since some have the KEDA prefix and others don't. I know I've requested this, but if others think it's stupid I would remove it or add the prefix to all of them. Thoughts @zroubalik?

I personally don't have a problem with ScaledJob, ScaledObject, or KEDA prefix.

  • @tomkerkhove, Scaler unauthorized will be KEDAScalerFailed, but that will also happen for any other errors other than Unauthorized. Scalers themselves don't get passed the event recorder object, so errors from them are opaque to KEDA itself. Initially I wanted to avoid scaler authers having to deal with Kubernetes APIs too much. do you think this is sufficient?

That's not ideal as you might want to filter out to detect authentication issues, but we can still split them later on if this is too much trouble now. Thoughts @zroubalik?

+1 split later if needed

  • @tomkerkhove Regarding "Scaledobject/SCaledJob is linked to trigger authentication", I'm not sure how best to do this tbh. This can happen on any ScaledObject/ScaledJob update. I can diff them on every update and emit those events, but currently we don't really store an easily enumerable list of ScaledObjects/ScaledJobs anywhere (each just gets a context and keeps checking the scalers until the context is canceled) I'll need to store references to all of them to be able to diff the old vs new, @zroubalik what do you think about that?

Let's leave this out then, we can still add it later on if need be?

+1

@tomkerkhove
Copy link
Member

I personally don't have a problem with ScaledJob, ScaledObject, or KEDA prefix.

So you're ok with how they are or would you use KEDA prefix for all?

@zroubalik
Copy link
Member

I personally don't have a problem with ScaledJob, ScaledObject, or KEDA prefix.

So you're ok with how they are or would you use KEDA prefix for all?

I am ok with how they are now.

@ahmelsayed
Copy link
Contributor Author

Regarding the naming, I was initially looking how the default kubernetes events are named, and they were all named as Created, Deleted, Killing, Pulling, etc. So I assumed the pattern is to have a verb for the name and drive the meaning from the object the event is on. The current names have the prefix ScaledObject or ScaledJob, but the ones for scalers are shared for both since scalers are the same regardless of the target.

Signed-off-by: Ahmed ElSayed <ahmels@microsoft.com>
Copy link
Member

@tomkerkhove tomkerkhove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me, it's a lot better than the default ones :D

LGTM, let me know when the docs are updated!

Signed-off-by: Ahmed ElSayed <ahmels@microsoft.com>
@tomkerkhove tomkerkhove merged commit aac70e6 into kedacore:main Feb 6, 2021
@tomkerkhove
Copy link
Member

🚀

@@ -90,6 +95,8 @@ func (h *scaleHandler) HandleScalableObject(scalableObject interface{}) error {
cancelValue()
}
h.scaleLoopContexts.Store(key, cancel)
} else {
h.recorder.Event(withTriggers, corev1.EventTypeNormal, eventreason.KEDAScalersStarted, "Started scalers watch")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahmelsayed Is it possible to add the name of every scaler/trigger here?

@@ -115,6 +122,7 @@ func (h *scaleHandler) DeleteScalableObject(scalableObject interface{}) error {
cancel()
}
h.scaleLoopContexts.Delete(key)
h.recorder.Event(withTriggers, corev1.EventTypeNormal, eventreason.KEDAScalersStopped, "Stopped scalers watch")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahmelsayed Is it possible to add the name of every scaler/trigger here?

ycabrer pushed a commit to ycabrer/keda that referenced this pull request Mar 1, 2021
* Emit kubernetes events from KEDA

Signed-off-by: Ahmed ElSayed <ahmels@microsoft.com>

* CR comments

Signed-off-by: Ahmed ElSayed <ahmels@microsoft.com>

* Fix CI errors

Signed-off-by: Ahmed ElSayed <ahmels@microsoft.com>

* goimports

Signed-off-by: Ahmed ElSayed <ahmels@microsoft.com>

* Code review comments

Signed-off-by: Ahmed ElSayed <ahmels@microsoft.com>

* Fix CHANGELOG.md

Signed-off-by: Ahmed ElSayed <ahmels@microsoft.com>
Rodolfodc pushed a commit to sidilabs/keda that referenced this pull request Mar 11, 2021
* Emit kubernetes events from KEDA

Signed-off-by: Ahmed ElSayed <ahmels@microsoft.com>

* CR comments

Signed-off-by: Ahmed ElSayed <ahmels@microsoft.com>

* Fix CI errors

Signed-off-by: Ahmed ElSayed <ahmels@microsoft.com>

* goimports

Signed-off-by: Ahmed ElSayed <ahmels@microsoft.com>

* Code review comments

Signed-off-by: Ahmed ElSayed <ahmels@microsoft.com>

* Fix CHANGELOG.md

Signed-off-by: Ahmed ElSayed <ahmels@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Emit Kuberentes Events on major KEDA events
4 participants