[v2] Implement the scaledjob controler for v2 #945

TsuyoshiUshio · 2020-07-22T18:04:56Z

I open this PR for start conversation. This is the initial implementation for the v2 scaled job controller.
This version is for implementing the v1 logic with v2 structure/design(a.k.a. refactoring). Once this is accepted, I'd like to send
other PRs for bug fixes and new features. I tested with Rabbit MQ scaler. Any feedback is welcome.

I don't write test now since the design review might comes first, then I'll write the test. I also want to check if this change doesn't broke anything.

Change introduced

Implement the scale job controller and finalizer with V1 based logic
scaleHandler L224 calls Close() method and it cause a problem of missing connection
ReconcileScaledJob update
Add Local debugging feature for the rabbit-mq scaler

I'd like to discuss when should we call the Close() method. I need to understand life-cycle and behavior more.
If you guys are ok. I'd like to add a sample for getting started for the scaled job for rabbit-mq with documentation.

Checklist

A PR is opened to update the documentation on https://github.com/kedacore/keda-docs
[-] Commits are signed with Developer Certificate of Origin (DCO)
Tests have been added

Fixes #
#789

pkg/controller/scaledjob/scaledjob_controller.go

pkg/scaling/scale_handler.go

pkg/scalers/rabbitmq_scaler.go

Signed-off-by: Tsuyoshi Ushio <ushio@simplearchitect.com>

zroubalik

It's a good start! The scaling logic needs to be improved though, as mentioned for example in these issues (and maybe in others) #636 #801 #525 kedacore/keda-docs#187 #529

The issue described in those is caused by the similar problem. So don't be afraid to modify the logic a lot, modify the ScaleJob type if needed etc. We need to change a way how KEDA resolves number of created jobs during each pollingInterval.

pkg/controller/scaledjob/scaledjob_controller.go

deploy/crds/keda.sh_scaledjobs_crd.yaml

TsuyoshiUshio · 2020-07-23T17:44:26Z

Hi @zroubalik , @ahmelsayed
I'd like to ask you two things. I'll introduce a change of the scale logic at least to cover #636 and related. (However, on this PR, includes the issue that you mentioned only) then I'll create subsequent PRs.

The PR that is mentioned the issue already has a good solution, however, the naming looks not good for avoiding breaking change for v1. So if there is a case, the metric is different between ScaledObject and ScaledJobs, I'd like to separate the metric interface as the the PR suggested. Since the V2 can introduce breaking change from V1, I'd like to introduce the change for the scaler interface. Do you think it is a good idea? or Any suggestion?

I'm not sure if we need the different metrics between ScaledObject and ScaledJob though.

I'm asking the original author why he introduce the interface change. #639 (comment)

Until I've got the answer from the author, I'll keep the interface as-is and try solving the issue.

type Scaler interface {

	GetMetrics(ctx context.Context, metricName string, metricSelector labels.Selector) ([]external_metrics.ExternalMetricValue, error)

	GetMetricSpecForScaleObject() []v2beta2.MetricSpec

	GetMetricSpecForScaleJob() []v2beta2.MetricSpec

	IsActive(ctx context.Context) (bool, error)

	// Close any resources that need disposing when scaler is no longer used or destroyed
	Close() error
}

TsuyoshiUshio · 2020-07-24T05:08:17Z

I'm learning to keep the current interface. Since I can't imagine any case that we need to use different scale metrics between deployment and jobs. If there is specific case is there, please let me know.

zroubalik · 2020-07-24T10:22:52Z

I'm learning to keep the current interface. Since I can't imagine any case that we need to use different scale metrics between deployment and jobs. If there is specific case is there, please let me know.

@TsuyoshiUshio I can't imagine any usecase for adding a separate metrics for Jobs either . So go ahead as you said with the current interface and if there's gonna be a reason to add it, we can do that in a subsequent PRs.

TsuyoshiUshio · 2020-07-24T15:00:23Z

Hi @zroubalik
Thanks! I'll do that.

… queue Signed-off-by: Tsuyoshi Ushio <ushio@simplearchitect.com>

Co-authored-by: Zbynek Roubalik <726523+zroubalik@users.noreply.github.com> Signed-off-by: Tsuyoshi Ushio <ushio@simplearchitect.com>

Signed-off-by: Tsuyoshi Ushio <ushio@simplearchitect.com>

TsuyoshiUshio · 2020-07-25T23:20:09Z

Hi @ahmelsayed @zroubalik
#636 #525
Might be solved since I implement the logic that only count visible queue if it is queue length is less than 32.
Also, I fix the all requests except for the #801 kedacore/keda-docs#187 #529 . I'm start investigate it one by one, however, merge this PR might be good idea since we can start debugging the new scalejob logic and can work parallel.

As you adoviced, I'll create a separate PR for changing rabbit mq for enabling local testing.

I also create a sample project that we can send/receive with Dockerfile and deploy ScaledJob for Azure Queue Stroage that is written by go.
https://github.com/TsuyoshiUshio/sample-go-storage-queue

TsuyoshiUshio · 2020-07-27T16:43:39Z

@zroubalik
If you are ok, could you merge this? I looked at the issues that you showed, the fix will be different, and sometimes, we need to discuss about specific topic on each PRs. I keep on fix these issues for the V2 release.

* [v2] Implement the scaledjob controler for v2 Signed-off-by: Tsuyoshi Ushio <ushio@simplearchitect.com> * Scale Logic of the Azure Storage Queue for not scaling with invisible queue Signed-off-by: Tsuyoshi Ushio <ushio@simplearchitect.com> * Update pkg/controller/scaledjob/scaledjob_controller.go Co-authored-by: Zbynek Roubalik <726523+zroubalik@users.noreply.github.com> Signed-off-by: Tsuyoshi Ushio <ushio@simplearchitect.com> * rollback the change for rabbit mq and fix the protocol on crd Signed-off-by: Tsuyoshi Ushio <ushio@simplearchitect.com> * Add protocol for the scalejobs_crd Signed-off-by: Tsuyoshi Ushio <ushio@simplearchitect.com> Co-authored-by: Zbynek Roubalik <726523+zroubalik@users.noreply.github.com>

TsuyoshiUshio requested review from ahmelsayed and zroubalik as code owners July 22, 2020 18:04

ahmelsayed reviewed Jul 22, 2020

View reviewed changes

pkg/controller/scaledjob/scaledjob_controller.go Outdated Show resolved Hide resolved

ahmelsayed reviewed Jul 22, 2020

View reviewed changes

pkg/scaling/scale_handler.go Outdated Show resolved Hide resolved

ahmelsayed reviewed Jul 22, 2020

View reviewed changes

pkg/scalers/rabbitmq_scaler.go Outdated Show resolved Hide resolved

TsuyoshiUshio force-pushed the tsushi/refactorjob branch from 9c47512 to 9ab83a5 Compare July 23, 2020 07:01

[v2] Implement the scaledjob controler for v2

a443e91

Signed-off-by: Tsuyoshi Ushio <ushio@simplearchitect.com>

TsuyoshiUshio force-pushed the tsushi/refactorjob branch from 9ab83a5 to a443e91 Compare July 23, 2020 07:04

zroubalik reviewed Jul 23, 2020

View reviewed changes

TsuyoshiUshio and others added 3 commits July 25, 2020 15:35

Scale Logic of the Azure Storage Queue for not scaling with invisible…

ee1d682

… queue Signed-off-by: Tsuyoshi Ushio <ushio@simplearchitect.com>

Update pkg/controller/scaledjob/scaledjob_controller.go

6ffb9ba

Co-authored-by: Zbynek Roubalik <726523+zroubalik@users.noreply.github.com> Signed-off-by: Tsuyoshi Ushio <ushio@simplearchitect.com>

rollback the change for rabbit mq and fix the protocol on crd

d011e2b

Signed-off-by: Tsuyoshi Ushio <ushio@simplearchitect.com>

TsuyoshiUshio force-pushed the tsushi/refactorjob branch from 1165914 to d011e2b Compare July 25, 2020 22:35

Add protocol for the scalejobs_crd

eecdc45

Signed-off-by: Tsuyoshi Ushio <ushio@simplearchitect.com>

ahmelsayed approved these changes Jul 27, 2020

View reviewed changes

ahmelsayed merged commit b203f38 into kedacore:v2 Jul 27, 2020

SpiritZhou pushed a commit to SpiritZhou/keda that referenced this pull request Jul 18, 2023

Add documentation for operator Prometheus metrics. (kedacore#945)

a17d853

hschimmel mentioned this pull request May 14, 2024

Scaledjob creates double amount of replicas on the first time #2488

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v2] Implement the scaledjob controler for v2 #945

[v2] Implement the scaledjob controler for v2 #945

TsuyoshiUshio commented Jul 22, 2020 •

edited

Loading

zroubalik left a comment •

edited

Loading

TsuyoshiUshio commented Jul 23, 2020 •

edited

Loading

TsuyoshiUshio commented Jul 24, 2020

zroubalik commented Jul 24, 2020

TsuyoshiUshio commented Jul 24, 2020

TsuyoshiUshio commented Jul 25, 2020 •

edited

Loading

TsuyoshiUshio commented Jul 27, 2020

[v2] Implement the scaledjob controler for v2 #945

[v2] Implement the scaledjob controler for v2 #945

Conversation

TsuyoshiUshio commented Jul 22, 2020 • edited Loading

Change introduced

Checklist

zroubalik left a comment • edited Loading

Choose a reason for hiding this comment

TsuyoshiUshio commented Jul 23, 2020 • edited Loading

TsuyoshiUshio commented Jul 24, 2020

zroubalik commented Jul 24, 2020

TsuyoshiUshio commented Jul 24, 2020

TsuyoshiUshio commented Jul 25, 2020 • edited Loading

TsuyoshiUshio commented Jul 27, 2020

TsuyoshiUshio commented Jul 22, 2020 •

edited

Loading

zroubalik left a comment •

edited

Loading

TsuyoshiUshio commented Jul 23, 2020 •

edited

Loading

TsuyoshiUshio commented Jul 25, 2020 •

edited

Loading