KEDA Kubernetes integration #56

belemaire · 2020-07-22T17:13:08Z

To get maximum scale at minimum cost, idea is to use Keda K8 autoscaler for LiveBundle.

LiveBundle GitHub application (composed of three services : producer/consumerr/queue) needs a node with a good number of VCPUs and RAM, as it is in charge to pull the PR from the repository and run react-native bundle multiple times on it. Here I am specifically talking about the consumer service. Indeed, the producer and queue services are very lightweight and we don't need to scale them in any way.

The "problem" here regarding the consumer, is that if we allocate a single node for it, the node will just stand idle most of the time, waiting for PRs to crunch. On the other hand, because we don't want to process multiple PRs in // on the same node (to keep processing time as deterministic as possible), if for some reason one or more PRs are opened while a PR is being processed, then they will be queued for processing by the consumer node, and the client will potentially wait for a while to get a QRCode back (also breaking the guarantee on deterministic processing time). What can be done to mitigate the latter, is to have a certain number of consumer replicas nodes (5 for example), running. But this makes the former problem even worse as we now have to pay the cost of 5 running nodes running idle most of the time. For reference a 4VCPU/8GB ram node in a popular public cloud provider, cost around 40$/month to run 24/24 7/7. This might just be too expensive for a lot of potential public users of LiveBundle (and this is only the cost for one, multiply by 5 if you want 5 of them in case you have high PR throughput).

Unfortunately K8 out of the box only an horizontal pod autoscaler, that can scale the number of pods (or nodes if making a quick shortcut) based on some metrics like CPU utilization for example. This is not really useful in our case.

Keda would solve both these problems, as it is an event driver autocaler, that can scale the number of pods based on some event driven metrics, such as the numbers of messages in a queue for example. It can also scale the number of pods all the way back down to zero. What this would allow, from an high level perspective, is to keep the number of consumer nodes to zero (no cost) and whenever a message is sent to the queue (a PR to process), Keda would spin up a node to do the processing, and scale back down when done. If 5 PRs are opened simultenaously, Keda would spin up 5 different nodes to process the message (based on configuration). So in theory, this means that costly consumer nodes would only be run whenever there is some PR to process, and would disappear afterward. Given that the billing of nodes in public cloud providers is by the minute, this would drastically scale cost downs (addressing first problem), while guaranteeing consistent PR prrocess time (addressing second problem).

The text was updated successfully, but these errors were encountered:

belemaire · 2020-07-22T17:16:59Z

Unfortunately, after much experimentation with Keda ScaledDeployments and ScaledJobs, it is not in a state that could fulfill our needs at this point.

ScaledDeployments will not really help us here. What we need is ScaledJobs.
But this feature currently suffers big issues that make it unusable as of now, boiling it down to the two following issues that I experienced myself with v1.5 during my exploration :
kedacore/keda#801
kedacore/keda#829

Seems like Keda team is now focusing solely on v2 release and won't really address issues in v1.5.
Hopefully v2 should address all these issues and make it useable per our needs, just need to be patient a little bit ;)

belemaire · 2020-09-04T22:06:03Z

Closing this one as we moved away from K8 using new architecture solely based on cloud storage

belemaire added the scalability label Jul 22, 2020

belemaire closed this as completed Sep 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KEDA Kubernetes integration #56

KEDA Kubernetes integration #56

belemaire commented Jul 22, 2020

belemaire commented Jul 22, 2020

belemaire commented Sep 4, 2020

KEDA Kubernetes integration #56

KEDA Kubernetes integration #56

Comments

belemaire commented Jul 22, 2020

belemaire commented Jul 22, 2020

belemaire commented Sep 4, 2020