Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spike: seperate out each validator to its own StatefulSet #301

Open
Anmol1696 opened this issue Nov 4, 2023 · 2 comments
Open

spike: seperate out each validator to its own StatefulSet #301

Anmol1696 opened this issue Nov 4, 2023 · 2 comments
Labels

Comments

@Anmol1696
Copy link
Collaborator

Overview

With implementation of ingress, and routing traffic to individual pods of a StatefulSets is non trival, might require embeding envoy proxy with istio.

Currently

Each chain is represented with 2 statefulsets:

  • genesis: 1 replica
  • validator: n replica
    Number of validators exposed to users is a simple number that can be set. Internal routing of traffic to individual validator is based on the headless service for the validator and use: validator-<n>.validator.$NAMESPACE.svc.cluster.local. But this is not available for simple ingress rules

Proposal

An alternative to current approach of 1 statefulset, is to just have multiple statefuls per validator. This will add the overhead of many Statefulset, but will reduce the overhead of tricky networking to route traffic.

Additional benifits: Each validator will have its own service

Alternatives

  • CRDS: starting to look promissing, we can simplify our own overhead of services, statefulsets etc. Not sure what this will look like though
  • Service mesh would need to integrated anyways for observability. It just might make sense to have a reverse proxy in place, then we could route traffic via the istion-ingress to individual pods of the stateful sets.
@Anmol1696 Anmol1696 added the spike label Nov 4, 2023
@Anmol1696
Copy link
Collaborator Author

Note, each validator will have its own domain. Something like: genesis.<chain>.<domain>,validator-1.<chain>.<domain>.

@Anmol1696
Copy link
Collaborator Author

Anmol1696 commented Nov 4, 2023

Hmm interesting. The restart itself could be due to pod consuming more memory then limits.
After validator nodes running for 23h, memory utilization seems to be close 1.9Gi/2Gi:

│ devnet mesh-2-genesis-0  ●    3/3  1 Running   97  1975  5  5  60  60 10.244.0.95  pool-11oli7pbe-xn7o7    23h       │

Need to do more research around memory requirements for longer running devnets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant