Nodetool drain for C* #21

pavolloffay · 2017-07-25T15:09:24Z

Fixes part of https://github.com/uber/jaeger/issues/175.

@mwringe @jpkrohling could you please review?

jpkrohling

I'd rather use a shell script similar to what Hawkular Metrics does.

jpkrohling · 2017-07-25T15:50:17Z

https://github.com/openshift/origin-metrics/blob/master/deployer/templates/hawkular-cassandra-node-emptydir.yaml#L110

https://github.com/openshift/origin-metrics/blob/master/cassandra/cassandra-prestop.sh

pavolloffay · 2017-07-25T19:09:34Z

@jpkrohling how to you want to get that shell script to the official c* docker image?

jpkrohling · 2017-07-26T06:19:45Z

I see a few ways, but you might find other (better?) ways as well. A few thoughts:

Contribute it directly to the Cassandra project, so that this script is available to all on the base image
Use it as CockroackDB uses, embedding the script within the command: https://git.io/v7Ork . Note that they are using it as the actual command, but the same idea could be applied to a preStop hook. The origin metrics script isn't that big, so, this is still doable.
Create it as a secret or as entry of a config map and mount it on the image

The first option is preferable, but if you need this feature quick, then you might want to use one of the other ones.

pavolloffay · 2017-07-26T06:55:03Z

Why is there this?
["/bin/sh", "-c", "PID=$(pidof java) && kill $PID && while ps -p $PID > /dev/null; do sleep 1; done"]
I see that k8s c* is using it, but why?
https://github.com/kubernetes/examples/blob/master/cassandra/cassandra-statefulset.yaml#L40

About 1. I don't think it would happen it's basically one liner and it depends on what you want to do with the output (or even if you want to do something). We currently do not define variable CASSANDRA_DATA_VOLUME as hawkular-metrics does so we cannot store the output of that command anywhere, we should just run it.

I'm more for the simplest solution.

jpkrohling · 2017-07-26T07:44:57Z

Why is there this?

It looks like you disconnected when I talked to @mwringe about this. It comes originally from the Kubernetes Cassandra StatefulSet example and we are not sure why it's there.

Even though nodetool drain is a one liner, the Origin Metrics code does have some logic for locking, so that no two nodetool drain would happen at the same time on the same node.

I'm also for the simplest solution :) Do you think just adding nodetool drain to the preStop would suffice?

pavolloffay · 2017-07-26T07:51:57Z

the Origin Metrics code does have some logic for locking,

Where is this logic? Can we do the same locking?

jpkrohling · 2017-07-26T07:58:12Z

Where is this logic? Can we do the same locking?

https://github.com/openshift/origin-metrics/blob/master/cassandra/cassandra-prestop.sh

EDIT: sorry, I read the code too fast :) I thought it was a locking mechanism, but it clearly isn't

pavolloffay · 2017-07-26T08:03:14Z

java kill hack was introduced here, I think if we do nodetool drain then it's not necessary
kubernetes/kubernetes#39199

Some other references about nodetool drain:

Support Cassandra in PetSet kubernetes/kubernetes#24030 PR Pet Set Example for Cassandra kubernetes/kubernetes#30577

pavolloffay · 2017-07-26T09:09:53Z

travis/install-start-minikube.sh

@@ -24,7 +24,7 @@ mkdir $HOME/.kube || true
 touch $HOME/.kube/config

 export KUBECONFIG=$HOME/.kube/config
-sudo -E ./minikube start --vm-driver=none --use-vendored-driver


They removed --use-vendored-driver in the last version which was released yesterday.

We can use a specific minikube version to avoid failures like this. (but I prefer to use latest for now).

mwringe · 2017-07-26T13:06:13Z

About:
["/bin/sh", "-c", "PID=$(pidof java) && kill $PID && while ps -p $PID > /dev/null; do sleep 1; done"]

I suspect this is here because in previous versions of Kubernetes, the prestop hook would block. So having this as part of the prestop hook would wait until Cassandra has fully exited before the pod exits. Without this, Cassandra would be killed after a timeout (default 10s?) which could be cause data corruption.

But this is no longer the case and it can no longer be assumed to work this way. Even with a prestop hook, your pod will be terminated after a certain timeout (default 30s?).

The way you handle it now is to specify a large terminationGracePeriodSeconds value, which will give your application more time to fully shutdown (which is what the pr adds).

I believe all you want now for a prestop hook is to just call nodetool drain. Having it in a separate script is nice and makes it easier to output information to file so that if there is a problem it can be debugged later.

pavolloffay · 2017-07-26T14:03:31Z

@mwringe thanks for the explanation 👍

@jpkrohling what is missing to merge? I don't think that there is a benefit in having a separate script for this. We currently don't have a place to publish the result of it.

jpkrohling

LGTM

pavolloffay · 2017-07-26T14:18:49Z

@jpkrohling thanks I will open the same for OC

Nodetool drain on preStop

2bd1648

jpkrohling reviewed Jul 25, 2017

View reviewed changes

pavolloffay added 2 commits July 26, 2017 10:34

Use only nodetool drain in preStop

59cbb7a

fix minikube

f709d43

pavolloffay commented Jul 26, 2017

View reviewed changes

jpkrohling approved these changes Jul 26, 2017

View reviewed changes

pavolloffay merged commit 65863f3 into jaegertracing:master Jul 26, 2017

This was referenced Jul 26, 2017

Cassandra nodetool drain jaegertracing/jaeger-openshift#27

Merged

Improve Cassandra deployment for Kubernetes / OpenShift jaegertracing/jaeger#175

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nodetool drain for C* #21

Nodetool drain for C* #21

pavolloffay commented Jul 25, 2017

jpkrohling left a comment

jpkrohling commented Jul 25, 2017 •

edited

Loading

pavolloffay commented Jul 25, 2017

jpkrohling commented Jul 26, 2017

pavolloffay commented Jul 26, 2017

jpkrohling commented Jul 26, 2017

pavolloffay commented Jul 26, 2017

jpkrohling commented Jul 26, 2017 •

edited

Loading

pavolloffay commented Jul 26, 2017 •

edited

Loading

pavolloffay Jul 26, 2017 •

edited

Loading

mwringe commented Jul 26, 2017

pavolloffay commented Jul 26, 2017

jpkrohling left a comment

pavolloffay commented Jul 26, 2017

Nodetool drain for C* #21

Nodetool drain for C* #21

Conversation

pavolloffay commented Jul 25, 2017

jpkrohling left a comment

Choose a reason for hiding this comment

jpkrohling commented Jul 25, 2017 • edited Loading

pavolloffay commented Jul 25, 2017

jpkrohling commented Jul 26, 2017

pavolloffay commented Jul 26, 2017

jpkrohling commented Jul 26, 2017

pavolloffay commented Jul 26, 2017

jpkrohling commented Jul 26, 2017 • edited Loading

pavolloffay commented Jul 26, 2017 • edited Loading

pavolloffay Jul 26, 2017 • edited Loading

Choose a reason for hiding this comment

mwringe commented Jul 26, 2017

pavolloffay commented Jul 26, 2017

jpkrohling left a comment

Choose a reason for hiding this comment

pavolloffay commented Jul 26, 2017

jpkrohling commented Jul 25, 2017 •

edited

Loading

jpkrohling commented Jul 26, 2017 •

edited

Loading

pavolloffay commented Jul 26, 2017 •

edited

Loading

pavolloffay Jul 26, 2017 •

edited

Loading