-
Notifications
You must be signed in to change notification settings - Fork 16.8k
[incubator/jaeger] Improve the Jaeger chart #3109
Conversation
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://github.com/kubernetes/kubernetes/wiki/CLA-FAQ to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
3fc92b6
to
dac7a59
Compare
Thanks to @mikelorant for helping me with this PR. |
/assign @foxish |
/assign @unguiculus |
/ok-to-test |
Unassigning myself - don't know enough about Jaeger to approve/review. |
incubator/jaeger/values.yaml
Outdated
image: jaegertracing/spark-dependencies | ||
tag: latest | ||
pullPolicy: Always | ||
schedule: "*/12 * * * *" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should run just before the end of the day. Let say 5-10 min for startup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I configured it to run every day @ 23:49
does it expose |
@pavolloffay Take a look at the |
6e125e9
to
cec83a7
Compare
Will take a look this afternoon as well. Been out of the office. |
Giving this a test drive today... Test Cases:
Will be adding a few more... |
@pavolloffay is there any reason why we need both sets of parameters?
It is possible to replace |
@dvonthenen The only reason there are two sets of |
@pavelnikolov I think it would go a long wait in terms of usability to consolidate. Minimally, there needs to be an update to the README to update the example command line in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found a couple of issues when trying to deploy the chart using previously deployed cassandra and elasticsearch stores.
cassandra.datacenter.name: {{ .Values.cassandra.config.dc_name | quote }} | ||
cassandra.keyspace: {{ printf "%s_%s" "jaeger_v1" .Values.cassandra.config.dc_name | quote }} | ||
cassandra.schema.mode: {{ .Values.schema.mode | quote }} | ||
cassandra.servers: {{ template "cassandra.fullname" . }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This forces the external or previously deployed cassandra host to be -cassandra. This should be set to .Values.storage.cassandra.host
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we might need another define in helpers like you did for elasticsearch to toggle between using the template "cassandra.fullname"
and .Values.storage.cassandra.host
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I've tested provisioned cassandra, external ES and provisioned ES. This was the only scenario that I haven't tested. I'll fix it right away. Thank you very much for catching this!
collector.http-port: {{ .Values.collector.service.httpPort | quote }} | ||
collector.port: {{ .Values.collector.service.tchannelPort | quote }} | ||
collector.zipkin.http-port: {{ .Values.collector.service.zipkinPort | quote }} | ||
es.server-urls: {{ template "elasticsearch.client.url" . }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This forces the external or previously deployed elasticsearch host to be -elasticsearch. This should be set to .Values.storage.elasticsearch.host
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What values are you using it with? I have tested this with both external and provisioned ES and it is working fine for me. Take a closer look at this helper:
{{- define "elasticsearch.client.url" -}}
{{- $port := .Values.storage.elasticsearch.port | toString -}}
{{- if .Values.provisionDataStore.elasticsearch -}}
{{- $host := printf "%s-%s-%s" .Release.Name "elasticsearch" "client" | trunc 63 | trimSuffix "-" -}}
{{- printf "%s://%s:%s" .Values.storage.elasticsearch.scheme $host $port }}
{{- else }}
{{- printf "%s://%s:%s" .Values.storage.elasticsearch.scheme .Values.storage.elasticsearch.host $port }}
{{- end -}}
{{- end -}}
It doesn't force anything, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry.... small goof. I got hit by the storage.type
thing again by not setting it. I corrected that now but it looks like the collector and query components are having problems connecting to elasticsearch.
These are the command line parameters I am running with:
--name myrel --set provisionDataStore.cassandra=false --set provisionDataStore.elasticsearch=false --set storage.type=elasticsearch --set storage.elasticsearch.host=elasticsearch --set storage.elasticsearch.port=9200 --set storage.elasticsearch.user=elastic --set storage.elasticsearch.password=changeme
Get pods output:
[dev@k8controller ~]$ kubectl get pods
NAME READY STATUS RESTARTS AGE
elasticsearch-0 1/1 Running 0 24m
elasticsearch-1 1/1 Running 0 24m
elasticsearch-2 1/1 Running 0 24m
myrel-jaeger-agent-31jxv 1/1 Running 0 8m
myrel-jaeger-agent-89kxf 1/1 Running 0 8m
myrel-jaeger-agent-gldbw 1/1 Running 0 8m
myrel-jaeger-agent-lh8tx 1/1 Running 0 8m
myrel-jaeger-agent-t7c0h 1/1 Running 0 8m
myrel-jaeger-agent-wgvrm 1/1 Running 0 8m
myrel-jaeger-collector-623630575-b32fr 0/1 CrashLoopBackOff 6 8m
myrel-jaeger-query-3159756439-nk7c6 0/1 CrashLoopBackOff 6 8m
Error:
{"level":"fatal","ts":1515108228.394109,"caller":"collector/main.go:99","msg":"Unable to set up builder","error":"health check timeout: no Elasticsearch node available"
Verified the configmap has the correct value:
Name: myrel-jaeger
Namespace: default
Labels: app=jaeger
chart=jaeger-0.3.0
heritage=Tiller
jaeger-infra=common-configmap
release=myrel
Annotations: <none>
Data
====
...
es.server-urls:
----
http://elasticsearch:9200
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, good catch. I've tested with ES without username and password. I'll add those two if they are set. As you can see in this file on lines 60 and 62 they haven't been implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pavelnikolov I would also recommend setting the default username and password to elastic/changeme since those are the defaults.
@pavelnikolov For me, it's good ✅ My PR #3100 was merged so you will probably have to rebase/update your branch |
5ec2c5d
to
9153093
Compare
Just rebased. |
name: http | ||
protocol: TCP | ||
- containerPort: 9411 | ||
- containerPort: {{ .Values.collector.service.zipkinPort }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ledor473 I used the named ports from your PR, but the values are no longer hard-coded. Now they come from the values.yaml
and are mapped in the config map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pavelnikolov yes that is even better considering the Collector Pod will listen on that port instead of doing port-forwarding at the Service level 👍
/assign @unguiculus |
@unguiculus any changes required? Did you want me to assign this PR to someone else? Or break it into smaller pieces? |
/assign @mattfarina |
Return the appropriate apiVersion for cronjob APIs. | ||
*/}} | ||
{{- define "cronjob.apiVersion" -}} | ||
{{- if ge .Capabilities.KubeVersion.Minor "8" -}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work once 1.10 is out. Also, this expects alpha features to be enabled for < 1.8. Better use Capabilities.APIVersions.Has
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I didn't know this was possible! I just learned something new! 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for:
Also, this expects alpha features to be enabled for...
This has been documented in the Prerequisites section of the README.md
file and the spark cronjob is disabled by default.
44e6f84
to
0febc72
Compare
/assign @unguiculus |
@dvonthenen there have been some small changes to it since you said LGTM. Still LGTM? |
Will check it out |
LGTM |
@@ -1,28 +1,29 @@ | |||
{{- if .Values.collector.enabled -}} | |||
apiVersion: extensions/v1beta1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A quick note, Deployments as extensions/v1beta1 will be removed from Kubernetes soon. Since this there has been apps/v1beta1
, apps/v1beta2
, and apps/v1
API versions. For broad compatibility we'd like to see charts use apps/v1beta2
for the moment. In k8s 1.10 the API versions prior to that can be removed.
This is just a note. Not going to have it hold up this pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I'll make sure I update it when 1.10 is out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattfarina should I remove extensions/v1beta1
now that k8s 1.10 is out?
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mattfarina, pavelnikolov The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
This PR adds the following improvements to the Jaeger chart:
provisionDataStore
key in thevalues.yaml
file instead oftags
to configure data store provisioningNodePort
andClusterIP
service types