Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rollouts doesn't drop old replicaset/pods #3836

Open
1 of 2 tasks
ajax-bychenok-y opened this issue Sep 17, 2024 · 2 comments
Open
1 of 2 tasks

Rollouts doesn't drop old replicaset/pods #3836

ajax-bychenok-y opened this issue Sep 17, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@ajax-bychenok-y
Copy link

ajax-bychenok-y commented Sep 17, 2024

Checklist:

  • I've included steps to reproduce the bug.
  • I've included the version of argo rollouts.

Describe the bug

Sometimes Argo Rollouts switches release to new version (replicaset) but don't remove old one, so pods are always running.
After some digging into it I've realized that the reason of that is blank value of annotation argo-rollouts.argoproj.io/scale-down-deadline: "" but correct date should be there. That's why controller can't remove it later.

To Reproduce

I have no reproduce steps for this problem because it accidentally occurs after rollout process. Here is the process of my digging

#1761 (comment)
#1761 (comment)

Expected behavior

Controller shoud remove pods in non-active replica.

Screenshots

svc-c6dcc48cb is still alive when newer replicaset svc-865f9fcf88 was buried and even newer one svc-74ff5588fb is currently working.

image

Version

app version: v1.7.1+6a99ea9
helm version: "2.37.1"

Logs

Have no logs for now.


Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.

@ajax-bychenok-y ajax-bychenok-y added the bug Something isn't working label Sep 17, 2024
@zachaller
Copy link
Collaborator

I think this is fixed in 1.7.2 or atleast improved can you try.

@ajax-bychenok-y
Copy link
Author

image

This is log for v1.7.1+6a99ea9 (soon we are going to update to the latest version as advised)
rollouts-fail-old-replica.json

Most interesting thing is

{"level":"info","msg":"Set 'scale-down-deadline' annotation on 'some-svc-55468fc7cb' to 2024-09-19T09:35:41Z (30s)","namespace":"staging-a","rollout":"some-svc","time":"2024-09-19T09:35:11Z"}
{"level":"info","msg":"synced ephemeral metadata nil to Pod some-svc-55468fc7cb-lkr5l","namespace":"staging-a","rollout":"some-svc","time":"2024-09-19T09:35:12Z"}
{"level":"info","msg":"synced ephemeral metadata nil to Pod some-svc-55468fc7cb-p25vp","namespace":"staging-a","rollout":"some-svc","time":"2024-09-19T09:35:12Z"}
{"level":"info","msg":"Conflict when updating replicaset some-svc-55468fc7cb, falling back to patch","namespace":"staging-a","rollout":"some-svc","time":"2024-09-19T09:35:12Z"}
{"level":"info","msg":"Patching replicaset with patch: {\"metadata\":{\"annotations\":{\"rollout.argoproj.io/desired-replicas\":\"2\",\"rollout.argoproj.io/revision\":\"235\",\"scale-down-deadline\":\"\"},\"labels\":{\"rollouts-pod-template-hash\":\"55468fc7cb\"}},\"spec\":{\"replicas\":2,\"selector\":{\"matchLabels\":{\"rollouts-pod-template-hash\":\"55468fc7cb\"}},\"template\":{\"metadata\":{\"annotations\":{\"ad.datadoghq.com/some-svc.checks\":\"{\\n  \\\"jmx\\\": {\\n    \\\"init_config\\\": {\\n      \\\"is_jmx\\\": true,\\n      \\\"collect_default_metrics\\\": true,\\n      \\\"collect_default_jvm_metrics\\\": true,\\n      \\\"new_gc_metrics\\\": true\\n    },\\n    \\\"instances\\\": [{\\n      \\\"host\\\": \\\"%%host%%\\\",\\n      \\\"port\\\": 8855\\n    }]\\n  }\\n}\\n\"},\"labels\":{\"app.kubernetes.io/instance\":\"some-svc\",\"app.kubernetes.io/managed-by\":\"Helm\",\"app.kubernetes.io/name\":\"some-svc\",\"env_name\":\"staging\",\"env_tag\":\"a\",\"helm.sh/chart\":\"some-svc-0.26.0-773.RELEASE\",\"rollouts-pod-template-hash\":\"55468fc7cb\"}}}}}","namespace":"staging-a","rollout":"some-svc","time":"2024-09-19T09:35:12Z"}
{"level":"info","msg":"synced ephemeral metadata nil to ReplicaSet some-svc-55468fc7cb","namespace":"staging-a","rollout":"some-svc","time":"2024-09-19T09:35:12Z"}
{"generation":485,"level":"info","msg":"No status changes. Skipping patch","namespace":"staging-a","resourceVersion":"185480139","rollout":"some-svc","time":"2024-09-19T09:35:12Z"}
{"generation":485,"level":"info","msg":"Reconciliation completed","namespace":"staging-a","resourceVersion":"185480139","rollout":"some-svc","time":"2024-09-19T09:35:12Z","time_ms":74.843767}

As result it sets argo-rollouts.argoproj.io/scale-down-deadline to '' and old replica set never goes down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants