Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UPSTREAM: 60978: Fix use of "-w" flag to iptables-restore #18919

Merged

Conversation

danwinship
Copy link
Contributor

iptables-restore's option-parsing code is weird and broken, and it requires you to say "-w 5" rather than "-w5". Up until now we've never been running OpenShift against a version of iptables new enough to have that flag, so we didn't notice, but 1.6.2 is now in updates-testing in Fedora 27, and when it hits updates, it will totally break kube-proxy in docker-in-docker. (For master/3.9; older releases still use F25 so won't ever see iptables 1.6.2.)

iptables accepts "-w5" but iptables-restore requires "-w 5"
@danwinship danwinship added kind/bug Categorizes issue or PR as related to a bug. component/networking sig/networking labels Mar 9, 2018
@openshift-ci-robot openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Mar 9, 2018
@openshift-merge-robot openshift-merge-robot added the vendor-update Touching vendor dir or related files label Mar 9, 2018
@danwinship
Copy link
Contributor Author

The iptables update is currently at +3 karma in bodhi (https://bodhi.fedoraproject.org/updates/FEDORA-2018-1c31f1eccd) so it could potentially get pushed to updates at any time. If 3.9 is 100% frozen AND is supposed to be released in the next few days AND the release would be thwarted if extended-networking-minimal failed reliably, then we could consider asking the fedora iptables maintainers to hold the update. Otherwise we should either just get this PR backported quickly, or else revert #18737 on release-3.9 (to put it back to using F25 for dind).

@eparis
Copy link
Member

eparis commented Mar 9, 2018

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 9, 2018
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship, eparis

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 9, 2018
@danwinship
Copy link
Contributor Author

Ah, ok, so the reason that we never noticed this is that we backported the code to add "-w" and "-w NUM" to RHEL, but we didn't backport the later patch to make iptables-restore error out when you pass it a bad option (https://git.netfilter.org/iptables/commit/?id=d89dc47). So we're currently running iptables-restore -w5 --noflush --counters, and iptables-restore says "ok, so that's -w, so I'll wait for the lock, and -5, which I don't know what that means, but whatevs, and --noflush, so I won't flush the existing contents, and --counters, so I'll restore counter values. OK!". (So it's actually waiting for the lock forever, not just for 5 seconds.)

So we probably should backport this to 3.7 as well in case RHEL iptables ever gets the error-checking fix. (The bug does not exist in 3.6 and earlier.)

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot
Copy link
Contributor

Automatic merge from submit-queue (batch tested with PRs 18908, 18919).

@openshift-merge-robot openshift-merge-robot merged commit 9c581ab into openshift:master Mar 9, 2018
@danwinship danwinship deleted the fix-iptables-restore-wait branch March 9, 2018 21:27
@danwinship
Copy link
Contributor Author

/cherrypick release-3.9

@openshift-cherrypick-robot

@danwinship: new pull request created: #18925

In response to this:

/cherrypick release-3.9

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. component/networking kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. sig/networking size/M Denotes a PR that changes 30-99 lines, ignoring generated files. vendor-update Touching vendor dir or related files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants