Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple boundary services targets per VPC #374

Closed
rcgoodfellow opened this issue Jun 6, 2023 · 1 comment · Fixed by #381
Closed

Multiple boundary services targets per VPC #374

rcgoodfellow opened this issue Jun 6, 2023 · 1 comment · Fixed by #381
Assignees
Milestone

Comments

@rcgoodfellow
Copy link
Contributor

Currently a VpcCfg has a single BoundaryServices member which in turn has a single target IP address as a tunnel endpoint (TEP) for sending packets to upstream networks.

This works for a single-switch environment. However, the rack is multi-switch and OPTE needs to be explicitly aware of that. Unlike sled-to-sled communications which are constrained to the rack underlay network, we do not control the physical paths packets will take beyond the boundary services TEP. This means that we need to maintain overlay path affinity on the underlying physical network (e.g. there is no end-to-end ddm protocol ensuring packet level balancing properties that preserve ordering across flows) and the only component in a position to do that currently is OPTE.

OPTE must be aware of the fact that there are multiple boundary service TEPs, and assign flows to a particular TEP. OPTE must also respond to signals about TEP availability. For example, if a sidecar goes down due to a failure or a maintenance event, OPTE must migrate flows off that TEP to another one that is available. Presumably, these signals will come from the control plane and I'll post a corresponding issue in Omicron about that.

@rcgoodfellow rcgoodfellow added this to the FCS milestone Jun 6, 2023
@rcgoodfellow rcgoodfellow changed the title Multiple boudnary services targets per VPC Multiple boundary services targets per VPC Jun 6, 2023
@rcgoodfellow
Copy link
Contributor Author

I gave this a bit more thought this evening. We may not need to track multiple boundary services addresses in OPTE.

If we use an anycast address for boundary services, then we get route availability tracking for free from ddmd. If an upstream sidecar goes away, so does the ddm peer, and the local ddmd instance running on the sled will remove the routes going through that peer when it expires. Because xde is now choosing nexthops based on host routing tables, this will all just work automatically.

We still have the issue of overlay/underlay path affinity to deal with in OPTE. But with an anycast-based approach, this becomes one of route/nexthop affinity, and not boundary services tunnel endpoint address affinity, which should be a much simpler overall mechanism to introduce.

@rcgoodfellow rcgoodfellow self-assigned this Jun 28, 2023
@askfongjojo askfongjojo modified the milestones: FCS, 1.0.2 Aug 10, 2023
@askfongjojo askfongjojo modified the milestones: 1.0.2, 3 Sep 14, 2023
@morlandi7 morlandi7 modified the milestones: 3, 4 Oct 5, 2023
@morlandi7 morlandi7 modified the milestones: 4, 5 Nov 14, 2023
@morlandi7 morlandi7 modified the milestones: 5, 6 Nov 30, 2023
@rcgoodfellow rcgoodfellow linked a pull request Jan 18, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants