Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ECS placement constraints/strategy #1059

Merged
merged 6 commits into from
Aug 10, 2017
Merged

ECS placement constraints/strategy #1059

merged 6 commits into from
Aug 10, 2017

Conversation

ejholmes
Copy link
Contributor

This adds support to the procfile for specifying ECS placement constraints and placement strategies.

I cherry-picked #941 into this, since it makes it easier, so may be better to only look at the second commit.

@ejholmes
Copy link
Contributor Author

ejholmes commented Mar 14, 2017

One thing worth mentioning is that, I can see cases where you might want these to be runtime configurable per environment. For example, you might have a CPU intensive task that you want to run on a c4.8xlarge in production, but then only run on a t2.micro in staging.

I think, in these cases, the best thing to do is setup "named profiles" using container instance attributes, that map to their production/staging equivalents. For example, in the scenario above, I might add a profile=cpu attribute to container instances, where in production it's backed by c4.8xlarge, but only t2.micro in staging. Then I can use that profile in the Procfile:

web:
  ecs:
    placement_constraints:
      - { type: memberOf, expression: "attribute:profile = cpu" }

Copy link
Contributor

@phobologic phobologic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I saw this last night that was going to be my first comment - it seems likely that we'd want to control placement with config, rather than with the code.

I think your method for dealing with that is probably a good one, especially since it keeps things simple.

Only suggestion here is to at least update the Procfile documentation (btw, I don't think we have a Procfile specific doc - if not, its probably getting complicated enough it deserves it's own page :)).

Could you also put your suggestion for managing environment specific placements in that doc?

default:
err = fmt.Errorf("%s is not supported", req.RequestType)
}
func (p *InstancePortsResource) Create(_ context.Context, req customresources.Request) (string, interface{}, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Man - does your linter not freak out at you for not having comments describing exported methods? It gets annoying - I'm wondering if I should just find a way to disable it, or if we should try and get better about listening to it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I disabled that a long time ago because you just end up with a bunch of Thing does thing doc comments to appease the linter.

@ejholmes
Copy link
Contributor Author

Good call. I'll update the procfile doc with information on that. The Procfile isn't documented in docs/ but there's a readme in ./procfile where all the options are documented, which the docs point to.

@ejholmes
Copy link
Contributor Author

Another thing to think about here is that, if you're including a bunch of different instance types in a single ECS cluster, you probably want services that don't specify any placement constraints to default to some constraints, so they don't go bouncing around a bunch of different instance types.

Some kind of first class "profile" support, that maps to some ECS placement constraints might be a better way to go initially. So, when booting Empire, you'd give it a number of named "profiles", which map to ECS placement constraints, and then you specify a default profile. For example:

{
  "default": {
    "placement_constraints": [
      { "type": "memberOf", "expression": "attributes:ecs.instance-type = m3.*" }
    ]
  },
  "memory": {
    "placement_constraints": [
      { "type": "memberOf", "expression": "attributes:ecs.instance-type = x1.*" }
    ]
  }
}

Users would be able to define what "profile" to use for a process in the Procfile:

worker:
  profile: memory

Or they could change this at runtime:

$ emp scale worker --profile memory -a <app> 

Or provide it via emp run:

$ emp run ./bin/job --profile memory -a <app>

Having a simple "named profile" provides users with a lot of simplicity, but still gives operators enough power to define as many profiles as needed, and document those within their org.

Food for thought 🤔

@ejholmes
Copy link
Contributor Author

ejholmes commented Aug 4, 2017

Alright, I'm feeling pretty good about this one now. I'll start doing some more testing, and giving it a whirl on our staging env.

Copy link
Contributor

@phobologic phobologic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good - it's a big change (seems like there was a bit of cleanup in this PR as well?).

return nil, fmt.Errorf("unable to unmarshal placement constraints: %v", err)
}
log.Println(fmt.Sprintf(" DefaultPlacementConstraints: %v", placementConstraints))
return twelvefactor.Transform(s, setDefaultPlacementConstraints(placementConstraints)), nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Man, so totally unrelated to this review in particular, but this finally helped me nail down why I really dislike the GoLang pattern of using 1-2 character variables. Take s for example here. In the review, I can't see what s is without expanding the lines above twice. If it had been named scheduler, I wouldn't even have to expand it, it would have made perfect sense in it's own context, which makes for easier code reviews. Python has it right ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, this overall method is getting a little long. Worth refactoring to break it up into smaller methods?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I can see how that could make reviewing difficult. I tend to just follow the language communities preference for things like this. In the case of Go: https://github.com/golang/go/wiki/CodeReviewComments#receiver-names.

In regards to breaking it up, there's not much within this that's re-usable, so breaking it up might make it hard to follow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't get their dislike of descriptive names. Since when did we run out of bytes for descriptive code?! :)

@@ -334,6 +335,12 @@ var EmpireFlags = []cli.Flag{
EnvVar: "EMPIRE_ECS_DOCKER_CERT_PATH",
},
cli.StringFlag{
Name: FlagECSPlacementConstraintsDefault,
Value: "",
Usage: "ECS placement constraints to set when a process does not set any.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you include a "If this is not set, then this will happen" bit here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. I'll expand on this before it gets merged.

proc := Process{
Quantity: 1,
}
var proc Process
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question about this method - it's getting a little long, worth breaking it up?

}); err != nil {
return fmt.Errorf("error pulling image: %v", err)
}
for _, p := range app.Processes {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a pretty big change - there will only ever be a single process, right? Is this just short circuiting needing the p since that's already represented as an array in app.Processes and it only has a single value when Run is called?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda addressed this below, but the overall change is basically a noop, since Empire core only ever passes 1 process, so the end result is the same. It does mean that Empire core could pass multiple processes to run them at once, but primarily this was done just to simplify the method signature since the manifest already contains all the process information.

@ejholmes
Copy link
Contributor Author

ejholmes commented Aug 9, 2017

@phobologic I just rebased this into smaller logical commits if it makes reviewing a little easier. The actual change to implement ECS placement constraints and strategies is in the last commit, which is relatively small. Commits 1-4 probably don't need too much attention, since they're mostly internal changes to make the last commit easier.

  • The first commit is just Move ECSService and InstancePort resources to higher level interface. #941, which has already been approved/tested, so can probably be ignored.
  • The second commit should probably get some attention. It simplifies the scheduler.Run method to only take a Manifest instead of both a Manifest and a Process, since the Manifest already contains the list of processes to run. It also fixes a "bug" so that emp run takes stored process configuration (e.g. size constraints) into account when running named procs.
  • Commits 3 just cleans up some legacy issues with scheduler.Restart. I probably shouldn't include it in this PR.

I'll follow up on the rest of the review comments once we've had a chance to test this in our staging env.

Copy link
Contributor

@phobologic phobologic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked at commit 5 and yeah, looks good.

"Expression": c.Expression,
})
}
serviceProperties["PlacementConstraints"] = placementConstraints
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should actually be added to the task definition, rather than the service. Reason being, this isn't updatable on a service, which would require a replacement of the service. If we do it on the task definition, then no replacement required. Placement strategies can only be applied to the service, so that would need to remain.

…visioner.

Previously, these resources managed the logic for performing resource
replacement themselves. After this change, they are based on the
`privisioner` type, which makes it simpler to determine what requires a
replacement by using a hash.
This commit simplifies the `twelvefactor.Scheduler.Run` interface method
to only take a `twelvefactor.Manifest` instead of both a
`twelvefactor.Manifest` and `twelvefactor.Process`. The manifest
contains the processes that should be run.

The commit also changes the `emp run` functionality, so that when a
"named" command is run, it will use the stored process configuration
(e.g. constraints) instead of the defaults.
This commit simplifies the `twelvefactor.Scheduler.Restart` method to
only pass the App ID, instead of a full manifest. The existing
implementation required the full manifest for the legacy ECS scheduler,
which is no longer required in the CloudFormation world.
This commit adds a `twelvefactor.Transform` method, which returns a
wrapped `twelvefactor.Scheduler` that will transform the
`twelvefactor.Manifest` before passing it to the downstream scheduler.

This can be used to, for example, add default placement constraints to
processes that don't define any.
This commit adds support for specifying ECS placement constraints and
placement strategies in the extended Procfile format.
@ejholmes
Copy link
Contributor Author

This is working pretty nicely in our staging environment. I'll go ahead and pull this in and fix any issues the crop up before the next release.

@ejholmes ejholmes merged commit 914b5dc into master Aug 10, 2017
@ejholmes ejholmes deleted the placement branch August 10, 2017 02:05
@ejholmes ejholmes mentioned this pull request Sep 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants