Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make execution planner smarter to enable running more operators in-place #98

Open
robertknight opened this issue Apr 16, 2024 · 0 comments
Labels
performance Issues that affect model inference or loading performance

Comments

@robertknight
Copy link
Owner

The current execution planner does a simple depth-first traversal of the graph, starting backwards from the requested outputs and visiting each operator's inputs in left-to-right order. It doesn't consider how the scheduling order affects whether operators can be run in-place or not. This can lead to copying tensors unnecessarily.

For example, in the graph segment below, taken from the Magicka model, the current planner runs the right branch before the left branch. This is because the output of the right branch forms the first input to the Tensordot node at the bottom, whereas the output of the left branch is the second input. This is suboptimal because it means the Reshape operator in the right branch is not run in-place, as its input will be needed later by the Shape operator in the left-branch. If the left branch was run first, the Reshape could be run in-place.

Magicka execution order 2
@robertknight robertknight added the performance Issues that affect model inference or loading performance label Apr 20, 2024
robertknight added a commit that referenced this issue Apr 27, 2024
To improve the chances of being able to run an operator in-place, visit inputs
in right-to-left order during planning. This results in inputs being computed in
this order by the generated plan, which means that the left-most input is more
likely to be available for mutation (ie. not used by other operators) at the
point when the operator is run.

See #98 for an example.

I haven't proven that this change will pick the optimal plan execution order
that maximizes the number of operations run in place, so consider it a
heuristic.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Issues that affect model inference or loading performance
Projects
None yet
Development

No branches or pull requests

1 participant