Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[adams2019] Add caching to autoscheduler #5697

Merged
merged 41 commits into from
Apr 20, 2021

Conversation

rootjalex
Copy link
Member

@rootjalex rootjalex commented Feb 3, 2021

Adds caching of features and schedule enumerations to the adams2019 autoscheduler. Backported from @aekul 's autoscheduler work.

On my machine, I see a x2 speedup on the time needed to autoschedule lens blur and local laplacian, about a x1.5 speedup for resnet50, and anywhere between x1-1.5 for other pipelines (caching is only useful for larger pipelines, it has little to no effect on smaller pipelines).

To enable caching (it is disabled by default), these parameters should be set:
HL_USE_MEMOIZED_FEATURES=1
HL_MEMOIZE_BLOCKS=1

Additionally, in order to test feature caching, setting the following value will enable feature caching verification (this will be quite slow):
HL_VERIFY_MEMOIZED_FEATURES=1

Caching schedule enumerations and features is enabled by default. To disable them, set the following environment variables:

HL_DISABLE_MEMOIZED_BLOCKS=1
HL_DISABLE_MEMOIZED_FEATURES=1

Tests were also added to verify these caching methods (Note that there likely will be no speed-up on these tests, as the pipelines are not large enough).

This PR was originally #5654 before it was split into two.

@rootjalex rootjalex linked an issue Feb 3, 2021 that may be closed by this pull request
@rootjalex rootjalex requested a review from aekul February 3, 2021 00:14
@steven-johnson
Copy link
Contributor

See failure in https://buildbot.halide-lang.org/master/#/builders/73/builds/60

@steven-johnson
Copy link
Contributor

are the environment vars controlling this (HL_USE_MEMOIZED_FEATURES=1, HL_MEMOIZE_BLOCKS=1, etc) meant to be a long-term API, or just a short-term expedient?

@rootjalex
Copy link
Member Author

@steven-johnson Short-term, input on thoughts for other APIs would be great. I also am not set on having caching off by default, it could just as easily be turned on by default

@steven-johnson
Copy link
Contributor

I have no opinion on on vs off by default.

I am concerned about the very large reliance that our autoschedulers have on env vars as a de facto 'api' for controlling a lot of things -- I don't have an alternate suggestion at this time, but I'd love to eventually have an alternative that doesn't require setting what are (effectively) mutable globals to do this sort of thing.

@rootjalex rootjalex added the autoscheduler Related to one or more of the Autoschedulers label Feb 3, 2021
@rootjalex
Copy link
Member Author

Apologies for how delayed I am in updating this - the start-of-the-semester craziness hit pretty hard.

@abadams Please let me know if the comments added to Cache.h and Autoschedule.cpp are or are not sufficient

@rootjalex
Copy link
Member Author

Failure seems to be an un-related build failure

If cache_features is enabled (i.e. HL_DISABLE_MEMOIZED_FEATURES!=1) then this function caches
the feautizations of its children, and if called again, reuses those cached feauturizations.
The features are saved in a LoopNest's member, std::map<> feature_cache. Some features do not
persist, and the FeaturesIntermediates strucct (see Featurization.h) is used to cache useful
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strucct

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in dccd642

@@ -13,6 +13,45 @@ namespace Halide {
namespace Internal {
namespace Autoscheduler {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a big-picture overview comment as well as the list of changes below. E.g. there seem to be two types of caching: feature caching and block caching, but I'm still confused about what block caching is from reading the text below. Say what kinds of caching exist, what values are cached, what the key is, and why this saves work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully addressed by dccd642 ? Let me know what you think.

@alexreinking alexreinking modified the milestones: v12.0.0, v13.0.0 Apr 16, 2021
@abadams
Copy link
Member

abadams commented Apr 19, 2021

@rootjalex Would be good to get this in for the 12.0 release. Just needs a few more comment tweaks I think.

@abadams abadams modified the milestones: v13.0.0, v12.0.0 Apr 19, 2021
@rootjalex
Copy link
Member Author

@abadams I think generally the tilings are faster to save than to re-generate. The speedup that Luke sees on the GPU autoscheduler is much more than we see here though, which I assume is because he's generating more tiling options.

@abadams
Copy link
Member

abadams commented Apr 19, 2021

The thing I'm still confused about is what is being cached in the blocks case. Reading the code it looks like it's the set of child LoopNest nodes for things scheduled compute_root. So what you're saving is loop nest construction time. Is that correct? The comment made me think it was just saving the tile sizes, which didn't sound useful.

@rootjalex
Copy link
Member Author

Those LoopNests are those generated from the tiling options - I think it's a combination of saving LoopNest construction as well as tiling generation.

@rootjalex
Copy link
Member Author

I just updated the comment, hopefully it makes the cache description more clear?

@rootjalex rootjalex requested a review from aekul April 20, 2021 02:12
@abadams abadams requested review from aekul and removed request for aekul April 20, 2021 02:16
@aekul
Copy link
Contributor

aekul commented Apr 20, 2021

In the blocks case, it's saving all the compute_root level loop nests. Most importantly, this includes their featurizations, which is the main motivation. Maybe update the comments to clarify this.

Copy link
Contributor

@aekul aekul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes I requested look good.

@rootjalex
Copy link
Member Author

I think the tests that are failing + hanging seem to be due to the issue in #5925 . merging per @abadams

@rootjalex rootjalex merged commit c1de142 into master Apr 20, 2021
@rootjalex rootjalex deleted the rootjalex/add_autosched_caching branch April 20, 2021 21:21
frengels pushed a commit to frengels/Halide that referenced this pull request Apr 30, 2021
* add feature caching and block caching to adams2019 autoscheduler

* added caching verification for feautures

* add caching docstrings
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
autoscheduler Related to one or more of the Autoschedulers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Caching schedule enumeration in the Adams2019 autoscheduler
5 participants