Skip to content

Commit

Permalink
Merge branch 'main' into darc-main-a0aec00d-d482-4e39-96c7-5aacd70aca5b
Browse files Browse the repository at this point in the history
  • Loading branch information
lewing committed Apr 19, 2023
2 parents 95f80da + d634ea0 commit 377228d
Show file tree
Hide file tree
Showing 6 changed files with 100 additions and 55 deletions.
6 changes: 5 additions & 1 deletion documentation/ProjectReference-Protocol.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ When `Clean`ing the output of a project, `CleanReferencedProjects` ensures that

## Targets required to be referenceable

These targets should exist in a project to be compatible with the common targets' `ProjectReference`. Some are called only conditionally.
These targets should exist in a project to be compatible with the common targets' `ProjectReference` (unless [marked with the `SkipNonexistentTargets='true'` metadatum](#targets-marked-with-skipnonexistenttargetstrue-metadatum)). Some are called only conditionally.

These targets are all defined in `Microsoft.Common.targets` and are defined in Microsoft SDKs. You should only have to implement them yourself if you require custom behavior or are authoring a project that doesn't import the common targets.

Expand Down Expand Up @@ -85,6 +85,10 @@ If implementing a project with an “outer” (determine what properties to pass
* As of 15.7, this is _optional_. If a project does not contain a `GetCopyToOutputDirectoryItems` target, projects that reference it will not copy any of its outputs to their own output folders, but the build can succeed.
* `Clean` should delete all outputs of the project.
* It is not called during a normal build, only during "Clean" and "Rebuild".

### Targets Marked With `SkipNonexistentTargets='true'` Metadatum
`GetTargetFrameworks` and `GetTargetFrameworksWithPlatformForSingleTargetFramework` are skippable if nonexistent since some project types (for example, `wixproj` projects) may not define them. See [this comment](https://github.com/dotnet/msbuild/blob/cc55017f88688cbe3f9aa810cdf44273adea76ea/src/Tasks/Microsoft.Managed.After.targets#L74-L77) for more details.

## Other protocol requirements

As with all MSBuild logic, targets can be added to do other work with `ProjectReference`s.
Expand Down
51 changes: 20 additions & 31 deletions documentation/specs/single-project-isolated-builds.md
Original file line number Diff line number Diff line change
@@ -1,54 +1,43 @@
# Single project isolated builds: implementation details
# Single Project Isolated Builds: Implementation Details

<!-- workflow -->
Single project isolated builds can be achieved by providing MSBuild with input and output cache files.

The input cache files contain the cached results of all the targets that a project calls on its references. When a project builds without isolation, it builds its references via [MSBuild task](aka.ms/msbuild_tasks) calls. In isolated builds, the engine, instead of executing these tasks, serves their results from the provided input caches. In an isolated project build, only the top level project (built via the BuildManager APIs) should build targets. Any referenced projects by the top level project should be provided from the input caches.
The input cache files contain the cached `TargetResult`s of all targets that a project calls on its references. When a project builds without isolation, it builds its references via [MSBuild task](aka.ms/msbuild_tasks) calls. In isolated builds, the engine, instead of executing these tasks, serves their results from the provided input caches. In an isolated project build, only the top level project (built via the `BuildManager` APIs) should build targets; Any referenced projects by the top level project should be provided from the input caches.

The output cache file tells MSBuild where to serialize the results of building the current project. This output cache becomes an input cache for all other projects that depend on the current project.
The output cache file can be omitted in which case the build reuses prior results but does not write out any new results. This is useful when one wants to re-execute the build for a project without building its references.
The output cache file tells MSBuild where to serialize the `TargetResult`s for a project's built targets and becomes an input cache for dependent projects.

The presence of either input or output caches turns on [isolated build constraints](static-graph.md##single-project-isolated-builds).

## Input / Output cache implementation
## Input / Output Cache Implementation
<!-- cache structure -->
The cache files contain the serialized state of MSBuild's [ConfigCache](https://github.com/dotnet/msbuild/blob/main/src/Build/BackEnd/Components/Caching/ConfigCache.cs) and [ResultsCache](https://github.com/dotnet/msbuild/blob/main/src/Build/BackEnd/Components/Caching/ResultsCache.cs). These two caches have been traditionally used by the engine to cache build results. For example, it is these caches which ensure that a target is only built once per build submission. The `ConfigCache` entries are instances of [BuildRequestConfiguration](https://github.com/dotnet/msbuild/blob/37c5a9fec416b403212a63f95f15b03dbd5e8b5d/src/Build/BackEnd/Shared/BuildRequestConfiguration.cs#L25). The `ResultsCache` entries are instances of [BuildResult](https://github.com/dotnet/msbuild/blob/37c5a9fec416b403212a63f95f15b03dbd5e8b5d/src/Build/BackEnd/Shared/BuildResult.cs#L34), which contain or more instances of [TargetResult](https://github.com/dotnet/msbuild/blob/37c5a9fec416b403212a63f95f15b03dbd5e8b5d/src/Build/BackEnd/Shared/TargetResult.cs#L22).

One can view the two caches as the following mapping: `(project path, global properties) -> results`. `(project path, global properties)` is represented by a `BuildRequestConfiguration`, and the results are represented by `BuildResult` and `TargetResult`.
The cache files contain the serialized state of MSBuild's [`ConfigCache`](https://github.com/dotnet/msbuild/blob/main/src/Build/BackEnd/Components/Caching/ConfigCache.cs) and [`ResultsCache`](https://github.com/dotnet/msbuild/blob/main/src/Build/BackEnd/Components/Caching/ResultsCache.cs), which have been traditionally used by the engine to cache build results. They ensure that a target is only built once per build submission. `ConfigCache` entries are instances of [`BuildRequestConfiguration`](https://github.com/dotnet/msbuild/blob/37c5a9fec416b403212a63f95f15b03dbd5e8b5d/src/Build/BackEnd/Shared/BuildRequestConfiguration.cs#L25)s (a `(project path, global properties)` tuple), and `ResultsCache` entries are instances of [`BuildResult`](https://github.com/dotnet/msbuild/blob/37c5a9fec416b403212a63f95f15b03dbd5e8b5d/src/Build/BackEnd/Shared/BuildResult.cs#L34)s, which contain [`TargetResult`](https://github.com/dotnet/msbuild/blob/37c5a9fec416b403212a63f95f15b03dbd5e8b5d/src/Build/BackEnd/Shared/TargetResult.cs#L22)s. The `ConfigCache` entries and `ResultsCache` entries form a [bijection](https://en.wikipedia.org/wiki/Bijection).

<!-- cache lifetime -->
The input and output cache files have the same lifetime as the `ConfigCache` and the `ResultsCache`. The `ConfigCache` and the `ResultsCache` are owned by the [BuildManager](https://github.com/dotnet/msbuild/blob/main/src/Build/BackEnd/BuildManager/BuildManager.cs), and their lifetimes are one `BuildManager.BeginBuild` / `BuildManager.EndBuild` session. On commandline builds, since MSBuild.exe uses one BuildManager with one BeginBuild / EndBuild session, the cache lifetime is the same as the entire process lifetime. When other processes (e.g. Visual Studio's devenv.exe) perform msbuild builds via the `BuildManager` APIs, there can be multiple build sessions in the same process.
In a build, the input and output cache files have the same lifetime as the `ConfigCache` and `ResultsCache`. The `ConfigCache` and `ResultsCache` are owned by the [`BuildManager`](https://github.com/dotnet/msbuild/blob/main/src/Build/BackEnd/BuildManager/BuildManager.cs), and their lifetimes are one `BuildManager.BeginBuild` / `BuildManager.EndBuild` session. On command-line builds, the cache lifetime is the same as the entire process lifetime since `MSBuild.exe` uses one `BuildManager` with one `BeginBuild` / `EndBuild` session. When other processes (e.g. Visual Studio's `devenv.exe`) perform MSBuild builds via the `BuildManager` APIs, there can be multiple build sessions in the same process.

<!-- constraints -->

When MSBuild is loading input cache files, it has to merge multiple incoming instances of `ConfigCache` and `ResultsCache` into one instance of each. The [CacheAggregator](https://github.com/dotnet/msbuild/blob/37c5a9fec416b403212a63f95f15b03dbd5e8b5d/src/Build/BackEnd/BuildManager/CacheAggregator.cs#L13) is responsible for stitching together the pairs of deserialized `ConfigCache`/`ResultsCache` entries from each input cache file.
The following constraints are enforced during cache aggregation:
- For each input cache, `ConfigCache.Entries.Size == ResultsCache.Entries.Size`
- For each input cache, there is exactly one mapping from ConfigCache to ResultsCache (on `BuildResult.ConfigurationId` == `BuildRequestConfiguration.ConfigurationId`)
- Colliding configurations (defined as tuples of `(project path, global properties)`) get their corresponding BuildResult entries merged at the level of TargetResult entries. TargetResult conflicts are handled via the "first one wins" strategy. This is in line with vanilla msbuild's behaviour where a target tuple of `(project path, global properties, target)` gets executed only once.
When loading input cache files, MSBuild merges incoming instances of `ConfigCache`s and `ResultsCache`s into one instance of each with the help of the [`CacheAggregator`](https://github.com/dotnet/msbuild/blob/51df47643a8ee2715ac67fab8d652b25be070cd2/src/Build/BackEnd/BuildManager/CacheAggregator.cs#L15), which enforces the following constraints:
- No duplicate cache entries
- Bijection:
- `ConfigCache.Entries.Size == ResultsCache.Entries.Size`
- `BuildResult.ConfigurationId` == `BuildRequestConfiguration.ConfigurationId`

The output cache file **only contains results for additional work performed in the current BeginBuild / EndBuild session**. Entries from input caches are not transferred to the output cache.
Note that the output cache file contains a single `BuildResult` with the `TargetResult`s from the project specified to be built in the `BeginBuild` / `EndBuild` session, as any `BuildResult`s obtained through isolation exemption are excluded to prevent potential duplicate input cache entries; Entries from input caches are not transferred to the output cache.

<!-- How input / output cache entries are separated with the override caches -->
Entries that make it into the output cache file are separated from entries serialized from input cache files via the use of [ConfigCacheWithOverride](https://github.com/dotnet/msbuild/blob/main/src/Build/BackEnd/Components/Caching/ConfigCacheWithOverride.cs) and [ResultsCacheWithOverride](https://github.com/dotnet/msbuild/blob/main/src/Build/BackEnd/Components/Caching/ResultsCacheWithOverride.cs). These are composite caches. Each contains two underlying caches: a cache where input caches files are loaded into (called the override cache), and a cache where new results are written into (called the current cache). Cache reads are satisified from both underlying caches (override cache is queried first, current cache is queried second). Writes are only written to the current cache, never into the override cache. The output cache file only contains the serialized current cache, and not the override cache, thus ensuring that only newly built results are serialized in the output cache file. It is illegal for both the current cache and override cache to contain entries for the same project configuration, a constraint that is checked by the two override caches on each cache read.
Input cache entries are separated from output cache entries with the composite caches [`ConfigCacheWithOverride`](https://github.com/dotnet/msbuild/blob/main/src/Build/BackEnd/Components/Caching/ConfigCacheWithOverride.cs) and [`ResultsCacheWithOverride`](https://github.com/dotnet/msbuild/blob/main/src/Build/BackEnd/Components/Caching/ResultsCacheWithOverride.cs). Each composite cache contains two underlying caches: a cache where input caches files are loaded into (the override cache), and a cache where new results are written into (the current cache).* In the `ConfigCacheWithOverride`, these caches are instances of `ConfigCache`s and, in the `ResultsCacheWithOverride`, these caches are instances of `ResultsCache`s. A query for a cache entry is first attempted from the override cache and, if unsatisfied, a second attempt is made from the current cache. Writes are only written to the current cache, never into the override cache.* It is illegal for both the current cache and override cache to contain entries for the same project configuration, a constraint that is checked by the two override caches on each cache query.

## Isolation implementation
## Isolation Implementation

[Isolation constraints](static-graph.md##single-project-isolated-builds) are implemented in the Scheduler and the TaskBuilder. [TaskBuilder.ExecuteInstantiatedTask](https://github.com/dotnet/msbuild/blob/37c5a9fec416b403212a63f95f15b03dbd5e8b5d/src/Build/BackEnd/Components/RequestBuilder/TaskBuilder.cs#L743) ensures that the `MSBuild` task is only called on projects declared in `ProjectReference`. [Scheduler.CheckIfCacheMissOnReferencedProjectIsAllowedAndErrorIfNot](https://github.com/dotnet/msbuild/blob/37c5a9fec416b403212a63f95f15b03dbd5e8b5d/src/Build/BackEnd/Components/Scheduler/Scheduler.cs#L1818) ensures that all `MSBuild` tasks are cache hits.
[Isolation constraints](static-graph.md##single-project-isolated-builds) are implemented in the `Scheduler` and `TaskBuilder`. [`TaskBuilder.ExecuteInstantiatedTask`](https://github.com/dotnet/msbuild/blob/37c5a9fec416b403212a63f95f15b03dbd5e8b5d/src/Build/BackEnd/Components/RequestBuilder/TaskBuilder.cs#L743) ensures that the `MSBuild` task is only called on projects declared in a `ProjectReference` item. [`Scheduler.CheckIfCacheMissOnReferencedProjectIsAllowedAndErrorIfNot`](https://github.com/dotnet/msbuild/blob/37c5a9fec416b403212a63f95f15b03dbd5e8b5d/src/Build/BackEnd/Components/Scheduler/Scheduler.cs#L1818) ensures that all `MSBuild` tasks are cache hits.

### How isolation exemption complicates everything
<!-- Potential cache scenarios caused by exemption -->
Project references [can be exempt](static-graph.md#exempting-references-from-isolation-constraints) from isolation constraints via the `GraphIsolationExemptReference` item.
### Isolation Exemption
The `Scheduler` [skips isolation constraints](static-graph.md#exempting-references-from-isolation-constraints) on project references via the:

The `Scheduler` knows to skip isolation constraints on an exempt `BuildRequest` because the [ProjectBuilder compares](https://github.com/dotnet/msbuild/blob/37c5a9fec416b403212a63f95f15b03dbd5e8b5d/src/Build/BackEnd/Components/RequestBuilder/RequestBuilder.cs#L349) each new `BuildRequest` against the `GraphIsolationExemptReference` items defined in the calling project, and if exempt, sets `BuildRequest.SkipStaticGraphIsolationConstraints`. When a `BuildRequest` is marked as exempt, the `Scheduler` also marks its corresponding `BuildRequestConfiguration` as exempt as well, which aids in further identification of exempt projects outside the `Scheduler`.
* `GraphIsolationExemptReference` item. The `RequestBuilder` sets the `SkipStaticGraphIsolationConstraints` property of a `BuildRequest` to `true` if the `RequestBuilder` matches it against a `GraphIsolationExemptReference` item defined in the calling project. Additionally, the `RequestBuilder` marks the `BuildRequest`'s corresponding `BuildRequestConfiguration` as exempt to allow the `TaskBuilder` to verify exemption from isolation constraints.

The build results for the exempt project are also included in the current cache (according to the above mentioned rule that newly built results are serialized in the output cache file). This complicates the way caches interact in several scenarios:
1. the same project can be exempt via multiple references, thus potentially colliding when multiple output cache files containing the same exempt project get aggregated. For example, given the graph `{A->B, A->C}`, where both `B` and `C` build the exempt project `D`, `D` will appear in the output caches of both `B` and `C`. When `A` aggregates these two caches, it will encounter duplicate entries for `D`, and will need to merge the results.
2. a project can be both exempt and present in the graph at the same time. For example, given the graph `{A->B}`, both `A` and `B` are in the graph, but `A` can also mark `B` as exempt (meaning that `A` contains both a `ProjectReference` item to `B`, and a `GraphIsolationExemptReference` item to `B`). The fact that `B` is in the graph means that `A` will receive an input cache containing B's build results. There are two subcases here:
1. `A` builds targets from `B` that already exist in the input cache file from `B`. In this case, all the builds of `B` will be cache hits, and no target results from `B` will make it into `A`'s output cache, since nothing new was built.
2. `A` builds targets from `B` that do not exist in the input cache file from `B`. If `B` weren't exempt from isolation constraints, this scenario would lead to a build break, as cache misses are illegal under isolation. With `B` being exempt, the new builds of `B` will get included in `A`'s output cache. The results from `B`'s cache file won't get included in `A`'s output cache file, as they weren't built by `A`.
3. A project, which is not in the graph, can be exempt by two parent/child projects from the graph. For example, given the graph `{A->B}`, both `A` and `B` can exempt project `D` (meaning that neither `A` nor `B` have a `ProjectReference` to `D`, but both `A` and `B` have a `GraphIsolationExemptReference` to `D`). The fact that `B` is in the graph means that `A` will receive an input cache containing `B`'s build results. Since `B` builds targets from `D`, it means that `B`'s output cache file also contains target results from `D`. There are two subcases here:
1. `A` builds targets from `D` that already exist in the input cache file from `B`. This is handled in the same way as the above case `2.1.`
2. `A` builds targets from `D` that do not exist in the input cache file from `B`, meaning that `A` builds additional targets from `D` which `B` didn't build. This is handled in the same way as teh above case `2.2.`
* `isolate:MessageUponIsolationViolation` switch. The `RequestBuilder` sets the `SkipStaticGraphIsolationConstraints` property of _every_ `BuildRequest` to `true`. The `TaskBuilder` verifies exemption from isolation constraints just by the switch value.

**Current issue:** if multiple nodes in the graph exempt the same project file, the build results of the exempt project will trickle up and conflict in the first parent that tries to merge them. Documented in issue [#4386](https://github.com/dotnet/msbuild/issues/4386).
\* Except in the following scenario when a `ProjectReference` is exempted from isolation constraints: a dependency project A outputs a cache file F containing a `BuildResult` with `TargetResult`s T<sub>cached</sub> for targets t<sub>1</sub>, t<sub>2</sub>, ..., t<sub>m</sub> and a dependent project B uses F as an input cache file but builds and obtains the `TargetResult`s T<sub>new</sub> for targets t<sub>m + 1</sub>, t<sub>m + 2</sub>, ..., t<sub>n</sub> such that 0 < m < n. In this case, T<sub>new</sub> will be placed into the `ResultsCache` containing T<sub>cached</sub> to enforce no overlap between the override and current caches in the `ConfigCacheWithOverride`.
Loading

0 comments on commit 377228d

Please sign in to comment.