Skip to content

Commit

Permalink
Remove --experimental_worker_allow_json_protocol
Browse files Browse the repository at this point in the history
This flag has been flipped, and unlike other experimental flags there is no reason to disable it because workers can only support a single format at once, so if they rely on this, you just wouldn't be able to build.

Closes #13599

Closes #14679.

PiperOrigin-RevId: 441738381
  • Loading branch information
keith authored and copybara-github committed Apr 14, 2022
1 parent 081f831 commit 09df7c0
Show file tree
Hide file tree
Showing 5 changed files with 58 additions and 79 deletions.
4 changes: 1 addition & 3 deletions site/en/docs/creating-workers.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,7 @@ would benefit from cross-action caching, you may want to implement your own
persistent worker to perform these actions.

The Bazel server communicates with the worker using `stdin`/`stdout`. It
supports the use of protocol buffers or JSON strings. Support for JSON is
experimental and thus subject to change. It is guarded behind the
`--experimental_worker_allow_json_protocol` flag.
supports the use of protocol buffers or JSON strings.

The worker implementation has two parts:

Expand Down
112 changes: 56 additions & 56 deletions site/en/docs/persistent-workers.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,33 +3,34 @@ Book: /_book.yaml

# Persistent Workers

This page covers how to use persistent workers, the benefits, requirements,
and how workers affect sandboxing.
This page covers how to use persistent workers, the benefits, requirements, and
how workers affect sandboxing.

A persistent worker is a long-running process started by the Bazel server, which
functions as a _wrapper_ around the actual _tool_ (typically a compiler), or is
the _tool_ itself. In order to benefit from persistent workers, the tool must
functions as a *wrapper* around the actual *tool* (typically a compiler), or is
the *tool* itself. In order to benefit from persistent workers, the tool must
support doing a sequence of compilations, and the wrapper needs to translate
between the tool's API and the request/response format described below. The same
worker might be called with and without the `--persistent_worker` flag
in the same build, and is responsible for appropriately starting and talking to
the tool, as well as shutting down workers on exit. Each worker instance is
assigned (but not chrooted to) a separate working directory under
worker might be called with and without the `--persistent_worker` flag in the
same build, and is responsible for appropriately starting and talking to the
tool, as well as shutting down workers on exit. Each worker instance is assigned
(but not chrooted to) a separate working directory under
`<outputBase>/bazel-workers`.

Using persistent workers is an
[execution strategy](/docs/user-manual#execution-strategy)
that decreases start-up overhead, allows more JIT compilation, and enables
caching of for example the abstract syntax trees in the action execution. This
strategy achieves these improvements by sending multiple requests to a
long-running process.
[execution strategy](/docs/user-manual#execution-strategy) that decreases
start-up overhead, allows more JIT compilation, and enables caching of for
example the abstract syntax trees in the action execution. This strategy
achieves these improvements by sending multiple requests to a long-running
process.

Persistent workers are implemented for multiple languages, including Java,
[Scala](https://github.com/bazelbuild/rules_scala){: .external},
[Kotlin](https://github.com/bazelbuild/rules_kotlin){: .external}, and more.

Programs using a NodeJS runtime can use the [@bazel/worker](https://www.npmjs.com/package/@bazel/worker)
helper library to implement the worker protocol.
Programs using a NodeJS runtime can use the
[@bazel/worker](https://www.npmjs.com/package/@bazel/worker) helper library to
implement the worker protocol.

## Using persistent workers {:#usage}

Expand All @@ -38,23 +39,23 @@ uses persistent workers by default when executing builds, though remote
execution takes precedence. For actions that do not support persistent workers,
Bazel falls back to starting a tool instance for each action. You can explicitly
set your build to use persistent workers by setting the `worker`
[strategy](/docs/user-manual#execution-strategy) for the applicable tool mnemonics.
As a best practice, this example includes specifying `local` as a fallback to
the `worker` strategy:
[strategy](/docs/user-manual#execution-strategy) for the applicable tool
mnemonics. As a best practice, this example includes specifying `local` as a
fallback to the `worker` strategy:

```posix-terminal
bazel build //{{ '<var>' }}my:target{{ '</var>' }} --strategy=Javac=worker,local
```

Using the workers strategy instead of the local strategy can boost compilation
speed significantly, depending on implementation. For Java, builds can be
2–4 times faster, sometimes more for incremental compilation. Compiling
Bazel is about 2.5 times as fast with workers. For more details, see the
speed significantly, depending on implementation. For Java, builds can be 2–4
times faster, sometimes more for incremental compilation. Compiling Bazel is
about 2.5 times as fast with workers. For more details, see the
"[Choosing number of workers](#number-of-workers)" section.

If you also have a remote build environment that matches your local build
environment, you can use the experimental
[_dynamic_ strategy](https://blog.bazel.build/2019/02/01/dynamic-spawn-scheduler.html),
[*dynamic* strategy](https://blog.bazel.build/2019/02/01/dynamic-spawn-scheduler.html){: .external},
which races a remote execution and a worker execution. To enable the dynamic
strategy, pass the
[--experimental_spawn_scheduler](/reference/command-line-reference#flag--experimental_spawn_scheduler)
Expand All @@ -72,8 +73,8 @@ amount of JIT compilation and cache hits you get. With more workers, more
targets will pay start-up costs of running non-JITted code and hitting cold
caches. If you have a small number of targets to build, a single worker may give
the best trade-off between compilation speed and resource usage (for example,
see [issue #8586](https://github.com/bazelbuild/bazel/issues/8586){: .external}. The
`worker_max_instances` flag sets the maximum number of worker instances per
see [issue #8586](https://github.com/bazelbuild/bazel/issues/8586){: .external}.
The `worker_max_instances` flag sets the maximum number of worker instances per
mnemonic and flag set (see below), so in a mixed system you could end up using
quite a lot of memory if you keep the default value. For incremental builds the
benefit of multiple worker instances is even smaller.
Expand Down Expand Up @@ -107,9 +108,8 @@ discarded):

**Figure 2.** Graph of performance improvements of incremental builds.

The speed-up depends on the change being made. A speed-up of a
factor 6 is measured in the above situation when a commonly used constant
is changed.
The speed-up depends on the change being made. A speed-up of a factor 6 is
measured in the above situation when a commonly used constant is changed.

## Modifying persistent workers {:#options}

Expand Down Expand Up @@ -137,17 +137,12 @@ flag makes each worker request use a separate sandbox directory for all its
inputs. Setting up the [sandbox](/docs/sandboxing) takes some extra time,
especially on macOS, but gives a better correctness guarantee.

You can use the `--experimental_worker_allow_json_protocol` flag to allow
workers to communicate with Bazel through JSON instead of protocol buffers
(protobuf). The worker and the rule that consumes it can then be modified to
support JSON.

The
[`--worker_quit_after_build`](/reference/command-line-reference#flag--worker_quit_after_build)
flag is mainly useful for debugging and profiling. This flag forces all workers
to quit once a build is done. You can also pass
[`--worker_verbose`](/reference/command-line-reference#flag--worker_verbose) to get
more output about what the workers are doing. This flag is reflected in the
[`--worker_verbose`](/reference/command-line-reference#flag--worker_verbose) to
get more output about what the workers are doing. This flag is reflected in the
`verbosity` field in `WorkRequest`, allowing worker implementations to also be
more verbose.

Expand Down Expand Up @@ -184,6 +179,7 @@ ctx.actions.run(
"supports-workers" : "1", "requires-worker-protocol" : "json" }
)
```

With this definition, the first use of this action would start with executing
the command line `/bin/some_compiler -max_mem=4G --persistent_worker`. A request
to compile `Foo.java` would then look like:
Expand All @@ -196,14 +192,12 @@ inputs: [
]
```

The worker receives this on `stdin` in JSON format (because
`requires-worker-protocol` is set to JSON, and
`--experimental_worker_allow_json_protocol` is passed to the build to enable
this option). The worker then performs the action, and sends a JSON-formatted
`WorkResponse` to Bazel on its stdout. Bazel then parses this response and
manually converts it to a `WorkResponse` proto. To communicate
with the associated worker using binary-encoded protobuf instead of JSON,
`requires-worker-protocol` would be set to `proto`, like this:
The worker receives this on `stdin` in newline-delimited JSON format (because
`requires-worker-protocol` is set to JSON). The worker then performs the action,
and sends a JSON-formatted `WorkResponse` to Bazel on its stdout. Bazel then
parses this response and manually converts it to a `WorkResponse` proto. To
communicate with the associated worker using binary-encoded protobuf instead of
JSON, `requires-worker-protocol` would be set to `proto`, like this:

```
execution_requirements = {
Expand All @@ -225,21 +219,24 @@ Each worker can currently only process one request at a time. The experimental
threads, if the underlying tool is multithreaded and the wrapper is set up to
understand this.

In [this GitHub repo](https://github.com/Ubehebe/bazel-worker-examples){: .external}, you can
see example worker wrappers written in Java as well as in Python. If you are
working in JavaScript or TypeScript, the [@bazel/worker
package](https://www.npmjs.com/package/@bazel/worker){: .external} and
In
[this GitHub repo](https://github.com/Ubehebe/bazel-worker-examples){: .external},
you can see example worker wrappers written in Java as well as in Python. If you
are working in JavaScript or TypeScript, the
[@bazel/worker package](https://www.npmjs.com/package/@bazel/worker){: .external}
and
[nodejs worker example](https://github.com/bazelbuild/rules_nodejs/tree/stable/examples/worker){: .external}
might be helpful.

## How do workers affect sandboxing? {:#sandboxing}

Using the `worker` strategy by default does not run the action in a
[sandbox](/docs/sandboxing), similar to the `local` strategy. You can set
the `--worker_sandboxing` flag to run all workers inside sandboxes, making sure
each execution of the tool only sees the input files it's supposed to have. The
tool may still leak information between requests internally, for instance
through a cache. Using `dynamic` strategy [requires workers to be sandboxed](https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/exec/SpawnStrategyRegistry.java){: .external}.
[sandbox](/docs/sandboxing), similar to the `local` strategy. You can set the
`--worker_sandboxing` flag to run all workers inside sandboxes, making sure each
execution of the tool only sees the input files it's supposed to have. The tool
may still leak information between requests internally, for instance through a
cache. Using `dynamic` strategy
[requires workers to be sandboxed](https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/exec/SpawnStrategyRegistry.java){: .external}.

To allow correct use of compiler caches with workers, a digest is passed along
with each input file. Thus the compiler or the wrapper can check if the input is
Expand All @@ -259,9 +256,12 @@ and this sandboxing must be separately enabled with the
For more information on persistent workers, see:

* [Original persistent workers blog post](https://blog.bazel.build/2015/12/10/java-workers.html)
* [Haskell implementation description](https://www.tweag.io/blog/2019-09-25-bazel-ghc-persistent-worker-internship/){: .external}
* [Blog post by Mike Morearty](https://medium.com/@mmorearty/how-to-create-a-persistent-worker-for-bazel-7738bba2cabb){: .external}
* [Haskell implementation description](https://www.tweag.io/blog/2019-09-25-bazel-ghc-persistent-worker-internship/)
{: .external}
* [Blog post by Mike Morearty](https://medium.com/@mmorearty/how-to-create-a-persistent-worker-for-bazel-7738bba2cabb)
{: .external}
* [Front End Development with Bazel: Angular/TypeScript and Persistent Workers
w/ Asana](https://www.youtube.com/watch?v=0pgERydGyqo){: .external}
* [Bazel strategies explained](https://jmmv.dev/2019/12/bazel-strategies.html)
* [Informative worker strategy discussion on the bazel-discuss mailing list](https://groups.google.com/forum/#!msg/bazel-discuss/oAEnuhYOPm8/ol7hf4KWJgAJ){: .external}
w/ Asana](https://www.youtube.com/watch?v=0pgERydGyqo) {: .external}
* [Bazel strategies explained](https://jmmv.dev/2019/12/bazel-strategies.html) {: .external}
* [Informative worker strategy discussion on the bazel-discuss mailing list](https://groups.google.com/forum/#!msg/bazel-discuss/oAEnuhYOPm8/ol7hf4KWJgAJ)
{: .external}
Original file line number Diff line number Diff line change
Expand Up @@ -46,16 +46,6 @@ public class WorkerOptions extends OptionsBase {
})
public Void experimentalPersistentJavac;

@Option(
name = "experimental_worker_allow_json_protocol",
defaultValue = "true",
documentationCategory = OptionDocumentationCategory.UNDOCUMENTED,
effectTags = {OptionEffectTag.BUILD_FILE_SEMANTICS},
help =
"Allows workers to use the JSON worker protocol until it is determined to be"
+ " stable.")
public boolean experimentalJsonWorkerProtocol;

/**
* Defines a resource converter for named values in the form [name=]value, where the value is
* {@link ResourceConverter.FLAG_SYNTAX}. If no name is provided (used when setting a default),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -90,14 +90,6 @@ public WorkerConfig compute(Spawn spawn, SpawnExecutionContext context)

HashCode workerFilesCombinedHash = WorkerFilesHash.getCombinedHash(workerFiles);

WorkerProtocolFormat protocolFormat = Spawns.getWorkerProtocolFormat(spawn);
if (!workerOptions.experimentalJsonWorkerProtocol) {
if (protocolFormat == WorkerProtocolFormat.JSON) {
throw new IOException(
"Persistent worker protocol format must be set to proto unless"
+ " --experimental_worker_allow_json_protocol is used");
}
}
WorkerKey key =
createWorkerKey(
spawn,
Expand All @@ -108,7 +100,7 @@ public WorkerConfig compute(Spawn spawn, SpawnExecutionContext context)
workerFiles,
workerOptions,
context.speculating(),
protocolFormat);
Spawns.getWorkerProtocolFormat(spawn));
return new WorkerConfig(key, flagFiles);
}

Expand Down
1 change: 0 additions & 1 deletion src/test/shell/integration/bazel_worker_test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,6 @@ example_worker=$(find $BAZEL_RUNFILES -name ExampleWorker_deploy.jar)

add_to_bazelrc "build -s"
add_to_bazelrc "build --spawn_strategy=worker,standalone"
add_to_bazelrc "build --experimental_worker_allow_json_protocol"
add_to_bazelrc "build --worker_verbose --worker_max_instances=1"
add_to_bazelrc "build --debug_print_action_contexts"
add_to_bazelrc "build --noexperimental_worker_multiplex"
Expand Down

0 comments on commit 09df7c0

Please sign in to comment.