Bazel build outputs

Bazel uses a very opinionated scheme to store build artifacts that is a very common source of confusion for developers. This pages tries to clarify how things work.

The Bazel `output_base`

By design, bazel build commands will never write files to a project's source directory (or one of its sub-directories). Instead Bazel uses a user-specific parallel directory to store all outputs, called the user_output_root, which is by default:

~/.cache/bazel/_bazel_$USER on Linux.
%HOME%\_bazel_%USERNAME% on Windows.

For each workspace directory where bazel is run, a separate directory, named the output_base is created under the user_output_root, as:

  ${user_output_root}/<WORKSPACE_HASH>`

Where <WORKSPACE_HASH> is a long hexadecimal hash (computed from the directory's absolute path).

Since this path is completely unpredictable, the command bazel info output_base will print it, when used inside a Bazel project. For example:

$ mkdir -p /tmp/project1 && cd /tmp/project1 && touch WORKSPACE.bazel
$ bazel info output_base
Starting local Bazel server and connecting to it...
/usr/local/google/home/digit/.cache/bazel/_bazel_digit/6c7b78994da78136b5cb6b7607361ad3

$ mkdir -p /tmp/project2 && cd /tmp/project2 && touch WORKSPACE.bazel
$ bazel info output_base
Starting local Bazel server and connecting to it...
/usr/local/google/home/digit/.cache/bazel/_bazel_digit/c37b9d68308ee5abe2f781dd38b733b9

$ mkdir -p /tmp/not-a-project && cd /tmp/not-a-project
$ bazel info output_base
WARNING: Invoking Bazel in batch mode since it is not invoked from within a workspace (below a directory having a WORKSPACE file).
ERROR: The 'info' command is only supported from within a workspace (below a directory having a WORKSPACE file).
See documentation at https://bazel.build/concepts/build-ref#workspace

This scheme is flexible but not perfect:

Pro: multiple users on the same machine can share the same read-only project directory.
Pro: multiple project directories for the same user will always use independent output paths.
Con: Looking at generated files directly from the command-line or even graphical explorers is very difficult.
Con: Removing a project directory (e.g. with rm -rf .../my-project) does not remove its outputs (a significant source of waste).
Con: Moving a project directory (e.g. with mv my-project my-project2) does not reuse previous output_base content (and leaves the old ones in place, now inaccessible).
Con: The default location for the user_output_root, and thus the output_base is often not on the same filesystem / partition than the project. This can have unexpected consequences in terms of performance / disk usage.

Call bazel clean to remove build outputs from the current output_base. This must be done before removing a source project directory.

In practice, it is easy to bloat the content of the user_output_root with build artifacts from stale Bazel projects that were never properly cleaned. Worse, trying to simply remove the user_output_root manually may not work, because Bazel creates read-only build artifacts by default, which prevent a command like rm -rf ~/.cache/bazel from working!

Bazel `output_base` content:

Several things are actually stored under the output_base:

Workspace directories for external repositories:

Those correspond to external project dependencies. Often these are not part of the project's source tree, but downloaded from the network or generated programmatically.

Their content is stored under ${output_base}/external/<repository_name>, where the external part is hard-coded, and <repository_name> matches the external repository's canonical name.

Note: that this content is not removed by bazel clean by default. Use bazel clean --expunge to remove them from the output_base.

Build artifacts:

The files generated by running bazel build. These are stored under:

${output_base}/execroot/<workspace_name>/bazel-out/<config_dir>/bin/

Where:

The execroot, bazel-out and bin parts are hard-coded and cannot be changed.
For targets defined in the project's own BUILD.bazel files, <workspace_name> defaults to __main__, unless it is set in the project's WORKSPACE.bazel file with a directive like:

  workspace(
    name = "my_project",
  )
  ```

- For targets defined in external repositories, `<workspace_name>` matches
the repository's canonical name.

- The `<config_dir>` value is a name derived from the build configuration used
to configure the target that generated the build artifact. This allows
rebuilding the same target in different ways, each time using a different
`<config_dir>` value.

Note: The `<config_dir>` value is **generally unpredictable**. More on this [here][bazel-config-dirs]

Test results:

The log files generated when bazel test is called, stored under ${output_base}/execroot/<workspace_name>/bazel-out/<config_dir>/testlogs/.
Internal cache and configuration files:

Used by remote builds and remote cache features. These files can be ignored by developers.

Bazel `execroot` directory:

The execroot is used to run Bazel commands that generate build artifacts, but how this is done depends on whether sandboxing is enabled for a specific action.

On Linux, sandboxing is enabled by default for all actions. There is no sandboxing support on Windows (as of Bazel 7).
A Bazel action can disable sandboxing intentionally by using the no-sandbox tag in definition.
Sandboxing can be disabled globally with an option like --spawn_strategy=local when invoking bazel.

Without sandboxing:

When sandboxing is disabled, all build actions that generate artifacts for a given workspace will put output files under ${output_base}/execroot/<workspace_name>.

All paths to source files and build artifacts that appear in the action's command will thus be relative to it.

Bazel ensures that symlinks to the input sources used by the command are created under the execroot before the command is launched.

For example, an action that would compile the //src/foo/foo.cc file, which contains #include "foo.h", corresponding to //src/foo/foo.h could look like:

gcc -c -o bazel-out/k8-fastbuild/bin/src/foo/foo.o src/foo/foo.cc -Isrc/foo

Which works because:

Before running the command, Bazel would create the symlink ${output_base}/execroot/__main__/src pointing to $PROJECT/src, so that src/foo/foo.cc and src/foo/foo.h resolve to $PROJECT/src/foo/foo.cc and $PROJECT/src/foo/foo.h as expected.
The location bazel-out/k8-fastbuild/bin/src/foo/foo.o is the final output path, for the object file created by compiling foo.cc in the build configuration used by this command.

With sandboxing:

When sandboxing is enabled, Bazel creates for each command a temporary directory (e.g. ${output_base}/sandbox/linux-sandbox/<random-number>) and will create a symlink tree to mimic the execroot layout under it, but only for the inputs it know about. In this case, this would look like:

Symlink for the inputs ${sandbox}/execroot/__main__/src/foo/foo.cc and ${sandbox}/execroot/__main__/src/foo/foo.h pointing to $PROJECT/src/foo/foo.cc and $PROJECT/src/foo/foo.h respectively.
Run the exact same command under ${sandbox}/execroot/__main__. instead of ${output_base}/execroot/__main__.
Once the command completes, copy the known output from the sandbox path at ${sandbox}/execroot/__main__/bazel-out/k8-fastbuild/bin/src/foo/foo.o to its final location at ${output_base}/execroot/__main__/bazel-out/k8-fastbuild/bin/src/foo/foo.o.
Finally, the sandbox directory, and all its content are removed. This also means that undeclared outputs are ignored.