Skip to content

Commit

Permalink
Rewrite some visibility documentation
Browse files Browse the repository at this point in the history
This is a precursor to documenting the new changes to visibility -- "public", "private", and --incomaptible_fix_package_group_syntax; and to documenting the new bzl visibility feature.

package_group() function doc:
- `into text`: Clarify visibility levels. Replace legacy "rule" terminology with more modern "target", and be a little more pedantic about granting access to packages vs targets.
- `packages` attr: Be consistent about these being "package specifications", not an enumeration of packages. Say the allowed forms, then the semantics without respect to `includes`. Mention `//...` special case, and repo behavior.
- `includes` attr: Move interaction between negation and includes to here.

dependencies.md:
- Better hint that granting access to a package group can never remove access from the package to itself ("may grant broader visibility").
- Describe visibility as a "dual" of dependency rather than "opposite".
- Paragraph break.

visibility.md:
- Again be pedantic about access for package vs target, while factoring out "targets defined in" verbiage.
- Again be pedantic about not suggesting you can remove access from a package to itself.
- Factor out mention of how the syntax is weird.
- Expand wording for load() visibility a little.

codebase.md:
- Drive by rewrite of part of Skyframe section.
- Drive-by fixes for Starlark section.
- Streamline wording of visibility section.
- Add more mention of package_group implementation classes. Maybe a little more verbose than needed, but I figured it was better to include them than not.

Work toward #11261.

PiperOrigin-RevId: 478572996
Change-Id: I1edc9bc80337976cc5dc21cf48a1d7e00924c63f
  • Loading branch information
brandjon authored and Copybara-Service committed Oct 3, 2022
1 parent 93e091f commit 0d5eb3c
Show file tree
Hide file tree
Showing 4 changed files with 171 additions and 138 deletions.
17 changes: 8 additions & 9 deletions site/en/basics/dependencies.md
Expand Up @@ -73,18 +73,17 @@ Bazel in the

## Minimizing Module Visibility

Bazel and other build systems allow each target to specify a visibility: a
property that specifies which other targets may depend on it. Targets can be
public, in which case they can be referenced by any other target in the
workspace; private, in which case they can be referenced only from within the
same `BUILD` file; or visible to only an explicitly defined list of other
targets. A visibility is essentially the opposite of a dependency: if target A
wants to depend on target B, target B must make itself visible to target A. As
with most programming languages, it is usually best to minimize visibility as
Bazel and other build systems allow each target to specify a visibility — a
property that determines which other targets may depend on it. A private target
can only be referenced within its own `BUILD` file. A target may grant broader
visibility to the targets of an explicitly defined list of `BUILD` files, or, in
the case of public visibility, to every target in the workspace.

As with most programming languages, it is usually best to minimize visibility as
much as possible. Generally, teams at Google will make targets public only if
those targets represent widely used libraries available to any team at Google.
Teams that require others to coordinate with them before using their code will
maintain an allow list of customer targets as their target’s visibility. Each
maintain an allowlist of customer targets as their target’s visibility. Each
team’s internal implementation targets will be restricted to only directories
owned by the team, and most `BUILD` files will have only one target that isn’t
private.
Expand Down
47 changes: 25 additions & 22 deletions site/en/concepts/visibility.md
Expand Up @@ -19,32 +19,33 @@ For more details on package and subpackages, see
## Visibility specifications {:#visibility-specifications}

All rule targets have a `visibility` attribute that takes a list of labels. One
target is visible to another if they are in the same package, or if they are
granted visibility by one of the labels.
target is visible to another if they are in the same package, or if visibility
is granted to the depending target's package.

Each label has one of the following forms:
Each label has one of the following forms. With the exception of the last form,
these are just syntactic placeholders that do not correspond to any actual
target.

* `"//visibility:public"`: Anyone can use this target. (May not be combined
* `"//visibility:public"`: Grants access to all packages. (May not be combined
with any other specification.)

* `"//visibility:private"`: Only targets in this package can use this
target. (May not be combined with any other specification.)
* `"//visibility:private"`: Does not grant any additional access; only targets
in this package can use this target. (May not be combined with any other
specification.)

* `"//foo/bar:__pkg__"`: Grants access to targets defined in `//foo/bar` (but
not its subpackages). Here, `__pkg__` is a special piece of syntax
representing all of the targets in a package.
* `"//foo/bar:__pkg__"`: Grants access to `//foo/bar` (but not its
subpackages).

* `"//foo/bar:__subpackages__"`: Grants access to targets defined in
`//foo/bar`, or any of its direct or indirect subpackages. Again,
`__subpackages__` is special syntax.
* `"//foo/bar:__subpackages__"`: Grants access `//foo/bar` and all of its
direct and indirect subpackages.

* `"//foo/bar:my_package_group"`: Grants access to all of the packages named
by the given [package group](/reference/be/functions#package_group).
* `"//some_pkg:my_package_group"`: Grants access to all of the packages that
are part of the given [`package_group`](/reference/be/functions#package_group).

* Package groups do not support the special `__pkg__` and
`__subpackages__` syntax. Within a package group, `"//foo/bar"` is
equivalent to `"//foo/bar:__pkg__"` and `"//foo/bar/..."` is equivalent
to `"//foo/bar:__subpackages__"`.
* Package groups use a different syntax for specifying packages. Within a
package group, the forms `"//foo/bar:__pkg__"` and
`"//foo/bar:__subpackages__"` are respectively replaced by `"//foo/bar"`
and `"//foo/bar/..."`.

For example, if `//some/package:mytarget` has its `visibility` set to
`[":__subpackages__", "//tests:__pkg__"]`, then it could be used by any target
Expand Down Expand Up @@ -187,12 +188,14 @@ for more details.

## Visibility of bzl files {:#visibility-bzl-files}

`load` statements are currently not subject to visibility. It is possible to
load a `bzl` file anywhere in the workspace.
`BUILD` and `.bzl` files, as processed by Bazel during loading, are not
considered to be targets and therefore are not subject to visibility. It is
therefore possible to load a `.bzl` file from anywhere in the workspace.

However, users may choose to run the Buildifier linter.
The [bzl-visibility](https://github.com/bazelbuild/buildtools/blob/master/WARNINGS.md#bzl-visibility) check
provides a warning if users `load` from beneath a subdirectory named `internal` or `private`.
The [bzl-visibility](https://github.com/bazelbuild/buildtools/blob/master/WARNINGS.md#bzl-visibility)
check provides a warning if users `load` from beneath a subdirectory named
`internal` or `private`.

## Visibility of implicit dependencies {:#visibility-implicit-dependencies}

Expand Down
142 changes: 72 additions & 70 deletions site/en/contribute/codebase.md
Expand Up @@ -246,17 +246,17 @@ it)
Every repository is composed of packages, a collection of related files and
a specification of the dependencies. These are specified by a file called
`BUILD` or `BUILD.bazel`. If both exist, Bazel prefers `BUILD.bazel`; the reason
why BUILD files are still accepted is that Bazel’s ancestor, Blaze, used this
why `BUILD` files are still accepted is that Bazel’s ancestor, Blaze, used this
file name. However, it turned out to be a commonly used path segment, especially
on Windows, where file names are case-insensitive.

Packages are independent of each other: changes to the BUILD file of a package
cannot cause other packages to change. The addition or removal of BUILD files
Packages are independent of each other: changes to the `BUILD` file of a package
cannot cause other packages to change. The addition or removal of `BUILD` files
_can _change other packages, since recursive globs stop at package boundaries
and thus the presence of a BUILD file stops the recursion.
and thus the presence of a `BUILD` file stops the recursion.

The evaluation of a BUILD file is called "package loading". It's implemented in
the class `PackageFactory`, works by calling the Starlark interpreter and
The evaluation of a `BUILD` file is called "package loading". It's implemented
in the class `PackageFactory`, works by calling the Starlark interpreter and
requires knowledge of the set of available rule classes. The result of package
loading is a `Package` object. It's mostly a map from a string (the name of a
target) to the target itself.
Expand Down Expand Up @@ -304,7 +304,7 @@ Packages are composed of targets, which have the following types:

The name of a target is called a _Label_. The syntax of labels is
`@repo//pac/kage:name`, where `repo` is the name of the repository the Label is
in, `pac/kage` is the directory its BUILD file is in and `name` is the path of
in, `pac/kage` is the directory its `BUILD` file is in and `name` is the path of
the file (if the label refers to a source file) relative to the directory of the
package. When referring to a target on the command line, some parts of the label
can be omitted:
Expand All @@ -321,9 +321,9 @@ be implemented either in Starlark (the `rule()` function) or in Java (so called
rule will be implemented in Starlark, but some legacy rule families (such as Java
or C++) are still in Java for the time being.

Starlark rule classes need to be imported at the beginning of BUILD files using
the `load()` statement, whereas Java rule classes are "innately" known by Bazel,
by virtue of being registered with the `ConfiguredRuleClassProvider`.
Starlark rule classes need to be imported at the beginning of `BUILD` files
using the `load()` statement, whereas Java rule classes are "innately" known by
Bazel, by virtue of being registered with the `ConfiguredRuleClassProvider`.

Rule classes contain information such as:

Expand Down Expand Up @@ -365,41 +365,39 @@ similarly-named package `com.google.devtools.build.lib.skyframe` contains the
implementation of Bazel on top of Skyframe. More information about Skyframe is
available [here](/reference/skyframe).

Generating a new `SkyValue` involves the following steps:

1. Running the associated `SkyFunction`
2. Declaring the dependencies (such as `SkyValue`s) that the `SkyFunction` needs
to do its job. This is done by calling the various overloads of
`SkyFunction.Environment.getValue()`.
3. If a dependency is not available, Skyframe signals that by returning null
from `getValue()`. In this case, the `SkyFunction` is expected to yield
control to Skyframe by returning null, then Skyframe evaluates the
dependencies that haven't been evaluated yet and calls the `SkyFunction`
again, thus going back to (1).
4. Constructing the resulting `SkyValue`

A consequence of this is that if not all dependencies are available in (3), the
function needs to be completely restarted and thus computation needs to be
re-done, which is obviously inefficient. `SkyFunction.Environment.getState()`
lets us directly work around this issue by having Skyframe maintain the
`SkyKeyComputeState` instance between calls to `SkyFunction.compute` for the
same `SkyKey`. Check out the example in the javadoc for
`SkyFunction.Environment.getState()`, as well as real usages in the Bazel
codebase.

Other indirect workarounds:

1. Declaring dependencies of `SkyFunction`s in groups so that if a function
has, say, 10 dependencies, it only needs to restart once instead of ten
times.
2. Splitting `SkyFunction`s so that one function does not need to be restarted
many times. This has the side effect of interning data into Skyframe that
may be internal to the `SkyFunction`, thus increasing memory use.

These are all just workarounds for the limitations of Skyframe, which
is mostly a consequence of the fact that Java doesn't support lightweight
threads and that we routinely have hundreds of thousands of in-flight Skyframe
nodes.
To evaluate a given `SkyKey` into a `SkyValue`, Skyframe will invoke the
`SkyFunction` corresponding to the type of the key. During the function's
evaluation, it may request other dependencies from Skyframe by calling the
various overloads of `SkyFunction.Environment.getValue()`. This has the
side-effect of registering those dependencies into Skyframe's internal graph, so
that Skyframe will know to re-evaluate the function when any of its dependencies
change. In other words, Skyframe's caching and incremental computation work at
the granularity of `SkyFunction`s and `SkyValue`s.

Whenever a `SkyFunction` requests a dependency that is unavailable, `getValue()`
will return null. The function should then yield control back to Skyframe by
itself returning null. At some later point, Skyframe will evaluate the
unavailable dependency, then restart the function from the beginning — only this
time the `getValue()` call will succeed with a non-null result.

A consequence of this is that any computation performed inside the `SkyFunction`
prior to the restart must be repeated. But this does not include work done to
evaluate dependency `SkyValues`, which are cached. Therefore, we commonly work
around this issue by:

1. Declaring dependencies in batches (by using `getValuesAndExceptions()`) to
limit the number of restarts.
2. Breaking up a `SkyValue` into separate pieces computed by different
`SkyFunction`s, so that they can be computed and cached independently. This
should be done strategically, since it has the potential to increases memory
usage.
3. Storing state between restarts, either using
`SkyFunction.Environment.getState()`, or keeping an ad hoc static cache
"behind the back of Skyframe".

Fundamentally, we need these types of workarounds because we routinely have
hundreds of thousands of in-flight Skyframe nodes, and Java doesn't support
lightweight threads.

## Starlark {:#starlark}

Expand All @@ -410,16 +408,16 @@ guarantees to enable concurrent reads. It is not Turing-complete, which
discourages some (but not all) users from trying to accomplish general
programming tasks within the language.

Starlark is implemented in the `com.google.devtools.build.lib.syntax` package.
Starlark is implemented in the `net.starlark.java` package.
It also has an independent Go implementation
[here](https://github.com/google/starlark-go){: .external}. The Java
implementation used in Bazel is currently an interpreter.

Starlark is used in four contexts:
Starlark is used in several contexts, including:

1. **The BUILD language.** This is where new rules are defined. Starlark code
running in this context only has access to the contents of the BUILD file
itself and Starlark files loaded by it.
1. **The `BUILD` language.** This is where new rules are defined. Starlark code
running in this context only has access to the contents of the `BUILD` file
itself and `.bzl` files loaded by it.
2. **Rule definitions.** This is how new rules (such as support for a new
language) are defined. Starlark code running in this context has access to
the configuration and data provided by its direct dependencies (more on this
Expand All @@ -430,8 +428,8 @@ Starlark is used in four contexts:
are defined. Starlark code running in this context can run arbitrary code on
the machine where Bazel is running, and reach outside the workspace.

The dialects available for BUILD and .bzl files are slightly different because
they express different things. A list of differences is available
The dialects available for `BUILD` and `.bzl` files are slightly different
because they express different things. A list of differences is available
[here](/rules/language#differences-between-build-and-bzl-files).

More information about Starlark is available
Expand All @@ -446,7 +444,7 @@ quite sensibly, a (target, configuration) pair.
It's called the "loading/analysis phase" because it can be split into two
distinct parts, which used to be serialized, but they can now overlap in time:

1. Loading packages, that is, turning BUILD files into the `Package` objects
1. Loading packages, that is, turning `BUILD` files into the `Package` objects
that represent them
2. Analyzing configured targets, that is, running the implementation of the
rules to produce the action graph
Expand Down Expand Up @@ -831,18 +829,17 @@ It's under review in pull request
### Visibility {:#visibility}

If you work on a large codebase with a lot of developers (like at Google), you
don't necessarily want everyone else to be able to depend on your code so that
you retain the liberty to change things that you deem to be implementation
details (otherwise, as per [Hyrum's law](https://www.hyrumslaw.com/){: .external},
people _will_ come to depend on all parts of your code).

Bazel supports this by the mechanism called _visibility: _you can declare that a
particular rule can only be depended on using the visibility attribute
(documentation
[here](/reference/be/common-definitions#common-attributes)).
This attribute is a little special because unlike every other attribute, the set
of dependencies it generates is not simply the set of labels listed (yes, this
is a design flaw).
want to take care to prevent everyone else from arbitrarily depending on your
code. Otherwise, as per [Hyrum's law](https://www.hyrumslaw.com/){: .external},
people _will_ come to rely on behaviors that you considered to be implementation
details.

Bazel supports this by the mechanism called _visibility_: you can declare that a
particular target can only be depended on using the
[visibility](/reference/be/common-definitions#common-attributes) attribute. This
attribute is a little special because, although it holds a list of labels, these
labels may encode a pattern over package names rather than a pointer to any
particular target. (Yes, this is a design flaw.)

This is implemented in the following places:

Expand All @@ -852,9 +849,14 @@ This is implemented in the following places:
packages directly (`//pkg:__pkg__`) or subtrees of packages
(`//pkg:__subpackages__`). This is different from the command line syntax,
which uses `//pkg:*` or `//pkg/...`.
* Package groups are implemented as their own target and configured target
types (`PackageGroup` and `PackageGroupConfiguredTarget`). We could probably
replace these with simple rules if we wanted to.
* Package groups are implemented as their own target (`PackageGroup`) and
configured target (`PackageGroupConfiguredTarget`). We could probably
replace these with simple rules if we wanted to. Their logic is implemented
with the help of: `PackageSpecification`, which corresponds to a
single pattern like `//pkg/...`; `PackageGroupContents`, which corresponds
to a single `package_group`'s `packages` attribute; and
`PackageSpecificationProvider`, which aggregates over a `package_group` and
its transitive `includes`.
* The conversion from visibility label lists to dependencies is done in
`DependencyResolver.visitTargetVisibility` and a few other miscellaneous
places.
Expand Down Expand Up @@ -915,7 +917,7 @@ built). Derived artifacts can themselves be multiple kinds:

There is no fundamental reason why source artifacts cannot be tree artifacts or
unresolved symlink artifacts, it's just that we haven't implemented it yet (we
should, though -- referencing a source directory in a BUILD file is one of the
should, though -- referencing a source directory in a `BUILD` file is one of the
few known long-standing incorrectness issues with Bazel; we have an
implementation that kind of works which is enabled by the
`BAZEL_TRACK_SOURCE_DIRECTORIES=1` JVM property)
Expand Down Expand Up @@ -1672,6 +1674,6 @@ tools. There are many examples of `BuildIntegrationTestCase` classes in the
Bazel repository.

Analysis tests are implemented as subclasses of `BuildViewTestCase`. There is a
scratch file system you can use to write BUILD files, then various helper
scratch file system you can use to write `BUILD` files, then various helper
methods can request configured targets, change the configuration and assert
various things about the result of the analysis.

0 comments on commit 0d5eb3c

Please sign in to comment.