Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework exclude rule merging #9307

Merged
merged 55 commits into from May 15, 2019
Merged

Rework exclude rule merging #9307

merged 55 commits into from May 15, 2019

Conversation

melix
Copy link
Contributor

@melix melix commented Apr 27, 2019

Context

As a follow-up to #9197, this commit properly fixes the
exclude rule merging algorithm, by completely rewriting
it. The new merging algorithm works by implementing the
minimal set of algebra operations that make sense to
minimize computation durations. In order to do this,
this commit introduces a number of exclude specs
(found in their own package) and factories to create
actual implementation of those specs.

Specs represent the different kind of excludes we can
find:

  • excluding a group
  • excluding a module (no group defined)
  • excluding a group+module
  • excluding an artifact of a group+module
  • pattern-matching excludes
  • unions of excludes
  • intersections of excludes

With all those minimal bricks, factories are responsible
of generating consistent specs. The dumbest factory
will just generate new instances for everything. This
is the default factory.

Minimally, this factory has to be backed by an optimizing
factory, which will take care of handling special cases:

  • union or intersection of a single spec
  • union or intersection of 2 specs
    • when one of them is null
    • when both are equal

Then we have a factory which performs the minimal algebra
to minimize specs:

  • unions of unions
  • intersections of intersections
  • union of a union and individual specs
  • insection of an intersection and individual spec
  • ...

This factory can be as smart as it can, but one must be
careful that it's worth it: some previously implemented
optimizations (like (A+B).A = A turned out to be costly
to detect, and didn't make it the final cut.

Yet another factory is there to reduce the memory footprint
and, as a side effect, make things faster by interning
the specs: equivalent specs are interned and indexed, which
allows us to optimize unions and intersections of specs.

Last but not least, a caching factory is there to avoid
recomputing the same intersections and unions of specs
when we have already done the job. This is efficient if
the underlying (delegate) specs are easily compared,
which is the case thanks to the interning factory.

All in all, the delegation chain allows us to make
the algorithm fast and hopefully reliable, while
making it easier to debug.

Hopefully this should fix the inconsistent classpath ordering
issues some customers have reported
.

@melix melix added this to the 5.5 RC1 milestone Apr 27, 2019
@melix melix requested review from ljacomet and bigdaz April 27, 2019 20:15
@melix melix self-assigned this Apr 27, 2019
@melix melix force-pushed the cc/dm/rework-exclude-merging branch 2 times, most recently from 0598b6c to 9b3d1c7 Compare April 27, 2019 22:21
@melix melix changed the title Rework exclude rule merging WIP - Rework exclude rule merging Apr 27, 2019
@melix melix force-pushed the cc/dm/rework-exclude-merging branch from 9b3d1c7 to f0d843a Compare April 28, 2019 07:00
Copy link
Member

@ljacomet ljacomet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall!

A couple comments and still need to take a deeper look at the indexed / interned part.

}
simpleExcludes.add((ModuleIdExclude) spec);
}
flattened.add(spec);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure that's a no-op, and if it was not you would get a ConcurrentModificationException

flattened.removeAll(simpleExcludes);
flattened.add(delegate.moduleSet(simpleExcludes.stream().map(ModuleIdExclude::getModuleId).collect(Collectors.toSet())));
} else {
flattened.add(simpleExcludes.iterator().next());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT that spec is already in flattened so this line is a no-op.


where:
left | right | expected
/* everything() | nothing() | everything()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to revert the commenting?

@melix melix force-pushed the cc/dm/rework-exclude-merging branch 6 times, most recently from 8b8a96a to 0d78e67 Compare April 29, 2019 13:47
Copy link
Member

@ljacomet ljacomet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor review comments

@@ -38,6 +38,7 @@ open class MinifyPlugin : Plugin<Project> {
val keepPatterns = mapOf(
"fastutil" to setOf(
"it.unimi.dsi.fastutil.ints.IntOpenHashSet",
"it.unimi.dsi.fastutil.ints.IntRBTreeSet",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is no longer needed AFAICT

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, will remove.

List<ModuleSetExclude> moduleSetExcludes = UnionOf.MODULE_SET.fromMap(byType);
List<ExcludeSpec> other = UnionOf.NOT_JOINABLE.fromMap(byType);
if (!moduleIdExcludes.isEmpty()) {
if (moduleIdExcludes.size() > 1 || !moduleIdSetsExcludes.isEmpty()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small comment to express intent might help there, or extract to a method.

// More than one ModuleIdExclude -> make it a ModuleIdSetExclude

Also, with the flattening done below, there might be a way to optimize when the moduleIdSetsExcludes.isEmpty() is true as you now create an immutable list that will be flattened again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code path is not called often and doesn't show up as a hotspot, so I didn't bother optimizing yet.

@melix melix force-pushed the cc/dm/rework-exclude-merging branch 2 times, most recently from 533e90d to ec085cf Compare May 2, 2019 19:48
@@ -992,7 +992,8 @@ include 'other'

@RequiredFeatures([
@RequiredFeature(feature = GradleMetadataResolveRunner.REPOSITORY_TYPE, value = "maven"),
@RequiredFeature(feature = GradleMetadataResolveRunner.EXPERIMENTAL_RESOLVE_BEHAVIOR, value = "true")
@RequiredFeature(feature = GradleMetadataResolveRunner.EXPERIMENTAL_RESOLVE_BEHAVIOR, value = "true"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in this test look like leftovers from trying things locally and should be reverted AFAICT

eskatos and others added 7 commits May 3, 2019 14:31
As a follow-up to #9197, this commit properly fixes the
exclude rule merging algorithm, by completely rewriting
it. The new merging algorithm works by implementing the
minimal set of algebra operations that make sense to
minimize computation durations. In order to do this,
this commit introduces a number of exclude specs
(found in their own package) and factories to create
actual implementation of those specs.

Specs represent the different kind of excludes we can
find:

- excluding a group
- excluding a module (no group defined)
- excluding a group+module
- excluding an artifact of a group+module
- pattern-matching excludes
- unions of excludes
- intersections of excludes

With all those minimal bricks, factories are responsible
of generating consistent specs. The dumbest factory
will just generate new instances for everything. This
is the default factory.

Minimally, this factory has to be backed by an optimizing
factory, which will take care of handling special cases:

- union or intersection of a single spec
- union or intersection of 2 specs
   - when one of them is null
   - when both are equal

Then we have a factory which performs the minimal algebra
to minimize specs:

- unions of unions
- intersections of intersections
- union of a union and individual specs
- insection of an intersection and individual spec
- ...

This factory can be as smart as it can, but one must be
careful that it's worth it: some previously implemented
optimizations (like (A+B).A = A turned out to be costly
to detect, and didn't make it the final cut.

Last but not least, a caching factory is there to avoid
recomputing the same intersections and unions of specs
when we have already done the job. This is efficient if
the underlying (delegate) specs are easily compared,
which is the case thanks to the interning factory.

All in all, the delegation chain allows us to make
the algorithm fast and hopefully reliable, while
making it easier to debug.
It didn't prove as fast as it was intended to be. Instead, we performed
the same optimization for single groups/modules as we did for module sets.
The merge cache can be reused at different stages. In particular, when
the normalizing factory normalizes unions/intersections, it is possible
that the normalized union is found in cache.
This commit optimizes flattening by avoiding the creation of intermediate
data structures. In particular using lists we were converting from and to
sets unnecessarily.
Measurements show it's significantly faster than using a read/write lock.
melix added 4 commits May 6, 2019 10:01
When sorting selectors, we will only give priority to a "latest"
selector if it's a require selector.
Sorting can be expensive, especially when the list is almost always sorted.
This commit changes the implementation to use on the fly sorting, as we
add selectors.
@melix melix force-pushed the cc/dm/rework-exclude-merging branch from 233d744 to dbb2ca6 Compare May 6, 2019 10:24
melix added 7 commits May 6, 2019 14:23
This involves the creation of an array, which starts becoming
relevant when we create lots of those objects.
While it's better to use ImmutableList for consistency,
it is surprisingly expensive to have the answer to the
question `isEmpty` on an empty immutable list.

This is done only where it showed up as a hotspot in
profiling: getting artifacts.
Deduplication is done when adding a reason. The rationale
is that in most cases, we have a single reason, and it's extremely
unlikely to have a large set of reasons.
This is an attempt to optimize performance: instead of iterating
over all dependencies, findind constraints, then checking if they
belong to the list of newly activated constraints, which can be
expensive in case of dependency locking, we now reverse the lookup
and for activated constraints, get the list of dependency states
which are effectively affected.
@melix melix force-pushed the cc/dm/rework-exclude-merging branch from 5030414 to be092aa Compare May 8, 2019 14:21
@melix melix force-pushed the cc/dm/rework-exclude-merging branch from cc97554 to 59d2b1f Compare May 9, 2019 16:24
This should reduce the amount of garbage created during
selection of artifact transforms.
@melix melix force-pushed the cc/dm/rework-exclude-merging branch 3 times, most recently from 14185bc to a54fceb Compare May 13, 2019 05:38
melix added 4 commits May 14, 2019 10:20
Serializing attributes (and reading them) can be quite expensive.
In project like Android projects, there are many attributes serialized
and they often have the same values. This prevents overhead by keeping
the serialized values de-duplicated.
Desugaring can be expensive, and this commit reduces the cost
by making sure we don't desugar the same instance of attributes
twice. It makes use of the fact we have an immutable attributes
factory which returns always the same instance of attributes.
Component attribute matching is the most expensive part of attribute
selection whenever there's an ambiguity. In Android, that's very often
the case. This commit changes the caching code to use more than one
(actually cache all) query cache in order to try to reduce selection
duration.
Instead of checking for **every** file entry if the file is new,
and then copy it if it's missing, we only need to do this once,
at the top-level exploded directory: this saves a lot of "file.exist"
calls, which can be quite expensive.
@melix melix force-pushed the cc/dm/rework-exclude-merging branch from f2cf15d to 8c227ac Compare May 14, 2019 21:08
This fixes the Gradleception build.
@melix melix force-pushed the cc/dm/rework-exclude-merging branch from 8c227ac to a6577d6 Compare May 14, 2019 21:19
@melix melix changed the title WIP - Rework exclude rule merging Rework exclude rule merging May 15, 2019
@melix melix merged commit 9e8afca into master May 15, 2019
@melix melix deleted the cc/dm/rework-exclude-merging branch May 15, 2019 06:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants