Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: sbt/sbt-assembly
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v1.2.0
Choose a base ref
...
head repository: sbt/sbt-assembly
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v2.0.0
Choose a head ref
  • 6 commits
  • 39 files changed
  • 4 contributors

Commits on May 1, 2022

  1. Refactored assembly to use in-memory model instead of IO model (#464)

    What's changed
    -------------------------------
    The plugin has been refactored to use in-memory processing of library entries, in contrast to its old version where library jars are unzipped to disk.
    
    This has positive performance implications, especially for large projects, machines with slow disks (i.e. spinning hard drives) or systems with slow file systems (i.e. WSL1 emulated file access).
    
    It has also streamlined the plugin code for contributors.
    
    New features for end users
    -------------------------------
    - General improved performance compared to older versions
    - `ThisBuild.repeatableBuild` introduced. If set to `false` - parallelizes jar creation for a faster performance at the expense of losing caching and a consistent hash. Defaults to `true`
    - Clearer parameters for a custom `MergeStrategy` - conflicts are provided as a collection of `Dependency`s and the merge result is represented as an `Either` of `JarEntry`s or an error message
    - Caching will prevent merging of files (other than rename) and jar creation if the input have not changed. Previously, the caching only caters for jar creation but always merges files
     
    Bug fixes
    -------------------------------
    - `Merge` now reports the correct number of affected files
    - Files that conflict with directories after the merge will now be printed as a clear error message to the user, instead of failing at runtime
    - The caching directory is updated to reflect the `crossVersion`, where previously, cross-builds (i.e. `2.12.8`, `2.13.8`) share the same cache directory, unncessarily invalidating the cache each time
    
    Breaking changes
    -------------------------------
    - Dropped `cacheUnzip`
    - Dropped `exludedFiles` 
    - Directories named LICENSE or README are not renamed anymore under the default merge strategy. If these directories conflict with files named LICENSE or README, the files will be renamed to include the assembly jar name (minus the .jar extenssion) if it is a project file or the jar name if it is a library jar entry. 
    - Fails `*.class` renames via `MergeStrategy.rename`, where it was a silent no-op previously
    fnqista authored May 1, 2022

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
    Copy the full SHA
    592fe87 View commit details

Commits on Jun 14, 2022

  1. Copy the full SHA
    9795533 View commit details

Commits on Jun 15, 2022

  1. Copy the full SHA
    d090b47 View commit details

Commits on Oct 4, 2022

  1. Fix windows problems (#472)

    fnqista committed Oct 4, 2022
    Copy the full SHA
    14c0f72 View commit details

Commits on Oct 5, 2022

  1. Merge pull request #478 from fnqista/develop

    Fix windows problems (#472)
    eed3si9n authored Oct 5, 2022

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
    Copy the full SHA
    dbe137f View commit details
  2. Run CI on Windows (#471)

    Run CI on Windows
    nightscape authored Oct 5, 2022
    Copy the full SHA
    1a529bc View commit details
Showing with 1,263 additions and 646 deletions.
  1. +4 −1 .github/workflows/ci.yml
  2. +96 −26 README.md
  3. +1 −2 build.sbt
  4. +1 −5 src/main/contraband/AssemblyOption.contra
  5. +638 −304 src/main/scala/sbtassembly/Assembly.scala
  6. +4 −15 src/main/scala/sbtassembly/AssemblyKeys.scala
  7. +14 −22 src/main/scala/sbtassembly/AssemblyOption.scala
  8. +49 −56 src/main/scala/sbtassembly/AssemblyPlugin.scala
  9. +47 −92 src/main/scala/sbtassembly/AssemblyUtils.scala
  10. +215 −113 src/main/scala/sbtassembly/MergeStrategy.scala
  11. +0 −4 src/sbt-test/caching/caching/build.sbt
  12. +42 −0 src/sbt-test/caching/custommergestrat/build.sbt
  13. +7 −0 src/sbt-test/caching/custommergestrat/project/plugins.sbt
  14. +1 −0 src/sbt-test/caching/custommergestrat/src/main/resources/sbtassembly
  15. +13 −0 src/sbt-test/caching/custommergestrat/test
  16. +26 −0 src/sbt-test/merging/custommergestrat/build.sbt
  17. BIN src/sbt-test/merging/custommergestrat/lib/1.jar
  18. +7 −0 src/sbt-test/merging/custommergestrat/project/plugins.sbt
  19. +1 −0 src/sbt-test/merging/custommergestrat/src/main/resources/sbtassembly
  20. +4 −0 src/sbt-test/merging/custommergestrat/test
  21. +5 −0 src/sbt-test/merging/mergefail3/build.sbt
  22. BIN src/sbt-test/merging/mergefail3/lib/1.jar
  23. +7 −0 src/sbt-test/merging/mergefail3/project/plugins.sbt
  24. +1 −0 src/sbt-test/merging/mergefail3/src/main/resources/sbtassembly
  25. +2 −0 src/sbt-test/merging/mergefail3/test
  26. +12 −0 src/sbt-test/merging/mergefail4/build.sbt
  27. +7 −0 src/sbt-test/merging/mergefail4/project/plugins.sbt
  28. +1 −0 src/sbt-test/merging/mergefail4/src/main/scala/Test.scala
  29. +2 −0 src/sbt-test/merging/mergefail4/test
  30. +5 −5 src/sbt-test/merging/merging/build.sbt
  31. +33 −0 src/sbt-test/merging/rename/build.sbt
  32. +7 −0 src/sbt-test/merging/rename/project/plugins.sbt
  33. +1 −0 src/sbt-test/merging/rename/src/main/resources/a
  34. +1 −0 src/sbt-test/merging/rename/src/main/resources/b
  35. +1 −0 src/sbt-test/merging/rename/src/main/resources/c
  36. +1 −0 src/sbt-test/merging/rename/src/main/resources/d
  37. +1 −0 src/sbt-test/merging/rename/src/main/resources/e
  38. +4 −0 src/sbt-test/merging/rename/test
  39. +2 −1 src/sbt-test/sbt-assembly/piecemeal/build.sbt
5 changes: 4 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -6,10 +6,12 @@ on:
- cron: '0 21 * * 0'
jobs:
test:
runs-on: ubuntu-latest
runs-on: ${{ matrix.os }}

strategy:
fail-fast: false
matrix:
os: [ubuntu-latest, windows-latest]
include:
- java: 8
scala: 2.12.8
@@ -26,6 +28,7 @@ jobs:
java-version: "adopt@1.${{ matrix.java }}"
- uses: coursier/cache-action@v5
- name: Build and test
shell: bash
run: |
sbt -v +publishLocal $(if [[ "${{matrix.scala}}" != "" ]] ; then echo "++${{matrix.scala}}!" ; fi) test scripted
rm -rf "$HOME/.ivy2/local" || true
122 changes: 96 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
@@ -74,13 +74,12 @@ one) then you'll end up with a fully executable JAR, ready to rock.
Here is the list of the keys you can rewire that are scoped to current subproject's `assembly` task:

assemblyJarName test mainClass
assemblyOutputPath assemblyOption assembledMappings
assembledMappings
assemblyOutputPath assemblyOption

And here is the list of the keys you can rewite that are scoped globally:

assemblyAppendContentHash assemblyCacheOutput assemblyCacheUnzip
assemblyExcludedJars assemblyMergeStrategy assemblyShadeRules
assemblyAppendContentHash assemblyCacheOutput assemblyShadeRules
assemblyExcludedJars assemblyMergeStrategy assemblyRepeatableBuild

Keys scoped to the subproject should be placed in `.settings(...)` whereas the globally scoped keys can either be placed inside of `.settings(...)` or scoped using `ThisBuild / ` to be shared across multiple subprojects.

@@ -140,10 +139,11 @@ of the following built-in strategies or writing a custom one:
* `MergeStrategy.first` picks the first of the matching files in classpath order
* `MergeStrategy.last` picks the last one
* `MergeStrategy.singleOrError` bails out with an error message on conflict
* `MergeStrategy.concat` simply concatenates all matching files and includes the result
* `MergeStrategy.filterDistinctLines` also concatenates, but leaves out duplicates along the way
* `MergeStrategy.concat` simply concatenates all matching files and includes the result. There is also an overload that accepts a line separator for formatting the result
* `MergeStrategy.filterDistinctLines` also concatenates, but leaves out duplicates along the way. There is also an overload that accepts a `Charset` for reading the lines
* `MergeStrategy.rename` renames the files originating from jar files
* `MergeStrategy.discard` simply discards matching files
* `MergeStrategy.preferProject` will choose the first project file over library files if present. Otherwise, it works like `MergeStrategy.first`

The mapping of path names to merge strategies is done via the setting
`assemblyMergeStrategy` which can be augmented as follows:
@@ -161,8 +161,19 @@ ThisBuild / assemblyMergeStrategy := {
```

**NOTE**:
- Actually, a merge strategy serves two purposes:
* To merge conflicting files
* To transform a single file (despite the naming), such as in the case of a `MergeStrategy.rename`. Sometimes, the transformation is a pass-through, as in the case of a `MergeStrategy.deduplicate` if there are no conflicts on a `target` path.
- `ThisBuild / assemblyMergeStrategy` expects a function. You can't do `ThisBuild / assemblyMergeStrategy := MergeStrategy.first`!
- Some files must be discarded or renamed otherwise to avoid breaking the zip (due to duplicate file name) or the legal license. Delegate default handling to `(ThisBuild / assemblyMergeStrategy)` as the above pattern matching example.
- Renames are processed first, since renamed file targets might match more merge patterns afterwards. By default, LICENSEs and READMEs are renamed before applying every other merge strategy. If you need a custom logic for renaming, create a new rename merge strategy so it is processsed first, along with the custom logic. See how to create custom `MergeStrategy`s in a later section of this README.
- There is an edge case that may occasionally fail. If a project has a file that has the same relative path as a directory to be written, an error notification will be written to the console as shown below. To resolve this, create a shade rule or a new merge strategy.

```bash
[error] Files to be written at 'shadeio' have the same name as directories to be written:
[error] Jar name = commons-io-2.4.jar, jar org = commons-io, entry target = shadeio/input/Tailer.class (from original source = org/apache/commons/io/input/Tailer.class)
[error] Project name = foo, target = shadeio
```

By the way, the first case pattern in the above using `PathList(...)` is how you can pick `javax/servlet/*` from the first jar. If the default `MergeStrategy.deduplicate` is not working for you, that likely means you have multiple versions of some library pulled by your dependency graph. The real solution is to fix that dependency graph. You can work around it by `MergeStrategy.first` but don't be surprised when you see `ClassNotFoundException`.

@@ -192,18 +203,67 @@ Here is the default:
}
```

Custom `MergeStrategy`s can find out where a particular file comes
from using the `sourceOfFileForMerge` method on `sbtassembly.AssemblyUtils`,
which takes the temporary directory and one of the files passed into the
strategy as parameters.
#### Creating a custom Merge Strategy (since 2.0.0)
Custom merge strategies can be plugged-in to the `assemblyMergeStrategy` function, for example:

```scala
...
ThisBuild / assemblyMergeStrategy := {
case "matching-file" => CustomMergeStrategy("my-custom-merge-strat") { conflicts =>
// NB! same as MergeStrategy.discard
Right(Vector.empty)
}
case x =>
val oldStrategy = (ThisBuild / assemblyMergeStrategy).value
oldStrategy(x)
}
...
```

The `CustomMergeStrategy` accepts a `name` and a `notifyIfGTE` that affects how the result is reported in the logs.
Please see the scaladoc for more details.

Finally, to perform the actual merge/transformation logic, a function has to be provided. The function acceptsa `Vector` of `Dependency`, where you can access the `target` of type `String` and the byte payload of type `LazyInputStream`, which is just a type alias for `() => InputStream`.

The input `Dependency` also has two subtypes that you can pattern match on:
- `Project` represents an internal/project dependency
- `Library` represents an external/library dependency that also contains the `ModuleCoordinate` (jar org, name and version) it originated from

To create a merge result, a `Vector` of `JarEntry` must be returned wrapped in an `Either.Right`, or empty to discard these conflicts from the final jar.
`JarEntry` only has two fields, a `target` of type `String` and the byte payload of type lazy `InputStream`.

To fail the assembly, return an `Either.Left` with an error message.

There is also a factory specifically for renames, so it gets processed first along with the built-in rename merge strategy, before other merge strategies, as mentioned in a previous section. It accepts a function `Dependency -> String`, so the `Dependency` can be inspected and a new `target` path returned.

Here is an example that appends a `String` to the original `target` path of the matched file.

```scala
...
case "matching-file" =>
import sbtassembly.Assembly.{Project, Library}
CustomMergeStrategy.rename {
case dependency@(_: Project) => dependency.target + "_from_project"
case dependency@(_: Library) => dependency.target + "_from_library"
}
...
```

For more information/examples, see the scaladoc/source code in `sbtassembly.Assembly` and `sbtassembly.MergeStrategy`.

**NOTE**:
- The `name` parameter will be distinguished from a built-in strategy. For example, the `name`=`First` will execute its custom logic along with the built-in `MergeStrategy.first`. They cannot cancel/override one another. In fact, the custom merge strategy will be logged as `First (Custom)` for clarity.
- However, you should still choose a unique `name` for a custom merge strategy within the build. Even if all built-in and custom merge strategies are guaranteed to execute if they match a pattern regardless of their `name`s, similarly-named custom merge strategies will have their log reports joined. YMMV, but you are encouraged to **avoid duplicate names**.

#### Third Party Merge Strategy Plugins

Support for special-case merge strategies beyond the generic scope can be
provided by companion plugins, below is a non-exhaustive list:
~~Support for special-case merge strategies beyond the generic scope can be
provided by companion plugins, below is a non-exhaustive list:~~

~~* Log4j2 Plugin Caches (`Log4j2Plugins.dat`):~~
~~<https://github.com/idio/sbt-assembly-log4j2>~~

* Log4j2 Plugin Caches (`Log4j2Plugins.dat`):
<https://github.com/idio/sbt-assembly-log4j2>
*The log4j plugin needs to be updated for sbt-assembly version 2.0.0

### Shading

@@ -412,19 +472,11 @@ lazy val app = (project in file("app"))

### Caching

By default for performance reasons, the result of unzipping any dependency JAR files to disk is cached from run-to-run. This feature can be disabled by setting:

```scala
ThisBuild / assemblyCacheUnzip := false
Caching is implemented by checking all the input dependencies (class and jar files)' latest timestamp and some configuration changes from the build file.

// or
lazy val app = (project in file("app"))
.settings(
assembly / assemblyOption ~= { _.withCacheUnzip(false) }
)
```
In addition the über JAR is cached so its timestamp changes only when the input changes.

In addition the über JAR is cached so its timestamp changes only when the input changes. This feature requires checking the SHA-1 hash of all *.class files, and the hash of all dependency *.jar files. If there are a large number of class files, this could take a long time, although with hashing of jar files, rather than their contents, the speed has recently been [improved](https://github.com/sbt/sbt-assembly/issues/68). This feature can be disabled by setting:
To disable caching:

```scala
ThisBuild / assemblyCacheOutput := false
@@ -436,14 +488,32 @@ lazy val app = (project in file("app"))
)
```

**NOTE**:
- Unfortunately, using a custom `MergeStrategy` other than `rename` will create a function in which the plugin cannot predict
the outcome. This custom function must always be executed if it matches a `PathList` pattern, and thus, **will disable caching**.

#### Jar assembly performance

By default, the setting key `assemblyRepeatableBuild` is set to `true`. This ensures that the jar entries are assembled in a specific order, resulting in a consistent hash for the jar.

There is actually a performance improvement to be gained if this setting is set to `false`, since jar entries will now be assembled in parallel. The trade-off is, the jar will not have a consistent hash, and thus, caching will not work.

To set the repeatable build to false:

```scala
ThisBuild / assemblyRepeatableBuild := false
```

If a repeatable build/consistent jar is not of much importance, one may avail of this feature for improved performance, especially for large projects.

### Prepending a launch script

Your can prepend a launch script to the über jar. This script will be a valid shell and batch script and will make the jar executable on Unix and Windows. If you enable the shebang the file will be detected as an executable under Linux but this will cause an error message to appear on Windows. On Windows just append a ".bat" to the files name to make it executable.

```scala
import sbtassembly.AssemblyPlugin.defaultUniversalScript

ThisBuild / assemblyPrependShellScript := = Some(defaultUniversalScript(shebang = false)))
ThisBuild / assemblyPrependShellScript := Some(defaultUniversalScript(shebang = false))

lazy val app = (project in file("app"))
.settings(
3 changes: 1 addition & 2 deletions build.sbt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
ThisBuild / version := "1.2.0-SNAPSHOT"
ThisBuild / version := "2.0.1-SNAPSHOT"
ThisBuild / organization := "com.eed3si9n"

def scala212 = "2.12.8"
@@ -12,7 +12,6 @@ lazy val root = (project in file("."))
name := "sbt-assembly"
scalacOptions := Seq("-deprecation", "-unchecked", "-Dscalac.patmat.analysisBudget=1024", "-Xfuture")
libraryDependencies ++= Seq(
"org.scalactic" %% "scalactic" % "3.0.8",
"com.eed3si9n.jarjarabrams" %% "jarjar-abrams-core" % "1.8.1",
"org.scalatest" %% "scalatest" % "3.1.1" % Test,
)
6 changes: 1 addition & 5 deletions src/main/contraband/AssemblyOption.contra
Original file line number Diff line number Diff line change
@@ -2,8 +2,6 @@ package sbtassembly
@target(Scala)

type AssemblyOption {
assemblyDirectory: java.io.File @since("0.15.0")

## include compiled class files from itself or subprojects
includeBin: Boolean! = true @since("0.15.0")

@@ -14,14 +12,12 @@ type AssemblyOption {

excludedJars: sbt.Keys.Classpath! = raw"Nil" @since("0.15.0")

excludedFiles: sbtassembly.Assembly.SeqFileToSeqFile! = raw"sbtassembly.Assembly.defaultExcludedFiles" @since("0.15.0")
repeatableBuild: Boolean! = true @since("2.0.0")

mergeStrategy: sbtassembly.MergeStrategy.StringToMergeStrategy! = raw"sbtassembly.MergeStrategy.defaultMergeStrategy" @since("0.15.0")

cacheOutput: Boolean! = true @since("0.15.0")

cacheUnzip: Boolean! = true @since("0.15.0")

appendContentHash: Boolean! = false @since("0.15.0")

prependShellScript: sbtassembly.Assembly.SeqString @since("0.15.0")
Loading