Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster Vector concatenation #10159

Merged
merged 6 commits into from Nov 30, 2022
Merged

Commits on Sep 16, 2022

  1. fix minor issues concerning large Vectors before implementing faster …

    …concatenation
    
    * misleading indentation
    * `a6 = new Arr6(LASTWIDTH)`:
      The whole point of LASTWIDTH is that Arr6 may be larger than the other arrays.
      This is considered in all initializations except here.
      The added test shows how initFrom failed so far on arrays with size in
      (Int.MaxValue/2, Int.MaxValue] and works now.
    * `else if (xor < WIDTH6) { // level = 5`:
      Similarly to the last one, Arr6 may have a size up to LASTWIDTH, so this check
      fails for valid Arr6s with size in [32,64]. About WIDTH6, see next point.
    * `BITS6` and `WIDTH6`:
      There is no valid usage for them. The BITSn and WIDHTn variables mark which bits
      of the index correspond to which level, or more precisely Vectors up to level n
      have sizes of [0, WIDTHn). Thus, the correct value of BITS6 would be (BITS * 6 + 1),
      thus WIDTH6 would be Int.MaxValue+1, one "more" than the longest possible vector
      of level 6.
      The current value is just wrong, as it arbitrarily "cuts" the range of 6 bits
      which compose the index in Arr6 in 1 high and 5 lower bits.
      In fact, the line above was the only place where one of `BITS6` and `WIDTH6` was used.
      I suspect this was a little oversight because of the irregularity of level 6.
    ansvonwa committed Sep 16, 2022
    Configuration menu
    Copy the full SHA
    74955b8 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    d7d696b View commit details
    Browse the repository at this point in the history

Commits on Sep 24, 2022

  1. add VectorBuilder.alignTo and fast Vector.prependedAll

    * add alignTo and leftAlignPrefix in VectorBuilder:
      alignTo(n: Int, v: Vector) ensures that if there are added n elements
      before Vector v, the Builder is aligned to the structure of v.
      This way, the Arrays of v can be reused in the Builder, which improves
      performance drastically.
      Formally, alignTo sets `len1`, `lenRest` and the internal arrays as
      if there were $x = len1 + lenRest$ elements already added, such that
      $x + n + prefixLenght(v)$ is a  multiple of the maximum prefix length
      of a Vector of v's level.
      The number $x$ is stored in `offset` and the new variable
      `prefixIsRightAligned` is set true to record the fact that in each
      prefix array, the length is WIDTH and the content is right-aligned.
      This is different to the normal case, where the prefixes are either
      full and size WIDTH or, if `initFrom` was used, full and trimmed.
      To remove the leading null entries in the prefix vectors, a call to
      leftAlignPrefixes is needed. This is done at the beginning of
      result().
    * prependedAll0 and appendedAll0:
      In case that an IterableOnce (left side; may be a Vector itself) and a
      Vector (right side) are concatenated and the left one is shorter
      than the right Vector, a VectorBuilder aligned to the right Vector is
      used.
      `++` and `concat` forward to `appendedAll`, so they profit from this
      optimization.
    * add tests and benchmark
    ansvonwa committed Sep 24, 2022
    Configuration menu
    Copy the full SHA
    6abbe43 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e4b872a View commit details
    Browse the repository at this point in the history

Commits on Oct 26, 2022

  1. make VectorBuilder reusable again

    * `leftAlignPrefix()` now keeps the length of the outermost array at
      (LAST)WIDTH.
      This is necessary because after a call to `result()`, the VB may be
      used further.
    * Allow arbitrarily misleading/wrong arguments in `alignTo()`
      Should not fail now, even if the hint given is completely nonsensical,
      `leftAlignPrefix()` now compensates for that.
      Negative alignments are allowed, e.g. if you plan to drop elements of
      the vector to be added.
    * Add a bunch of tests to cover previously uncovered branches.
    ansvonwa committed Oct 26, 2022
    Configuration menu
    Copy the full SHA
    a30168d View commit details
    Browse the repository at this point in the history

Commits on Oct 31, 2022

  1. use VB.alignTo only if RHS is significantly larger

    * To cempensate for alignTo's (small, but constant) overhead, it is only
    called when the right Vector is mor then 64 elemente longer than the
    left one.
    * replaced some `….knownSize` by `k`
    ansvonwa committed Oct 31, 2022
    Configuration menu
    Copy the full SHA
    7b8c407 View commit details
    Browse the repository at this point in the history