Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving performance on Chunk #3301

Open
ahoy-jon opened this issue Apr 5, 2020 · 6 comments
Open

Improving performance on Chunk #3301

ahoy-jon opened this issue Apr 5, 2020 · 6 comments

Comments

@ahoy-jon
Copy link
Contributor

ahoy-jon commented Apr 5, 2020

Chunk could be improved in different ways.

  • Concat could be specialized and improved with:
case class ConcatArr[A](body: Array[Arr[A]], tail:Array[Arr[A]]) extends NonEmpty[A]
case class Concat[A](body: Array[NonEmpty[A]], tail:Array[NonEmpty[A]]) extends NonEmpty[A]

This change could help a lot with #3233 .
The tail of those structures would be used to hold the element just added to the end of the structure.

For example:

val c1 = Arr(Array(1,2,3))
val c2 = c1 + 4
//ConcatArr(body = Array(c),  tail = Array(4))
val c3 = c2 + 5
//ConcatArr(body = Array(c), tail = Array(4,5))
val c4 = c2 + 6 + 7 + 8 + 9 + 10 + 11
//ConcatArr(body = Array(c, Arr(Array(4, 5, 6, 7, 8, 9, 10, 11)), tail = Array())
  • Slice could be removed and pushed to Arr and VectorChunk
case class Arr[A](array: Array[A], offset: Int, l: Int) extends NonEmpty[A]
case class VectorChunk[A](private val vector: Vector[A], offset: Int, l: Int) extends NonEmpty[A]
  • materialize could be optimized, keep Concat and group substructures.

  • most of the complex components should have private "smart constructors".

private def arr[A](array: Array[A]): Arr[A] =
    Arr(array, ClassTag(array.getClass.getComponentType))

private case class Arr[A](private val array: Array[A], ct: ClassTag[A]) extends NonEmpty[A] {

What do you think about those changes?

@Vilkina
Copy link
Contributor

Vilkina commented Apr 6, 2020

I'll take that ticket

@ahoy-jon ahoy-jon mentioned this issue Apr 7, 2020
2 tasks
@jdegoes
Copy link
Member

jdegoes commented Apr 13, 2020

  1. Materialize should not be changed, it's purpose is to force the chunk to be strict (no laziness), so making it lazy which just defeat the point of its existence.
  2. How would Concat specialization improve anything?
  3. Removing Slice could be useful. "Inlining" it into Concat (concat would have to force it down into the left and right children, discarding, if necessary, left or right when they are entirely sliced out), Vector, and Array chunks would create more flattening and reduce nesting.

@ahoy-jon
Copy link
Contributor Author

  1. it could be changed so it doesn't rewrite already optimized parts of the structures.
  2. The specialization could help with the different layers of the Tree, Level0 (ConcatArr) for structures up to 1024 elements, LevelN (Concat). The tail helps with append on the right.
  3. was done then removed in a previous PR (NonEmptyChunk), will reintroduce it when possible.

@jdegoes
Copy link
Member

jdegoes commented Apr 13, 2020

  1. The goal of materialize is to allocate an array, and copy all the data inside there, then wrap the array into a chunk. I think it can be optimized somewhat, but mainly by using System.arraycopy to copy the array parts into the target array.
  2. Ok, I see. Yes, I think Concat could benefit from richer structure. See here for what Scala's vector is currently using (a radix balanced finger tree).
  3. Benchmarks should guide the deletion of Slice. I'd try to slice a bit, then slice some more, then slice some more, etc., building up Slice(Slice(Slice(...))) very deeply, exercising all possible cases, and then show the performance improvement possible with "slice fusion".

Looking forward to seeing this work!

@ahoy-jon
Copy link
Contributor Author

  1. Scala Vector and Clojure Vector are using a tree structure with leaf at 32 elements. It seems that the page size makes is great bellow 256 elements. Materializing on large chunks could have a cost (to be proven with benchmarks).
  2. I took the time to read it before proposing the issue. A version with 1 level and a tail (like the PersistentVector from Clojure) could be faster to implement.
  3. Slice fusion requires a lot less allocation + ... I will do a PR just for that one!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants