Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(): optimize rect rendering + calcOwnMatrix #1

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ShaMan123
Copy link

@ShaMan123 ShaMan123 commented Apr 11, 2024

@jiayihu and I worked on this and decided to publish our findings.
They are all present under OptimizedRect class

  1. strokeRect/fillRect are to be used instead of
ctx.beginPath()
ctx.rect()
ctx.fill()
ctx.stroke()
  1. Any usage of path2D is extremely heavy.

  2. As @asturur suggested matrix cache key is a bad idea, turns out it is very bad. I used a different approach, deleting the matrix instead and recalculating it if undefined. This can be done in the source code in _set for example. calcTransformMatrix should be changed as well to use a different approach.

  3. Using save, restore is bad for performance, using setTransform vs. transform is better

  4. It proves that removing originX/Y is good for perf

perf.mov

Copy link
Author

@ShaMan123 ShaMan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2024-04-11 at 7 57 53 Screenshot 2024-04-11 at 7 58 16

fabric is not causing frame drops with these optimizations

class OptimizedRect extends fabric.Rect {
m;
calcOwnMatrix() {
// return [1, 0, 0, 1, this.left, this.top];
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to test the perf of this impl of calcOwnMatrix try to uncomment this line and compare results. Should be the same

@ShaMan123
Copy link
Author

And the most important part is that we move to the top of the list of 2D renderers!

@asturur
Copy link
Member

asturur commented Apr 11, 2024

  • originX and originY will be a large improvement yes.
  • I also think transformMatrices can be cached on the object itself per render and trashed away every time, the string caching is nonsense
  • save/restore is ok to swap with setTransform but there are situations in which you still have to keep track of the previous transform and other properties like fill/stroke etc etc is not an easy replacement, is it?

How did you measure the difference between save/restore and trasnfrom/setTransform?

@jiayihu
Copy link

jiayihu commented Apr 11, 2024

As @asturur suggested matrix cache key is a bad idea, turns out it is very bad. I used a different approach, deleting the matrix instead and recalculating it if undefined. This can be done in the source code in _set for example.

I suggested we can use a this.transformCacheValues = [top, left, scaleX, scaleY, ...] instead of composing a long string cache key, which we've seen causes lots of memory (because of the long strings) and heavy garbage collection cycles (which were taking 51s and clearing 60MB of memory!).
We can just check if any value changed, e.g. between this.transformCacheValues[0] and this.top. If no change then reuse cached transform, otherwise return a new transform matrix and mutate this.transformCacheValues[0] = this.top for instance.
Tldr; there is no reason to use a string cache key, it's just an old habit.

@asturur
Copy link
Member

asturur commented Apr 11, 2024

Ages ago i removed it and then i said maybe is faster without but the speed was the same, so i didn't touch it.

All of those checks for performances are non breaking, from the rect optimization when rx/ry is not used to the save/restore, each of those can be done anytime if they come with a way to test the performance and they do not add 500k of code there is not much discussion to have around them, if is faster is faster.
They are also independent from one to another, so they can be done in parallel

For the matrices i would start from originX/originY tho because that reduces by a bunch the cost of calculating a matrix and possibly change how we approach matrix calculation and the discussion around setCoords.
With originX/originY gone the stroke, strokeUniform for example become non influential anymore for the matrix itself ( as they are today with center/center ) is also faster to get the translate matrix and possibly the translate + rotate can be calculated in one step rather than with multiplication, so there are a lot of little things that maybe done makes matrix caching easier or less relevant.
There could be different approaches like each object tracks the matrix for his canvas state rather than the one relative to the parent or both, and that changes how you can use save/restore or setTransform.

The big issue is that originX/originY is the biggest breaking change so now i would like to do it but also not


m[4] = x;
m[5] = y;
return (this.m = m);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this entirely equivalent? if doing rotate/scale/skew as a 2x2 and then slapping the top/left at the end is the same as starting with the translate as we do today and multiply the ohers as a 2by3, we don't need to swap the order, we can keep the translate as the first step.
Consider also that multiplyTransformMatrixArray should be used in sensitive parts since it adds the inital iMatrix multiplication that is not useful, so you can remove it

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be 100% eq since translation is not affected by the plane when using center origin.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if doing rotate/scale/skew as a 2x2 and then slapping the top/left at the end is the same as starting with the translate as we do today and multiply the ohers as a 2by3, we don't need to swap the order, we can keep the translate as the first step.

True, that is why I left it commented out, because if the gain is not huge it is too messy to keep

@asturur
Copy link
Member

asturur commented Apr 11, 2024

Keep in mind that there may be issues with strokeRect and fillRect with:

  • stroke uniform
  • patterns + scaling
  • gradients + scaling

Paths behave in a different way, you can draw a path and do transform operations and then fill or stroke, i don't think you can do that with stroke/fill rect as much you can't do that with the text.

@ShaMan123
Copy link
Author

ShaMan123 commented Apr 11, 2024

I love the input @asturur and agree.
I am aligned on originX/Y as the first step of transform/coords stuff.

Regarding matrix/coords hashing I would like to propose my solution:
own matrix can be invalidated by _set or a proxy as we do with dirty flagging. We have a known set of properties that invalidate the matrix. That seems straight forward enough.
calcTransformMatrix on the other hand depends on all ancestors. Therefore I suggest the hash to contain the timestamp and an array of ancestors of the time of execution. Then to decide if we need to recalculate the matrix or not we compare the current ancestors against the hash and their timestamps against the hash. If the ancestors are eq and all their timestamps are lt the hash timestamp we can use the existing matrix.
This is supposed to be very lean because it depends on tree depth.

@ShaMan123
Copy link
Author

The same hash can be used for setCoords maybe with an addition of width/height/ more props

@asturur
Copy link
Member

asturur commented Apr 11, 2024

the issue with that is that i don't really want to move that everything needs set and we need to keep track over every change.
That would mean that scaleX and company needs to become a setter/getter.

I want to see if after some improvements matrix caching is still necessary ( today isn't because both are slow from my last check )

@jiayihu
Copy link

jiayihu commented Apr 11, 2024

I agree with @asturur, I don't like relying on _set to check and do dirty flagging. It's easy for a developer to do obj.left += 10 although it's not recommended.
That's why I think using an array of the values as "cache key" is better. I don't see any downside of it literally, it has fixed irrelevant memory cost plus + fixed irrelevant computing cost + more robust than _set.

Something like this:

[
  this.top,
  this.left,
  this.scaleX,
  this.scaleY,
  this.skewX,
  this.skewY,
  this.angle,
  this.originX,
  this.originY,
  this.width,
  this.height,
  this.strokeWidth,
  this.flipX,
  this.flipY,
]

@ShaMan123
Copy link
Author

In an optimized fabric world I would not want to use set at all.
In that world I would expose methods to set the matrix value and not assign decomposed values to the object.
As an intermediate step this is a valid option IMO, or else no caching or a major refactor.

@asturur
Copy link
Member

asturur commented Apr 11, 2024

setting matrix to the object does not make sense, people work with scale in mind, not with matrices.
At that point you would work with the native canvas api.
Calculating a matrix without originX/originY won't need those anymore:

  this.originX,
  this.originY,
  this.width,
  this.height,
  this.strokeWidth,

That is why i think we are running ahead ourselves thinking the matrices can't be calculated once per object per render

@ShaMan123
Copy link
Author

I am just saying something like
object.setScale(2, 2) can mutate the matrix directly instead of doing

object.scaleX = 2
object.calcTransformMatrix()

@asturur
Copy link
Member

asturur commented Apr 11, 2024

yeah is what i don't like
We know that with setters and getters we could keep the matrix in place all the time

@ShaMan123
Copy link
Author

I am not against getter/setter

@jiayihu
Copy link

jiayihu commented Apr 11, 2024

setting matrix to the object does not make sense, people work with scale in mind, not with matrices.

I did not suggest setting a matrix to the object, rather setting an array of the matrix inputs as cache key. For instance:

calcOwnMatrix(): TMat2D {
  const matrixInputs = [
      this.top,
      this.left,
      this.scaleX,
      this.scaleY,
      this.skewX,
      this.skewY,
      this.angle
  ]
  const cache = this.ownMatrixCache;
    
  if (cache && ownMatrixCache.matrixInputs.every((a, i) => a === matrixInputs[i])) {
    return cache.value;
  }

  const center = this.getRelativeCenterPoint(),
    options = {
      angle: this.angle,
      translateX: center.x,
      translateY: center.y,
      scaleX: this.scaleX,
      scaleY: this.scaleY,
      skewX: this.skewX,
      skewY: this.skewY,
      flipX: this.flipX,
      flipY: this.flipY,
    },
    value = composeMatrix(options);
  this.ownMatrixCache = { matrixInputs, value };
  return value;
}

That is why i think we are running ahead ourselves thinking the matrices can't be calculated once per object per render

Of course then if we see that calculating on-the-fly has no additional cost then we can drop it. But otherwise doing ownMatrixCache.matrixInputs.every((a, i) => a === matrixInputs[i]) looks very low-cost and effective to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants