New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Anormal time spent in initializeCollections
(quadratic behaviour somewhere ?)
#4171
Comments
Do you use composite PKs, or FKs as PKs heavily? That could add some overhead to the Could be more about the hydration, especially if your entities are huge. The Can't help much without a complete reproduction, this will be probably very specific to your entity definition and number of items in the context. You know what method does it, so it should be quite easy, or what am I missing? It is about populating some to-many relation, so just create the entity definitions, populate with a lot of random data and do the find call. |
thanks for the quick response; I'll make a repro ! I understand those kind of issues without context can be quite annoying x) In the meantime, just sharing this in case it rings a bell. our collections are large (a few dozen to a few thouthands items max) but nothing too crazy; never more than 3/4000 instances. I think we should not be using mapToPk as we only have basic PKs(id) and non composite Foreign keys 馃 but the execution cost near the lines 170 / 171 / 174 feels wrong. |
That's indeed very interesting/weird. But at the same time, I feel like this will be much easier to fix than a hydration/propagation issue :] I guess we will just need to chunk the |
Yeah, that was my initial guess; with ~1000 items in batch, each having ~1000 children, fusing the loop, or optimising the comparaison could probably brings a 2 or 3 magnitude order speedup. |
Well, if you want to have 1 million entities in context, I'm afraid it will never be very fast :D Looks like this is enough to reproduce: test('perf: populating many large one to many collections (#4171)', async () => {
console.time('create');
for (let i = 1; i <= 1000; i++) {
const author = new Author2('Jon Snow ' + i, `snow-${i}@wall.st`);
for (let j = 1; j <= 20; j++) {
author.books.add(new Book2(`My Life on The Wall, part ${i}/${j}`, author));
}
await orm.em.fork().persist(author).flush();
}
console.timeEnd('create');
console.time('find');
await orm.em.fork().find(Author2, {}, { populate: ['books'] });
console.timeEnd('find');
}); If I comment the problematic filter line and use an empty array instead, the |
Starting from your repro, a quick benchmark which may help: https://stackblitz.com/edit/mikro-orm-repro-bzsaor?file=test.ts It's surprising:
|
The slow part is actually reading the relation property (so the |
I tried to quickly patch the code like this: initializeOneToMany(filtered, children, prop, field) {
for (const entity of filtered) {
const items = children.filter(child => {
// /* 馃敶 */ if (prop.targetMeta.properties[prop.mappedBy].mapToPk) {
// /* 馃敶 */ return child[prop.mappedBy] === (0, wrap_1.helper)(entity).getPrimaryKey();
// /* 馃敶 */ }
// /* 馃敶 */ return Reference_1.Reference.unwrapReference(child[prop.mappedBy]) === entity;
return Reference_1.Reference.unwrapReference(child.__helper.__data[prop.mappedBy]) === entity;
});
entity[field].hydrate(items);
}
} Everything is indeed faster 馃憤, but the cost remains high in unexpected places: and there is still some weird flamegraph 馃 |
so I guess the nested |
performance issue in
initializeCollections
(using "5.6.16-dev.6", but the issue was present in previous versions too)
馃憢 Hello @B4nan
I have a worker that process quite a few rows from various tables in batch.
Initialize Collections
takes way more time that we expect 馃some batch hints about some quadratic behaviours: here:
150ms
only to find and hydrate models12seconds
to initialize collectionspossible culprit:
at https://github.com/mikro-orm/mikro-orm/blob/master/packages/core/src/entity/EntityLoader.ts#L223
filter
within afor
?helper(entity).getPrimaryKey()
is not cached above the filter closure ?@B4nan Not quite sure what is the best way to help you reproduce this. not event sure if it's worth anything in case you'll simply spot the problem by just looking at the code.
I'm a bit lost there, but tell me if I can do anything to help !
Best
The text was updated successfully, but these errors were encountered: