Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch inserts in mysql/mariadb cluster (autoincrement by >1) #3828

Closed
sazalihisham88 opened this issue Dec 6, 2022 · 22 comments
Closed

Batch inserts in mysql/mariadb cluster (autoincrement by >1) #3828

sazalihisham88 opened this issue Dec 6, 2022 · 22 comments
Labels
bug Something isn't working

Comments

@sazalihisham88
Copy link

sazalihisham88 commented Dec 6, 2022

I have an entity like this:

export class EventGrossSales {

  @PrimaryKey()
  id: number;

  @Unique({ name: 'unique_key' })
  @Property({ length: 45, nullable: true, default: 'NULL' })
  orderNumber?: string;

  @Property({ columnType: 'text', length: 65535, nullable: true, default: 'NULL' })
  installationStatus?: string;

  @Property({ columnType: 'text', length: 65535, nullable: true, default: 'NULL' })
  installationDoneYear?: string;

  @Property({ columnType: 'text', length: 65535, nullable: true, default: 'NULL' })
  installationDoneMonth?: string;

  @Property({ columnType: 'text', length: 65535, nullable: true, default: 'NULL' })
  installationDoneDate?: string;

  @Property({ columnType: 'text', length: 65535, nullable: true, default: 'NULL' })
  accntId?: string;

My use case is I am extracting these row from an excel sheet. By looping each of the row (for loop), I called persist() and flush() it once loop exits.

But this error occured:

JIT runtime error: Cannot read properties of undefined (reading 'id')

  function(entity, data, factory, newEntity, convertCustomTypes, schema) {
>   if (typeof data.id !== 'undefined') entity.id = data.id;
                   ^
[query] rollback
TypeError: Cannot read properties of undefined (reading 'id')
    at eval (eval at createFunction (/Users/sazalihisham/Works/sifu/admin-service/node_modules/@mikro-orm/core/utils/Utils.js:735:20), <anonymous>:6:19)
    at Function.callCompiledFunction (/Users/sazalihisham/Works/sifu/admin-service/node_modules/@mikro-orm/core/utils/Utils.js:746:20)
    at ObjectHydrator.hydrate (/Users/sazalihisham/Works/sifu/admin-service/node_modules/@mikro-orm/core/hydration/ObjectHydrator.js:24:23)
    at ChangeSetPersister.reloadVersionValues (/Users/sazalihisham/Works/sifu/admin-service/node_modules/@mikro-orm/core/unit-of-work/ChangeSetPersister.js:279:27)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at ChangeSetPersister.persistNewEntities (/Users/sazalihisham/Works/sifu/admin-service/node_modules/@mikro-orm/core/unit-of-work/ChangeSetPersister.js:105:17)
    at ChangeSetPersister.runForEachSchema (/Users/sazalihisham/Works/sifu/admin-service/node_modules/@mikro-orm/core/unit-of-work/ChangeSetPersister.js:68:13)
    at UnitOfWork.commitCreateChangeSets (/Users/sazalihisham/Works/sifu/admin-service/node_modules/@mikro-orm/core/unit-of-work/UnitOfWork.js:680:9)
    at UnitOfWork.persistToDatabase (/Users/sazalihisham/Works/sifu/admin-service/node_modules/@mikro-orm/core/unit-of-work/UnitOfWork.js:633:13)
    at MySqlConnection.transactional (/Users/sazalihisham/Works/sifu/admin-service/node_modules/@mikro-orm/knex/AbstractSqlConnection.js:36:25)
    at UnitOfWork.doCommit (/Users/sazalihisham/Works/sifu/admin-service/node_modules/@mikro-orm/core/unit-of-work/UnitOfWork.js:262:17)
    at UnitOfWork.commit (/Users/sazalihisham/Works/sifu/admin-service/node_modules/@mikro-orm/core/unit-of-work/UnitOfWork.js:235:13)
    at SqlEntityManager.flush (/Users/sazalihisham/Works/sifu/admin-service/node_modules/@mikro-orm/core/EntityManager.js:641:9)
    at GrossSalesService.uploadGrossSales (/Users/sazalihisham/Works/sifu/admin-service/src/gross-sales/gross-sales.service.ts:118:13)

Further console.log in the compiled code it is due to undefined hash.

data from map Map(1) {
  '3' => {
    id: 3,
    installationStatus: null,
    installationDoneYear: null,
    installationDoneMonth: null,
    installationDoneDate: null,
    platform: null,
  }
} undefined 4

Any idea what is the root cause of this? It is pretty deep for me to understand since I am quite new to orm. I also noticed that if I use persistAndFlush() each row within the loop, it works. But it is very expensive. Currently we are using mariadb galera cluster with 3 nodes.

Thanks!

@B4nan
Copy link
Member

B4nan commented Dec 6, 2022

Uf, please simplify before reporting anything, this is crazy.

@B4nan
Copy link
Member

B4nan commented Dec 6, 2022

You haven't provided anything reproducible, just error message, please try to put together a reproduction as in https://github.com/mikro-orm/mikro-orm/tree/master/tests/issues

Closing before you provide one.

My use case is I am extracting these row from an excel sheet. By looping each of the row (for loop), I called persist() and flush() it once loop exits.

Why don't you show the code of what you are doing? :]

@B4nan B4nan closed this as not planned Won't fix, can't repro, duplicate, stale Dec 6, 2022
@sazalihisham88
Copy link
Author

@B4nan my apologies. Its pretty difficult for me to reproduce it because it happens on this specific environment I have. It is working on my another env. I am posting because I have a difficulty to understand on how this error is happening. Meaning that which part should I check on or what usually trigger this error.

Thanks a lot for your fast response

@B4nan
Copy link
Member

B4nan commented Dec 6, 2022

The stack trace shows one thing - the error originated from ChangeSetPersister.reloadVersionValues, so focus on optimistic versioning. I don't see any version property in the entity definition you provided (not even in the full one before edit).

@B4nan
Copy link
Member

B4nan commented Dec 6, 2022

How does GrossSalesService.uploadGrossSales look like?

@Property({ columnType: 'text', length: 65535, nullable: true, default: 'NULL' })

Also I can see one (unrelated) error in your entity definition, you cant do default: 'NULL', that literally means a string 'NULL', either do default: null or defaultRaw: 'NULL'.

@sazalihisham88
Copy link
Author

How does GrossSalesService.uploadGrossSales look like?

@Property({ columnType: 'text', length: 65535, nullable: true, default: 'NULL' })

Also I can see one (unrelated) error in your entity definition, you cant do default: 'NULL', that literally means a string 'NULL', either do default: null or defaultRaw: 'NULL'.

I used entity generator to populate those properties because as u can see there are too many columns which I can't avoid.

BTW here is the simplified uploadGrossSales

public async uploadGrossSales(filePath: string) : Promise <any> {
        try {
            console.time('execution time');            
            const wb = new ExcelJS.Workbook();
            await wb.xlsx.readFile(`data/gross_sales/1670296753952.xlsx`);
            const sheet = wb.worksheets[0];
            const rowCount = sheet.rowCount;
            const data = [];
            for(let i = 1; i < rowCount; i++) {
                const row = sheet.getRow(i);
                // skip headers row by checking first column value
                if(row.getCell(1).value == 'ORDER_NUMBER') continue;
                const newEventGrossSales = new EventGrossSalesTemp(
                    String(row.getCell(1).value),
                    (row.getCell(2).value)? String(row.getCell(2).value) : null,
                    (row.getCell(3).value)? String(row.getCell(3).value) : null,
                    (row.getCell(4).value)? String(row.getCell(4).value) : null,
                    (row.getCell(5).value)? String(row.getCell(5).value) : null,
                    (row.getCell(6).value)? String(row.getCell(6).value) : null,
                    (row.getCell(7).value)? String(row.getCell(7).value) : null,
                    (row.getCell(8).value)? String(row.getCell(8).value) : null,
                    (row.getCell(9).value)? String(row.getCell(9).value) : null,
                    (row.getCell(10).value)? String(row.getCell(10).value) : null,
                    // more columns
                );
                this.em.persist(newEventGrossSales);
            }
            await this.em.flush();
            console.timeEnd('execution time');
            return;
        } catch (error) {
            console.error(error);
            throw { appMessage: `processing failed - ${error.message}`, appCode: 'ERR_PROCESSING_FAILED'};
        }
    }

It works fine on single mariadb instance. The error occured on another env that runs mariadb galera cluster. Anyway I am currently reading and learning about the optimistic locking as you suggested :D

@B4nan
Copy link
Member

B4nan commented Dec 6, 2022

How does your ORM config look like? Maybe you use global contexts? What versions are you on (if not latest, update)?

I dont think entity generator would generate default: 'null' unless you actually have the string defaults set in your schema. But that would be a separate issue.

@sazalihisham88
Copy link
Author

The only ORM config I have:

MikroOrmModule.forRoot({
        entities: ['./dist/**/*.entity.js'], // path to our JS entities (dist), relative to `baseDir`
        entitiesTs: ['src/**/*.entity.ts'], // path to our TS entities (source), relative to `baseDir`
        dbName: process.env.DB_NAME,
        user: process.env.DB_USER,
        password: process.env.DB_PASSWORD,
        host: process.env.DB_HOST,
        type: 'mysql',
        port: 330642,
        debug: true,
        // logger: console.log.bind(console)
    }),

Currently using mikro-orm version 5.4.2

Thanks for the advice regarding the entity definition. Will ammend that.

@sazalihisham88
Copy link
Author

@B4nan another difference is in galera cluster, the id autoincrement by 3 not 1. I wonder if that may be the cause?

@B4nan
Copy link
Member

B4nan commented Dec 6, 2022

Hmm interesting, that could be doing some issues. Batch inserts won't work with that in mysql, you'd have to disable those (useBatchInserts: false).

@B4nan
Copy link
Member

B4nan commented Dec 6, 2022

MySQL is currently the only driver not supporting returning statements, so we expect the PKs to be a sequence and compute them based on the returned inserted id. That happens here:

https://github.com/mikro-orm/mikro-orm/blob/master/packages/mysql/src/MySqlDriver.ts#L16

@sazalihisham88
Copy link
Author

So if i disabled useBatchInsert, is that means the performance will be like persistAndFlush in the for loop? There are 7k rows in that excel sheet :(

@B4nan
Copy link
Member

B4nan commented Dec 6, 2022

Well, let's first verify this is the culprit, then we can deal with getting around it. If you dont care about the entity instances and you just need to dump the data into the database, you could use QB with multi insert query. Or you could extend the driver and modify that line yourself.

@sazalihisham88
Copy link
Author

Yes after disabling batch insert, it works. With current code, it is 3x slower. So I guess I need to try QB multi-insert and see if performance improve. Thanks a lot for the pointer :D

@B4nan
Copy link
Member

B4nan commented Dec 6, 2022

We could introduce some dialects here to support this out of box, ideally we could detect it upfront. Let's reopen.

@B4nan B4nan reopened this Dec 6, 2022
@B4nan B4nan changed the title JIT runtime error: Cannot read properties of undefined (reading 'id') Batch updates in mariadb galera cluster (autoincrement by 3) Dec 6, 2022
@B4nan B4nan added the bug Something isn't working label Dec 6, 2022
@B4nan
Copy link
Member

B4nan commented Dec 6, 2022

Please confirm that SHOW STATUS LIKE 'wsrep%'; returns some values for you, sounds like the way to detect it.

@sazalihisham88
Copy link
Author

@B4nan yes it returns some values. May I know which wsrep property you are interested in?

@sazalihisham88
Copy link
Author

This might also helps:

image

@B4nan
Copy link
Member

B4nan commented Dec 6, 2022

Good. FYI the value should be driven by your cluster side, more about this here:

https://galeracluster.com/library/kb/auto-increment-multiples.html

@B4nan B4nan changed the title Batch updates in mariadb galera cluster (autoincrement by 3) Batch inserts in mysql/mariadb cluster (autoincrement by >1) Dec 6, 2022
@sazalihisham88
Copy link
Author

Yes. And seems like there is a number of factors need to be considered too. But it will be a cool feature :D

@B4nan B4nan closed this as completed in 516db6d Dec 6, 2022
@sazalihisham88
Copy link
Author

@B4nan i saw this changes have been pushed to master. Is it considered included in the latest version?

@B4nan
Copy link
Member

B4nan commented Dec 8, 2022

You can already use the dev version, every commit to master is published to NPM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants