Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duration at the summary page is wrong #1561

Open
7 tasks done
JP95Git opened this issue Feb 22, 2024 · 2 comments
Open
7 tasks done

Duration at the summary page is wrong #1561

JP95Git opened this issue Feb 22, 2024 · 2 comments

Comments

@JP95Git
Copy link

JP95Git commented Feb 22, 2024

I use pgloader to migrate a huge database from MariaDB to PostgreSQL. Based on the summary page, the migration takes too long, so I tried to adjust the parameters. Then I noticed that either the duration in the summary page or the times in the NOTICE lines are wrong.

The summary shows almost 8 hours, but the job took only 2,5 hours.

I can always reproduce this, even on different machines. All machines are virtual, running in rather big data centres at the office or at my customer.

  • pgloader --version
    pgloader version "3.6.af8c3c1"
    compiled with Clozure Common Lisp Version 1.12.2 (v1.12.2) LinuxX8664

Also reproducable with the other compiler (forget the name).

  • did you test a fresh compile from the source tree?
    yes

  • did you search for other similar issues?
    yes, none found.

  • how can I reproduce the bug?
    yes, always

Started with

pgloader --client-min-messages=notice database.load

database.load

LOAD DATABASE
	FROM mysql://root:password@host/database
	INTO pgsql://root:password@localhost:5432/database

	WITH
		no truncate,
		create tables,
		include drop,
		create indexes,
		reset sequences,
		foreign keys,
		downcase identifiers,
		uniquify index names,
		prefetch rows = 10000,
		rows per range = 10000,
		batch rows = 10000,
		workers = 4,
		concurrency = 1

	SET MySQL PARAMETERS
		net_read_timeout = '500',
		net_write_timeout = '500'
;

I can't attach a database dump, because my database is over 200 GB. Using an empty database is to fast to compare the times.

  • pgloader output you obtain
2024-02-21T17:56:04.000581+01:00 NOTICE Starting pgloader, log system is ready.
2024-02-21T17:56:04.001357+01:00 LOG pgloader version "3.6.af8c3c1"
[...]
2024-02-21T20:29:39.970798+01:00 LOG report summary reset
                        table name     errors       read   imported      bytes      total time       read      write
----------------------------------  ---------  ---------  ---------  ---------  --------------  ---------  ---------
                   fetch meta data          0        211        211                     0.085s
                    Create Schemas          0          0          0                     0.002s
                  Create SQL Types          0          0          0                     0.001s
                     Create tables          0        100        100                     0.238s
                    Set Table OIDs          0         50         50                     0.001s
----------------------------------  ---------  ---------  ---------  ---------  --------------  ---------  ---------
[...]
----------------------------------  ---------  ---------  ---------  ---------  --------------  ---------  ---------
           COPY Threads Completion          0          8          8               1h34m24.228s
                    Create Indexes          0        111        111               5h26m22.551s
            Index Build Completion          0        111        111                 38m51.722s
                   Reset Sequences          0         27         27                     6.022s
                      Primary Keys          0         49         49                     0.168s
               Create Foreign Keys          0         50         50                 20m10.162s
                   Create Triggers          0          0          0                     0.001s
                  Install Comments          0          0          0                     0.000s
                        after load          0         23         23                     0.512s
----------------------------------  ---------  ---------  ---------  ---------  --------------  ---------  ---------
                 Total import time          ✓  154281165  154281165    15.2 GB    7h59m55.363s

I removed information about the tables because they are private.

  • data that is being loaded, if relevant
    yes, all data has been loaded.

  • How the data is different from what you expected, if relevant
    Does not apply.

@JP95Git
Copy link
Author

JP95Git commented Feb 23, 2024

[user@host programs]$ /path/to/pgloader/bin/pgloader --client-min-messages=notice database.load
2024-02-22T10:43:40.001437+01:00 NOTICE Starting pgloader, log system is ready.
2024-02-22T10:43:40.007689+01:00 LOG pgloader version "3.6.af8c3c1"
[...]
2024-02-23T02:34:37.652045+01:00 LOG report summary reset
                        table name     errors       read   imported      bytes      total time       read      write
----------------------------------  ---------  ---------  ---------  ---------  --------------  ---------  ---------
                   fetch meta data          0        211        211                     0.494s
                    Create Schemas          0          0          0                     0.001s
                  Create SQL Types          0          0          0                     0.000s
                     Create tables          0        100        100                     0.360s
                    Set Table OIDs          0         50         50                     0.002s
----------------------------------  ---------  ---------  ---------  ---------  --------------  ---------  ---------
[...]
----------------------------------  ---------  ---------  ---------  ---------  --------------  ---------  ---------
           COPY Threads Completion          0          8          8              14h37m57.750s
                    Create Indexes          0        111        111                4h47m9.918s
            Index Build Completion          0        111        111                 48m26.715s
                   Reset Sequences          0         27         27                     0.870s
                      Primary Keys          0         49         49                     1.044s
               Create Foreign Keys          0         50         50                 24m11.139s
                   Create Triggers          0          0          0                     0.000s
                  Install Comments          0          0          0                     0.000s
                        after load          0         23         23                     0.364s
----------------------------------  ---------  ---------  ---------  ---------  --------------  ---------  ---------
                 Total import time          ✓  4449186294  4449186294   227.9 GB   20h37m47.805s

Output from the virtual server at my customer.

Duration based on the log lines = ~ 15.5 hours
Duration reported by pgLoader = 20.5 hours

@hvanoch
Copy link

hvanoch commented Feb 23, 2024

We are having the same issues. On our end it seems that the "Create Indexes" part makes no sense. As an example, this is a summary which only migrates the schema. Real time it took was 14 sec:

            table name     errors       rows      bytes      total time
----------------------  ---------  ---------  ---------  --------------
       fetch meta data          0       3405                     0.293s
        Create Schemas          0          0                     0.437s
      Create SQL Types          0          0                     0.004s
         Create tables          0       1062                     3.530s
        Set Table OIDs          0        531                     0.047s
----------------------  ---------  ---------  ---------  --------------
----------------------  ---------  ---------  ---------  --------------
Index Build Completion          0       1906                     8.285s
        Create Indexes          0       1906                  1m38.173s
       Reset Sequences          0        310                     0.144s
          Primary Keys          0        521                     0.282s
   Create Foreign Keys          0        968                     0.785s
       Create Triggers          0          0                     0.000s
      Install Comments          0        664                     0.274s
            after load          0          2                     0.013s
----------------------  ---------  ---------  ---------  --------------
     Total import time          ✓          0                  1m47.957s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants