Reduce generation loss #3563

jonsneyers · 2024-05-07T10:43:26Z

At very high quality / low distance settings, some of the default encoder choices are doing more harm than good, especially w.r.t. generation loss. In particular:

VarDCT mode: Gaborish is causing more distortion than it is reducing it, and it accumulates badly over generations. This PR disables Gaborish by default at distances up to 0.5
Disabling gaborish caused a reduction in both quality and filesize; this PR adjusts the calibration to get gab0 closer to gab1
Modular integer quantization of XYB is too coarse to allow very high-precision encoding
Modular additional quantization of B is not useful and causes additional cumulative errors

The rationale for these changes is as follows: at very low distances, it is more likely that additional editing/processing will still be done, i.e. we are in an authoring workflow, not a web delivery scenario. Therefore generation loss is a critical issue that needs to be addressed.

Single generation

Before:
benchmark_xl v0.10.2 22a9c10 [NEON]
12 total threads, 637 tasks, 12 threads, 0 inner threads

Encoding       kPixels    Bytes          BPP  E MP/s  D MP/s     Max norm  SSIMULACRA2   PSNR        pnorm       BPP*pnorm   QABPP   Bugs
-----------------------------------------------------------------------------------------------------------------------------------------
jxl:d0:4         53140 69538320   10.4686088   2.072   9.923          nan 100.00000000  99.99   0.00000000  0.000000000000  10.469      0
jxl:d0.05        53140 67333277   10.1366518   2.067  20.379   0.29220747  95.38172761  58.50   0.08118059  0.822899360818  10.137      0
jxl:d0.1         53140 48950295    7.3691957   2.127  23.880   0.33223548  94.89087682  55.27   0.10573762  0.779201225712   7.369      0
jxl:d0.2         53140 33948482    5.1107559   2.229  28.661   0.42857408  93.89324626  51.52   0.16581995  0.847465257243   5.111      0
jxl:d0.3         53140 26599076    4.0043435   2.311  32.443   0.57239349  92.73351028  49.00   0.23488577  0.940563314038   4.004      0
jxl:d0.4         53140 22097056    3.3265893   2.356  34.718   0.71591109  91.56903642  47.16   0.30289841  1.007618637718   3.327      0
jxl:d0.5         53140 19001497    2.8605701   2.446  37.477   0.86379899  90.38862331  45.76   0.36794302  1.052526796888   2.873      0
jxl:d0.51        53140 18752324    2.8230585   2.410  35.786   0.87492628  90.27763397  45.64   0.37405473  1.055978366225   2.832      0
jxl:d1           53140 11657700    1.7550021   2.280  41.043   1.58120143  84.82011959  41.70   0.64830139  1.137770330813   2.773      0
jxl:m:d0.05      53140 55399234    8.3400477   1.098   8.520   0.20588803  95.43668063  58.16   0.08899073  0.742186942752   8.340      0
jxl:m:d0.1       53140 41436894    6.2380948   1.381   9.264   0.31810229  94.80887562  54.10   0.13607218  0.848831128358   6.238      0
jxl:m:d0.2       53140 30007994    4.5175372   1.660  10.348   0.51027208  93.47759059  49.98   0.22139302  1.000151229075   4.518      0
jxl:m:d0.3       53140 24505493    3.6891662   1.848  11.225   0.68457476  92.19044178  47.67   0.29185825  1.076713577529   3.690      0
Aggregate:       53140 31656837    4.7657614   1.974  19.938   0.52768621  93.00299076  52.86   0.20930740  0.934194153364   4.939      0

After:
benchmark_xl v0.10.2 2b193d2 [NEON]
12 total threads, 637 tasks, 12 threads, 0 inner threads

Encoding       kPixels    Bytes          BPP  E MP/s  D MP/s     Max norm  SSIMULACRA2   PSNR        pnorm       BPP*pnorm   QABPP   Bugs
-----------------------------------------------------------------------------------------------------------------------------------------
jxl:d0:4         53140 69538320   10.4686088   2.088   9.984          nan 100.00000000  99.99   0.00000000  0.000000000000  10.469      0
jxl:d0.05        53140 66089537    9.9494136   2.098  22.034   0.16267897  95.73751400  60.52   0.05078334  0.505264442084   9.949      0
jxl:d0.1         53140 48310883    7.2729357   2.131  25.377   0.32303696  95.00890681  55.46   0.09997654  0.727122974420   7.273      0
jxl:d0.2         53140 33551290    5.0509608   2.212  31.203   0.52211921  93.85639968  50.84   0.17964052  0.907357246076   5.051      0
jxl:d0.3         53140 26294047    3.9584231   2.313  33.833   0.67932596  92.60807086  48.16   0.25624444  1.014323892842   3.966      0
jxl:d0.4         53140 21918684    3.2997364   2.368  37.906   0.82838599  91.43068193  46.33   0.32801396  1.082359608772   3.315      0
jxl:d0.5         53140 18923204    2.8487835   2.443  41.930   0.98757012  90.24280634  44.95   0.39566406  1.127161245178   2.984      0
jxl:d0.51        53140 18752324    2.8230585   2.425  38.016   0.87492628  90.27763397  45.64   0.37405473  1.055978366225   2.832      0
jxl:d1           53140 11657700    1.7550021   2.260  42.186   1.58120143  84.82011959  41.70   0.64830139  1.137770330813   2.773      0
jxl:m:d0.05      53140 55315055    8.3273750   1.114   8.609   0.17825803  95.64803959  58.55   0.07206166  0.600084494106   8.327      0
jxl:m:d0.1       53140 41104085    6.1879922   1.425   9.465   0.30245872  94.98616634  54.18   0.12241306  0.757491042333   6.188      0
jxl:m:d0.2       53140 29738958    4.4770353   1.740  10.345   0.49932055  93.63880816  49.96   0.21010832  0.940662376037   4.477      0
jxl:m:d0.3       53140 24378701    3.6700784   1.920  11.180   0.67874604  92.30811594  47.68   0.28368398  1.041142432635   3.673      0
Aggregate:       53140 31435753    4.7324784   1.997  20.860   0.51939696  93.05431958  52.77   0.19904515  0.881671672306   4.923      0

Overall (at these very low distances) this gives a nice improvement in bpp*pnorm. It does introduce a discontinuity at d0.5, since Gaborish gets enabled at d>0.5, which causes the first generation of d0.51 to be slightly better quality than the first generation of d0.5.

Multi-generation

Testing with 10 intermediate generations, the generation loss issue becomes clear:

Before:
benchmark_xl v0.10.2 22a9c10 [NEON]
12 total threads, 637 tasks, 12 threads, 0 inner threads
Generation loss testing with 10 intermediate generations

Encoding       kPixels    Bytes          BPP  E MP/s  D MP/s     Max norm  SSIMULACRA2   PSNR        pnorm       BPP*pnorm   QABPP   Bugs
-----------------------------------------------------------------------------------------------------------------------------------------
jxl:d0:4         53140 69586836   10.4759126   0.192   0.866          nan 100.00000000  99.99   0.00000000  0.000000000000  10.476      0
jxl:d0.05        53140 66648186   10.0335151   0.196   1.912   3.73499970  86.30593047  44.56   0.89634434  8.993484512441  34.769      0
jxl:d0.1         53140 48305305    7.2720960   0.199   2.212   3.73802646  85.75859064  43.99   0.89005544  6.472568586806  25.404      0
jxl:d0.2         53140 33380778    5.0252912   0.207   2.688   3.74770894  84.70778938  43.80   0.87528230  4.398548405896  17.853      0
jxl:d0.3         53140 26148209    3.9364680   0.215   3.081   3.72280015  83.50169961  43.77   0.86804287  3.417022979741  14.165      0
jxl:d0.4         53140 21728671    3.2711310   0.222   3.335   3.70316481  82.45400293  43.46   0.86632896  2.833875548643  11.857      0
jxl:d0.5         53140 18735046    2.8204574   0.225   3.490   3.78689229  81.03160550  42.96   0.88218701  2.488170852721  10.477      0
jxl:d0.51        53140 18481637    2.7823080   0.227   3.554   3.78943443  80.82840503  42.90   0.88291424  2.456539379935  10.350      0
jxl:d1           53140 11473364    1.7272514   0.211   3.851   4.33372505  72.98258002  40.04   1.11665253  1.928739624560   7.551      0
jxl:m:d0.05      53140 55478487    8.3519788   0.106   0.806   0.94522626  91.69883246  56.32   0.37685361  3.147473380173   8.916      0
jxl:m:d0.1       53140 41469885    6.2430614   0.131   0.879   1.09014931  90.28594211  52.89   0.44540513  2.780691604205   7.095      0
jxl:m:d0.2       53140 30026224    4.5202817   0.163   0.959   1.28927657  88.27007918  49.07   0.53604961  2.423095223780   5.795      0
jxl:m:d0.3       53140 24526939    3.6923948   0.177   1.041   1.49017273  86.09299761  46.85   0.61623735  2.275391552839   5.380      0
Aggregate:       53140 31379339    4.7239856   0.186   1.870   2.58447337  85.46657089  48.52   0.73614243  3.254219268898  11.223      0

After:
benchmark_xl v0.10.2 2b193d2 [NEON]
12 total threads, 637 tasks, 12 threads, 0 inner threads
Generation loss testing with 10 intermediate generations

Encoding       kPixels    Bytes          BPP  E MP/s  D MP/s     Max norm  SSIMULACRA2   PSNR        pnorm       BPP*pnorm   QABPP   Bugs
-----------------------------------------------------------------------------------------------------------------------------------------
jxl:d0:4         53140 69586836   10.4759126   0.193   0.872          nan 100.00000000  99.99   0.00000000  0.000000000000  10.476      0
jxl:d0.05        53140 65898561    9.9206632   0.195   2.034   0.31137996  94.77593503  58.74   0.08306947  0.824104203433   9.921      0
jxl:d0.1         53140 48067113    7.2362375   0.199   2.387   0.56743481  93.35413797  53.88   0.15707780  1.136652255755   7.246      0
jxl:d0.2         53140 33314710    5.0153450   0.206   2.903   0.86919606  91.07749996  49.37   0.27402673  1.374338584121   5.226      0
jxl:d0.3         53140 26131220    3.9339104   0.214   3.323   1.18395798  88.91236535  46.78   0.38239799  1.504319424389   4.746      0
jxl:d0.4         53140 21769978    3.2773496   0.221   3.597   1.44660343  87.13655866  45.04   0.47493384  1.556524204485   4.743      0
jxl:d0.5         53140 18817199    2.8328250   0.225   3.832   1.72020761  85.26976210  43.76   0.55819762  1.581276208007   4.877      0
jxl:d0.51        53140 18481637    2.7823080   0.228   3.582   3.78943443  80.82840503  42.90   0.88291424  2.456539379935  10.350      0
jxl:d1           53140 11473364    1.7272514   0.211   3.885   4.33372505  72.98258002  40.04   1.11665253  1.928739624560   7.551      0
jxl:m:d0.05      53140 55321297    8.3283147   0.108   0.809   0.37407158  95.10988285  57.94   0.10546228  0.878323023911   8.328      0
jxl:m:d0.1       53140 41108008    6.1885828   0.137   0.887   0.61221855  94.01731746  53.56   0.17845403  1.104377529510   6.238      0
jxl:m:d0.2       53140 29741507    4.4774191   0.163   0.966   0.94721991  91.97989836  49.38   0.29829694  1.335600422312   4.902      0
jxl:m:d0.3       53140 24378351    3.6700257   0.183   1.048   1.14434603  89.94132561  47.14   0.39316213  1.442915091842   4.397      0
Aggregate:       53140 31283376    4.7095390   0.187   1.945   1.05234469  89.37912533  51.49   0.31003724  1.366022347459   6.511      0

After 10 generations, d0.3 went from pnorm 0.23 to pnorm 0.86 before this PR — significantly worse than first-generation d1.
After this PR, d0.3 now goes from pnorm 0.25 to pnorm 0.38 — still about as good as first-generation d0.5.

Encode/decode speed was not measured accurately but obviously skipping gaborish will make both somewhat faster (at d <= 0.5).

These changes make it substantially more feasible to use very high quality lossy jxl (e.g. d0.1 or d0.2) as an alternative for lossless in an authoring workflow. Of course only lossless is fully safe w.r.t. generation loss, but if the number of re-encodes is reasonable (e.g. less than 10), d0.2 will now be just as good, in practice.

veluca93 · 2024-05-07T13:40:07Z

Seems sensible enough to me, but I'll leave it to @jyrkialakuijala.

jyrkialakuijala

I'm looking at this at a more fundamental level (Gaborish inaccuracies and other generation loss induction -- experimentally I found that mostly they are in enc_group.cc in quantization adjustments, and I don't yet understand why they induce generation loss)

While I study the more fundamental reasons, it is ok to get this merged as this already helps with the symptoms

jonsneyers · 2024-05-13T14:47:02Z

Rebased since there were merge conflicts, didn't change anything but needs re-approval.

mo271 · 2024-05-13T15:31:00Z

Are there some tests that are newly failing?

jonsneyers · 2024-05-13T15:56:35Z

Fixed tests, I think.

jonsneyers added the encoder Related to the libjxl encoder label May 7, 2024

jonsneyers requested review from veluca93 and jyrkialakuijala May 7, 2024 10:43

jyrkialakuijala previously approved these changes May 13, 2024

View reviewed changes

jonsneyers mentioned this pull request May 13, 2024

Tweak lossy modular #3575

Open

jonsneyers dismissed jyrkialakuijala’s stale review via 6a5ed38 May 13, 2024 14:46

jonsneyers force-pushed the reduce_generation_loss branch from 8ee8cad to 6a5ed38 Compare May 13, 2024 14:46

mo271 previously approved these changes May 13, 2024

View reviewed changes

jonsneyers dismissed mo271’s stale review via af57fa8 May 13, 2024 15:55

jonsneyers force-pushed the reduce_generation_loss branch from 6a5ed38 to af57fa8 Compare May 13, 2024 15:55

jonsneyers force-pushed the reduce_generation_loss branch from af57fa8 to 92860a5 Compare May 14, 2024 10:55

improve precision and generation loss at very low distance

48ec2d3

jonsneyers force-pushed the reduce_generation_loss branch from 92860a5 to 48ec2d3 Compare May 16, 2024 12:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce generation loss #3563

Reduce generation loss #3563

jonsneyers commented May 7, 2024

veluca93 commented May 7, 2024

jyrkialakuijala left a comment

jonsneyers commented May 13, 2024

mo271 commented May 13, 2024

jonsneyers commented May 13, 2024

Reduce generation loss #3563

Are you sure you want to change the base?

Reduce generation loss #3563

Conversation

jonsneyers commented May 7, 2024

Single generation

Multi-generation

veluca93 commented May 7, 2024

jyrkialakuijala left a comment

Choose a reason for hiding this comment

jonsneyers commented May 13, 2024

mo271 commented May 13, 2024

jonsneyers commented May 13, 2024