Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a PUSHENVACCMANY opcode. #18964

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

silene
Copy link
Contributor

@silene silene commented Apr 22, 2024

When closures are nested inside closures, it might happen that several consecutive values from the environment of the outer closure are pushed onto the stack in order to create the inner closure. This opcode replaces a sequence of repeated "PUSH; ENVACC" of decreasing indices.

@silene silene added kind: performance Improvements to performance and efficiency. part: VM Virtual machine. labels Apr 22, 2024
@silene silene requested review from a team as code owners April 22, 2024 13:33
@coqbot-app coqbot-app bot added the needs: full CI The latest GitLab pipeline that ran was a light CI. Say "@coqbot run full ci" to get a full CI. label Apr 22, 2024
@silene
Copy link
Contributor Author

silene commented Apr 22, 2024

@coqbot bench

@SkySkimmer
Copy link
Contributor

@coqbot run full ci

@coqbot-app coqbot-app bot removed the needs: full CI The latest GitLab pipeline that ran was a light CI. Say "@coqbot run full ci" to get a full CI. label Apr 22, 2024
Copy link
Contributor

coqbot-app bot commented Apr 23, 2024

🏁 Bench results:

┌─────────────────────────────────────┬─────────────────────────┬───────────────────────────────────────┬───────────────────────────────────────┬──────────────────────────┐
│                                     │      user time [s]      │              CPU cycles               │           CPU instructions            │  max resident mem [KB]   │
│                                     │                         │                                       │                                       │                          │
│            package_name             │   NEW      OLD    PDIFF │      NEW             OLD        PDIFF │      NEW             OLD        PDIFF │   NEW      OLD    PDIFF  │
├─────────────────────────────────────┼─────────────────────────┼───────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤
│ coq-neural-net-interp-computed-lite │  292.94   298.51  -1.87 │  1337376212675   1362945810944  -1.88 │  3427633620998   3425755675583   0.05 │ 1126092  1126220   -0.01 │
│                       coq-equations │    7.67     7.75  -1.03 │    31232464513     31279393886  -0.15 │    50989406523     50991934811  -0.00 │  389500   387024    0.64 │
│                            coq-core │  129.12   130.32  -0.92 │   501645934426    494777457833   1.39 │   530237285987    529760364758   0.09 │  457428   457172    0.06 │
│                       coq-fiat-core │   59.03    59.28  -0.42 │   246341481533    246343944115  -0.00 │   367778466964    367734962739   0.01 │  485152   485172   -0.00 │
│                                 coq │  712.69   715.01  -0.32 │  3004847193203   3010431173986  -0.19 │  5387718512900   5382402532438   0.10 │ 2635536  2479068    6.31 │
│               coq-engine-bench-lite │  156.24   156.66  -0.27 │   661223793281    664833544571  -0.54 │  1222292862412   1228680783574  -0.52 │ 1043404  1234972  -15.51 │
│                        coq-bedrock2 │  360.69   361.61  -0.25 │  1643490734343   1648498494650  -0.30 │  3099264764486   3099070479673   0.01 │  860020   857888    0.25 │
│                        coq-compcert │  279.89   280.49  -0.21 │  1266275209679   1267951675810  -0.13 │  1935198116186   1935220449530  -0.00 │ 1121432  1121600   -0.01 │
│                      coq-verdi-raft │  582.29   583.49  -0.21 │  2647608299650   2654377098824  -0.26 │  4168632669309   4168532028648   0.00 │  846132   846252   -0.01 │
│              coq-mathcomp-ssreflect │   92.47    92.66  -0.21 │   421774059102    422649908762  -0.21 │   658849808817    658804637858   0.01 │ 1368496  1370540   -0.15 │
│                         coq-coqutil │   42.10    42.16  -0.14 │   185750354312    185543679671   0.11 │   270283559110    270214000373   0.03 │  561104   560908    0.03 │
│                    coq-math-classes │   85.20    85.29  -0.11 │   383686922341    383578520477   0.03 │   535914160132    535905070497   0.00 │  504008   503772    0.05 │
│         coq-rewriter-perf-SuperFast │  784.42   784.83  -0.05 │  3565451279982   3567808178669  -0.07 │  6192112194519   6188016676551   0.07 │ 1573088  1576368   -0.21 │
│          coq-performance-tests-lite │  701.61   701.81  -0.03 │  3170165211432   3167252957625   0.09 │  5634922136796   5632585885634   0.04 │ 2264784  2264624    0.01 │
│                      coq-coquelicot │   39.45    39.42   0.08 │   175474158261    174906657949   0.32 │   248974058823    248954748669   0.01 │  852324   853380   -0.12 │
│               coq-mathcomp-fingroup │   30.66    30.63   0.10 │   139944753286    139293533344   0.47 │   208117424283    208112236560   0.00 │  563128   563800   -0.12 │
│                         coq-bignums │   29.36    29.32   0.14 │   132918823123    132613386682   0.23 │   192016388233    191991933510   0.01 │  474692   472712    0.42 │
│                             coq-vst │  873.95   872.31   0.19 │  3975721302252   3965064869241   0.27 │  6746609739164   6742135338972   0.07 │ 2151116  2150852    0.01 │
│                    coq-fiat-parsers │  309.64   308.96   0.22 │  1381486577633   1377059088865   0.32 │  2433582884768   2432364900899   0.05 │ 2401476  2397628    0.16 │
│             coq-metacoq-safechecker │  423.72   422.70   0.24 │  1938747473822   1934497633657   0.22 │  3197609774908   3198484627713  -0.03 │ 2059936  2070028   -0.49 │
│                          coq-stdlib │  362.47   361.57   0.25 │  1533246253284   1527949096954   0.35 │  1286627907700   1286636395989  -0.00 │  716604   716136    0.07 │
│                           coq-color │  251.99   251.36   0.25 │  1131832066668   1130892337096   0.08 │  1633995155302   1634087888208  -0.01 │ 1205556  1209988   -0.37 │
│                   coq-iris-examples │  468.43   467.13   0.28 │  2128879146292   2122381994926   0.31 │  3276692450136   3276735898861  -0.00 │ 1113096  1111656    0.13 │
│                            coq-hott │  157.15   156.65   0.32 │   702271434934    698881837564   0.49 │  1117300623760   1117422153008  -0.01 │  550956   550664    0.05 │
│                   coq-metacoq-pcuic │  985.26   982.00   0.33 │  4409070155383   4400440333963   0.20 │  6475547257617   6475553899766  -0.00 │ 2686696  2683344    0.12 │
│            coq-metacoq-translations │   16.78    16.71   0.42 │    75262124909     74640342840   0.83 │   124806694898    124791623620   0.01 │  846184   846760   -0.07 │
│                coq-mathcomp-algebra │  233.61   232.60   0.43 │  1066514781598   1061887978279   0.44 │  1760451893710   1760430620131   0.00 │ 1292736  1290612    0.16 │
│        coq-fiat-crypto-with-bedrock │ 6192.93  6163.49   0.48 │ 28210520752958  28071579373013   0.49 │ 50340405359330  50331188371458   0.02 │ 3246068  3246124   -0.00 │
│                        coq-coqprime │   48.28    48.05   0.48 │   216784645213    214366982054   1.13 │   332238795863    332158351482   0.02 │  783112   784400   -0.16 │
│                 coq-category-theory │  692.12   688.79   0.48 │  3149486336848   3133373309177   0.51 │  5259458038399   5258252021965   0.02 │  954536   956004   -0.15 │
│                        coq-rewriter │  389.82   387.87   0.50 │  1771156520032   1762622846747   0.48 │  2950512693657   2950456650106   0.00 │ 1480308  1477600    0.18 │
│                         coq-unimath │ 2442.84  2430.02   0.53 │ 11122093695785  11060690640192   0.56 │ 21927452993238  21925293474639   0.01 │ 1254004  1254700   -0.06 │
│                            coq-corn │  715.87   711.85   0.56 │  3250841497170   3233999876384   0.52 │  5070751897585   5071026378416  -0.01 │  765564   767464   -0.25 │
│                       coq-fourcolor │ 1348.22  1339.94   0.62 │  6163684429084   6127024668020   0.60 │ 12142970445065  12140211770497   0.02 │ 2129040  2132464   -0.16 │
│                coq-metacoq-template │  150.68   149.64   0.70 │   668808561975    664052579912   0.72 │  1045714068271   1045895420615  -0.02 │ 1484784  1484408    0.03 │
│                 coq-metacoq-erasure │  504.23   500.40   0.77 │  2300263018188   2283394361121   0.74 │  3586947198473   3586899103873   0.00 │ 2157152  2157612   -0.02 │
│               coq-mathcomp-solvable │  118.49   117.45   0.89 │   541422555492    536154489265   0.98 │   853860355609    853829831996   0.00 │  852980   853196   -0.03 │
│                  coq-mathcomp-field │  134.72   133.52   0.90 │   614880228195    609334732057   0.91 │  1010485164937   1010479596950   0.00 │ 1417984  1417800    0.01 │
│                           coq-verdi │   49.43    48.66   1.58 │   221523243376    219179247566   1.07 │   340654128395    340609984149   0.01 │  531856   531288    0.11 │
│              coq-mathcomp-odd-order │  779.52   762.10   2.29 │  3569136163636   3489827573851   2.27 │  5994515937715   5994633368995  -0.00 │ 1523556  1523700   -0.01 │
│              coq-mathcomp-character │  104.85   102.30   2.49 │   478759934084    467602836745   2.39 │   741996254352    742077614302  -0.01 │ 1044260  1044440   -0.02 │
└─────────────────────────────────────┴─────────────────────────┴───────────────────────────────────────┴───────────────────────────────────────┴──────────────────────────┘

🐢 Top 25 slow downs
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                                 TOP 25 SLOW DOWNS                                                                 │
│                                                                                                                                                   │
│   OLD       NEW      DIFF   %DIFF    Ln                      FILE                                                                                 │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ 180.5720  184.0040  3.4320   1.90%   233  coq-fiat-crypto-with-bedrock/rupicola/bedrock2/deps/riscv-coq/src/riscv/Proofs/DecodeByExtension.v.html │
│  76.6340   78.9300  2.2960   3.00%   365  coq-mathcomp-odd-order/theories/PFsection4.v.html                                                       │
│ 216.8910  218.8720  1.9810   0.91%   103  coq-fiat-crypto-with-bedrock/src/Arithmetic/BarrettReduction.v.html                                     │
│  62.5680   64.4540  1.8860   3.01%   368  coq-mathcomp-odd-order/theories/PFsection4.v.html                                                       │
│ 255.1950  256.8250  1.6300   0.64%  1629  coq-metacoq-pcuic/pcuic/theories/PCUICSR.v.html                                                         │
│ 235.7040  237.2370  1.5330   0.65%   141  coq-fiat-crypto-with-bedrock/src/UnsaturatedSolinasHeuristics/Tests.v.html                              │
│  63.0140   64.3190  1.3050   2.07%   609  coq-fiat-crypto-with-bedrock/rupicola/bedrock2/bedrock2/src/bedrock2Examples/lightbulb.v.html           │
│  45.9500   47.0050  1.0550   2.30%   926  coq-fiat-crypto-with-bedrock/src/Bedrock/End2End/X25519/GarageDoor.v.html                               │
│  16.2470   17.2340  0.9870   6.07%   607  coq-mathcomp-odd-order/theories/PFsection9.v.html                                                       │
│   2.6920    3.6230  0.9310  34.58%   490  coq-stdlib/Reals/Cauchy/ConstructiveCauchyRealsMult.v.html                                              │
│   7.0530    7.9560  0.9030  12.80%   881  coq-vst/veric/binop_lemmas4.v.html                                                                      │
│  47.9980   48.8710  0.8730   1.82%    50  coq-fiat-crypto-with-bedrock/src/Bedrock/Field/Synthesis/Examples/p224_64_new.v.html                    │
│  50.4160   51.2510  0.8350   1.66%    50  coq-fiat-crypto-with-bedrock/src/Bedrock/Field/Synthesis/Examples/p224_64_new.v.html                    │
│ 134.7340  135.4950  0.7610   0.56%   155  coq-fiat-crypto-with-bedrock/src/UnsaturatedSolinasHeuristics/Tests.v.html                              │
│  38.4040   39.1310  0.7270   1.89%   224  coq-performance-tests-lite/PerformanceExperiments/rewrite_lift_lets_map.v.html                          │
│  39.3170   40.0250  0.7080   1.80%   236  coq-rewriter/src/Rewriter/Rewriter/Examples/PerfTesting/LiftLetsMap.v.html                              │
│ 155.0040  155.7090  0.7050   0.45%  1190  coq-unimath/UniMath/CategoryTheory/GrothendieckConstruction/IsPullback.v.html                           │
│  98.2100   98.8470  0.6370   0.65%   376  coq-unimath/UniMath/ModelCategories/Generated/LNWFSMonoidalStructure.v.html                             │
│  25.5250   26.1460  0.6210   2.43%    22  coq-fiat-crypto-with-bedrock/src/Spec/Test/X25519.v.html                                                │
│  25.5210   26.1410  0.6200   2.43%    17  coq-fiat-crypto-with-bedrock/src/Spec/Test/X25519.v.html                                                │
│  99.9450  100.4770  0.5320   0.53%    20  coq-fiat-crypto-with-bedrock/src/Rewriter/Passes/NBE.v.html                                             │
│   3.2750    3.7800  0.5050  15.42%  1060  coq-unimath/UniMath/CategoryTheory/EnrichedCats/Examples/FunctorCategory.v.html                         │
│  27.0900   27.5910  0.5010   1.85%    68  coq-fiat-crypto-with-bedrock/rupicola/bedrock2/deps/riscv-coq/src/riscv/Proofs/VerifyDecode.v.html      │
│  19.9990   20.4990  0.5000   2.50%    25  coq-fiat-crypto-with-bedrock/src/Spec/Test/X25519.v.html                                                │
│  17.5520   17.9870  0.4350   2.48%    32  coq-performance-tests-lite/src/pattern.v.html                                                           │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
🐇 Top 25 speed ups
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                             TOP 25 SPEED UPS                                                              │
│                                                                                                                                           │
│   OLD       NEW      DIFF     %DIFF    Ln                    FILE                                                                         │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ 257.2970  251.4590  -5.8380   -2.27%     8  coq-neural-net-interp-computed-lite/theories/MaxOfTwoNumbersSimpler/Computed/AllLogits.v.html │
│  42.8080   39.9910  -2.8170   -6.58%   236  coq-rewriter-perf-SuperFast/src/Rewriter/Rewriter/Examples/PerfTesting/LiftLetsMap.v.html     │
│  64.5210   63.2970  -1.2240   -1.90%   609  coq-bedrock2/bedrock2/src/bedrock2Examples/lightbulb.v.html                                   │
│  12.2070   11.4890  -0.7180   -5.88%   194  coq-fiat-crypto-with-bedrock/src/Fancy/Barrett256.v.html                                      │
│  67.0150   66.4630  -0.5520   -0.82%    27  coq-fiat-crypto-with-bedrock/src/Rewriter/Passes/ToFancyWithCasts.v.html                      │
│  95.3060   94.7610  -0.5450   -0.57%   999  coq-performance-tests-lite/src/fiat_crypto_via_setoid_rewrite_standalone.v.html               │
│  26.0150   25.4800  -0.5350   -2.06%   129  coq-fiat-crypto-with-bedrock/src/Curves/Weierstrass/Projective.v.html                         │
│  17.7590   17.2620  -0.4970   -2.80%   481  coq-verdi-raft/theories/RaftProofs/EndToEndLinearizability.v.html                             │
│  41.2450   40.7660  -0.4790   -1.16%   139  coq-fiat-parsers/src/Parsers/Refinement/SharpenedJSON.v.html                                  │
│  48.2050   47.8050  -0.4000   -0.83%   558  coq-bedrock2/bedrock2/src/bedrock2Examples/insertionsort.v.html                               │
│ 131.9040  131.5070  -0.3970   -0.30%    22  coq-fiat-crypto-with-bedrock/src/Rewriter/Passes/ArithWithCasts.v.html                        │
│  95.2500   94.8720  -0.3780   -0.40%   968  coq-performance-tests-lite/src/fiat_crypto_via_setoid_rewrite_standalone.v.html               │
│   9.7880    9.4260  -0.3620   -3.70%    87  coq-engine-bench-lite/coq/PerformanceDemos/quadratic_reduction.v.html                         │
│  24.2840   23.9460  -0.3380   -1.39%    85  coq-fiat-crypto-with-bedrock/src/Curves/Montgomery/AffineProofs.v.html                        │
│  17.0700   16.7740  -0.2960   -1.73%    24  coq-fiat-crypto-with-bedrock/src/Rewriter/Passes/MulSplit.v.html                              │
│  23.5260   23.2300  -0.2960   -1.26%    23  coq-fiat-crypto-with-bedrock/src/Rewriter/Passes/Arith.v.html                                 │
│   4.4570    4.2000  -0.2570   -5.77%   199  coq-fiat-crypto-with-bedrock/src/Fancy/Montgomery256.v.html                                   │
│   0.6640    0.4090  -0.2550  -38.40%  2108  coq-stdlib/FSets/FMapFacts.v.html                                                             │
│   4.9340    4.6910  -0.2430   -4.93%   320  coq-metacoq-erasure/erasure/theories/ErasureProperties.v.html                                 │
│  44.1920   43.9520  -0.2400   -0.54%   827  coq-vst/veric/binop_lemmas4.v.html                                                            │
│   1.3210    1.0810  -0.2400  -18.17%    89  coq-engine-bench-lite/coq/PerformanceDemos/quadratic_reduction.v.html                         │
│   0.8250    0.6170  -0.2080  -25.21%   736  coq-stdlib/Reals/Cauchy/ConstructiveCauchyReals.v.html                                        │
│  20.0360   19.8320  -0.2040   -1.02%   325  coq-engine-bench-lite/coq/PerformanceDemos/one_step_reduction.v.html                          │
│  20.1720   19.9850  -0.1870   -0.93%   214  coq-engine-bench-lite/coq/PerformanceDemos/one_step_reduction.v.html                          │
│   9.5010    9.3210  -0.1800   -1.89%   118  coq-unimath/UniMath/Semantics/LinearLogic/LiftingModel.v.html                                 │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

@silene
Copy link
Contributor Author

silene commented Apr 23, 2024

The overall .vo size reduction is rather low, of the order of 0.5% on the whole CI. That said, some developments still benefit heavily from this pull request, e.g., coq-fiat-crypto-with-bedrock with a 34% reduction on Field25519.vo (and five other .vo files being reduced by 15% or more). Regarding coq-hott, the results are rather disappointing, so the hypothesis from #18959 is flawed. The only exception is file DPathCube.vo, whose size is reduced by 28%.

@SkySkimmer
Copy link
Contributor

SkySkimmer commented Apr 23, 2024

The -15% max memory usage in engine-bench-lite may be real too

   After |   Peak Mem | File Name                                        |   Before |   Peak Mem ||    Change || Change (mem) | % Change | % Change (mem)
---------------------------------------------------------------------------------------------------------------------------------------------------------
0m12.09s | 1027784 ko | quadratic_reduction.vo                           | 0m12.48s | 1219400 ko || -0m00.39s ||   -191616 ko |   -3.12% |        -15.71%

@silene
Copy link
Contributor Author

silene commented Apr 23, 2024

Hard to tell. If the optimization triggers, the generated bytecode is a bit smaller, so it might delay the next garbage collection. But on the other hand, the allocation of the aux closure will hasten the next garbage collection. Since this code path will be triggered very often (even if the optimization is not), perhaps I should rewrite the closure to make it allocation-free.

@silene silene added the request: full CI Use this label when you want your next push to trigger a full CI. label Apr 23, 2024
@coqbot-app coqbot-app bot removed the request: full CI Use this label when you want your next push to trigger a full CI. label Apr 23, 2024
@silene
Copy link
Contributor Author

silene commented Apr 23, 2024

Actually, this is a semi-pointless optimization, since a look at the assembly code shows that I had not taken into account the allocation for the pair. So, despite the latest change, the code is still not allocation-free. But what is done is done, and the number of allocations is nonetheless halved.

@silene silene added the request: full CI Use this label when you want your next push to trigger a full CI. label May 3, 2024
@coqbot-app coqbot-app bot removed the request: full CI Use this label when you want your next push to trigger a full CI. label May 3, 2024
@ppedrot ppedrot self-assigned this May 21, 2024
@ppedrot
Copy link
Member

ppedrot commented May 21, 2024

@silene same remark as the other PR, could you rebase to bench after #19014?

When closures are nested inside closures, it might happen that several
consecutive values from the environment of the outer closure are pushed
onto the stack in order to create the inner closure. This opcode replaces
a repeated sequence of opcodes "PUSH; ENVACC" of decreasing indices.
@silene silene added the request: full CI Use this label when you want your next push to trigger a full CI. label May 21, 2024
@coqbot-app coqbot-app bot removed the request: full CI Use this label when you want your next push to trigger a full CI. label May 21, 2024
@silene
Copy link
Contributor Author

silene commented May 21, 2024

@coqbot bench

Copy link
Contributor

coqbot-app bot commented May 22, 2024

🏁 Bench results:

┌─────────────────────────────────────┬─────────────────────────┬───────────────────────────────────────┬───────────────────────────────────────┬──────────────────────────┐
│                                     │      user time [s]      │              CPU cycles               │           CPU instructions            │  max resident mem [KB]   │
│                                     │                         │                                       │                                       │                          │
│            package_name             │   NEW      OLD    PDIFF │      NEW             OLD        PDIFF │      NEW             OLD        PDIFF │   NEW      OLD    PDIFF  │
├─────────────────────────────────────┼─────────────────────────┼───────────────────────────────────────┼───────────────────────────────────────┼──────────────────────────┤
│ coq-neural-net-interp-computed-lite │  255.73   262.00  -2.39 │  1166043841143   1196140547624  -2.52 │  3038763516866   3209158266281  -5.31 │ 1112676  1114968   -0.21 │
│              coq-mathcomp-odd-order │  782.77   795.70  -1.62 │  3583397918495   3643212735186  -1.64 │  6077741767438   6077798068872  -0.00 │ 1664880  1665084   -0.01 │
│                           coq-color │  249.13   252.36  -1.28 │  1120947974998   1134839311086  -1.22 │  1632105695401   1632212985177  -0.01 │ 1170244  1165500    0.41 │
│               coq-engine-bench-lite │  155.25   157.22  -1.25 │   656777755732    666648496752  -1.48 │  1221573801085   1229113144747  -0.61 │ 1101716  1232444  -10.61 │
│                  coq-mathcomp-field │  140.92   142.53  -1.13 │   643526418084    650571914976  -1.08 │  1073516082653   1073506492285   0.00 │ 1435912  1436288   -0.03 │
│                   coq-metacoq-pcuic │  976.69   987.58  -1.10 │  4371644494896   4426215481031  -1.23 │  6449402128769   6450620279506  -0.02 │ 2678292  2675120    0.12 │
│                         coq-coqutil │   42.20    42.64  -1.03 │   186146572857    187441659108  -0.69 │   270740752105    270842533608  -0.04 │  560836   560700    0.02 │
│                      coq-verdi-raft │  576.84   582.63  -0.99 │  2623619614000   2649554546391  -0.98 │  4160706140082   4160715361680  -0.00 │  848020   848224   -0.02 │
│          coq-performance-tests-lite │  702.80   709.38  -0.93 │  3174256704895   3205103329928  -0.96 │  5637119449002   5687080059475  -0.88 │ 2274660  2276052   -0.06 │
│                coq-metacoq-template │  149.98   151.30  -0.87 │   666454350091    671644144053  -0.77 │  1045362602558   1045362751024  -0.00 │ 1515200  1516496   -0.09 │
│         coq-rewriter-perf-SuperFast │  780.29   786.91  -0.84 │  3546244112973   3576159782665  -0.84 │  6173720057502   6225563395710  -0.83 │ 1578732  1576008    0.17 │
│        coq-fiat-crypto-with-bedrock │ 5921.80  5970.79  -0.82 │ 26966862830700  27186015975440  -0.81 │ 48055848537562  48708683583874  -1.34 │ 3246024  3246080   -0.00 │
│                        coq-rewriter │  385.05   387.90  -0.73 │  1749971748727   1764076072589  -0.80 │  2934448896564   2974661228626  -1.35 │ 1479932  1480056   -0.01 │
│                       coq-fourcolor │ 1391.44  1399.96  -0.61 │  6365528577670   6404946215188  -0.62 │ 12721238329828  13506346789515  -5.81 │ 2137744  2137824   -0.00 │
│                    coq-fiat-parsers │  305.56   307.37  -0.59 │  1360575811667   1371463353283  -0.79 │  2396340385256   2434012572621  -1.55 │ 2459172  2458212    0.04 │
│                coq-mathcomp-algebra │  258.27   259.79  -0.59 │  1178016211473   1184055832061  -0.51 │  1999264022720   1999214607425   0.00 │ 1335236  1334392    0.06 │
│             coq-metacoq-safechecker │  416.18   418.61  -0.58 │  1905790397479   1915961841380  -0.53 │  3172522519590   3172846703091  -0.01 │ 2039160  2037240    0.09 │
│                 coq-category-theory │  658.02   661.86  -0.58 │  2991161234205   3008973422910  -0.59 │  4922166693498   4920232765361   0.04 │  969904   972196   -0.24 │
│              coq-mathcomp-ssreflect │  218.72   219.88  -0.53 │   998826184653   1003867285518  -0.50 │  1675254071996   1676139974959  -0.05 │ 1726624  1718796    0.46 │
│                            coq-core │  128.88   129.50  -0.48 │   501473124035    500017466390   0.29 │   531895361897    531778379656   0.02 │  457664   458204   -0.12 │
│               coq-mathcomp-fingroup │   30.40    30.54  -0.46 │   138468702307    139141183398  -0.48 │   207652252397    207646328709   0.00 │  568992   569712   -0.13 │
│                 coq-metacoq-erasure │  501.65   503.87  -0.44 │  2289842179242   2298607416963  -0.38 │  3577058608209   3577286289029  -0.01 │ 2099916  2108652   -0.41 │
│              coq-mathcomp-character │  108.29   108.76  -0.43 │   494599581859    496829008686  -0.45 │   790172385594    790151542428   0.00 │ 1130728  1127972    0.24 │
│                          coq-stdlib │  349.29   350.74  -0.41 │  1483260839539   1482432171780   0.06 │  1257252177936   1257456219381  -0.02 │  712696   718984   -0.87 │
│                        coq-compcert │  279.33   280.43  -0.39 │  1263490118172   1269781764415  -0.50 │  1931752657816   1939685079642  -0.41 │ 1135072  1135032    0.00 │
│                            coq-hott │  160.13   160.76  -0.39 │   716980415373    719875267490  -0.40 │  1142779545498   1142727353179   0.00 │  474676   471652    0.64 │
│                         coq-bignums │   29.37    29.48  -0.37 │   133124139976    133671291699  -0.41 │   192184610917    192226554246  -0.02 │  470408   472752   -0.50 │
│                      coq-coquelicot │   39.50    39.64  -0.35 │   175900392853    176957091994  -0.60 │   249296684004    249283025445   0.01 │  853128   852476    0.08 │
│                      coq-test-suite │  700.82   703.24  -0.34 │  2986151876355   2995381894463  -0.31 │  5351319836584   5360459729095  -0.17 │ 2696252  2694688    0.06 │
│                   coq-iris-examples │  468.60   470.14  -0.33 │  2129394343464   2137199704721  -0.37 │  3277318054277   3277349012089  -0.00 │ 1117064  1115608    0.13 │
│                       coq-fiat-core │   59.15    59.32  -0.29 │   246230748666    248256326146  -0.82 │   368476806284    368457427291   0.01 │  476660   476748   -0.02 │
│                             coq-vst │  876.42   878.84  -0.28 │  3983427030703   3995586501298  -0.30 │  6737624290115   6737642841711  -0.00 │ 2212372  2211632    0.03 │
│                            coq-corn │  716.23   718.00  -0.25 │  3253840403588   3259171932852  -0.16 │  5075255461474   5076269071998  -0.02 │  786116   786820   -0.09 │
│                         coq-unimath │ 2577.83  2584.04  -0.24 │ 11736273584154  11764509949987  -0.24 │ 23090071223472  23088860031200   0.01 │ 1272696  1275444   -0.22 │
│                       coq-equations │    7.57     7.58  -0.13 │    31071425584     31214689786  -0.46 │    50906180803     50928968012  -0.04 │  386768   386964   -0.05 │
│                           coq-verdi │   48.99    49.02  -0.06 │   219310469875    220689489315  -0.62 │   340652227473    340623737175   0.01 │  527788   529316   -0.29 │
│                    coq-math-classes │   85.90    85.92  -0.02 │   385778017858    386344990054  -0.15 │   536783032373    536757329682   0.00 │  503160   503760   -0.12 │
│                        coq-coqprime │   48.18    48.19  -0.02 │   216412249431    216194834990   0.10 │   332866787576    334662769032  -0.54 │  782628   785876   -0.41 │
│               coq-mathcomp-solvable │  118.68   118.46   0.19 │   541858296828    540433037175   0.26 │   856112276500    856224689501  -0.01 │  883480   883908   -0.05 │
│            coq-metacoq-translations │   16.92    16.87   0.30 │    75662327900     75737198758  -0.10 │   124718004428    124708246114   0.01 │  841272   839704    0.19 │
│                        coq-bedrock2 │  362.89   360.55   0.65 │  1652756697576   1643178605071   0.58 │  3105954348834   3109231706873  -0.11 │  862040   859032    0.35 │
└─────────────────────────────────────┴─────────────────────────┴───────────────────────────────────────┴───────────────────────────────────────┴──────────────────────────┘

🐢 Top 25 slow downs
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                             TOP 25 SLOW DOWNS                                                              │
│                                                                                                                                            │
│   OLD       NEW      DIFF     %DIFF     Ln                    FILE                                                                         │
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│  61.7760   64.9430  3.1670      5.13%   609  coq-bedrock2/bedrock2/src/bedrock2Examples/lightbulb.v.html                                   │
│  61.8080   64.6620  2.8540      4.62%   609  coq-fiat-crypto-with-bedrock/rupicola/bedrock2/bedrock2/src/bedrock2Examples/lightbulb.v.html │
│  11.8360   12.4830  0.6470      5.47%   194  coq-fiat-crypto-with-bedrock/src/Fancy/Barrett256.v.html                                      │
│ 154.8680  155.5140  0.6460      0.42%  1190  coq-unimath/UniMath/CategoryTheory/GrothendieckConstruction/IsPullback.v.html                 │
│   2.3940    2.9520  0.5580     23.31%   487  coq-stdlib/Numbers/HexadecimalFacts.v.html                                                    │
│   4.4040    4.8600  0.4560     10.35%   733  coq-category-theory/Construction/Comma/Adjunction.v.html                                      │
│  44.5720   44.9270  0.3550      0.80%   827  coq-vst/veric/binop_lemmas4.v.html                                                            │
│   0.5920    0.8800  0.2880     48.65%   160  coq-stdlib/Numbers/HexadecimalNat.v.html                                                      │
│  33.8000   34.0670  0.2670      0.79%   839  coq-unimath/UniMath/CategoryTheory/GrothendieckConstruction/IsPullback.v.html                 │
│   1.0160    1.2830  0.2670     26.28%    88  coq-engine-bench-lite/coq/PerformanceDemos/quadratic_reduction.v.html                         │
│   4.3230    4.5650  0.2420      5.60%   199  coq-fiat-crypto-with-bedrock/src/Fancy/Montgomery256.v.html                                   │
│   2.6590    2.8600  0.2010      7.56%   761  coq-category-theory/Construction/Comma/Adjunction.v.html                                      │
│   0.3100    0.5080  0.1980     63.87%   705  coq-stdlib/Numbers/HexadecimalFacts.v.html                                                    │
│  17.6040   17.7920  0.1880      1.07%    31  coq-engine-bench-lite/coq/PerformanceDemos/pattern.v.html                                     │
│  13.0320   13.2170  0.1850      1.42%  1028  coq-unimath/UniMath/CategoryTheory/LocalizingClass.v.html                                     │
│   0.7240    0.9060  0.1820     25.14%   813  coq-stdlib/MSets/MSetRBT.v.html                                                               │
│  25.4350   25.6080  0.1730      0.68%    12  coq-fourcolor/theories/job535to541.v.html                                                     │
│   0.0010    0.1740  0.1730  17300.00%   157  coq-metacoq-pcuic/pcuic/theories/Syntax/PCUICReflect.v.html                                   │
│   5.0570    5.2210  0.1640      3.24%  1258  coq-mathcomp-odd-order/theories/PFsection9.v.html                                             │
│   0.2490    0.4050  0.1560     62.65%   443  coq-stdlib/Reals/Abstract/ConstructiveLimits.v.html                                           │
│   2.4780    2.6160  0.1380      5.57%   509  coq-metacoq-pcuic/pcuic/theories/PCUICWfUniverses.v.html                                      │
│   0.5100    0.6430  0.1330     26.08%   808  coq-category-theory/Construction/Comma/Adjunction.v.html                                      │
│   7.1650    7.2920  0.1270      1.77%   583  coq-unimath/UniMath/CategoryTheory/Monoidal/Displayed/Symmetric.v.html                        │
│   3.8590    3.9840  0.1250      3.24%    61  coq-fiat-crypto-with-bedrock/rupicola/src/Rupicola/Examples/L64X128.v.html                    │
│   0.2170    0.3370  0.1200     55.30%   637  coq-stdlib/Reals/Abstract/ConstructiveSum.v.html                                              │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
🐇 Top 25 speed ups
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                               TOP 25 SPEED UPS                                                                │
│                                                                                                                                               │
│   OLD       NEW      DIFF    %DIFF    Ln                     FILE                                                                             │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ 220.2730  213.8020  -6.4710  -2.94%     8  coq-neural-net-interp-computed-lite/theories/MaxOfTwoNumbersSimpler/Computed/AllLogits.v.html      │
│ 241.8720  235.7000  -6.1720  -2.55%   141  coq-fiat-crypto-with-bedrock/src/UnsaturatedSolinasHeuristics/Tests.v.html                         │
│  83.6260   80.0150  -3.6110  -4.32%   365  coq-mathcomp-odd-order/theories/PFsection4.v.html                                                  │
│ 138.6460  135.0940  -3.5520  -2.56%   155  coq-fiat-crypto-with-bedrock/src/UnsaturatedSolinasHeuristics/Tests.v.html                         │
│  69.2210   66.3670  -2.8540  -4.12%   368  coq-mathcomp-odd-order/theories/PFsection4.v.html                                                  │
│ 257.9050  255.5070  -2.3980  -0.93%  1629  coq-metacoq-pcuic/pcuic/theories/PCUICSR.v.html                                                    │
│ 130.9410  128.6240  -2.3170  -1.77%    22  coq-fiat-crypto-with-bedrock/src/Rewriter/Passes/ArithWithCasts.v.html                             │
│ 222.1250  220.3510  -1.7740  -0.80%   103  coq-fiat-crypto-with-bedrock/src/Arithmetic/BarrettReduction.v.html                                │
│  39.9690   38.4870  -1.4820  -3.71%   224  coq-performance-tests-lite/PerformanceExperiments/rewrite_lift_lets_map.v.html                     │
│  40.7940   39.3660  -1.4280  -3.50%   236  coq-rewriter-perf-SuperFast/src/Rewriter/Rewriter/Examples/PerfTesting/LiftLetsMap.v.html          │
│  40.7680   39.3620  -1.4060  -3.45%   236  coq-rewriter/src/Rewriter/Rewriter/Examples/PerfTesting/LiftLetsMap.v.html                         │
│ 101.4070  100.2060  -1.2010  -1.18%    20  coq-fiat-crypto-with-bedrock/src/Rewriter/Passes/NBE.v.html                                        │
│  17.5580   16.7380  -0.8200  -4.67%   607  coq-mathcomp-odd-order/theories/PFsection9.v.html                                                  │
│  46.3640   45.5920  -0.7720  -1.67%    50  coq-fiat-crypto-with-bedrock/src/Bedrock/Field/Synthesis/Examples/p224_64_new.v.html               │
│  44.0130   43.2770  -0.7360  -1.67%    50  coq-fiat-crypto-with-bedrock/src/Bedrock/Field/Synthesis/Examples/p224_64_new.v.html               │
│  11.7690   11.0580  -0.7110  -6.04%  2103  coq-mathcomp-ssreflect/mathcomp/ssreflect/order.v.html                                             │
│  27.7450   27.1270  -0.6180  -2.23%    68  coq-fiat-crypto-with-bedrock/rupicola/bedrock2/deps/riscv-coq/src/riscv/Proofs/VerifyDecode.v.html │
│  37.7430   37.1490  -0.5940  -1.57%     3  coq-fiat-crypto-with-bedrock/src/ExtractionJsOfOCaml/bedrock2_fiat_crypto.v.html                   │
│   9.9040    9.3550  -0.5490  -5.54%    87  coq-engine-bench-lite/coq/PerformanceDemos/quadratic_reduction.v.html                              │
│  13.3830   12.8600  -0.5230  -3.91%    22  coq-fiat-crypto-with-bedrock/src/Spec/Test/X25519.v.html                                           │
│  37.2540   36.7320  -0.5220  -1.40%   139  coq-fiat-parsers/src/Parsers/Refinement/SharpenedJSON.v.html                                       │
│  13.4220   12.9010  -0.5210  -3.88%    17  coq-fiat-crypto-with-bedrock/src/Spec/Test/X25519.v.html                                           │
│  22.0330   21.5500  -0.4830  -2.19%   376  coq-fiat-crypto-with-bedrock/rupicola/bedrock2/compiler/src/compiler/FlatToRiscvMetric.v.html      │
│  20.3040   19.8390  -0.4650  -2.29%   214  coq-engine-bench-lite/coq/PerformanceDemos/one_step_reduction.v.html                               │
│  20.0680   19.6170  -0.4510  -2.25%   325  coq-engine-bench-lite/coq/PerformanceDemos/one_step_reduction.v.html                               │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

@ppedrot
Copy link
Member

ppedrot commented May 22, 2024

Looks like the memory reduction of coq-engine-bench-lite is reproducible. Apart from that, the leading vo size reductions are:

coq-metacoq-safechecker/./safechecker/theories/PCUICTypeChecker.vo 4254073 4113413 140660 (-3%)
coq-category-theory/./Instance/Coq/Par.vo 8373788 8200012 173776 (-2%)
coq-fiat-parsers/./src/Parsers/BooleanRecognizerOptimized.vo 8323229 8146557 176672 (-2%)
coq-fiat-crypto-with-bedrock/./src/Bedrock/Field/Synthesis/Examples/p224_64_new.vo 2346344 2119714 226630 (-10%)
coq-metacoq-safechecker/./safechecker/theories/PCUICSafeRetyping.vo 3493440 3256420 237020 (-7%)
coq-metacoq-pcuic/./pcuic/theories/PCUICConversion.vo 4585514 4270334 315180 (-7%)
coq-metacoq-safechecker/./safechecker/theories/PCUICSafeConversion.vo 8168971 7684625 484346 (-6%)
coq-test-suite/./_build/default/test-suite/success/SchemeEquality.vo 4396229 3843741 552488 (-13%)
coq-fiat-crypto-with-bedrock/./src/Bedrock/End2End/X25519/Field25519.vo 1926146 1269105 657041 (-34%)
coq-metacoq-safechecker/./safechecker/theories/PCUICSafeReduce.vo 7839729 6824982 1014747 (-13%)
TOTAL 1457683104 1449720654 -7962450 (-1%)

Copy link
Member

@ppedrot ppedrot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simple and good enough for me.

@ppedrot
Copy link
Member

ppedrot commented May 22, 2024

@silene if you're happy with the current state of the PR I can merge it.

@ppedrot ppedrot added this to the 8.20+rc1 milestone May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: performance Improvements to performance and efficiency. part: VM Virtual machine.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants