Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more robust CPU feature compilation and runtime selection #98

Open
hannesm opened this issue Jan 4, 2021 · 0 comments
Open

more robust CPU feature compilation and runtime selection #98

hannesm opened this issue Jan 4, 2021 · 0 comments
Labels
help wanted Extra attention is needed

Comments

@hannesm
Copy link
Member

hannesm commented Jan 4, 2021

at the moment (with #96 merged):

  • __builtin_bswap is avoided (since together with -mssse3 it leads to SSSE3 instructions that won't work on e.g. AMD Phenom II)
  • chacha source is copied to be compiled twice, once with and once without -mssse3

newer compilers with more analysis may optimize even more code into ssse3 instructions (of other hashes / ...).

to avoid this potential miscompilation, "all" C code should be compiled twice:

  • once with machine intrinsics flags that are useful (atm ssse3/aes/pclmul)
  • once without any machine-dependent intrinsics

then all entries (called from OCaml) should be runtime dispatched on the specific feature flags.

the issue with the above approach is that it is not yet clear which CPU features could be used in which settings and which features are useful (thinking of SSSE4 etc. as well). to avoid a huge matrix, time should be used to research what is useful.

@hannesm hannesm added the help wanted Extra attention is needed label Jan 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant