Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constify the code generator #242

Draft
wants to merge 21 commits into
base: master
Choose a base branch
from
Draft

Conversation

vbe0201
Copy link

@vbe0201 vbe0201 commented Oct 21, 2021

This is an initial draft to make the phf_generator crate usable as const fn, as directly and indirectly requested in #188, #196, #198, and #202. Being able to construct static maps from arbitrary const expressions as keys is the main goal of this PR.

The current approach depends on quite a bunch of nightly feature gates so I'm not expecting to get this merged anytime soon. For the time being, I'd rather collect feedback and thoughts on this and discuss the approach that is taken to make this a thing.

phf_shared

Using the const_trait_impl feature, I largely made the PhfBorrow and PhfHash traits const-compatible and adapted the proper implementations for types. All this is behind a const-api Cargo feature.

In the process, I also replaced the siphasher dependency by a corresponding in-crate implementation that supports compile-time hashing. It runs (and passes) the same test suite as the siphasher trait and I generally think the maintenance burden for this addition will be very low due to being a non-cryptographic hash that will hardly ever require any changes to the implementation.

Unresolved

  • I replaced std::ptr::copy_nonoverlapping operations with fixed-length arrays in a similar fashion to [buf[0], buf[1], buf[2], buf[3]]. I haven't decompiled the code yet but should the code emitted for this end up being worse than for copy_nonoverlapping I will use the const-api feature to provide separate implementations intended for performant runtime usage and purely compile-time usage, respectively.

phf_generator

const compatibility turned out to be quite difficult here. At its core we have the generate_hash function which should enable us to generate a HashState over a given slice of key entries. This turned out to be a non-trivial journey that mandates a breaking change to the API which can be sensibly gated behind yet another const-api Cargo feature for this crate.

To make const fn generate_hash a thing, we need to have fixed-length slices - this will barely be a problem with phf_macros but it would be a big problem for phf_codegen where the number of map entries cannot be measured statically - the const-api feature comes yet again handy here as we can exclude the phf_codegen crate from constified phf_generator APIs and it will continue to work as it did until now.

For a const PRNG, I decided to go with wyrand which is also used in quite a few programming languages and has some good attributes going for it, e.g. a very minimal implementation that will also turn out faster at runtime than rand's SmallRng. I consider the maintenance burden very low here too because the core implementation is less than 20 lines of code.

Unresolved

  • Some further additions to the RNG implementation will be required to make letter generation for gen_hash_test viable which currently does not compile.

  • The last unimplemented feature currently is sorting buckets by descending order. I will try to get this resolved as fast as possible, but help with the implementation would be appreciated here.

phf

Lastly, for the phf crate itself, we will want to expose phf_map and phf_set macros that expand to const expressions but don't break existing code at the same time.

I solved this by introducing declarative macros with macro_rules! that expand into a constant expression using APIs provided by the phf_macros crate. This will also fix #183.

However, this comes at the cost that phf_generator must be a hard dependency of phf when building with the macro feature.

phf_macros

I had to repurpose this crate into providing utilities for macro implementations in the documentation as the now declarative macros would need types from phf which leads to a dependency cycle if all of the previous user API should remain intact.

I figured deleting or renaming it is not within the scope of my decisions but I also don't feel this is the optimal state, so I'm leaving the "future" of this crate open for further discussion here.

phf_codegen

phf_codegen is the reason phf_shared and phf_generator have their const API behind the const-api feature - a way to generate PHF types dynamically is still needed and should be provided.

This however imposed an issue for me during development - in-tree cargo check will make the phf_codegen crate error over the const API of phf_generator which actually shouldn't be exposed to it. I haven't found a good solution yet so help with that is heavily appreciated.


Lastly, all of this is in a very experimental state and will evolve over time as Rust upstream grants us more flexibility. I'd love to hear your thoughts on this and please let me know if something is not up to standards.

@vbe0201
Copy link
Author

vbe0201 commented Oct 23, 2021

I'd like to point out that this is functional now and passes most of the existing tests.

The ones that particularly don't pass are those which test for duplicate keys in the map. This is currently an unresolved issue because we can't do const fn comparisons yet and even if we could, we'd be limited on rudimentary panic messages which are not descriptive of what the duplicate key actually is.

I will try to get this sorted out in the future when possible, but until then, check_duplicates remains a no-op.

I'd also like to point out two new tests I added:

    #[test]
    fn test_constexpr_keys() {
        static MAP: phf::Map<u8, isize> = phf_map! {
            stringify!(abc).len() as u8 => 0,
            5 + 4 + 3 => 1,
        };

        assert_eq!(MAP.get(&3), Some(&0));
        assert_eq!(MAP.get(&12), Some(&1));
        assert_eq!(MAP.get(&4), None);
    }

    #[test]
    fn test_nested_map() {
        static MAP: phf::Map<&'static str, phf::Map<&'static str, u16>> = phf_map! {
            "nested" => phf_map! {
                "map" => 1337,
            },
        };

        assert_eq!(
            MAP.get(&"nested").and_then(|m| m.get(&"map")),
            Some(&1337)
        );
    }

This comes at the cost that all the PHF structures are Clonable and Copyable now and the same must be fulfilled for all other types a user possibly wants to store inside them. I think the tradeoff is sustainable.

This also enables a clear formulation of the limitations of the macro syntax:

keys and values may be arbitrary expressions which both evaluate into a copyable object with the additions that keys must satisfy PhfHash and PhfBorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Nested maps don't compile
1 participant