New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial version of redis-codegen #1
base: main
Are you sure you want to change the base?
Conversation
redis-codegen generates medium level API interfaces based on the redis command documention provided as commands.json from the redis-doc repository. Currently the generated commands still use the implement_commands macro and are included using the include macro. There are some breaking changes trough this. All shorthand medium level commands from the old definition needs to be moved out and reimplemented. Probably named as high level api.
This also adds _single fn variants that do not take a slice, as a shorthand.
redis-code now generate a single module per type of implementation. The traits Commands and AsyncCommands are implemented in the commands and async_commands module. Cmd gets a additional implementation in the command module. The same holds for pipeline and cluster_pipeline. Token enum generation is WIP.
We generate a tuple based interface for block arguments now. This is over specific, but should provide a stricter API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, this is looking pretty great! I think we can probably do a PR to the real project soon.
Can you make a list of open questions/issues that we should discuss?
Cargo.toml
Outdated
@@ -13,6 +13,12 @@ edition = "2018" | |||
all-features = true | |||
rustdoc-args = ["--cfg", "docsrs"] | |||
|
|||
|
|||
[workspace] | |||
members = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit: let's just put this on a single line?
redis-codegen/Cargo.toml
Outdated
serde_path_to_error = {version = "0.1"} | ||
thiserror = "1.0" | ||
log = "0.4" | ||
itertools = "0.10" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what's going on with the indentation here?
(Also, would be nice to have trailing newlines in all the TOML files.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the current state needs some more cleanup at various points. The commits are more or less just savepoints after a days worth of work at this point. There are some parts that changed drastically from commit 1 and I am thinking of squashing the commits for a reviewable PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think squashing makes a lot of sense here.
tests/generate.rs
Outdated
} | ||
} | ||
|
||
const BASE_URI: &str = "https://raw.githubusercontent.com/redis/redis-doc/master/commands.json"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be unused. What's the reason to keep it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to opentelemetry-rust I want to add a test that is ignored by default which downloads an up to date version of the commands.json. But that had low priority and therefore I have not implemented that yet.
tests/generate.rs
Outdated
|
||
// Compare the old version and new version and fail the test if they're different. | ||
|
||
if versions[0] != versions[1] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just return if they're the same instead? Saves a level of indentation.
redis-codegen/src/ident.rs
Outdated
// the following keywords are not supported as raw identifiers and are therefore suffixed with an underscore. | ||
"self" | "super" | "extern" | "crate" => ident += "_", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not at this time. I can remove that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let's ditch those for now.
I will work on the oneof support and token creation tomorrow and will collect a list of open todos and questions after that. |
Add allow deprecated to commands.rs Add docs to tokens.rs
This test is ignored by default similar to the sync_schemas command of opentelemetry-rust. This test can be used to update the commands.json
Alright. I implemented a first version of argument token generation. token.rs should now contain all No to open TODOs:
|
Sounds good to me!
Can you be more specific? I guess so but not exactly sure.
Can you make a review comment in the relevant spot with specific suggestions on why you're not happy yet?
Adding traits how? I think this too would benefit from a review comment in context, so it's easier for me to review.
Sure, that's fine. Do you think we could propose a way upstream to make that information part of
Here too, can you be more specific about your concerns? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The functions that return an iterator are currently handwritten. I hope that is alright.
Sure, that's fine. Do you think we could propose a way upstream to make that information part of commands.json?
I guess that could be added. Maybe a flag in the hints is enough? There is request_policy:special
but it seems this just means there is some kind of special handling needed.
We probably want to sanitize comments similar to prost right?
Can you be more specific? I guess so but not exactly sure.
Currently we take the strings from the commands.json to render docs. Prost sanitizes urls (adds <
and >
and escapes [
and ]
) See https://github.com/tokio-rs/prost/blob/master/prost-build/src/ast.rs#L144. I guess we should add this here as well for our docs.
I am not 100% happy with the way the modules are laid out in redis_codegen::code_generator. Maybe you have a cleaner idea?
Here too, can you be more specific about your concerns?
This is related to the files/modules in the code_generator directory. I guess we could move all generators (the ones with the _generator suffix) into a new directory/module to get this a little bit cleaner, but I guess this is purely cosmetic regarding discoverability and maintainability.
I hope the inline comments are more helpful.
kind: GenerationKind::Full, | ||
}; | ||
|
||
let generation_unit: Box<dyn Generator> = match generation_type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one point I am not particular happy with. The Box seem overkill for this match.
I guess if we match here we can just directly call the generate
fn on these different generators.
src/generated/tokens.rs
Outdated
/// Redis Type: Filterby | ||
pub enum Filterby { | ||
/// MODULE | ||
Module(String), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to allow for a better usability for the types in token.rs by adding traits like ToRedisString, ToRedisKey? These currently have underlying types based on the redis RESP (String, i64, f64)
Adding traits how? I think this too would benefit from a review comment in context, so it's easier for me to review.
In this file in general we have types that just wrap some kind of base type. I am thinking of making this generic with a trait bound of ToRedisString
so that I can put any type (that implements ToRedisString
) in my code into this enum and don't have to convert it into a String
first. Similar for the other types (i64, f64, and probably Keys which are currently Strings too).
But that might increase compilation times a lot as there are many types here and most of them are candidates to get a generic.
While shortly looking into adding the generated types as arguments to the API, I think I found some flaws in my current implementation. I currently check if we already generated a type by that name. I don't think I can assume uniqueness based on the commands.json. I will look into splitting the types into submodules base on the group of the command. For this the argument names or token names should be unique for each group. I probably have time on Tuesday and will look into that. I don't want to end up with a module per command, so we might need a smart way to avoid that when the arguments are not unique across a command group. We could add the command name itself to the name of the types, but I don't think that will result in a good usable API. One quick example is the |
This also fixes some logical flaws with the previous implementation. Previously when two types hat the same token name or in the worst case the same argument name, an identical type was created. This adds a TypeRegistry which creates a submodule when such an duplicate type is encountered. It also helps to resolve types when generating code. This solution is pretty regarding the names of modules, but at least this works for now.
I fixed the issue with duplicate types by adding a I guess there is still space for improvement regarding ergonomic usage. Maybe use some heuristic to create mutliple functions for each command when there are mutual exclusive arguments. This would need some support in the command.json. Currently these are just marked as optional. |
I think this is very promising -- I guess my biggest question is how we might get to merge this. I think a key thing that's hard to see right now is how the API changes from the old version to the new version. Maybe it would be possible to somehow (semi-manually) generate a list of function signatures from the current source code and create a diff from that to the new function signatures? |
I created a cleaned up overview of all functions in the I created a gist with both versions so you can use diff tool you prefer yourself. I hope I did not delete any commands while cleaning the output up. |
A short recap of my talks with upstream: Their reply schema PR looks pretty good. It is based on JSON schema which should allow us to generate types for the responses. I proposed some additions to their PR to improve the generation of a more ergonomic high level API. Sadly before we can utilize this we would need add support for RESP3. I am awaiting some feedback on adding a cursor hint. I have resolved the CLIENT KILL mis-documentation and we can remove that from our denylist. |
Redis Maintainers asked to use the output of a working redis instance to generate the commands.json ourselves. This adds a built_commands_json module which is a more typesafe copy of the python script redis uses to generate the commands.json in the docs repo.
This adds support to automatically generate cursor based commands using an iterator result. This currently is not finished, as the command json generator is missing support for optional and multiple arguments.
The commands.json generation based on the redis-cli output now correctly computes subcommands.
I need to retouch the type registry. The generation of the types (and deduplication) seems to be nondeterministic currently. Probably caused by iterating over a HashMap. |
Note, upstream now has optional support for the RedisJSON module. Should figure out how that fits into this... |
redis-codegen generates medium level API interfaces based on the redis
command documention provided as commands.json from the redis-doc
repository.