Add initial version of redis-codegen #1

valkum · 2022-06-16T11:39:13Z

redis-codegen generates medium level API interfaces based on the redis
command documention provided as commands.json from the redis-doc
repository.

redis-codegen generates medium level API interfaces based on the redis command documention provided as commands.json from the redis-doc repository. Currently the generated commands still use the implement_commands macro and are included using the include macro. There are some breaking changes trough this. All shorthand medium level commands from the old definition needs to be moved out and reimplemented. Probably named as high level api.

This also adds _single fn variants that do not take a slice, as a shorthand.

redis-code now generate a single module per type of implementation. The traits Commands and AsyncCommands are implemented in the commands and async_commands module. Cmd gets a additional implementation in the command module. The same holds for pipeline and cluster_pipeline. Token enum generation is WIP.

We generate a tuple based interface for block arguments now. This is over specific, but should provide a stricter API.

djc

Hey, this is looking pretty great! I think we can probably do a PR to the real project soon.

Can you make a list of open questions/issues that we should discuss?

djc · 2022-06-20T08:56:22Z

Cargo.toml

@@ -13,6 +13,12 @@ edition = "2018"
 all-features = true
 rustdoc-args = ["--cfg", "docsrs"]

+
+[workspace]
+members = [


Style nit: let's just put this on a single line?

djc · 2022-06-20T08:57:21Z

redis-codegen/Cargo.toml

+serde_path_to_error = {version = "0.1"}
+thiserror = "1.0"
+log = "0.4"
+ itertools = "0.10"


Not sure what's going on with the indentation here?

(Also, would be nice to have trailing newlines in all the TOML files.)

Yeah, the current state needs some more cleanup at various points. The commits are more or less just savepoints after a days worth of work at this point. There are some parts that changed drastically from commit 1 and I am thinking of squashing the commits for a reviewable PR.

Yes, I think squashing makes a lot of sense here.

djc · 2022-06-20T10:40:19Z

tests/generate.rs

+    }
+}
+
+const BASE_URI: &str = "https://raw.githubusercontent.com/redis/redis-doc/master/commands.json";


This seems to be unused. What's the reason to keep it here?

Similar to opentelemetry-rust I want to add a test that is ignored by default which downloads an up to date version of the commands.json. But that had low priority and therefore I have not implemented that yet.

djc · 2022-06-20T10:42:31Z

tests/generate.rs

+
+    // Compare the old version and new version and fail the test if they're different.
+
+    if versions[0] != versions[1] {


Can we just return if they're the same instead? Saves a level of indentation.

djc · 2022-06-20T10:45:15Z

redis-codegen/src/ident.rs

+        // the following keywords are not supported as raw identifiers and are therefore suffixed with an underscore.
+        "self" | "super" | "extern" | "crate" => ident += "_",


Is this actually an issue?

Not at this time. I can remove that.

Yeah, let's ditch those for now.

valkum · 2022-06-20T21:49:56Z

Hey, this is looking pretty great! I think we can probably do a PR to the real project soon.

Can you make a list of open questions/issues that we should discuss?

I will work on the oneof support and token creation tomorrow and will collect a list of open todos and questions after that.

Add allow deprecated to commands.rs Add docs to tokens.rs

This test is ignored by default similar to the sync_schemas command of opentelemetry-rust. This test can be used to update the commands.json

valkum · 2022-06-22T23:45:16Z

Alright. I implemented a first version of argument token generation. token.rs should now contain all oneof definitions and arguments that have a token associated with it.
The enums and structs there implement all ToRedisArgs and write the token to the connection before their value. Currently these are not enforced to use in the other generated APIs.

No to open TODOs:

Currently I parse everything from the commands.json but not everything is used, leading to some clippy warnings.
We do not utilize the command key spec currently. I have not found a usecase for this at the current state.
The old implement_commands!() macro generated some function bodies with std::mem::replace usage. I guess these were for move related behavior which a newer compiler is able to figure out by itself? At least I cannot come up with a need currently. I implemented the generation without std::mem:replace in the current version but left the old way in as a comment. If you agree, I can remove comment with the old way.
I implemented the basic functionality to generate alias commands to allow for some backwards compatibility. These get a deprecation tag and need the redis-rs version entered there.
Make the command* APIs aware of the now generated tokens for a true high level API.
We probably want to sanitize comments similar to prost right?
I am not 100% happy with the Box<dyn Generator> usage in code_generator/mods.rs. Maybe you have a better pattern in mind?
Do we want to allow for a better usability for the types in token.rs by adding traits like ToRedisString, ToRedisKey? These currently have underlying types based on the redis RESP (String, i64, f64)
The functions that return an iterator are currently handwritten. I hope that is alright.
I am not 100% happy with the way the modules are laid out in redis_codegen::code_generator. Maybe you have a cleaner idea?

djc · 2022-06-23T08:24:41Z

If you agree, I can remove comment with the old way.

Sounds good to me!

We probably want to sanitize comments similar to prost right?

Can you be more specific? I guess so but not exactly sure.

I am not 100% happy with the Box<dyn Generator> usage in code_generator/mods.rs. Maybe you have a better pattern in mind?

Can you make a review comment in the relevant spot with specific suggestions on why you're not happy yet?

Do we want to allow for a better usability for the types in token.rs by adding traits like ToRedisString, ToRedisKey? These currently have underlying types based on the redis RESP (String, i64, f64)

Adding traits how? I think this too would benefit from a review comment in context, so it's easier for me to review.

The functions that return an iterator are currently handwritten. I hope that is alright.

Sure, that's fine. Do you think we could propose a way upstream to make that information part of commands.json?

I am not 100% happy with the way the modules are laid out in redis_codegen::code_generator. Maybe you have a cleaner idea?

Here too, can you be more specific about your concerns?

valkum

The functions that return an iterator are currently handwritten. I hope that is alright.

Sure, that's fine. Do you think we could propose a way upstream to make that information part of commands.json?

I guess that could be added. Maybe a flag in the hints is enough? There is request_policy:special but it seems this just means there is some kind of special handling needed.

We probably want to sanitize comments similar to prost right?

Can you be more specific? I guess so but not exactly sure.

Currently we take the strings from the commands.json to render docs. Prost sanitizes urls (adds < and > and escapes [ and ]) See https://github.com/tokio-rs/prost/blob/master/prost-build/src/ast.rs#L144. I guess we should add this here as well for our docs.

I am not 100% happy with the way the modules are laid out in redis_codegen::code_generator. Maybe you have a cleaner idea?

Here too, can you be more specific about your concerns?

This is related to the files/modules in the code_generator directory. I guess we could move all generators (the ones with the _generator suffix) into a new directory/module to get this a little bit cleaner, but I guess this is purely cosmetic regarding discoverability and maintainability.

I hope the inline comments are more helpful.

valkum · 2022-06-23T21:57:15Z

redis-codegen/src/code_generator/mod.rs

+            kind: GenerationKind::Full,
+        };
+
+        let generation_unit: Box<dyn Generator> = match generation_type {


This is one point I am not particular happy with. The Box seem overkill for this match.
I guess if we match here we can just directly call the generate fn on these different generators.

valkum · 2022-06-23T22:04:19Z

src/generated/tokens.rs

+/// Redis Type: Filterby
+pub enum Filterby {
+    /// MODULE
+    Module(String),


Do we want to allow for a better usability for the types in token.rs by adding traits like ToRedisString, ToRedisKey? These currently have underlying types based on the redis RESP (String, i64, f64)

Adding traits how? I think this too would benefit from a review comment in context, so it's easier for me to review.

In this file in general we have types that just wrap some kind of base type. I am thinking of making this generic with a trait bound of ToRedisString so that I can put any type (that implements ToRedisString) in my code into this enum and don't have to convert it into a String first. Similar for the other types (i64, f64, and probably Keys which are currently Strings too).
But that might increase compilation times a lot as there are many types here and most of them are candidates to get a generic.

valkum · 2022-06-26T22:19:43Z

While shortly looking into adding the generated types as arguments to the API, I think I found some flaws in my current implementation. I currently check if we already generated a type by that name. I don't think I can assume uniqueness based on the commands.json. I will look into splitting the types into submodules base on the group of the command. For this the argument names or token names should be unique for each group. I probably have time on Tuesday and will look into that. I don't want to end up with a module per command, so we might need a smart way to avoid that when the arguments are not unique across a command group. We could add the command name itself to the name of the types, but I don't think that will result in a good usable API.

One quick example is the count argument from GEOSEARCHSTORE and from HRANDFIELD

This also fixes some logical flaws with the previous implementation. Previously when two types hat the same token name or in the worst case the same argument name, an identical type was created. This adds a TypeRegistry which creates a submodule when such an duplicate type is encountered. It also helps to resolve types when generating code. This solution is pretty regarding the names of modules, but at least this works for now.

valkum · 2022-06-30T01:55:54Z

I fixed the issue with duplicate types by adding a TypeRegistry which will create modules on the fly when it encounters a duplicate type name.
The API should use the generated types now and also fixes the problem with duplicate issued argument identifiers in the API functions.

I guess there is still space for improvement regarding ergonomic usage. Maybe use some heuristic to create mutliple functions for each command when there are mutual exclusive arguments. This would need some support in the command.json. Currently these are just marked as optional.

djc · 2022-06-30T10:59:08Z

I think this is very promising -- I guess my biggest question is how we might get to merge this. I think a key thing that's hard to see right now is how the API changes from the old version to the new version. Maybe it would be possible to somehow (semi-manually) generate a list of function signatures from the current source code and create a diff from that to the new function signatures?

valkum · 2022-07-03T21:50:13Z

I created a cleaned up overview of all functions in the Commands trait for the old and the new api.
Normal diff output is bad at showing the differences though. diff --word-diff is working better but not optimal. difftastic had a huge memory footprint and failed to produce output.

I created a gist with both versions so you can use diff tool you prefer yourself. I hope I did not delete any commands while cleaning the output up.
The gist can be found here: https://gist.github.com/valkum/ad83391b94e76c6a684136dae79ea278
The aligned versions should output better diffs. I aligned the commands by line by hand.

valkum · 2022-07-17T17:19:03Z

A short recap of my talks with upstream:
They are happy that someone is using their json spec. They propose to query the commands.json from a redis instance instead of using the hosted version from their repository. This is something we can definitely add and replace the download "test".

Their reply schema PR looks pretty good. It is based on JSON schema which should allow us to generate types for the responses.
The question of de-duplicating response types would be important there to. So maybe we can come up with a generic way to generate these types and solve name clashes. (as posted earlier having a module for each command would probably yield a lot of identical types and is a waste of computation time when compiling redis-rs).

I proposed some additions to their PR to improve the generation of a more ergonomic high level API. Sadly before we can utilize this we would need add support for RESP3.

I am awaiting some feedback on adding a cursor hint.

I have resolved the CLIENT KILL mis-documentation and we can remove that from our denylist.

Redis Maintainers asked to use the output of a working redis instance to generate the commands.json ourselves. This adds a built_commands_json module which is a more typesafe copy of the python script redis uses to generate the commands.json in the docs repo.

This adds support to automatically generate cursor based commands using an iterator result. This currently is not finished, as the command json generator is missing support for optional and multiple arguments.

The commands.json generation based on the redis-cli output now correctly computes subcommands.

valkum · 2022-09-05T23:28:21Z

I need to retouch the type registry. The generation of the types (and deduplication) seems to be nondeterministic currently. Probably caused by iterating over a HashMap.
After that I plan to update all the tests and examples and do some small finishing touches before I submit a squashed PR to the upstream repo for broader feedback.

djc · 2022-09-07T10:19:01Z

Note, upstream now has optional support for the RedisJSON module. Should figure out how that fits into this...

valkum added 3 commits June 11, 2022 23:12

Add support for multiple tag through slices

1693f92

This also adds _single fn variants that do not take a slice, as a shorthand.

valkum force-pushed the codegen branch from e066a58 to c2fda18 Compare June 16, 2022 18:28

Improve handling of block arguments

afde254

We generate a tuple based interface for block arguments now. This is over specific, but should provide a stricter API.

djc reviewed Jun 20, 2022

View reviewed changes

valkum added 7 commits June 23, 2022 01:22

Refactor Generator trait and some smaller related parts

b1b1fbd

Add struct and enum generated based on tokens and oneof arguments

2da46c3

Update generated files

944fba6

Remove some of the warning in generated files.

fc18c4d

Add allow deprecated to commands.rs Add docs to tokens.rs

Cleanup and add exception for rustfmt to generated files

e0cab8b

Add sync_commands_json test

a88af98

This test is ignored by default similar to the sync_schemas command of opentelemetry-rust. This test can be used to update the commands.json

Tidy up Cargo.toml and remove outdated Todo

15807e1

valkum commented Jun 23, 2022

View reviewed changes

valkum added 2 commits June 30, 2022 03:48

Start updating some of the tests

3206673

valkum added 4 commits July 25, 2022 00:35

Add initial overwrite support

a266465

This adds support to automatically generate cursor based commands using an iterator result. This currently is not finished, as the command json generator is missing support for optional and multiple arguments.

Add generated notice to generated files

48bd85e

Finish commands.json generation and update generated code

d7fa5bf

The commands.json generation based on the redis-cli output now correctly computes subcommands.

valkum force-pushed the codegen branch from 043f441 to d7fa5bf Compare September 5, 2022 23:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add initial version of redis-codegen #1

Add initial version of redis-codegen #1

valkum commented Jun 16, 2022

djc left a comment

djc Jun 20, 2022

djc Jun 20, 2022

valkum Jun 20, 2022

djc Jun 23, 2022

djc Jun 20, 2022

valkum Jun 20, 2022

djc Jun 20, 2022

djc Jun 20, 2022

valkum Jun 20, 2022

djc Jun 23, 2022

valkum commented Jun 20, 2022

valkum commented Jun 22, 2022

djc commented Jun 23, 2022

valkum left a comment

valkum Jun 23, 2022

valkum Jun 23, 2022

valkum commented Jun 26, 2022

valkum commented Jun 30, 2022

djc commented Jun 30, 2022

valkum commented Jul 3, 2022

valkum commented Jul 17, 2022 •

edited

valkum commented Sep 5, 2022

djc commented Sep 7, 2022 •

edited


		// Compare the old version and new version and fail the test if they're different.

		if versions[0] != versions[1] {

		// the following keywords are not supported as raw identifiers and are therefore suffixed with an underscore.
		"self" \| "super" \| "extern" \| "crate" => ident += "_",

Add initial version of redis-codegen #1

Are you sure you want to change the base?

Add initial version of redis-codegen #1

Conversation

valkum commented Jun 16, 2022

djc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

valkum commented Jun 20, 2022

valkum commented Jun 22, 2022

djc commented Jun 23, 2022

valkum left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

valkum commented Jun 26, 2022

valkum commented Jun 30, 2022

djc commented Jun 30, 2022

valkum commented Jul 3, 2022

valkum commented Jul 17, 2022 • edited

valkum commented Sep 5, 2022

djc commented Sep 7, 2022 • edited

valkum commented Jul 17, 2022 •

edited

djc commented Sep 7, 2022 •

edited