Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define a naming convention for each language #20

Open
jonaslagoni opened this issue Oct 8, 2021 · 13 comments
Open

Define a naming convention for each language #20

jonaslagoni opened this issue Oct 8, 2021 · 13 comments

Comments

@jonaslagoni
Copy link
Member

Many languages have conventions and strict ways you name either properties, classes, etc. And almost all languages won't accept a property such as {"properties": {"some weird €&# property name": {}}.

So we need to define all the conventions that will be followed, so it is clear what to use in which case.

This is needed to provide an accurate test runner for the test suite.

@jdesrosiers
Copy link
Member

Maybe we don't need to define this. Each implementation will build their own test runner that runs the test suite against their implementation and can define their own naming conventions. The test suite just needs to assert that the some weird €&# property name property is represented in the object some how. The implementation's test runner would then do something like:

const propertyName = "some weird €&# property name";
const sanitizedName = sanitizeName(propertyName);
return hasPropertyName(type, sanitizedName);

@jonaslagoni
Copy link
Member Author

jonaslagoni commented Oct 19, 2021

That could be an option.

I just feel like we leave too much for the implementors to figure out, so IMO this needs to be described in a way so the outputs will ways be compilable and consistent for the end-user.

Some of the naming rules we have encountered so far:

  1. Property names and data model names cannot contain a number as the first character, so we need to define behavior when that is encountered: {"properties": {"12Prop": {}}
  2. The property name cannot be the same as the data model it belongs to: {"name": "PropClass", "properties": {"PropClass": {}}
  3. Property names and data model names cannot be reserved keyword names: {"properties": {"return": {}}, {"name": "return"}
  4. What happens if the property name contains special cases (including space ): {"properties": {"some prop !"#€%&/()=": {}},
  5. If a property is renamed, we must make sure that it does not clash with the already existing property name (say we append number to the property name when a number is first char): {"properties": {"12Prop": {}, "number12Prop}.

These are just some of the rules we have encountered and had to find a behavior for. Would be great if this was a standard behavior we could follow... As the output for the end-user is more predictable.

@chriskapp
Copy link

Ok, I think you could add a general recommendation to use only normal ASCII characters for identifiers but in the end the rules which characters are allowed for a specific target language should be always handled by the code generator, since this is the only place where we know the exact rules which characters are allowed. Iam also not sure whether this is something which must be handled in a test suite, since this is basically an implementation detail.

@jdesrosiers
Copy link
Member

I just feel like we leave too much for the implementors to figure out, so IMO this needs to be described in a way so the outputs will ways be compilable and consistent for the end-user.

Different languages have different naming conventions (camelCase vs PascalCase vs underscore_case vs snake-case). We can't define a rule for every language, but you've convinced me that names should be generated that are consistent with a language's standard convention. That means we have to allow some some leeway for implementations to transform names how they see fit to match their target language's style.

What if we provide recommendations for dealing with certain issues, but not make it a hard requirement? For example, if the target language doesn't support numbers in a name, the number should be replaced with the spelled out version of the number. The implementation doesn't need to do it if the language is fine with numbers in names, but if they choose the alter the name, there's a clear recommendation about how to do it.

Iam also not sure whether this is something which must be handled in a test suite, since this is basically an implementation detail.

I too see this as an implementation detail, but I also see the value in providing implementations some guidance.

@jonaslagoni
Copy link
Member Author

jonaslagoni commented Oct 20, 2021

Ok, I think you could add a general recommendation to use only normal ASCII characters for identifiers but in the end the rules which characters are allowed for a specific target language should be always handled by the code generator, since this is the only place where we know the exact rules which characters are allowed

@chriskapp recommendation does not change that users will provide it 🙂 It will happen either way, and we at least need some guidance how to best handle such cases when you encounter it 😄

What if we provide recommendations for dealing with certain issues, but not make it a hard requirement? For example, if the target language doesn't support numbers in a name, the number should be replaced with the spelled-out version of the number. The implementation doesn't need to do it if the language is fine with numbers in names, but if they choose the alter the name, there's a clear recommendation about how to do it.

I like this a lot!

I get that simply forcing a naming convention will be impossible and in many cases, it comes down to preference as well. But explaining how to handle these issues, would be helpful for everyone, of course, if you want another convention that's up to you, but at least we give you some recommendations on how it can be dealt with.

So for the following problems, what would you folks recommend they are handled? #20 (comment)

  1. I like your suggestion @jdesrosiers that the number should be replaced with the spelled-out version. i.e. 12Prop becomes TwelveProp. We could also just do it with the first character so it becomes One2Prop 🤔
  2. Honestly, I don't see any other way than prepending/postpending something like Reserved (or any other). I.e. similar to Define how to handle reserved keywords #18
  3. see Define how to handle reserved keywords #18
  4. One way could be to remove special characters entirely... However, if you have two properties such as prop and prop+, these two will get the same property name, which I don't think is a usable solution. So maybe we can do that same thing as issue 1? 🤔 There is quite a lot of special characters though, and mapping them is not easy if not impossible.
  5. Same as issue 2.

@jonaslagoni
Copy link
Member Author

jonaslagoni commented Oct 21, 2021

Stumbled upon this blog post by QuickType that explains their process (to some extend) how they handle it - https://blog.quicktype.io/little-big-detail-1-perfect-property-names/

@yordis
Copy link

yordis commented Dec 20, 2021

OpenAPI Operation ID suffers from the same problem.

I wouldn't recommend going with language-specific support, the problem most of the time is when people come up with characters that make it harder to convert to a particular casing correctly. For example, allowing acronyms, in combination with PascalCase and or camelCase.

Reasons why we decided to enforce to use /^[a-z]([a-z][0-9]_)+$/ (or something around those lines, forgot the regex value) and you can decide to the casing you want. Everything is lowercased for simplifying and everything is separated by _.

@jonaslagoni
Copy link
Member Author

jonaslagoni commented Jan 6, 2022

I wouldn't recommend going with language-specific support, the problem most of the time is when people come up with characters that make it harder to convert to a particular casing correctly. For example, allowing acronyms, in combination with PascalCase and or camelCase.

@yordis do you mind clarifying this a bit more, I don't think I follow?

Reasons why we decided to enforce to use /^[a-z]([a-z][0-9]_)+$/ (or something around those lines, forgot the regex value) and you can decide to the casing you want. Everything is lowercased for simplifying and everything is separated by _.

Enforcing is good when you have control of the schema, cause that way you always end up with the expected name 😄 However, as JSON Schema doesn't have any enforced rules, we can't either.

So I think the solution is to offer a well-defined set of naming conventions that can be followed if you cannot control the input JSON schema (for example when you cant enforce linting).

I think we can give the solution of enforcing namings in those cases where it fit's 🙂

@yordis
Copy link

yordis commented Mar 1, 2022

@jonaslagoni I get your perspective, I just think you will be taken too much of a burden, and hopefully will not halt the execution of the proposal.

It is quite hard to know every programing language out there today and tomorrow, in order to make a good judgment.

Especially given programming languages such as Clojure that uses dashes in their function names as a convention in some projects.

I would stay away from anything related to allowed values of any kind and let the developers decide on that matters.

Tackling such a problem is a huge task and I think what is valuable from the spec is having a reserved key to work around it.

:2cents:

@jonaslagoni
Copy link
Member Author

It is quite hard to know every programing language out there today and tomorrow, in order to make a good judgment.

Definitely, and I don't expect we will be able to define it for every language for this proposal to form 🙂 However, this is something that can be slowly added in order to work together to reduce time to implementation.

If you are a tool developer and want to create a generator, figuring all of these things out would be a mess 😅 So I would say collaborating on this is paramount!

Of course, nothing is forced, you can have your own interpretation steer from the suggestions, up to you 🙂 But that hurt the consistency for users as using two tools provide different outcomes.

@gregsdennis
Copy link
Member

I'm perfectly happy with merely stating that implementations need to be able to handle certain cases (perhaps with examples) and leaving it to them (who we expect should be the experts in their chosen language) to decide exactly how they're handled.

I don't think this needs to be specified for any particular languages.

As a potential implementor, I certainly would not want this specification mandating how I name properties or types in C#. There are well-known conventions already in place that I would follow, and if there are conflicts between this spec and those conventions, I will use the conventions as it's what my user base would expect.

@jonaslagoni
Copy link
Member Author

As a potential implementor, I certainly would not want this specification mandating how I name properties or types in C#. There are well-known conventions already in place that I would follow, and if there are conflicts between this spec and those conventions, I will use the conventions as it's what my user base would expect.

Agreed, the naming conventions should definitely not be seen as a mandate, but instead, something that helps you be aware of cases you can encounter and how to mitigate them with suggestions based on how others have solved them.

I.e. knowledge sharing as much as possible 🙂

@yordis
Copy link

yordis commented Jul 27, 2023

https://github.com/stripe/openapi uses x-resourceId as the "schema id" concept we discussed.

By now, the schema identity for code-gen sake is primarily given.

image

It should be a simple enough string for people to generate most SDKs. Leave aside how a programming language would prefer to implement it, but suggest avoiding weird values.

Could it be a reverse-DNS name to reflect global uniqueness to some extent (in case you reuse types from even external specs)? But still, I would keep it loose and see how people leverage it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants