Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PSL: define the grammar of string literals #4167

Closed
tomhoule opened this issue May 6, 2020 · 11 comments · Fixed by prisma/prisma-engines#2996
Closed

PSL: define the grammar of string literals #4167

tomhoule opened this issue May 6, 2020 · 11 comments · Fixed by prisma/prisma-engines#2996
Assignees
Labels
Milestone

Comments

@tomhoule
Copy link
Contributor

tomhoule commented May 6, 2020

This is mostly an issue in @default("some string here") directives.

At the moment, escaping has an effect on parsing, but not on migrations, nor on what gets inserted by the query engine. The default string the query engine will insert is exactly what is inside the quotes in the schema, including the backslashes used for escaping.

Illustration

Given the following schema:

datasource db {
  provider = "sqlite"
  url      = "file:dev.db"
}

generator client {
  provider = "prisma-client-js"
}

model User {
  id         Int    @id
  name       String @default("Jean Claude")
  secondName String @default("Jean \" \n Claude")
}

if you create one user with just the id, leaving the fields with defaults to be inserted by prisma, the response will be (I tested this, on both postgres and sqlite):

{
  "data": {
    "createOneUser": {
      "id": 6,
      "name": "Jean Claude",
      "secondName": "Jean \\\" \\n Claude"
    }
  }
}

Proposed solution

Similar to most languages and formats with string literals (e.g. JSON), define \ as an escaping character that will be removed from the string during parsing.

Some languages allow a \ before any character, some (like JSON) only as the first token of a recognized escape sequence.

Regarding escape sequences: some string literals cannot be expressed as literals within the current schema parser. The most common example is probably strings containing a newline. Therefore common escape sequences, like \n, could be a worthwhile addition. Same goes for a "bare" double quote.

Related issues

#1888

@tomhoule
Copy link
Contributor Author

tomhoule commented May 6, 2020

I am wondering if this is spec material, or something engineering should take care of and have documented.

@tomhoule
Copy link
Contributor Author

There is initial work towards addressing the most important limitations here: prisma/prisma-engines#738

I will update this issue with once that is reviewed and merged.

@tomhoule
Copy link
Contributor Author

Current state with prisma/prisma-engines#738 : the only changes to the previous state (no escaping implemented) is that \" in a schema string literal is translated to just " (so you can have double quotes inside string literals), and \n is interpreted as a newline character.

@janpio janpio transferred this issue from prisma/specs Sep 12, 2020
@tomhoule tomhoule transferred this issue from prisma/migrate Nov 5, 2020
@tomhoule
Copy link
Contributor Author

tomhoule commented Nov 5, 2020

Moving this issue to prisma/prisma since it is about the Prisma Schema Language. This is not critical for the migration engine at the moment, but it would be nice to have a grammar.

@matthewmueller
Copy link
Contributor

matthewmueller commented Nov 5, 2020

Agreed. Here's a non-official grammar, but probably a good starting point: https://github.com/prisma/prismafile/blob/master/src/parser/index.pegjs

It supports escape sequences in the same way that JSON does: https://github.com/prisma/prismafile/blob/23098e1febebf0aa6ca790aff41faac71dcb233a/src/parser/index.pegjs#L392

Currently it drives the prisma/upgrade tool.

@janpio janpio added the kind/feature A request for a new feature. label Nov 6, 2020
@pantharshit00 pantharshit00 added team/client Issue for team Client. team/schema Issue for team Schema. and removed team/client Issue for team Client. labels Apr 23, 2021
@tomhoule
Copy link
Contributor Author

Relevant: prisma/prisma-engines#1973

@Ultra-Instinct-05
Copy link

Is this unofficial grammar present anywhere for viewing ? Right now, https://github.com/prisma/prismafile/blob/master/src/parser/index.pegjs leads to a 404 (which probably means the repo doesn't exist or is private)

@tomhoule tomhoule self-assigned this Jun 20, 2022
@janpio janpio added this to the 4.0.0 milestone Jun 22, 2022
tomhoule added a commit to prisma/prisma-engines that referenced this issue Jun 22, 2022
Simplify pest grammar of string literals

We want to tokenize invalid escape sequences and report them as such
later in validation.

This results in better error messages across the board.

closes prisma/prisma#4167
tomhoule added a commit to prisma/prisma-engines that referenced this issue Jun 22, 2022
Simplify pest grammar of string literals

We want to tokenize invalid escape sequences and report them as such
later in validation.

This results in better error messages across the board.

closes prisma/prisma#4167
tomhoule added a commit to prisma/prisma-engines that referenced this issue Jun 22, 2022
Simplify pest grammar of string literals

We want to tokenize invalid escape sequences and report them as such
later in validation.

This results in better error messages across the board.

closes prisma/prisma#4167
@Ultra-Instinct-05
Copy link

Is this unofficial grammar present anywhere for viewing ? Right now, https://github.com/prisma/prismafile/blob/master/src/parser/index.pegjs leads to a 404 (which probably means the repo doesn't exist or is private)

@tomhoule Sorry for the ping but anything on this ? Is the PSL grammar file available for others to view ?

@tomhoule
Copy link
Contributor Author

Hi @Ultra-Instinct-05 , sorry I missed your previous comment.

The grammar used inside Prisma is https://github.com/prisma/prisma-engines/blob/main/libs/datamodel/schema-ast/src/parser/datamodel.pest — note however that we apply a bunch of validations on top, so this file defines a grammar that is more permissive than what we actually allow (for example: trailing commas in attributes). Notably, the new grammar for string literals is meant to be an exact implementation of the grammar for string literals from the JSON spec (https://datatracker.ietf.org/doc/html/rfc8259).

@tomhoule
Copy link
Contributor Author

There is no spec-style well defined grammar beyond what the implementation defines, but if this something you would like to have, the best course of action would be to open an issue so we can track interest into that and make prioritization decisions.

@Ultra-Instinct-05
Copy link

@tomhoule I maintain a Sublime Text package to highlight schema files https://github.com/Sublime-Instincts/PrismaHighlight. It would be nice if there was documentation on the official grammar, so that the package highlights stuff correctly. Right now it works, but I just looked at various example schema files to make the package.

I am pretty sure even VS Code can take advantage of it 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants