Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spec unclear regarding malformed sql comments #57

Open
Naveenaidu opened this issue Mar 28, 2021 · 2 comments
Open

Spec unclear regarding malformed sql comments #57

Naveenaidu opened this issue Mar 28, 2021 · 2 comments

Comments

@Naveenaidu
Copy link

Naveenaidu commented Mar 28, 2021

@adamegyed @aabmass While reading through the specification of the sqlcommenter, specifically the Comment Escaping section. It states that:

If a comment already exists within a SQL statement, we MUST NOT mutate that statement.

The above statement is clear for the scenarios when we have properly formed comment such as:

SELECT * from FOO -- this is the comment
SELECT * from FOO /* this is the comment */

But the spec does not state what happens when we have malformed comments, such as:

SELECT * from FOO /* this is the comment
SELECT * from FOO *//* this is the comment 

Looking at the implementation of sqlcommenter in the provided languages, there seem to be discrepancies. Please do correct me if that is wrong.

Java

  private Boolean hasSQLComment(String stmt) {
    if (stmt == null || stmt.isEmpty()) {
      return false;
    }

    // Perhaps we now have the closing comment or not but that doesn't matter
    // as the SQL should be reported as having an opening comment regardless
    // of it is in properly escaped or not.
    return stmt.contains("--") || stmt.contains("/*");
  }

In this implementation as soon as we see that there is a opening comment tag, we return the SQL statement as it is.

References:

  1. Test cases

NodeJS

exports.hasComment = (sql) => {

    if (!sql)
        return false;

    // See https://docs.oracle.com/cd/B12037_01/server.101/b10759/sql_elements006.htm
    // for how to detect comments.
    const indexOpeningDashDashComment = sql.indexOf('--');
    if (indexOpeningDashDashComment >= 0)
        return true;

    const indexOpeningSlashComment = sql.indexOf('/*');
    if (indexOpeningSlashComment < 0)
        return false;

    // Check if it is a well formed comment.
    const indexClosingSlashComment = sql.indexOf('*/');
    
    /* c8 ignore next */
    return indexOpeningSlashComment < indexClosingSlashComment;
}

In this implementation we return false if there is a malformed comments which is totally opposite to what the implementation of java does.. That means, we do append the comment created by the library to the SQL statement.

References:

  1. Test cases

Python and Ruby

I was unable to find the place where, we check if the SQL statement has any comments or not. Does this mean that the implementation in these languages, we append the comment created by library irrespective of if there is a comment present in the statement. I just had a cursory look in the library, so please do correct me if I'm wrong.


Keeping in consideration the above thing, the question would be:

What to do when we have a malformed comment in the SQL statement, do we ignore the statement and return it as it is OR do we append the comment to the SQL statement?

@Naveenaidu
Copy link
Author

There another question regarding the spec:

The URL encoding is different especially in the implementation of python library.

def url_quote(s):
    if not isinstance(s, (str, bytes)):
        return s
    quoted = url_quote_fn(s)
    # Since SQL uses '%' as a keyword, '%' is a by-product of url quoting
    # e.g. foo,bar --> foo%2Cbar
    # thus in our quoting, we need to escape it too to finally give
    #      foo,bar --> foo%%2Cbar
    return quoted.replace('%', '%%')

As you see above, there are escaping the % sign after the url-encoding. This does not happen in other language implementation. So I'm not sure why they used this approach in python.

When again reading through the python documentation of the sqlcommenter, I find the following:

In the example of the SQLAlchemy section, the comment to the SQL statement is as follows:

2019-05-28 11:52:06.527 PDT [64087] LOG:  statement: SELECT * FROM polls_question
/*db_driver='psycopg2',framework='sqlalchemy%3A1.3.4',
traceparent='00-5bd66ef5095369c7b0d1f8f4bd33716a-c532cb4098ac3dd2-01',
tracestate='congo%%3Dt61rcWkgMzE%%2Crojo%%3D00f067aa0ba902b7'*/

If you notice the value tracestate key in the above comment you would notice that we do %%3D instead of %3D, but for the other key framework whose value is sqlalchemy%3A1.3.4 we do not escape the % sign. Why do we escape the % sign in the tracestate and why not in framework

Where as

In the example of the Psycopg2 section, the comment to the SQL statement is as follows:

2019-05-28 02:33:25.287 PDT [57302] LOG:  statement: SELECT * FROM
polls_question /*db_driver='psycopg2%%3A2.8.2%%20%%28dt%%20dec%%20pq3%%20ext%%20lo64%%29',
dbapi_level='2.0',dbapi_threadsafety=2,driver_paramstyle='pyformat',
libpq_version=100001,traceparent='00-5bd66ef5095369c7b0d1f8f4bd33716a-c532cb4098ac3dd2-01',
tracestate='congo%%3Dt61rcWkgMzE%%2Crojo%%3D00f067aa0ba902b7'*/

If you notice the comment here, all the % in the URL encoded value are escaped by % value. This behaviour is different than the one in SQLAlchemy.

This makes me ask the question:

  1. Why are there two different implementations of how to URL encode the values in python libraries. And why don't we escape the % in other libraries?
  2. If I were to implement the spec should I follow this implementation and escape the % symbol in the URL encoded values OR follow the spec and just URL encode the values and not escape the & symbol.

@aabmass
Copy link
Member

aabmass commented Apr 15, 2021

@omirho, can you comment on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants