Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minifyAliases does not fully resolve PostgreSQL alias identifier length constraints #11937

Closed
2 of 5 tasks
Nick-Riggs opened this issue Feb 19, 2020 · 5 comments · Fixed by #11940
Closed
2 of 5 tasks
Labels
dialect: postgres For issues and PRs. Things that involve PostgreSQL (and do not involve all dialects). hard For issues and PRs. status: understood For issues. Applied when the issue is understood / reproducible. type: feature For issues and PRs. For new features. Never breaking changes.

Comments

@Nick-Riggs
Copy link

Nick-Riggs commented Feb 19, 2020

Issue Description

minifyAliases does not minify aliases used in the FROM clause, leading to PostgreSQL identifier length issues.

What are you doing?

We have a query with a long chain of includes to eager load. This lead to a query generated by sequelize that had some rather long aliases in the FROM clause. Some of these identifiers exceeded PostgreSQL's 63-byte length constraint, causing errors. We found the minifyAliases option, which seems to only minify aliases in the SELECT clause, not the FROM clause.

Example code

const Sequelize = require('sequelize');
const Model = Sequelize.Model;
const sequelize = new Sequelize({
    dialect: "sqlite",
    storage: "database.sqlite",
    minifyAliases: true
});
class User extends Model { }
User.init({
    name: Sequelize.STRING,
    email: Sequelize.STRING
}, {
    sequelize,
    modelName: "user_model_name_that_is_long_for_demo"
});
class Project extends Model { }
Project.init({
    name: Sequelize.STRING
}, {
    sequelize,
    modelName: "project_model_name_that_is_long_for_demo"
});
class Company extends Model { }
Company.init({
    name: Sequelize.STRING
}, {
    sequelize,
    modelName: "company_model_name_that_is_long_for_demo"
});
User.hasMany(Project);
Project.belongsTo(Company);
(async () => {
    await sequelize.sync({ force: true });
    await User.findAll({
        include: {
            model: Project,
            include: Company
        }
    });
})();

What do you expect to happen?

A query to be generated with all aliases mapped to identifiers that don't break PostgreSQL's length limit.

What is actually happening?

Long aliases in the FROM clause, see below:

SELECT "user_model_name_that_is_long_for_demo"."id",
       "user_model_name_that_is_long_for_demo"."name",
       "user_model_name_that_is_long_for_demo"."email",
       "user_model_name_that_is_long_for_demo"."createdAt",
       "user_model_name_that_is_long_for_demo"."updatedAt",
       "project_model_name_that_is_long_for_demos"."id"                                                  AS "_0",
       "project_model_name_that_is_long_for_demos"."name"                                                AS "_1",
       "project_model_name_that_is_long_for_demos"."createdAt"                                           AS "_2",
       "project_model_name_that_is_long_for_demos"."updatedAt"                                           AS "_3",
       "project_model_name_that_is_long_for_demos"."userModelNameThatIsLongForDemoId"                    AS "_4",
       "project_model_name_that_is_long_for_demos"."companyModelNameThatIsLongForDemoId"                 AS "_5",
       "project_model_name_that_is_long_for_demos->company_model_name_that_is_long_for_demo"."id"        AS "_6",
       "project_model_name_that_is_long_for_demos->company_model_name_that_is_long_for_demo"."name"      AS "_7",
       "project_model_name_that_is_long_for_demos->company_model_name_that_is_long_for_demo"."createdAt" AS "_8",
       "project_model_name_that_is_long_for_demos->company_model_name_that_is_long_for_demo"."updatedAt" AS "_9"
FROM "user_model_name_that_is_long_for_demos " AS "user_model_name_that_is_long_for_demo "
         LEFT OUTER JOIN "project_model_name_that_is_long_for_demos " AS "project_model_name_that_is_long_for_demos"
                         ON "user_model_name_that_is_long_for_demo"."id " =
                            "project_model_name_that_is_long_for_demos"."userModelNameThatIsLongForDemoId "
         LEFT OUTER JOIN "company_model_name_that_is_long_for_demos " AS "project_model_name_that_is_long_for_demos->company_model_name_that_is_long_for_demo "
                         ON "project_model_name_that_is_long_for_demos"."companyModelNameThatIsLongForDemoId " =
                            "project_model_name_that_is_long_for_demos->company_model_name_that_is_long_for_demo"."id";

Environment

  • Sequelize version: 5.21.4
  • Node.js version: v12.7.0
  • Operating System: Windows 10

How does this problem relate to dialects?

minifyAliases is really only necessary for PostgreSQL but works with any dialect.

  • I think this problem happens regardless of the dialect.

Would you be willing to resolve this issue by submitting a Pull Request?

  • Yes, I have the time and I know how to start.
  • Yes, I have the time but I don't know how to start, I would need guidance.
  • No, I don't have the time, although I believe I could do it if I had the time...
  • No, I don't have the time and I wouldn't even know how to start.
@mrrinot
Copy link
Contributor

mrrinot commented Feb 19, 2020

I looked into it this morning to remind myself why I didn't do this at the time.
It seems very much harder and trickier than just aliasing the fields the way I did it.

It would mean aliasing table names before aliasing fields because I use the table names to store field aliases (because multiple table can have identically named field).

One of our (@mattp94) original ideas was to alias every fields on model define/init to then replace them after the query executes.

I think it could work, I'll try to make it work and report back here but no promises.

@mrrinot
Copy link
Contributor

mrrinot commented Feb 19, 2020

As expected, aliasing the modelName is very easy, the problem comes when I want to revert it back.

I tried to bypass touching the aliasing part in the query construction by modifying the modelName in the Model.init() function.

It works of course, but after that your model is aliased forever, I don't think it's what people want.
Even if I managed to rename the keys in the dataValues returned by any query, the model names would still stay aliased throughout your project.

The solution is to add a new attribute to the model definition on the fly (something like automaticAlias).
The problem is that you would have to use that optional field everywhere you would use the modelName in the query construction AND in the results construction code only when minifyAliasesis set.

The code for both these parts is huge and very hard to understand and modify without potential unintended side-effects.

Also, this solution reduces the model names to 2 or 3 characters, so you could still surpass the 63 characters limit if you nest includes deep enough (around 13 nested includes)

@papb papb added dialect: postgres For issues and PRs. Things that involve PostgreSQL (and do not involve all dialects). hard For issues and PRs. status: understood For issues. Applied when the issue is understood / reproducible. type: feature For issues and PRs. For new features. Never breaking changes. labels Feb 19, 2020
@Nick-Riggs
Copy link
Author

@mrrinot Thank you for looking into it. I took some time and reviewed the code as well and you got a lot further than I did. I agree it's a difficult problem to tackle.

While some extreme cases would still hit the after being minified, I think it would help the problem a great deal. As I hit the problem after only 5 well named, but not overly verbose, associations.

Along the lines of some of the suggestions you made, I had already worked out a work-around for us for our particular problem query. I'll explain it for any that find themselves here before it's resolved (hopefully).

I ended up making copies of some of my associations, just with very small names. And using those associations in the query having the length issue.

So as an example from the code I posted above, something like this:

User.hasMany(Project);
Project.belongsTo(Company);

//copy of the associations above, just to use when hitting the postgresql limitation
User.hasMany(Project, { as: "p" });
Project.belongsTo(Company, { as: "c" });

(async () => {
    await sequelize.sync({ force: true });

    await User.findAll({
        //using the `as` option to ensure use of the copied association
        include: {
            as: "p",
            model: Project,
            include: {
                as: "c",
                model: Company
            }
        }
    });
})();

This results in SQL such as the following in the FROM clause:

`company_model_name_that_is_long_for_demos` AS `p->c`

Then, (I'm not sure this would be the case for everyone), I was able to assign the copied association back to the "real" association without side-effects (for me) on the instance. So that the rest of my code base wouldn't have to be updated to check for both. Such as:

user.project = user.p;
delete user["p"];

@mrrinot
Copy link
Contributor

mrrinot commented Feb 19, 2020

I was giving up when I thought: "hey you don't need to alias each individual model just the JOIN aliases on includes" because single model names should never go over 63 characters (at least it's easy to fix in this case).

So I tracked down where includes are aliased in the code: here.

I tried simply aliasing the asRight attributes in hope to replace it globally later in the SQL string.

I used your example to test it:

  "project_model_name_that_is_long_for_demos"."companyModelNameThatIsLongForDemoId"                 AS "_5",
  "project_model_name_that_is_long_for_demos->company_model_name_that_is_long_for_demo"."id"        AS "_6",

First problem was that in this example, the second join alias uses the first join alias.

Here project_model_name_that_is_long_for_demos was replaced by %0 and in the same replace the line below became: %0->company_model_name_that_is_long_for_demo instead of %1.

To prevent this, I reversed the aliases order and the %1was then replaced first. I think the optimal way to do this is to order aliases by length to prevent collision like that (replacing the longest one first always works)

This fixed your test so I then tried the solution with every other test in the suite.

One of them tests belongsToMany association like so

ctx.Article.belongsToMany(ctx.Label, { through: 'ArticleLabels' });

Without an as: "" option the alias is the same as the model name, and thus the generated SQL is:
INNER JOIN "ArticleLabels" AS "ArticleLabels"

This caused the global replace to produce INNER JOIN "%0" AS "%0" which doesn't work.

To counteract this (and also for performance, I guess ?) I thought to only shorten a JOIN alias if it surpasses the 63 character limit so that it is only used on long includes.

I'll try to open a merge request for people to look

@mrrinot
Copy link
Contributor

mrrinot commented Feb 19, 2020

Here is the MR #11940 if anyone wants to look at it and review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dialect: postgres For issues and PRs. Things that involve PostgreSQL (and do not involve all dialects). hard For issues and PRs. status: understood For issues. Applied when the issue is understood / reproducible. type: feature For issues and PRs. For new features. Never breaking changes.
Projects
None yet
3 participants