Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Left join and group by on nullable field returns runtime UnexpectedNullError error - PostgreSQL, query macros #3164

Open
BenJeau opened this issue Mar 31, 2024 · 5 comments
Labels

Comments

@BenJeau
Copy link

BenJeau commented Mar 31, 2024

Bug Description

When doing a left join between two tables on a field that is nullable and performing a group by on the joined table field, a UnexpectedNullError is returned

Minimal Reproduction

The following is a simplified reproduction from my current project:

CREATE TABLE "providers" (
    "id" BIGINT PRIMARY KEY,
    "name" TEXT NOT NULL
);

CREATE TABLE "sources" (
    "id" BIGINT PRIMARY KEY,
    "provider_id" INTEGER REFERENCES "providers" ("id")
);

INSERT INTO "providers" ("name") VALUES ('github');
INSERT INTO "sources" ("provider_id") VALUES (1), (1), (NULL);

The following query within SQLx errors but should return valid data:

sqlx::query!(
        r#"SELECT count(*), providers.id, providers.name
FROM sources
LEFT JOIN providers ON providers.id = sources.provider_id
GROUP BY providers.id;"#
    )
    .fetch_all(pool)
    .await;

Value of fetch_all return:

Err(
    ColumnDecode {
        index: "1",
        source: UnexpectedNullError,
    },
)

Query data using an external DB tool:

[
  {
    "count": 1,
    "id": null,
    "name": null
  },
  {
    "count": 2,
    "id": 1,
    "name": "google"
  }
]

I alter the query to reference the original table only, everything works:

sqlx::query!(
        r#"SELECT count(*), sources.provider_id
FROM sources
LEFT JOIN providers ON providers.id = sources.provider_id
GROUP BY sources.provider_id;"#
    )
    .fetch_all(pool)
    .await;

Value of fetch_all return:

Ok(
    [
        Record {
            count: Some(
                1,
            ),
            provider_id: None,
        },
        Record {
            count: Some(
                2,
            ),
            provider_id: Some(
                1,
            ),
        },
    ]
)

But since I want the other field (name) from the other table, I cannot/do not want to do the last query.

Info

  • SQLx version: 0.7.4
  • SQLx features enabled: ["chrono", "json", "macros", "postgres", "runtime-tokio-native-tls", "uuid"]
  • Database server and version: PostgreSQL 16.2 (running within docker with the postgres:16.2-alpine3.19 image)
  • Operating system: MacOS 14.4.1
  • rustc --version: rustc 1.77.0 (aedd173a2 2024-03-17) (running within docker with the rust:1.77.0-slim-bookworm image)
@BenJeau BenJeau added the bug label Mar 31, 2024
@BenJeau
Copy link
Author

BenJeau commented Mar 31, 2024

I have tested with 0.7.4, 0.7.0 and 0.6.3 and they all have this issue.

@abonander
Copy link
Collaborator

Please provide the output of EXPLAIN(VERBOSE, FORMAT JSON) <query>

@BenJeau
Copy link
Author

BenJeau commented Mar 31, 2024

I've also tried the query and query_as version that is not a macro and it is returning the data as expected with no errors.


Here's the query explained as requested:

[
  {
    "Plan": {
      "Node Type": "Aggregate",
      "Strategy": "Hashed",
      "Partial Mode": "Simple",
      "Parallel Aware": false,
      "Async Capable": false,
      "Startup Cost": 88.43,
      "Total Cost": 101.13,
      "Plan Rows": 1270,
      "Plan Width": 44,
      "Output": ["count(*)", "providers.id", "providers.name"],
      "Group Key": ["providers.id"],
      "Planned Partitions": 0,
      "Plans": [
        {
          "Node Type": "Hash Join",
          "Parent Relationship": "Outer",
          "Parallel Aware": false,
          "Async Capable": false,
          "Join Type": "Left",
          "Startup Cost": 38.58,
          "Total Cost": 77.13,
          "Plan Rows": 2260,
          "Plan Width": 36,
          "Output": ["providers.id", "providers.name"],
          "Inner Unique": true,
          "Hash Cond": "(sources.provider_id = providers.id)",
          "Plans": [
            {
              "Node Type": "Seq Scan",
              "Parent Relationship": "Outer",
              "Parallel Aware": false,
              "Async Capable": false,
              "Relation Name": "sources",
              "Schema": "public",
              "Alias": "sources",
              "Startup Cost": 0.00,
              "Total Cost": 32.60,
              "Plan Rows": 2260,
              "Plan Width": 4,
              "Output": ["sources.id", "sources.provider_id"]
            },
            {
              "Node Type": "Hash",
              "Parent Relationship": "Inner",
              "Parallel Aware": false,
              "Async Capable": false,
              "Startup Cost": 22.70,
              "Total Cost": 22.70,
              "Plan Rows": 1270,
              "Plan Width": 36,
              "Output": ["providers.id", "providers.name"],
              "Plans": [
                {
                  "Node Type": "Seq Scan",
                  "Parent Relationship": "Outer",
                  "Parallel Aware": false,
                  "Async Capable": false,
                  "Relation Name": "providers",
                  "Schema": "public",
                  "Alias": "providers",
                  "Startup Cost": 0.00,
                  "Total Cost": 22.70,
                  "Plan Rows": 1270,
                  "Plan Width": 36,
                  "Output": ["providers.id", "providers.name"]
                }
              ]
            }
          ]
        }
      ]
    }
  }
]

@BenJeau BenJeau changed the title Left join and group by on nullable field returns runtime UnexpectedNullError error - PostgreSQL Left join and group by on nullable field returns runtime UnexpectedNullError error - PostgreSQL, query macros Mar 31, 2024
@spencerbart
Copy link

I've run into this issue a couple times

@spencerbart
Copy link

spencerbart commented Apr 5, 2024

A column should be nullable during left joins that but the generated query-<hash>.json marks the column as nullable false and then I get the error ColumnDecode { index: "7", source: UnexpectedNullError }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants