Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Count returns incorrect results for multiple primary keys #5989

Closed
dsbert opened this issue May 1, 2020 · 2 comments · Fixed by #6870
Closed

Count returns incorrect results for multiple primary keys #5989

dsbert opened this issue May 1, 2020 · 2 comments · Fixed by #6870
Assignees

Comments

@dsbert
Copy link
Contributor

dsbert commented May 1, 2020

When using an entity with multiple primary keys, count() can return an incorrect result.

The current behavior of count with primary keys is to concat the two keys together and then count the distinct result of that. If the primary keys are of the same data type and can contain the same values, this will result in an incorrect count. See below for an example.

There are other reasons why the current method of concat is not desirable, particular regarding performance as it causes indexes to be ignored.

Some related issues

Issue type:

[ ] question
[X] bug report
[ ] feature request
[ ] documentation issue

Database system/driver:

[ ] cordova
[ ] mongodb
[X] mssql
[ ] mysql / mariadb
[ ] oracle
[ ] postgres
[ ] cockroachdb
[ ] sqlite
[ ] sqljs
[ ] react-native
[ ] expo

TypeORM version:

[X] latest
[ ] @next
[ ] 0.x.x (or put your version here)

Steps to reproduce or a small repository showing the problem:

The following example incorrectly returns a count of 1 when there are 2 records

@Entity()
export class MyEntity {
	@PrimaryColumn()
	field1: number;

	@PrimaryColumn()
	field2: number;
}
{
	field1: 11,
	field2: 101
	// concat(field1, field2) = '11101'
},
{
	field1: 1,
	field2: 1101
	// concat(field1, field2) = '11101'
},
@imnotjames imnotjames self-assigned this Oct 4, 2020
imnotjames added a commit to imnotjames/typeorm that referenced this issue Oct 8, 2020
currently we use concatenation of multiple primary keys and a
COUNT DISTINCT of that to figure out how many records we have
matched in a query.

however, that fails if the records have keys when the keys
are ambigious when concatenated (`"A", "AA"` & `"AA", "A"`)

the fact that we do a distinct can also be a performance impact
that isn't needed when we aren't doing joins

as such, in MySQL & Postgres we can use the built in counting
of multiple distinct values to resolve some of the issues,
and in other environments we can make it SLIGHTLY better by
adding delimiters between the concatenated values.  It is not
perfect because it technically could run into the same issue
if the delimiters are in the primary keys but it's BETTER
in most cases.

also, in cases where we do not perform any joins we can
short circuit all of this and do a much more performant
`COUNT(1)` operation

fixes typeorm#5989
fixes typeorm#5314
fixes typeorm#4550
imnotjames added a commit to imnotjames/typeorm that referenced this issue Oct 8, 2020
currently we use concatenation of multiple primary keys and a
COUNT DISTINCT of that to figure out how many records we have
matched in a query.

however, that fails if the records have keys when the keys
are ambigious when concatenated (`"A", "AA"` & `"AA", "A"`)

the fact that we do a distinct can also be a performance impact
that isn't needed when we aren't doing joins

as such, in MySQL & Postgres we can use the built in counting
of multiple distinct values to resolve some of the issues,
and in other environments we can make it SLIGHTLY better by
adding delimiters between the concatenated values.  It is not
perfect because it technically could run into the same issue
if the delimiters are in the primary keys but it's BETTER
in most cases.

also, in cases where we do not perform any joins we can
short circuit all of this and do a much more performant
`COUNT(1)` operation

fixes typeorm#5989
fixes typeorm#5314
fixes typeorm#4550
@imnotjames
Copy link
Contributor

@dsbert can you confirm that #6870 fixes the issues you're seeing?

@dsbert
Copy link
Contributor Author

dsbert commented Oct 9, 2020

@imnotjames I'll look for some time this weekend to confirm.

pleerock pushed a commit that referenced this issue Oct 9, 2020
currently we use concatenation of multiple primary keys and a
COUNT DISTINCT of that to figure out how many records we have
matched in a query.

however, that fails if the records have keys when the keys
are ambigious when concatenated (`"A", "AA"` & `"AA", "A"`)

the fact that we do a distinct can also be a performance impact
that isn't needed when we aren't doing joins

as such, in MySQL & Postgres we can use the built in counting
of multiple distinct values to resolve some of the issues,
and in other environments we can make it SLIGHTLY better by
adding delimiters between the concatenated values.  It is not
perfect because it technically could run into the same issue
if the delimiters are in the primary keys but it's BETTER
in most cases.

also, in cases where we do not perform any joins we can
short circuit all of this and do a much more performant
`COUNT(1)` operation

fixes #5989
fixes #5314
fixes #4550
zaro pushed a commit to zaro/typeorm that referenced this issue Jan 12, 2021
)

currently we use concatenation of multiple primary keys and a
COUNT DISTINCT of that to figure out how many records we have
matched in a query.

however, that fails if the records have keys when the keys
are ambigious when concatenated (`"A", "AA"` & `"AA", "A"`)

the fact that we do a distinct can also be a performance impact
that isn't needed when we aren't doing joins

as such, in MySQL & Postgres we can use the built in counting
of multiple distinct values to resolve some of the issues,
and in other environments we can make it SLIGHTLY better by
adding delimiters between the concatenated values.  It is not
perfect because it technically could run into the same issue
if the delimiters are in the primary keys but it's BETTER
in most cases.

also, in cases where we do not perform any joins we can
short circuit all of this and do a much more performant
`COUNT(1)` operation

fixes typeorm#5989
fixes typeorm#5314
fixes typeorm#4550
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants