Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Allows valid non-object JSON to be retrieved in simple-json columns #6574

Merged
merged 16 commits into from Oct 10, 2020

Conversation

mkornatz
Copy link
Contributor

@mkornatz mkornatz commented Aug 17, 2020

Closes #5501.

A breaking change / bug was introduced in #4476 as a proposed fix for #4440 (which really just needed a try/catch). It assumes that all simple-json type properties should be objects, and returns {} in all other cases. However, that goes against the spec for JSON.parse() which can return an object, array, string, number, or null. Ultimately, this breaks all existing simple-json properties which are non-objects.

A similar fix was implemented in #5125 and has a conversation around it. However, no tests were added to that PR, and the build is failing. I've created this PR in hopes of getting this out sooner than later, because it's breaking my application.

#6335 is also a duplicate.

@mkornatz mkornatz changed the title Allows valid non-object JSON to be retrieved in simple-json columns fix: Allows valid non-object JSON to be retrieved in simple-json columns Aug 17, 2020
src/util/DateUtils.ts Outdated Show resolved Hide resolved
@imnotjames
Copy link
Contributor

Looks like this should also drop the tests related to #4440 ? Otherwise they fail.

@mkornatz
Copy link
Contributor Author

Yep, thanks @imnotjames. I had reverted that test previously, but failed to update it to align with expected behavior. In terms of standard operating procedure for the project, should the entire set of tests for #4440 be dropped, or should they be left in-place but updated for expected functionality?

@pleerock
Copy link
Member

pleerock commented Aug 17, 2020

so, I have the same comment as in #6335

This basically reverts #4476 . First, you need ask initial authors why then made this change, then you can explain why change was wrong and your proposal is right. At the end we need to came to conclusion what we should do.

we can't have changes back and forward depend on different user needs. We need a vote before this can be merged.

@mkornatz
Copy link
Contributor Author

The original issue in #4440 (fixed in #4476) looks like it could have been caused by a lack of data validation for the entity, which caused an invalid JSON object to be stored in the db and throw an error when loaded back into the entity. Although, the issue makes no mention of what the data looks like that was assigned to the simple-json column, so it's hard to be certain of the underlying cause of the user's issue. The fix for this issue could have just been adding a try/catch around JSON.parse() to avoid the parse error. Instead, any non-object was being filtered, even though JSON.parse can validly return a number, string, object, array, or null.

As @svvac mentioned in #5125, adding additional "magic" around the JSON.parse method will only cause confusion/bugs. The type is called simple-json not simple-object which, to me, means that it should be able to support any valid JSON value. Properly formatted numbers, strings, objects, arrays, or nulls are all valid JSON.

One major downside to the current implementation (always returning an object when loading a simple-json column) is that finding an entity and then re-saving that entity would potentially turn any non-object value into {}, which is destructive and in my opinion undesired behavior.

@imnotjames
Copy link
Contributor

imnotjames commented Aug 17, 2020

simple-json documentation - https://github.com/typeorm/typeorm/blob/master/docs/entities.md#simple-json-column-type

There is a special column type called simple-json which can store any values which can be stored in database via JSON.stringify. Very useful when you do not have json type in your database and you want to store and load object without any hustle.

@imnotjames
Copy link
Contributor

imnotjames commented Aug 17, 2020

The original issue in #4440 (fixed in #4476) looks like it could have been caused by a lack of data validation for the entity, which caused an invalid JSON object to be stored in the db and throw an error when loaded back into the entity.

In all likelihood, the value in the database was [object Object] - which is completely invalid for JSON. This is guessing based off the error message...

SyntaxError: Unexpected token o in JSON at position 1

> JSON.parse('[object Object]')
Uncaught SyntaxError: Unexpected token o in JSON at position 1

and what they'd said in the issue..

This seems to be something that happened before, back in November 2016. I read #96 and it seems like it was fixed. However... it now seems like the objects hitting TypeORM are still objects, so they're either not being stringified (which I did try doing manually to no success) or something else is going on here.

> String({})
'[object Object]'

@imnotjames
Copy link
Contributor

imnotjames commented Aug 17, 2020

Scratch that on the previous comment -

I think #4440 was using simple-json against a field that wasn't a string and was instead a native postgres JSON field.

Under the typeorm posgres driver, pg converted it to an object and then the object got passed to simple-json. simple-json called parse on that and it failed.

I can replicate this case locally, but:

> JSON.parse({})
Uncaught SyntaxError: Unexpected token o in JSON at position 1

They were then able to switch to using the native json via @Column('json')

In other words, maybe a bit of user error there?

@svvac
Copy link

svvac commented Aug 17, 2020

With these changes, Typeorm still tries to « fix » values it doesn't understand (= data that doesn't fit the specified model), and still does so silently, leading to potential data loss.

I don't think an ORM library should take this responsibility. Let's say I carelessly save a large JSON payload larger that the column datatype and it gets truncated in the DB. I don't want that this (admittedly invalid) column to get wiped on the next save call with no indication. In that case the library should crash (because of unmet expectation : data must be valid JSON), so I can tell something is wrong, fix it, and have the opportunity to salvage what I can from my junk row.

I do see why the empty string could get special treatment, especially since some backends don't allow setting default values on TEXT columns. That being said I don't think {} is a very sane default in this case.

@imnotjames
Copy link
Contributor

imnotjames commented Aug 17, 2020

With these changes, Typeorm still tries to « fix » values it doesn't understand (= data that doesn't fit the specified model), and still does so silently, leading to potential data loss.
I don't think an ORM library should take this responsibility.

I'd agree with this.

Let's say I carelessly save a large JSON payload larger that the column datatype and it gets truncated in the DB.

The direct example from #4440 is that you set the JSON data type & then try to use simple-json. It'll still wipe the data with no indication. Perhaps for simple-json if it's not a string that gets returned it should error out, too?

I do see why the empty string could get special treatment

I'd agree with this, too. Perhaps null would be a more realistic default?

@svvac
Copy link

svvac commented Aug 17, 2020

The direct example from #4440 is that you set the JSON data type & then try to use simple-json. It'll still wipe the data with no indication.

Yes that seems to be the original problem the fix was trying to address. However my point was more generally about returning a default when JSON.parse fails for whatever reason (i.e. the try/catch block).

Perhaps for simple-json if it's not a string that gets returned it should error out, too?

Sounds like a good first line of defense against a DB/model configuration mismatch.

@imnotjames
Copy link
Contributor

imnotjames commented Aug 17, 2020

Looking at how other places handle this - for example, the MySQL driver has to handle JSON fields as strings similar to simple-json -

value = typeof value === "string" ? JSON.parse(value) : value;

If it's a string it tries to parse it. If it's not a string it does not parse it.

@mkornatz
Copy link
Contributor Author

I've updated this PR based on the latest conversation. This fully reverts the change from #4440 (seen here, https://github.com/typeorm/typeorm/pull/4476/files#diff-f0f40758177783349c31da1ee0725addL173) and also implements an error when the value is non-string.

Perhaps for simple-json if it's not a string that gets returned it should error out, too?

Sounds like a good first line of defense against a DB/model configuration mismatch.

Although, as @imnotjames points out in the above comment, drivers handle a JSON field differently, I've implemented the error approach when the value is not a string. Obviously, this will prevent data destruction as mentioned above.

I do see why the empty string could get special treatment, especially since some backends don't allow setting default values on TEXT columns. That being said I don't think {} is a very sane default in this case.

I'd agree with this, too. Perhaps null would be a more realistic default?

Could you elaborate, @imnotjames or @svvac? I don't necessarily see why an empty string should be handled as a special case. For back ends that don't support setting a default, do they not generally default to null naturally? Any defaults should be the responsibility of the user to implement as part of their column definition.

When it comes to empty strings, I would expect they can be a valid value as JSON.stringify and JSON.parse have no issues with them:

const stringified = JSON.stringify("");
const parsed = JSON.parse(stringified);
parsed === ""; // true
typeof parsed === "string"; // true

@svvac
Copy link

svvac commented Aug 18, 2020

When it comes to empty strings, I would expect they can be a valid value as JSON.stringify and JSON.parse have no issues with them

The problem is with an empty string from the database since JSON.parse("") throws.

Any defaults should be the responsibility of the user to implement as part of their column definition.

Agreed.

For back ends that don't support setting a default, do they not generally default to null naturally?

I kinda seem to remember some database backends defaulting to an empty string for some columns types, though I can't seem to find anything on that front right now so it's likely I'm just mistaken.

...and also implements an error when the value is non-string.

Your latest revision will choke on NULL-ed fields since typeof null === 'object'.

On that note it might be worth to point out that a NULL field in database will become indistinguishable from a field set to "null" (i.e. JSON.stringify(null)) for Typeorm consumers.

That may or may not be a problem, though I'm pretty sure this was the behaviour in place before #4476.

@mkornatz
Copy link
Contributor Author

mkornatz commented Aug 18, 2020

Your latest revision will choke on NULL-ed fields since typeof null === 'object'.

Yep, you're totally right. Just fixed that to allow nulls through to JSON.parse.

On that note it might be worth to point out that a NULL field in database will become indistinguishable from a field set to "null" (i.e. JSON.stringify(null)) for Typeorm consumers.
That may or may not be a problem, though I'm pretty sure this was the behaviour in place before #4476.

I agree that this probably isn't an issue due to the fact that whether it's a db null or a JSON null, they both essentially mean the same thing.

@svvac
Copy link

svvac commented Aug 18, 2020

I agree that this probably isn't an issue due to the fact that whether it's a db null or a JSON null, they both essentially mean the same thing.

Yes, but WHERE a IS NULL and WHERE a = 'null' aren't interchangeable.

That being said, coalescing js-null into sql-NULL could make sense, but it might be taking the logic too far off the « simple » in simple-json.

Either way that's something that should be mentioned in the docs somewhere.

@mkornatz
Copy link
Contributor Author

After my last push/test, I saw that prepareHydratedValue is being called multiple times on the field value, and since it's first a string and then an object, the second time through the method, the SimpleJsonIsNotAStringError error was being thrown. The failing test was github issues > #1898 Simple JSON breaking in @next (see CI https://app.circleci.com/pipelines/github/typeorm/typeorm/1358/workflows/9ef472ba-45ac-4429-8bb3-71eb9ce850fa/jobs/2167).

It seems the best course of action is the original implementation:

static stringToSimpleJson(value: any) {
        return typeof value === "string" ? JSON.parse(value) : value;
}

Yes, but WHERE a IS NULL and WHERE a = 'null' aren't interchangeable.
That being said, coalescing js-null into sql-NULL could make sense, but it might be taking the logic too far off the « simple » in simple-json.

Within the preparePersistentValue methods in the drivers, if a value is null then it is not passed through JSON.stringify before persisting in the db, which subsequently stores a native NULL db value in the column. The case where "null" would persist in the database is when assigning the string "null" to the entity before saving, or when using {default: "null"} when default values are allowed on simple-json columns (e.g. SQLite). I made a note about nulls in the docs for simple-json.

await repo.save(post);
const postFound = await repo.findOne(1);
postFound!.id.should.eql(1);
expect(postFound!.jsonField).to.eql("null");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This expectation is now specifically noted in the docs for simple-json.

@mkornatz
Copy link
Contributor Author

Any thoughts on getting this merged in?

Again, the original behavior before the breaking change (which was not noted as a breaking change in the changelog) seems to be the best behavior for the expectation of the functionality. This PR reverts that breaking change.

@dapriett
Copy link

dapriett commented Oct 7, 2020

If anyone needs this, I have this run at startup in my app as a patch until we get a fix:

import { DateUtils } from 'typeorm/util/DateUtils';

DateUtils.stringToSimpleJson = function(value: any) {
   try {
         return value ? JSON.parse(value) : null;	
   } catch (err) {	
        return null;	
  }	
}

docs/entities.md Outdated Show resolved Hide resolved
@imnotjames
Copy link
Contributor

Rebased against master.

@imnotjames
Copy link
Contributor

🤔 For some reason a completely unrelated test brought in via #6634 referenced a file in test for 4440.

That's.. odd. I'll fix it first and then rebase again.

@imnotjames
Copy link
Contributor

I'm refactoring some of the tests you wrote to handle a few more cases and codify the error cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hacktoberfest-accepted label hacktoberfest
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect data loading from JSON string for column type 'simple-json'. Why?
5 participants