Type accounting rigour #65

adrian-burlacu-software · 2021-07-16T07:49:02Z

When I compare two arrays:
1st arg = ['a', 'b', 'c']
and
2nd arg = ['a', 'b']

results in object:
{ '2': undefined }

I would expect -- given your immense abilities for designing software :P -- to get this:
[null/same?, null/same?, undefined]

if I have this problem:
1st argument: unchanged
2nd arg = ['a', 'c']

result would be:
[null/same?, 'c', undefined]

We don't need to prematurely quibble about perf do we lads?

What about rhs and lhs have an array and an obj?
1st arg = ['a', 'b', 'c']
2nd arg = {'1': "a", 'b': "c"}

I would output this result annotated with metadata:
{
type: Object,
value: 2nd arg = {'1': "a", 'b': "c"}
}
I realize this might be a tad too intrusive, maybe think of some prototype property(maybe through a proxy?) that could annotate the object without the interference to the output value ;)

Typescript custom type comparison support would be extra fancy ca$h that would make me want to SCREAM WITH JOY.

Love always ❤️,
Adrian

anko · 2021-07-16T21:22:53Z

Random reactions from someone who's stared at this library quite a lot:

When I compare two arrays:
1st arg = ['a', 'b', 'c']
and
2nd arg = ['a', 'b']

results in object:
{ '2': undefined }

Reproducing in node REPL just to make sure we're on the same page:

> let dod = require('deep-object-diff')
> dod.diff(['a', 'b', 'c'], ['a', 'b'])
{ '2': undefined }

I would expect [...] to get this:
[null/same?, null/same?, undefined]

A diff describes the set of operations to be applied to previous to get next. In your proposed format, the "same"-markers don't really add information: everything not-different is already implicit by not being listed. Such a "verbose diff" would be unusable for e.g. delta encoding.

To list all unchanged properties, you can iterate the "previous" object, excluding properties that appear in the diff. The loops to iterate the "previous" object and to iterate the proposed "verbose diff" are basically the same code, with very little difference in performance or clarity.

if I have this problem:
1st argument: unchanged
2nd arg = ['a', 'c']

result would be:
[null/same?, 'c', undefined]

In REPL:

> dod.diff(['a', 'b', 'c'], ['a', 'c'])
{ '1': 'c', '2': undefined }

The first element is omitted because it didn't change; the second changed to a 'c'; and the last was deleted, so undefined. This seems correct to me.

We don't need to prematurely quibble about perf do we lads?

I don't know what you mean here.

What about rhs and lhs have an array and an obj?
1st arg = ['a', 'b', 'c']
2nd arg = {'1': "a", 'b': "c"}

I would output this result annotated with metadata:
{
type: Object,
value: 2nd arg = {'1': "a", 'b': "c"}
}

You're right that this library currently confuses arrays and number-keyed objects. For example, it thinks this array and object are equal:

> dod.diff(['a', 'b', 'c'], {0: 'a', 1: 'b', 2: 'c'})
{}

This could indeed be clarified by changing the output format there from {} to e.g. { diff: {}, typeDiff: 'object' }; mentioning each property's type diff in addition to its value diff.

It would take a pretty extreme situation where such disambiguation would be useful though. I can imagine very few situations where ① it makes any sense to diff an array against an object, and ② that object contains numeric keys, and ③ their types being different actually matters. Any such situations are so niche that I think the inconvenience of having to manually Array.isArray the inputs against each other is pretty small. Object and Array are the only combination of types that are ambiguous this way.

I think it's not worth making the diff format more complex for everyone, just to cater to a single extremely rare edge case, which has a solid work-around. This caveat should be mentioned in the readme though; it currently isn't.

adrian-burlacu-software · 2021-07-17T16:41:31Z

To clarify, given that I'm too noob on the nice formatting:

What I mean by this: We don't need to prematurely quibble about perf do we lads?

Let's not account for performance or "polluting results with extra details" when I'm actually trying to suggest that we provide a more complete diff API -- that is more realistic to the natural expectations -- as opposed to the current API design.
With regards to the "polluting results with extra details" argument, you can stow it, as I already suggested a proxy wrapper around the result that would probably solve the problem without strictly changing the output API unless the user digs into those optional metadata. I get it, if you were writing the change, it would definitely take some coding and probably not a small amount of lines.

Further, it seems that you missed my point about the arrays. Because I don't want any translation from arrays to objects, there is no way to somehow say what elements of the arrays is changed and which is not without the filler values. I would say let's have an option between this full mode and the smaller, more performance-oriented, object-converting current way.

Good job, random quibbler, you mentioned a bug in comparing types, and I want you to know that I appreciate that!!

Toodleloo,
Adrian

anko · 2021-07-21T16:41:01Z

On extra details:

.detailedDiff separates additions, changes, and deletions:

> dod.detailedDiff({a:1, b:2, c:3}, {a:2, b:2, x:42})
{ added: { x: 42 }, deleted: { c: undefined }, updated: { a: 2 } }

That format is my "natural expectation", personally.

What do you have in mind for the extra-detailed API to look like? Pseudo-code would be informative.

On array diffs:

It's undocumented, but there's an arrayDiff function in require('./node_modules/deep-object-diff/dist/arrayDiff'):

> d = require('./node_modules/deep-object-diff/dist/arrayDiff')
> d([1,2,3], [1,3])
[ <1 empty item>, 3, undefined ]

Related discussions: #14 #6

there is no way to somehow say what elements of the arrays is changed and which is not without the filler values

I don't understand why filler values are necessary. Could you give an example?

> d.detailedDiff([1,2,3,4,5], [1,2,3,'x'])
{ added: {}, deleted: { '4': undefined }, updated: { '3': 'x' } }

In the above, it's clear index 3 changed, and index 4 disappeared.

adrian-burlacu-software · 2021-07-25T09:44:10Z

I don't understand why filler values are necessary. Could you give an example?
> d.detailedDiff([1,2,3,4,5], [1,2,3,'x'])
{ added: {}, deleted: { '4': undefined }, updated: { '3': 'x' } }
In the above, it's clear index 3 changed, and index 4 disappeared.

Obviously the detailed diff is fine and that API isn't about array/obj type accuracy, so I guess this case is about:

Bug in successfully comparing arrays to objects
Documenting (and integrating?) the arrayDiff function into the regular .diff, as cool as it is to find this issue for documentation.
Output differences between custom types

adrian-burlacu-software changed the title ~~Keep result types the same with arguments~~ Type accounting rigour Jul 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Type accounting rigour #65

Type accounting rigour #65

adrian-burlacu-software commented Jul 16, 2021 •

edited

anko commented Jul 16, 2021

adrian-burlacu-software commented Jul 17, 2021

anko commented Jul 21, 2021

adrian-burlacu-software commented Jul 25, 2021 •

edited

Type accounting rigour #65

Type accounting rigour #65

Comments

adrian-burlacu-software commented Jul 16, 2021 • edited

anko commented Jul 16, 2021

adrian-burlacu-software commented Jul 17, 2021

anko commented Jul 21, 2021

adrian-burlacu-software commented Jul 25, 2021 • edited

adrian-burlacu-software commented Jul 16, 2021 •

edited

adrian-burlacu-software commented Jul 25, 2021 •

edited