Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(NODE-3015): ObjectId.equals should use Buffer.equals for better performance #478

Merged
merged 6 commits into from Jan 26, 2022

Conversation

Uzlopak
Copy link
Contributor

@Uzlopak Uzlopak commented Jan 20, 2022

This is a PR based on #287 by @siboulet but for the new typescript codebase.

I used the following benchmark:

const ObjectId = require("./lib/objectid").ObjectId;

const data = [];
const id = new ObjectId().toString();
const lastId = new ObjectId();

const runs = 1000000;
for (let i = 1; i < runs; i++) {
    data.push(new ObjectId());
}
data.push(lastId);
const results = [];

function runFilter(name, cb) {
    const now = Date.now();
    let length = data.filter(cb).length;
    results.push({ name, length, time: Date.now() - now });
}

console.log(`Running tests on ${data.length} ids array`);
runFilter("Filter toString to string", s => s.toString() === id);
runFilter("Filter toStringBoth", s => s.toString() === lastId.toString());
runFilter("Filter String() Both", s => String(s) === String(lastId));
runFilter("Filter equals objectId", s => s.equals(lastId));
runFilter("Filter equals to string", s => s.equals(id));
runFilter("Filter direct id", s => s.id === lastId.id);
runFilter(
    "Filter buffer compare",
    s => Buffer.compare(s.id, lastId.id) === 0
);
runFilter(
    "Check all bytes in order",
    s => (
        // first two timestamp bytes
        s.id[0] === lastId.id[0] &&
        s.id[1] === lastId.id[1] &&
        // last two timestamp bytes
        s.id[2] === lastId.id[2] &&
        s.id[3] === lastId.id[3] &&
        // three bytes machine id
        s.id[4] === lastId.id[4] &&
        s.id[5] === lastId.id[5] &&
        s.id[6] === lastId.id[6] &&
        // two bytes process id
        s.id[7] === lastId.id[7] &&
        s.id[8] === lastId.id[8] &&
        // last three incrementor bytes
        s.id[9] === lastId.id[9] &&
        s.id[10] === lastId.id[10] &&
        s.id[11] === lastId.id[11]
    )
);
runFilter(
    "Check all bytes reverse",
    s => (
        // last three incrementor bytes
        s.id[11] === lastId.id[11] &&
        s.id[10] === lastId.id[10] &&
        s.id[9] === lastId.id[9] &&
        // two bytes process id
        s.id[8] === lastId.id[8] &&
        s.id[7] === lastId.id[7] &&
        // three bytes machine id
        s.id[6] === lastId.id[6] &&
        s.id[5] === lastId.id[5] &&
        s.id[4] === lastId.id[4] &&

        // last two timestamp bytes
        s.id[3] === lastId.id[3] &&
        s.id[2] === lastId.id[2] &&
        // first two timestamp bytes
        s.id[1] === lastId.id[1] &&
        s.id[0] === lastId.id[0]
    )
);
runFilter(
    "Check all bytes in significance order",
    // last three incrementor bytes
    s => (
        s.id[11] === lastId.id[11] &&
        s.id[10] === lastId.id[10] &&
        s.id[9] === lastId.id[9] &&

        // last two timestamp bytes
        s.id[3] === lastId.id[3] &&
        s.id[2] === lastId.id[2] &&
        s.id[1] === lastId.id[1] &&
        s.id[0] === lastId.id[0] &&
        // two bytes process id
        s.id[8] === lastId.id[8] &&
        s.id[7] === lastId.id[7] &&
        // three bytes machine id
        s.id[6] === lastId.id[6] &&
        s.id[5] === lastId.id[5] &&
        s.id[4] === lastId.id[4]
        // first two timestamp bytes
    )
);
runFilter(
    "Check the most significant bytes first",
    s => (
        // last two timestamp bytes
        s.id[3] === lastId.id[3] &&
        // s.id[2] === lastId.id[2] &&
        
        // last three incrementor bytes
        s.id[11] === lastId.id[11] &&
        s.id[10] === lastId.id[10] &&
        s.id[9] === lastId.id[9] &&
        
        Buffer.compare(s.id, lastId.id) === 0
    )
);
runFilter(
    "Filter buffer equals",
    s => s.id.equals(lastId.id)
);
runFilter(
    "Filter buffer using equals modified",
    s => s.id[11] === lastId.id[11] && s.id.equals(lastId.id)
);
runFilter(
    "Filter buffer using compare",
    s => s.id[11] === lastId.id[11] && Buffer.compare(s.id, lastId.id) === 0
);
console.log(JSON.stringify(results, null, 2));

and got the following result

Running tests on 1000000 ids array
[
  {
    "name": "Filter toString to string",
    "length": 0,
    "time": 183
  },
  {
    "name": "Filter toStringBoth",
    "length": 1,
    "time": 295
  },
  {
    "name": "Filter String() Both",
    "length": 1,
    "time": 459
  },
  {
    "name": "Filter equals objectId",
    "length": 1,
    "time": 23
  },
  {
    "name": "Filter equals to string",
    "length": 0,
    "time": 927
  },
  {
    "name": "Filter direct id",
    "length": 1,
    "time": 13
  },
  {
    "name": "Filter buffer compare",
    "length": 1,
    "time": 103
  },
  {
    "name": "Check all bytes in order",
    "length": 1,
    "time": 27
  },
  {
    "name": "Check all bytes reverse",
    "length": 1,
    "time": 26
  },
  {
    "name": "Check all bytes in significance order",
    "length": 1,
    "time": 25
  },
  {
    "name": "Check the most significant bytes first",
    "length": 1,
    "time": 26
  },
  {
    "name": "Filter buffer equals",
    "length": 1,
    "time": 88
  },
  {
    "name": "Filter buffer using equals modified",
    "length": 1,
    "time": 20
  },
  {
    "name": "Filter buffer using compare",
    "length": 1,
    "time": 21
  }
]

So it is indeed the fastest way to compare two ObjectIds in Buffer format.

Double check the following

@dariakp dariakp changed the title perf: NODE-3015 ObjectId.equals should use Buffer.equals perf(NODE-3015): ObjectId.equals should use Buffer.equals Jan 20, 2022
@dariakp
Copy link
Contributor

dariakp commented Jan 21, 2022

@Uzlopak Hi there, thanks for submitting this PR. We still have the same concerns as with the other PR in terms of ensuring this library continues to work in the web environment (which does not have the Buffer class). In order to move forward on this issue, we'd need to make sure that this logic only runs in the node.js environment and we'd also want tests to cover the behavior.

@Uzlopak
Copy link
Contributor Author

Uzlopak commented Jan 23, 2022

Why is that even an issue? As I can see you use rollup-plugin-polyfill-node and you build for browser. Also you use karma according to your package.json. You even have a package "Buffer" as dependency.

So what is the real issue? Dont you trust your CI/CD pipeline? Should I have a look on your CI/CD pipeline and the builds?

@nbbeeken
Copy link
Contributor

Hi @Uzlopak I made the mistake of thinking there was a difference in id typing in the browser and node. We do indeed bundle Buffer for a consistent experience across platforms. If you log a Buffer in chrome it reports itself as a Uint8Array, but it is in fact the extended class from the Buffer polyfil upon closer inspection. CI aside I was just inspecting manually and got mislead by what I saw in the logs, thanks for contributing this improvement.

I still agree with @dariakp here that having some additional testing coverage would be great to ensure the behavior we expect continues to work as designed. I've pushed up some tests, and a small improvement to another section of the equality.

src/objectid.ts Outdated Show resolved Hide resolved
Co-authored-by: Bailey Pearson <bailey.pearson@gmail.com>
src/objectid.ts Outdated Show resolved Hide resolved
src/objectid.ts Outdated Show resolved Hide resolved
Co-authored-by: Neal Beeken <neal.beeken@mongodb.com>
nbbeeken
nbbeeken previously approved these changes Jan 25, 2022
@nbbeeken
Copy link
Contributor

@baileympearson I've fixed my tests that broke when we switched to using the symbol. I brought over a helper we have in our driver that lets me obtain the symbol at the top of the tests for checking.

@nbbeeken nbbeeken changed the title perf(NODE-3015): ObjectId.equals should use Buffer.equals fix(NODE-3015): ObjectId.equals should use Buffer.equals for better performance Jan 26, 2022
@nbbeeken nbbeeken merged commit 8305bdf into mongodb:master Jan 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants