Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[@parcel/graph] SharedTypeMap should dynamically select the store data type #8621

Open
ahaoboy opened this issue Nov 11, 2022 · 5 comments
Open

Comments

@ahaoboy
Copy link
Contributor

ahaoboy commented Nov 11, 2022

🙋 feature request

SharedTypeMap and SerializedAdjacencyList use Uint32Array to store data, but in most case Uint16Array is enough, so if can dynamically select the store data type, it's can save half memory to store and transform?

class SharedTypeMap {
 data: Uint32Array;
}
 
export type SerializedAdjacencyList<TEdgeType> = {|
  nodes: Uint32Array,
  edges: Uint32Array,
|};

🤔 Expected Behavior

Dynamically select the AdjacencyList's store data type

😯 Current Behavior

The AdjacencyList's store data type is always Uint32Array, in most cases it is wasted

💁 Possible Solution

Each resize we can choose the appropriate data type(Uint32Array/Uint16Array) depending on the number of nodes/edges
We can hard code Uint32Array to Uint16Array and test some usecase, find if this approach is effective by comparing

🔦 Context

None

💻 Examples

I make a sample with 100_000 nodes and edges, it looks like to be working
Snipaste_2022-11-11_18-08-07

Snipaste_2022-11-11_18-07-29

@devongovett
Copy link
Member

@lettertwo

@lettertwo
Copy link
Member

lettertwo commented Nov 11, 2022

I like this idea! The heuristic seems simple enough. This might just work for edges in particular, since they are re-hashed and copied into the new array on resize. Nodes might need a bit extra work, since the 16 bit array data could not just be cloned into the 32 bit array Nodes may also just work, if passing a 16 bit array to a new 32 bit array just does the right thing (it might?).

I think it has some conceptual overlap with another potential optimization, too: Edge types are always going to fit into 8 bits (or could be enforced as such), so we could save a fair amount of space if we used DataView instead of a fixed-size TypedArray (see #6922 (comment)). It would be nice to squeeze out all the extra space we can this way, but implementing on top of DataView brings some complications (as described in that comment).

@ahaoboy
Copy link
Contributor Author

ahaoboy commented Nov 12, 2022

@lettertwo
I think this lib can be used by more projects, but it depends some parcel utils and can't used in stackbitz, can you fix this?
stackblitz/core#1855

@ahaoboy
Copy link
Contributor Author

ahaoboy commented Nov 12, 2022

I like this idea! The heuristic seems simple enough. This might just work for edges in particular, since they are re-hashed and copied into the new array on resize. Nodes might need a bit extra work, since the 16 bit array data could not just be cloned into the 32 bit array Nodes may also just work, if passing a 16 bit array to a new 32 bit array just does the right thing (it might?).

I think it has some conceptual overlap with another potential optimization, too: Edge types are always going to fit into 8 bits (or could be enforced as such), so we could save a fair amount of space if we used DataView instead of a fixed-size TypedArray (see #6922 (comment)). It would be nice to squeeze out all the extra space we can this way, but implementing on top of DataView brings some complications (as described in that comment).

Yes, in most case(may 99.99%) edeg type is less than 255, so use u8 rather than u32 can save more memory, but this adds more
complexity. I think it's worth a try~

@mischnic
Copy link
Member

I think this lib can be used by more projects, but it depends some parcel utils and can't used in stackbitz, can you fix this?

The only usage is

import {SharedBuffer} from '@parcel/utils';

which imports
// @flow
/* global MessageChannel:readonly */
export let SharedBuffer: Class<ArrayBuffer> | Class<SharedArrayBuffer>;
// $FlowFixMe[prop-missing]
if (process.browser) {
SharedBuffer = ArrayBuffer;
// Safari has removed the constructor
if (typeof SharedArrayBuffer !== 'undefined') {
let channel = new MessageChannel();
try {
// Firefox might throw when sending the Buffer over a MessagePort
channel.port1.postMessage(new SharedArrayBuffer(0));
SharedBuffer = SharedArrayBuffer;
} catch (_) {
// NOOP
}
channel.port1.close();
channel.port2.close();
}
} else {
SharedBuffer = SharedArrayBuffer;
}

I think we could just copy that over into the graph package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants