Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrapping std::vector in JS bindings can lead to confusing crashes #170

Open
alice opened this issue Apr 1, 2024 · 8 comments
Open

Wrapping std::vector in JS bindings can lead to confusing crashes #170

alice opened this issue Apr 1, 2024 · 8 comments

Comments

@alice
Copy link
Contributor

alice commented Apr 1, 2024

We already have a note in the README about JS bindings for std::vector.

However, the situation is even more annoying than I'd realised. I spent several hours on Friday debugging crashes which were being triggered by an example JavaScript script to walk the Mac tree. After a lot of painful work, I realised the issue was that I'd written code roughly like this:

function getNodeListValue(node, attribute) {
  let array = [];
  let vector = node.getNodeListValue(attribute);
  for (let i = 0; i < vector.size(); i++)
    array.push(vector.get(i));

  return array;
}

At the end of the function, vector goes out of scope and is marked for garbage collection. Some time later, the garbage collector sweeps through and deletes the wrapped std::vector<AXAPINode> - causing all of the AXAPINodes it contains to also be deleted. And if that garbage collection happens between returning array and using one of array's members, that member now wraps a null AXAPINode object, causing a crash when you try to access its data.

I was able to figure that out because I understand the wrapped C++ code and am already regularly compiling and debugging it, but if I had simply been trying to use the library it would have been impossible to deduce.

I think I'm leaning towards not exposing methods which return std::vector to the SWIG bindings. We could alternatively warn users about this behaviour, but it's quite a foreign concept to JavaScript that you can have a reference become stale.

@joanmarie
Copy link
Contributor

This sounds related to an issue I was seeing locally with AtspiHyperlink: If I didn't free the pointer to the interface, there was a memory leak. If I did free it, the C++ example worked as expected and was leak-free. BUT both the python and the node examples reliably crashed. I haven't yet figured out how to solve that -- other than to not include AtspiHyperlink support with the other, non-crashful, interfaces. 😞

I mention all of this because I don't know if we have two SWIG issues to resolve, or a single one related to garbage collecting.

@alice
Copy link
Contributor Author

alice commented Apr 1, 2024

Ooh interesting - could you give an example of crashy (python/JS) code?

@joanmarie
Copy link
Contributor

Ooh interesting - could you give an example of crashy (python/JS) code?

I most certainly can. Behold #171. (There's got to be some swiggy solution for this scenario.)

@joanmarie
Copy link
Contributor

Looks like I fixed my issue via shared pointer.

@spectranaut
Copy link
Contributor

spectranaut commented Apr 1, 2024

I wonder if Joanie's shared_ptr solution would work here -- does array.push(vector.get(i)); put what is essentially a pointer to the AXAPINode onto the array? If so, then the object will still get garbage collected like Alice said. But on the other hand, if it triggers a copy constructor... I wonder if there is someway to get around this.

Here is Joanie's shared_ptr use: https://github.com/Igalia/AXAccess/pull/171/files#diff-6acaa5ccb23aa8579f65ac9c02d9e6668713c18a2db32ed7ab9f24d625043606R24

I agree with your assessment that if there is no way to fix this then we shouldn't expose these Vector objects in nodejs. But also, I'm very curious if this problem occurs in python, did you happen to test, Alice? If it doesn't happen in python, and in python the objects are native list objects not this custom Vector object, I'm starting to feel like we might benefit from #32 sooner than later.

@alice
Copy link
Contributor Author

alice commented Apr 2, 2024

does array.push(vector.get(i)); put what is essentially a pointer to the AXAPINode onto the array?

Here's the relevant generated code:

SWIGINTERN std::vector< mac_inspect::AXAPINode >::const_reference std_vector_Sl_mac_inspect_AXAPINode_Sg__get(std::vector< mac_inspect::AXAPINode > *self,int i){
                int size = int(self->size());
                if (i>=0 && i<size)
                    return (*self)[i];
                else
                    throw std::out_of_range("vector index out of range");
            }

static SwigV8ReturnValue _wrap_AXAPINodeVector_get(const SwigV8Arguments &args) {
// ...
  std::vector< mac_inspect::AXAPINode >::value_type *result = 0 ;
// ...
try {
    result = (std::vector< mac_inspect::AXAPINode >::value_type *) &std_vector_Sl_mac_inspect_AXAPINode_Sg__get(arg1,arg2);
  } catch(std::out_of_range &_e) {
    SWIG_exception_fail(SWIG_IndexError, (&_e)->what());
  }
  jsresult = SWIG_NewPointerObj(SWIG_as_voidptr(result), SWIGTYPE_p_mac_inspect__AXAPINode, 0 |  0 );
  
  SWIGV8_RETURN(jsresult);
}

So it's returning a (raw) pointer to vector[i] and yeah, it'll be garbage collected. Would be nice if we got a copy, but then presumably I wouldn't have had this crash.

You could possibly work around it (I haven't checked) by trying to explicitly call the copy constructor from JS? But it still seems counter-intuitive that you'd even need to do that. I think instead we could consider bundling the plain SWIG-generated code with some glue JavaScript when it comes time to ship the library.

I haven't checked what happens in Python yet. There's less of an incentive to do this specific weird thing in Python, since the wrapped C++ vectors just work like Python arrays out of the box, but you might still be able to get into trouble if you take a reference to an element and expect it to work after the vector has gone out of scope; I'll try and check later.

@spectranaut
Copy link
Contributor

I'm curious what you mean by "bundling the plain SWIG-generate code with some glue JavaScript"!

@alice
Copy link
Contributor Author

alice commented Apr 2, 2024

Oh right, I should have unpacked that a bit - I mean, if the generated SWIG bindings give us getNodeListValueAtIndex(i), we ship the NPM package with some hand-written javascript along the lines of

function getListAttributeValue(node, attribute) {
  let count = node.getListElementCount(attribute);
  if (count === 0)
    return [];

  const type = node.getListElementType(attribute);
  if (!kSupportedListTypes.has(type)) {
    const typeString = axapi_inspect.ValueTypeToString(type);
    console.error(`Unsupported list type: ${typeString} for ${attribute}`);
    return undefined;
  }

  let array = [];
  for (let i = 0; i < count; i++)
    array.push(getListAttributeValueAtIndex(node, type, attribute, i));

  return array;
}

function getListAttributeValueAtIndex(node, type, attribute, index) {
  switch (type) {
    case axapi_inspect.ValueType_NODE:
      return node.getNodeListValueAtIndex(attribute, index);
    case axapi_inspect.ValueType_STRING:
      return node.getStringListValueAtIndex(attribute, index);
    case axapi_inspect.ValueType_RANGE:
      return node.getRangeListValueAtIndex(attribute, index);
    case axapi_inspect.ValueType_DICTIONARY:
      return node.getDictionaryListValueAtIndex(attribute, index);
    default:
      return undefined;
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants