Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requesting Py_buffer with additional fields either crashes or returns struct with partially garbage values #2371

Open
chickenservice opened this issue May 5, 2024 · 1 comment

Comments

@chickenservice
Copy link

Environment

  • Pythonnet version: 3.0.3
  • Python version: 3.12.1
  • Operating System: Windows 11
  • .NET Runtime: .NET 4.7.2, .NET 6.0

Details

  • Describe what you were trying to get done.

Requesting buffer with format string crashes (both .NET 4.7.2 and NET 6.0)

Requesting a buffer with PyBUF_FULL crashes the process. This seems to be caused by the string marshaling of the format field in the Py_buffer struct. Changing this to IntPtr and marshalling the string by hand with e.g. Marshal.PtrToStringAnsi resolves the crash.

Requesting buffer with strides for bytearray returns garbage strides values (.NET 6.0)

The issue seems to be that the .NET marshaller creates a copy and not a reference of the struct if the struct is not blittable. bytearray uses the default PyBuffer_FillInfo implementation from abstract.c to populate the Py_buffer struct. When requesting strides the function sets the strides field to the address of the len field PyBuffer_FillInfo. Upon returning from the P/Inovke call the marshaller copies the values back to the managed struct. The pointer is copied correctly but now points to garbage because it's referencing the field in the unmanaged struct created by the marshaller. From Formatted Blittable Classes:

When these types require marshalling, a pointer to the object in the heap is passed to the callee directly. The callee can change the contents of the memory location being referenced by the pointer.

and Formatted Non-Blittable Classes

If a non-blittable class is marshalled by reference, the callee receives a pointer to a pointer to a copy of the data structure.
...
If the OutAttribute attribute is set, the state is always copied back to the instance on return, marshalling as necessary.

I tested this with mixed-mode debugging as well (the struct in PyBuffer_FillInfo is correctly populated). Why this works in .NET 4.7.2 and not in .NET 6.0 is a mystery to me but relying on such implementation details is anyway not desirable. Note that this only happens with objects that use PyBuffer_FillInfo, requesting strides for e.g. numpy arrays works fine as they populate strides with a dedicated array.

Take a look at this branch to see examples for the issues above and a potential solution. Below are two example tests to reproduce these bugs.

public class TestPyBuffer
{
        [Test]
        public void TestBufferRequestFull()
        {
            string bufferTestString = "hello world! !$%&/()=?";

            using var _ = Py.GIL();

            using var pythonArray = ByteArrayFromAsciiString(bufferTestString);
            byte[] managedArray = new byte[bufferTestString.Length];

            using (PyBuffer buf = pythonArray.GetBuffer(PyBUF.FULL)) // Crashes
            {
                Assert.AreEqual(buf.Format, "B");
            }
        }

        [Test]
        public void TestBufferRequestStridesBytearray()
        {
            string bufferTestString = "hello world! !$%&/()=?";

            using var _ = Py.GIL();

            using var pythonArray = ByteArrayFromAsciiString(bufferTestString);
            byte[] managedArray = new byte[bufferTestString.Length];

            using (PyBuffer buf = pythonArray.GetBuffer(PyBUF.CONTIG_RO))
            {
                Assert.AreEqual(22, buf.Shape[0]); // Fails in .NET 6.0
            }
        }
}

Potential solution

By changing the string? format field to IntPtr and the bool _readonly' field to int` we make the struct blittable in which case it is passed as a true reference. This resolves both bugs, which is why I report them within the same issue. In my opinion structs should be blittable wherever possible to bypass the marshaller. I'd be happy to fix this as it relates to #1838.

/* buffer interface */
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Auto)]
internal struct Py_buffer {
    public IntPtr buf;
    public IntPtr obj;        /* owned reference */
    /// <summary>Buffer size in bytes</summary>
    [MarshalAs(UnmanagedType.SysInt)]
    public nint len;
    [MarshalAs(UnmanagedType.SysInt)]
    public nint itemsize;  /* This is Py_ssize_t so it can be
                         pointed to by strides in simple case.*/
    public int _readonly;
    public int ndim;
    public IntPtr format;
    public IntPtr shape;
    public IntPtr strides;
    public IntPtr suboffsets;
    public IntPtr _internal;
}
@filmor
Copy link
Member

filmor commented May 5, 2024

First, thank you for the thorough report, explanation and debugging on this!

As Py_buffer is not exposed in our API, it's completely fine with adjusting it in the way you suggested. I can only find a single reference to this field, in GetBuffer, where it's filled from the ItemFormats dictionary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants