New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend PayloadReader APIs to meet the WinForms requirements #11341
base: SafePayloadReader
Are you sure you want to change the base?
Conversation
- RecordMap needs to remain internal, but implement IReadOnlyDictionary<int, SerializationRecord> so it can be returned from a public method without exposing the type itself - introduce new PayloadReader.Read overload that can return RecordMap via out parameter - make SerializationRecord.ObjectId public - make MemberReferenceRecord public, expose Reference property
- extend ArrayRecord with ElementTypeName and ElementTypeLibraryName public properties
…needed for writing to System.Windows.Forms where it's used (System.Windows.Forms compiles)
- stop using internal APIs - fix the bugs: - update the record map, so mapping system class of common primtive types to PrimitiveTypeRecord<T> is visible also to those who access it via Id - add missing support for mapping to PrimitiveTypeRecord for IntPtr and UIntPtr - use GetSerializationRecord when unwrapping - remove FormatterTypeStyle.TypesWhenNeeded test (not supported) - remove tests for internal APIs tested in PayloadReader project: - RecordMapTests - NullRecordTests - ClassInfoTests - MemberTypeInfo
- check records by Ids to detect custom comparers for Hashtable - ObjectRecordDeserializer needs to handle ArrayRecords and raw primitive values - add support for jagged arrays with custom offsets (just to rejct them later) - materialize arrays of primitive types using the provided API - when mapping System Classes to PrimitiveTypeRecord<T> the Ids need to be preserved - don't iterate over the same member twice - support arrays of nullable primitives (don't represent them as ArrayRecord<ClassRecord> because they consists of MemberPrimitiveTypedRecord and NullsRecords)
…eds a discussion whether we want to support that or not)
… using any of the internal APIs
...rivate.Windows.Core/src/System/Windows/Forms/BinaryFormat/BinaryFormattedObjectExtensions.cs
Show resolved
Hide resolved
...rivate.Windows.Core/src/System/Windows/Forms/BinaryFormat/BinaryFormattedObjectExtensions.cs
Show resolved
Hide resolved
...e.Windows.Core/src/System/Windows/Forms/BinaryFormat/Deserializer/ArrayRecordDeserializer.cs
Show resolved
Hide resolved
} | ||
while (_memberNamesIterator.MoveNext()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the order is no longer guaranteed to be preserved and I know it looks less clean than before, but I really hope that I don't need to expose an array (we are using dictionary to store the member names for security reasons:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does have to be maintained for this to be compatible.
I'm not sure guarding against a lot of member names and member data is that useful. Are they really going to be able to DOS with this? Wouldn't an upper cap on number of members be more practical? I can't imagine any real-world class having more than 64 members that get serialized, picking some large number beyond that might be more useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does have to be maintained for this to be compatible.
I tried to write a failing test for that and I've failed, it seems that the order is preserved as long as we don't mutate the dictionary afterwards: https://stackoverflow.com/a/64596145/5852046 But is it safe to rely on that assumption? I am not convinced.
We could use OrderedDictionary
which offers O(1)
lookups, but the inserts can be O(n)
at worst, but the same is true for Dictionary
. (source)
Wouldn't an upper cap on number of members be more practical?
I was thinking about introducing such limit: https://github.com/adamsitnik/SafePayloadReader/blob/f9beee7176ce0bf29fdea794d4f79de5c024103c/System.Runtime.Serialization.BinaryFormat/PayloadOptions.cs#L14
@GrabYourPitchforks Thoughts?
...stem.Private.Windows.Core/src/System/Windows/Forms/BinaryFormat/Deserializer/Deserializer.cs
Show resolved
Hide resolved
...stem.Private.Windows.Core/src/System/Windows/Forms/BinaryFormat/Deserializer/Deserializer.cs
Show resolved
Hide resolved
@@ -75,7 +75,7 @@ public void DeserializeStoredObjects(object value, TypeSerializableValue[] seria | |||
// Explicitly not supporting offset arrays | |||
from value in BinaryFormatterTests.RawSerializableObjects() | |||
from FormatterAssemblyStyle assemblyFormat in new[] { FormatterAssemblyStyle.Full, FormatterAssemblyStyle.Simple } | |||
from FormatterTypeStyle typeFormat in new[] { FormatterTypeStyle.TypesAlways, FormatterTypeStyle.TypesWhenNeeded, FormatterTypeStyle.XsdString } | |||
from FormatterTypeStyle typeFormat in new[] { FormatterTypeStyle.TypesAlways/*, FormatterTypeStyle.TypesWhenNeeded, FormatterTypeStyle.XsdString*/ } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JeremyKuhne this is important and we need to discuss whether we want to support it (I hope not, as according to my understanding it requires loading types which Payload Reader needs to avoid)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to support it, just made sure it was possible to do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing we would want to add back here is FormatterTypeStyle.TypesAlways | FormatterTypeStyle.XsdString
. That always forces strings to go as records.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that with that option tests fail again. There is what I believe to be a bug in the serializer where it will put out duplicate negative ids. My RecordMap
allows for that as they aren't supposed to be referenced. I didn't try to constrain further, this may only happen for the root record, or there might be some other constraint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to support it, just made sure it was possible to do so.
Awesome! I've changed the code to throw NotSupportedException
and removed the related code.
FormatterTypeStyle.TypesAlways | FormatterTypeStyle.XsdString
Thanks for sharing that! I had no idea that FormatterTypeStyle
an be used as [Flags]
(it's not marked as such...)
There is what I believe to be a bug in the serializer where it will put out duplicate negative ids
Thanks a lot for the hint, it has saved me a lot of debugging
src/System.Private.Windows.Core/tests/BinaryFormatTests/FormatTests/Common/PrimitiveTests.cs
Show resolved
Hide resolved
[MemberData(nameof(ArrayInfo_ParseSuccessData))] | ||
public void ArrayInfo_Parse_Success(Stream stream, int expectedId, int expectedLength) | ||
{ | ||
using BinaryReader reader = new(stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed these and similar tests as they belong to Payload Reader test suite
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They need to be fully covered, so should at least have a tracking issue to make sure we don't lose any of these scenarios.
...m.Private.Windows.Core/tests/BinaryFormatTests/FormatTests/FormattedObject/HashTableTests.cs
Show resolved
Hide resolved
src/System.Runtime.Serialization.BinaryFormat/Records/ArrayRecord.cs
Outdated
Show resolved
Hide resolved
src/System.Runtime.Serialization.BinaryFormat/Records/BinaryArrayRecord.cs
Show resolved
Hide resolved
src/System.Runtime.Serialization.BinaryFormat/Records/RectangularOrCustomOffsetArrayRecord.cs
Show resolved
Hide resolved
src/System.Runtime.Serialization.BinaryFormat/Records/SerializationRecord.cs
Show resolved
Hide resolved
src/System.Runtime.Serialization.BinaryFormat/Records/SystemClassWithMembersAndTypesRecord.cs
Show resolved
Hide resolved
src/System.Windows.Forms/src/System/Windows/Forms/BinaryFormat/ArrayInfo.cs
Show resolved
Hide resolved
return _lastType; | ||
} | ||
|
||
_lastLibraryId = libraryId; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last retrieved is an important perf optimization, we should try and keep that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
libraryId == _lastLibraryId
was much faster when the API was accepting Id libraryId
(integer). Now the public API provides AssemblyNameInfo
and we need to compare strings to check for equality (in this particular case).
I doubt it would move the needle now, as what we have (a dictionary lookup) is very fast (O(1)
)
_array = _arrayType is BinaryArrayType.Rectangular | ||
? Array.CreateInstance(_elementType, binaryArray.Lengths.ToArray()) | ||
: Array.CreateInstance(_elementType, arrayRecord.Length); | ||
int[] lengths = new int[arrayRecord.Rank]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be stackalloc'ed, should have made that more explicit in my own code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be stackalloc'ed
But there is no overload for Array.CreateInstance
that accepts a span (only arrays?)?
From apisof.net:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A feature request is in order, perhaps. You can allocate arrays on the stack with https://learn.microsoft.com/en-us/dotnet/csharp/whats-new/csharp-12#inline-arrays.
....Windows.Core/src/System/Windows/Forms/BinaryFormat/Deserializer/ObjectRecordDeserializer.cs
Show resolved
Hide resolved
|
||
namespace FormatTests.Common; | ||
|
||
public abstract class NullRecordTests<T> : SerializationTest<T> where T : ISerializer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These bad input tests need to be retained. As a quick hack you can make them just have the raw data streams.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ReadPrimitiveTypes
should be replaced with the implementation I have in BinaryReaderExtensions.ReadPrimitiveArray
. ToArray
needs to just pass back the existing array. Critical for the allocations. We need to be able to do the same for BinaryArray
when it is primitive as well (think byte[,]
), but that can come as a work item as long as the object model allows for it. I seem to recall some single dimensional primitive arrays ending up in BinaryArray
as well, but I'm not positive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ToArray needs to just pass back the existing array
I am going to offer that in the API proposal. Something like MoveToImmutable or a boolean flag for ToArray
. Do you have any preferences?
ReadPrimitiveTypes should be replaced with the implementation I have in BinaryReaderExtensions.ReadPrimitiveArray
That implementation may be faster, but it has some bugs:
-
It assumes that every
Stream
can returnLength
andPosition
while non-seekable streams will simply throw:
winforms/src/System.Private.Windows.Core/src/System/IO/BinaryReaderExtensions.cs
Lines 51 to 55 in 31176a4
internal static long Remaining(this BinaryReader reader) { Stream stream = reader.BaseStream; return stream.Length - stream.Position; } -
For certain types it works only for big endian:
winforms/src/System.Private.Windows.Core/src/System/IO/BinaryReaderExtensions.cs
Line 129 in 31176a4
if (sizeof(T) != 1 && !BitConverter.IsLittleEndian) |
My goal is to get the API design accepted and merged. Once we get there we can talk about perf (if there are not other more important things to work on). What is important is that all the mentioned optimizations can be achieved with current public API shape.
- FormatterTypeStyle.TypeAlways is a must have, remove unsupported RecordTypes from the public enum and throw NotSupportedException when they are provided - add support for Formatters.FormatterTypeStyle.XsdString - enable tests for FormatterTypeStyle.TypesAlways | FormatterTypeStyle.XsdStringr
ArrayType.Rectangular => _elementType.MakeArrayType(arrayRecord.Rank), | ||
_ => _elementType.MakeArrayType() | ||
}; | ||
// Tricky part: for arrays of classes/structs the following record allocates and array of class records |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamsitnik Is this allocating an additional array or is it just returning the internal data structure?
@JeremyKuhne I was working on the MSRCs today and managed to make only a small progress with transition to SafePayloadReader APIs. What I have here compiles, but you can revert the last revert to have a temporary state where some APIs have been transitioned and there are compiler errors.
Microsoft Reviewers: Open in CodeFlow