Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to cover BCL interfaces/implementations by semi-automatically aggregation for around members. #91

Open
kekyo opened this issue Aug 11, 2021 · 9 comments

Comments

@kekyo
Copy link
Owner

kekyo commented Aug 11, 2021

TODO:

@cyborgyn
Copy link

It is worth looking at https://github.com/nanoframework/CoreLibrary

  • It has a mini, stripped down mscorlib implementation
  • Has no generics in it
  • Has no unsafe code in it

I will try to compile it with IL2C, to see how much effort needed to be put into full blown implementation of it for IL2C.
Would require to:

  • implement a few more IL codes
  • parsing of mscorlib itself (which is in this case the nanoFramework mscorlib)
  • fix explicit interface implementation handling
  • implement C functions for [MethodImpl(MethodImplOptions.InternalCall)]

@kekyo
Copy link
Owner Author

kekyo commented Oct 13, 2021

We need to extract automated how much lack members on IL2C. I think it difficulty topic:

  • We can traverse all members on mscorlib/corlib by cecil and/or reflection.
  • Very core impls for IL2C are only on C source code.
  • Need to match these member information, and extract any different impl status. But how to extract from C source code (with stable)?
    • If use basic text parsing: Unstable way maybe fail when contains complex declaration (already contains ? :)
    • If use C language parser: Clang or others lib? (I know nclang and ClangSharp)
  • Need to translate by except (exclude) member list.
    • (I don't know now) System.Math members make safer automatically translate from IL impls, because it contains pure computation by IL.
    • All members come from bottom of P/Invoke calling graph have to review translation or hand-coded.

@cyborgyn
Copy link

Indeed, it is a difficult topic. I think, we need to pick a BCL implementation, and work against it, otherwise it will be an endless effort to build it from bottom to up. If we can make IL2C compile such an mscorlib into C, it will produce the same naming convention otherwise used in the app.

I looked at the Microsoft's implementation of BCL (mscorlib) a bit, tried to put together a stripped down version for a few days, but has a lot of problems:

  • Very much dependent on Win32 DLL calls, structures
  • Very much depends on unsafe methods, string hacks
  • All of the classes are interconnected, a spaghetti code basically (fe.: couldn't eliminate the usage of CultureInfo, and culture handling all together)
  • It has differences for different targets (NETCF, NETCORE, etc...), even in method signatures

The nanoFramework seems to be a fairly complete reimplementation of .NET, with much fewer methods, and all of those are not implemented in IL (C#), are internal.calls, and on top of that: it was specifically designed with resource constrained embedded systems in mind. (Just like IL2C, as I understand)

By compiling mscorlib into C, we would get the proper class descriptors, vtables, stack frames, everything, for even the method stubs. Those could be removed from the IL2C.Runtime files, and only leave the actual method implementations. Than trying to compile the mscorlib against the framework with gcc, would result in a well identifiable error messages, which methods don't have an implementation (the compile log could be 'grep'-ed for those, I think).

When we have identified the methods, not there in the IL2C.Runtime, we can generate those with a simple body, throwing NotImplementedException. And start to implement those, eventually.

If we fork from nanoFramework.CoreLibrary, we can even choose where to implement the missing parts: C# side, or C side.

cyborgyn added a commit to cyborgyn/IL2C that referenced this issue Oct 13, 2021
@kekyo
Copy link
Owner Author

kekyo commented Oct 14, 2021

I will read/check too nanoFramework.CoreLibrary, I feel will take some idea from it.

It may be difficult to port nanoFramework.CoreLibrary as it is. (Because there is a difference between AOT and interpreter)

@cyborgyn
Copy link

cyborgyn commented Oct 16, 2021

According to my initial tests with compiling nanoFramework's mscorlib, I see the following path would be viable to have an optional "external" mscorlib implementation:

  • Need to rearrange a bit the IL2C.Runtime *.h files
    • Need to introduce a #ifndef EXTCORLIB clause in il2c.h around the place where System/*.h is included
    • In #else we would include the generated headers
    • Split out special functions from System/*.h into separate header files, to be able to include just those with the external mscorlib
  • Introduce #ifndef EXTCORLIB also in C files, around VTABLE and IL2C_RUNTIME_TYPE declarations (since it will be done in the generated code, with the generated methods)
  • Need to fix up a bit HeaderWriter
    • Fix #include generator ordering to be able to include types in correct order depending on each other (first pass)
    • Split the second pass into a separate file, to be able to include special functions/macros before including the body declarations
  • Need to modify a bit the nanoFramework.CoreLib, to also include those private fields the IL2C.Runtime expects to have
  • Need to patch up the generated C/H files with special cases, which are naturally not generated correctly for this special case, like:
    • typedef struct System_Byte System_Byte; =>typedef struct uint8_t System_Byte;
    • In Delegate.h: typedef System_MulticastDelegate System_Delegate; => typedef struct System_Delegate System_Delegate;

If after these changes compilation succeeds, we will have a new runtime static library, with external references to methods not implemented, and this could be collected probably with libtools somehow.

Edit

Don't need to patch the generated C code, instead I introduced the Interop attributes in nanoFramework, and used to annotate NativeType all the simple types, except for the Delegate and MulticastDelegate.

@kekyo
Copy link
Owner Author

kekyo commented Oct 16, 2021

I bit understood your strategy :) Would you tell me about my understanding is correct:

  1. Assuming that the published symbols that nanoFramework.CoreLibrary contains reasonable scale of symbols,
  2. Build IL2C with a little tweaking to refer to the CoreLibrary symbols,
  3. Extract based on the symbol information of the combined native binary (with nm and like).

The difference between the list of symbols obtained from a normal IL2C binary and the CoreLibrary combined binary is probably insufficient and it's close up next step for our implementation.

@cyborgyn
Copy link

Yes, something close. The produced combined library will contain a lot of extern symbols, which will be coming from the CoreLibrary's methods, annotated with [MethodImpl(MethodImplOptions.InternalCall)] but no implementation found in the Runtime. We can decide one by one, which one to implement on which side: C# or C. At least I hope, that it will work like this.

@kekyo
Copy link
Owner Author

kekyo commented Oct 17, 2021

I understood it!

Would you make ready to review for branch feature/implement-ilcodes before do it? It already long commit tree so I wanna review before continue this idea.

(I feel you will make on feature/external-mscorlib, could you rebase it on PR merged commit in devel ?)

@cyborgyn
Copy link

I understood it!

Would you make ready to review for branch feature/implement-ilcodes before do it? It already long commit tree so I wanna review before continue this idea.

(I feel you will make on feature/external-mscorlib, could you rebase it on PR merged commit in devel ?)

I think it is ready to review. In the meantime, I will stop working on the feature/external-mscorlib branch for now, and add some more unit tests to the PR, if I will have some time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants