Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose CLong, CULong, and NFloat interchange types #13788

Closed
tannergooding opened this issue Nov 12, 2019 · 46 comments · Fixed by #46401
Closed

Expose CLong, CULong, and NFloat interchange types #13788

tannergooding opened this issue Nov 12, 2019 · 46 comments · Fixed by #46401
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-Interop-coreclr
Milestone

Comments

@tannergooding
Copy link
Member

tannergooding commented Nov 12, 2019

Rationale

There exists a few common interchange types which are variable sized. Unlike many of the other types, it is non-trivial for a user to define these types themselves and even if they did, there are some calling conventions where structs are not marshaled the same as the underlying T they may wrap.

These types include long and unsigned long types defined by C/C++ which have variable size depending on the target platform/architecture. On Windows, these types are always 4-bytes and are equivalent to int/uint in C#. On Unix, however, these types are 4-bytes or 8-bytes depending on if you are targeting a 32-bit or 64-bit platform which makes them equivalent to nint/nuint.

Likewise Apple defines CGFloat which is float on 32-bit systems and double on 64-bit systems.

As such, I propose we expose minimal types that are recognized by the runtime and are handled as if they were the corresponding underlying primitive type to avoid the ABI issues. These types are explicitly very minimal and do not provide "common" operations such as parsing, formatting, or even simple operators.

Proposal

namespace System.Runtime.InteropServices
{
    public readonly struct CLong : IEquatable<CLong>
    {
        public CLong(nint value);

        public nint Value { get; }

        public override bool Equals(object o);
        public bool Equals(CLong other);

        public override int GetHashCode();

        public override string ToString();
    }

    public readonly struct CULong : IEquatable<CULong>
    {
        public CULong(nuint value);

        public nuint Value { get; }

        public override bool Equals(object o);
        public bool Equals(CULong other);

        public override int GetHashCode();

        public override string ToString();
    }

    public readonly struct NFloat : IEquatable<NFloat>
    {
        public NFloat(double value);

        public double Value { get; }

        public override bool Equals(object o);
        public bool Equals(NFloat other);

        public override int GetHashCode();

        public override string ToString();
    }
}
@tannergooding
Copy link
Member Author

CC. @jkoritzinsky, @AaronRobinsonMSFT, @jkotas for thoughts/opinions/etc.

@tannergooding
Copy link
Member Author

It does look like there is one unused bit on StructLayout for LayoutKind, so it would also be possible to have StructLayout(LayoutKind.Transparent), for example, as an alternative proposal.

@GrabYourPitchforks
Copy link
Member

Would something like the below also make sense?

[MarshalAs(UnmanagedType.I4)] // This HRESULT should be marshaled as a 32-bit integer
public struct HRESULT
{
    internal int _hr; // contains the actual data
}

@FiniteReality
Copy link

Would something like the below also make sense?

[MarshalAs(UnmanagedType.I4)] // This HRESULT should be marshaled as a 32-bit integer
public struct HRESULT
{
    internal int _hr; // contains the actual data
}

How would this work for pointers, which are based on the native platform word size? Some native libraries, such as DirectX, have APIs which take arrays of pointers. It's hard to accept parameters like that as Span<T> does not support pointer types as T, so it's somewhat common to see code like this:

struct Pointer<T> where T : unmanaged
{
    public T* value;
}

Passing these values as Span<T> is then possible, as you can use Span<Pointer<T>>

@jkotas
Copy link
Member

jkotas commented Nov 13, 2019

We have number of similar suggestions to add more rules for generating interop marshallers (e.g. one for Span<T> - https://github.com/dotnet/coreclr/issues/14460) or to fix bugs in existing interop marshallers. My preference would be to implement these new rules using a build-time interop source generator only. It would create incentive for projects to switch to build-time generated interop and help us avoid all problems with runtime-generated interop that we have today (startup performance, version resilient AOT, compatibility, diagnosability and ease of maintenance).

Once we have a built-time generated interop, I would be open to add as many of these new marshaling rules as necessary.

@tannergooding
Copy link
Member Author

Once we have a built-time generated interop, I would be open to add as many of these new marshaling rules as necessary.

I'm not sure that will be a sufficient way to go about this. There are a large number of interop libraries in existence today, some of which are purposefully exposing bindings in a particular way and which may not be able to change due to it being breaking or due to it being non-trivial effort.

You likewise have cases where there is a lot of customization and/or configurability desired. For example:

  • You may not want bindings for all exported declarations, you may only want a subset of them
  • You may want to expose 1-to-1 blittable P/Invokes or you may want more friendly P/Invokes which use ref/out/in, arrays, strings, etc
  • You may COM headers where you want to interop using C# interfaces or you may want to manually layout the VTBL and use function pointers
  • You may be using a header file that benefits from or requires re-interpretation of certain types (e.g. SIZE_T as nuint, or HWND__ as IntPtr)
  • You may be using a header file that benefits from fixups of various names (e.g. tagPOINT to POINT or D3DCOLORVALUE to DXGI_RGBA)
  • You may be using a header file that has non-trivial differences between Unix and Windows

While I think having some sort of source generator to assist with this would be nice, and while there are many existing tools that help with the binding generation process (ClangSharp, CppAst, etc); they will almost always need some kind of manual fixups, customization, etc. They also all, today, use existing interop mechanisms exposed through System.Runtime.InteropServices and would likely need to continue doing so in the future as well (even any generated by some future "offficial" binding generator).

@jkotas
Copy link
Member

jkotas commented Nov 13, 2019

I am not suggesting that we should invent universal interop source generator. I agree that it is close to impossible to build such thing. Let me change my statement to: "I would be open to add as many of these new marshaling rules as reasonable.".

This is what you can write today:

internal struct HRESULT
{
    internal int _hr;
    internal HRESULT(int hr) { _hr = hr; }
}
[DllImport("MyLibrary", EntryPoint="MethodThatReturnsHRESULT")]
private extern static int _MethodThatReturnsHRESULT();

internal static HRESULT MethodThatReturnsHRESULT()
{
    return new HRESULT(_MethodThatReturnsHRESULT(());
}

This proposal would let you avoid a bit of the boiler plate code and instead write:

internal struct HRESULT
{
    internal int _hr;
}
[DllImport("MyLibrary")]
internal extern static [return: MarshalAs(UnmanagedType.TransparentStruct)] MethodThatReturnsHRESULT();

My point is that boiler plate code generator should run at build time. Runtime is not a good place for it.

@tannergooding
Copy link
Member Author

So basically your proposal is the user would write the latter and then some build time tool (possibly C# source generators or an IL rewriter, so other languages can also benefit) would convert it to the former; Is that correct?

I'd be interested in hearing the proposal for how that would work with function pointers and delegates, although I could conceive of that being handled by wrappers as well.

I'd imagine there would also be a fairly long tail of bugs around ensuring the JIT produced good codegen for these scenarios (but fixing those would help with better codegen in the JIT overall, not just for interop).

@jkotas
Copy link
Member

jkotas commented Nov 13, 2019

I'd imagine there would also be a fairly long tail of bugs around ensuring the JIT produced good codegen

I expect that this would have better performance from the get-go (both startup performance and throughput). The marshaling IL generated by the runtime today is no different from what you can write in C#. However:

  • The marshaling IL generated by the runtime is special cased in number of places of the system, and as a result comes with number of limitations. For example, the JIT is not able to inline it - even if it is something trivial like converting bool to BOOL.
  • It is hard to spot and fix inefficiencies in the runtime generated IL
  • It is hard to evolve the runtime generator without introducing regressions

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the 5.0 milestone Jan 31, 2020
@tannergooding tannergooding changed the title Expose a new UnmanagedType: TransparentStruct Expose a new TransparentStruct attribute and CLong/CULong interchange types Apr 22, 2020
@tannergooding tannergooding added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Apr 22, 2020
@tannergooding
Copy link
Member Author

Updated the proposal based on the discussion that has been happening on #27530 (comment)

@tannergooding tannergooding changed the title Expose a new TransparentStruct attribute and CLong/CULong interchange types Expose CLong, CULong, and NFloat interchange types Apr 23, 2020
@tannergooding
Copy link
Member Author

@jkotas, @AaronRobinsonMSFT, @jkoritzinsky. I've updated the top proposal based on the most recent discussion from #27530.

The proposal suggests we expose CLong (C/C++ long type), CULong (C/C++ unsigned long type), and NFloat (Apple's CGFloat type) and no longer suggests we expose a TransparentStruct or equivalent attribute.
The proposal also explicitly calls out that these types are explicitly minimal and do not plan on exposing common operations such as parsing/formatting or operators.

Could you let me know if everything looks good so I can mark this as api-ready-for-review?

@AaronRobinsonMSFT
Copy link
Member

@tannergooding I think these are good for us. What about mono? I assume we would want identical semantics so we should hear their thoughts as well.

@tannergooding
Copy link
Member Author

Xamarin defines nfloat here: https://docs.microsoft.com/en-us/dotnet/api/system.nfloat?view=xamarin-ios-sdk-12. I don't think they have a CLong equivalent today.

Feel free to tag whoever you think is appropriate, I'm unsure who the relevant area owners are here.

@AaronRobinsonMSFT
Copy link
Member

@lambdageek and @marek-safar Do either of you have thoughts on this API proposal?

@jkotas
Copy link
Member

jkotas commented Apr 23, 2020

Could you let me know if everything looks good

It looks good to me.

NFloat

I am not sure how much we get by adding this - since Xamarin framework has nfloat already. @lambdageek @marek-safar Would it make sense to switch from nfloat to it in .NET 5?

@tannergooding
Copy link
Member Author

I'm not sure they could switch, outside using it themselves as the internal field of their nfloat (which would just let all runtime have a consistent type to check for callconv handling).

Their nfloat implements all the common methods, interfaces, and operators.

@marek-safar
Copy link
Contributor

I don't think Xamarin APIs can migrate to this version of nfloat. It'd be a noticeable breaking change for the customers so we'd end up with 2 types in "BCL" with the same name in different namespaces which I don't think is an improvement for the developers.

@rolfbjarne do you agree?

@rolfbjarne
Copy link
Member

I don't think Xamarin APIs can migrate to this version of nfloat. It'd be a noticeable breaking change for the customers so we'd end up with 2 types in "BCL" with the same name in different namespaces which I don't think is an improvement for the developers.

Correct, Xamarin.iOS/Mac can't migrate to these versions of nint/nuint/nfloat, it would be quite the breaking change.

What we'd need in order to avoid breaking changes is for the runtime to treat the types we define (System.nint, System.nuintand System.nfloat) as primitive types (to prevent ABI issues). Ideally the generated native code would be equivalent to using the int/long/float/double version too (for performance reasons).

I'm not sure they could switch, outside using it themselves as the internal field of their nfloat

This runs into the ABI problem again.

@tannergooding
Copy link
Member Author

tannergooding commented Apr 23, 2020

Just to clarify, while the proposed NFloat corresponds to System.nfloat, the proposed CLong and CULong do not correspond to System.nint/System.nuint.

CLong and CULong correspond to the C/C++ long and unsigned long keywords respectively, which are 32-bits on Windows, 32-bits on 32-bit Unix, and 64-bits on 64-bit Unix. That is, it is equivalent to System.Int32 on Windows and System.IntPtr on Unix.

System.nint and System.nuint are Xamarin types that correspond to System.IntPtr and System.UIntPtr respectively, but which expose operators/etc. They (IntPtr and UIntPtr) won't have new types exposed because it is already a runtime primitive with the correct behavior and the C# language is exposing new nint and nuint keywords for them in C# 9 (Xamarin likely has a separate decision on how to handle that, if at all).

@tannergooding tannergooding added api-ready-for-review and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Jun 12, 2020
@carlossanlop carlossanlop added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-ready-for-review labels Jul 6, 2020
@markusschaber
Copy link

markusschaber commented Jul 13, 2020

Shouldn't CLong / CULong expose 64 bit values, so the full range can be marshaled on 64 Bit Linux / MacOS / iOS / Android?

@DeafMan1983
Copy link

Ok @tannergooding !! I understand you. Net is not Java, correctly! But who is still Java programmer and he/she want learn with C# if they mess up like "where is NativeLong" in C#. That is problem. I want support Java-programmers into C#. "c" of clong, etc.. are okay - We need duplicate to "native" or you make easy explanation for Java programmer if they learn and understand Java and C# equivalent.

Please don't be aggressive me! I just want help if Java coders write easy with translation from Java to C# like Tangible Software Solutions programs. But they don't care converter and just translate by hands.

@tannergooding
Copy link
Member Author

tannergooding commented Sep 23, 2020

Please don't be aggressive me!

I definitely apologize if it came across that way. I didn't mean to come across as aggressive and just intended to explain why we don't do it like Java and the reasoning why the name CLong was chosen here.

But who is still Java programmer and he/she want learn with C# if they mess up like "where is NativeLong" in C#. That is problem. I want support Java-programmers into C#.

I think that is a documentation issue. We have docs like https://docs.microsoft.com/dotnet/framework/interop/marshaling-data-with-platform-invoke that provide a mapping of common types from C or the Windows API to the corresponding .NET Type.

I would expect it to be possible to provide similar for other common languages, like Java, Rust, Objective-C, or Swift and even do this for other APIs, as JNI is by far not the only API that differs. Things like the Math, Console, Networking, File I/O, and more APIs all differ significantly both in where/how they are exposed and in how they are used (something akin to a "Get started with C#, I know 'x'" rather than just "Get started with C#, I'm a beginner".)

@FiniteReality
Copy link

mess up like "where is NativeLong" in C#.

For what it's worth, C#9 also added native integers (i.e. nint and nuint) which should cover the "native" (i.e. CPU word size) integers: https://github.com/dotnet/csharplang/blob/cc11a3bb7a1be4c9437846b40d07c5174e579b15/proposals/csharp-9.0/native-integers.md

@DeafMan1983
Copy link

DeafMan1983 commented Sep 23, 2020

Great job @FiniteReality nice explanation and formation with C# and nint/nuint.
but you can't do like this:

public class Window : nint // nint is as struct = Exception error
{
   // constructure
}

But I need add protected constructor like this

public class Window : nint // nint is as class = Correctly
{
   public Window(int value) : base(value)
   {
   }
}

But why do you not define nint with "class" and shouldn't be "struct"
Like that I explain you.
object -> nint/nuint -> class = works like GCHandle
object -> nint/nuint -> struct = throws exception because it doesn't work with IntPtr or UIntPtr like unusable...

Please remember about ObjectRefine like Gtk2 & Gtk3 you know far but we do not implement for simple C# class like class without g_object_ref and g_object_unref. How do you resolve class from "typedef {} DefineObject"? It means as class with subclass.
IntPtr and UIntPtr use only "ref" for making struct.

I hope you understand me. I am not wrongly.

// EDIT:
I will show you like you say but.
grafik
It is like nint as struct

But why does Display not like struct as subclass?
grafik

That is problem.

@FiniteReality
Copy link

You're using OOP wrong; if you want a handle type you should be using something like SafeHandle to properly implement your type. A nint is just the same as any other integer type, like int or long - and I don't think you'd want to subclass int for anything, because what would that even mean?

@tannergooding
Copy link
Member Author

@AaronRobinsonMSFT, this is going to require special handling so that CLong, CULong, and NFloat are treated as the primitive type rather than as structs for the purposes of marshalling (I believe this only impacts Windows thiscall).

There is already special handling for this scenario for COM methods in dllimport.cpp, but IIRC we are also trying to move more logic to just be in the JIT and to match it by default.

So, should the new CLong, CULong, NFloat logic be added to dllimport or to the JIT?

@jkotas
Copy link
Member

jkotas commented Sep 29, 2020

I think CLong, CULong and NFloat should be handled as intrinsic types in the unmanaged calling convention in the JIT. I think it would be nice to finish #39294 first before starting on this one.

cc @jkoritzinsky

@jkoritzinsky
Copy link
Member

I'll try to get back on #39294 so we can finish it up. I agree that these types should be handled in the JIT. That makes things like UnmanagedCallersOnly support for these types extremely easy.

@AaronRobinsonMSFT
Copy link
Member

I think it would be nice to finish #39294 first before starting on this one.

Agreed.

That makes things like UnmanagedCallersOnly support for these types extremely easy.

Yep.

@ceztko
Copy link

ceztko commented Nov 16, 2020

I also think a solution for ABI compatibility for the long types in unix architectures is desirable. If I can give a suggestion for naming: just call them nlong, nulong. Yes, n prefix would not reflect exactly the same scaling semantics as for nint, nuint, and nfloat but I would dislike a new non-obvious c or C prefix way more. After all, the hope is that native types are used just by people that know what they are meant for.

I also have a question: where I can find the cs source for System.nfloat ? I could not find it in xamarin repositories I searched.

@am11
Copy link
Member

am11 commented Nov 16, 2020

where I can find the cs source for System.nfloat

It is part of mono mini, also mirrored in runtime repo:

public unsafe struct nfloat : IFormattable, IConvertible, IComparable, IComparable<nfloat>, IEquatable <nfloat>

@tannergooding
Copy link
Member Author

tannergooding commented Nov 16, 2020

I also think a solution for ABI compatibility for the long types in unix architectures is desirable.

That is what the CLong and CULong types are. I don't think the n prefix is better for long and/or ulong in this context. They are in no way tied to the native platform/architecture like nint, nuint, or nfloat are and are instead tied to how the underlying ABI handles the correspondingly named types from the C programming language.

@ceztko
Copy link

ceztko commented Nov 16, 2020

They are in no way tied to the native platform/architecture like nint, nuint, or nfloat

"No way" seems too bold here when actually C/C++ long has a scaling semantics that behaves exactly like nint in all Unix .NET supported ABIs. My opinion here is that in the long run over detailed naming scheme hurts eyes, and c/C prefixes are not that fantastic choice that makes them appear like they are the perfect fit. n prefix would have the advantage of grouping all the types that have special meaning with regard to runtime architecture and ABIs. Also consider how the very n prefix is probably already a compromise when used for nfloat: there's no correlation to FPU registers size compared to what happens for integral types (FP registers are most often 80bit sizes), and calling it "native" gives the wrong feeling that is the right type to use for general math in all architectures, which of course it's not true. Still I take the n choice in nfloat as an acceptable, and understandable, compromise.

@DeafMan1983
Copy link

Oh my god powerful teacher, @ceztko ! Good job! Linux/Unix or BSD are very different to MS-DOS. Why does young Bill Gates work hard since initial Windows Operating System with 32Bit like C/C++ formation. Is it correctly?

Since my birth C++ releases initial and it works for Unix Format too. Mathematics failed to mean 64 or 32 for long.That's why I understand that now. OK we need with nint, nuint, nfloat, nshort etc for C# Unix-Net.

@tannergooding
Copy link
Member Author

While it does match the behavior for Unix platforms which are LP64, it does not match Windows, which is LLP64 (long is always 32-bits`).

"native" in this context doesn't mean register size, which might even be 128 or more bits on some platforms. It means it matches the bitness of the underlying platform, so either 32 or 64-bits and is ultimately just extending an existing concept.

With regards to the C long and unsigned long keywords: If you are only supporting Unix, then using nint and nuint are ultimately better. If you are only supporting Windows, then int is better. CLong and CULong will ultimately just be interchange types with no operator support for the scenarios where you need to support bindings for both Unix and Windows simultaneously and where the C library (for whatever reason) decided to use long on both.

This scenario arises quite frequently in code that was ported to Windows before the 64-bit split and less commonly after where codebases often use intptr_t and uintptr_t instead to ensure consistent behavior and usability. However, it does still crop up from time to time, such as with several of the libClang bindings.

@ceztko
Copy link

ceztko commented Nov 16, 2020

@tannergooding thank you for continuing the discussion. Actually there was no need to re-instantiate explanation for your proposed CLong/CULong types which is was already clear to me. I just said that I don't like c/ C and I suggested to make it easy and prefix all possibly scaling types with n and specify implementation details and rationale in the documentation. In the end, I guess you'll have internally a committee at Microsoft and have a resolution later on names of these types. Since I am not going to take part at this committee I just wanted to leave my feedback. That's all.

@AaronRobinsonMSFT
Copy link
Member

In the end, I guess you'll have internally a committee at Microsoft and have a resolution later on names of these types.

The API review was already done and referenced above at #13788 (comment). Everyone is welcome to participate in those reviews as they are public, this was already happened unfortunately. Thank you for sharing your feedback through it is much appreciated.

@markusschaber
Copy link

This scenario arises quite frequently in code that was ported to Windows before the 64-bit split and less commonly after where codebases often use intptr_t and uintptr_t instead to ensure consistent behavior and usability. However, it does still crop up from time to time, such as with several of the libClang bindings.

Just as an anecdote: In our case, it's an existing, historically grown C API/ABI with roots about 25 years in the past. Back then, 64 bit for Desktop (Windows, Linux) was in the far future, and even more in the dozen or so embedded platforms which are our main target. As there's also a C++ API using the same structs and datatypes, the use of "long" also could not be upgraded to modern types like "int32_t" or "ptrsize_t" without breaking ABI compatibility for C++ clients.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Dec 25, 2020
@jkoritzinsky jkoritzinsky self-assigned this Dec 25, 2020
@GSPP
Copy link

GSPP commented Dec 30, 2020

This looks like something that would benefit from a proof of concept to validate the design. For example, I notice that there are no conversion operators. The types are intentionally minimal. I wonder how well this works under practical circumstances. I also wonder if this feature carries it's weight. In this issue, there was pretty minimal discussion around that. Maybe this discussion happened elsewhere?

@ssa3512
Copy link

ssa3512 commented Dec 30, 2020

@GSPP my understanding (and need) is that these really are only for use in cross platform private interop methods. You would never expose them in a public API - you would instead expose int or long or some other result in the public surface. This was discussed in the API review video linked here

@AaronRobinsonMSFT
Copy link
Member

is that these really are only for use in cross platform private interop methods. You would never expose them in a public API - you would instead expose int or long or some other result in the public surface.

Correct. Exposing these to users would be rather annoying and represent a leak in the API abstraction being exposed. The primary purpose here is to help interop do the right thing when it concerns cross platform APIs that use a C/C++ long on the native side.

@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jan 12, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Feb 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-Interop-coreclr
Projects
None yet
Development

Successfully merging a pull request may close this issue.