Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Analyzer]: Unexpected allocation when converting arrays to spans #69577

Closed
stephentoub opened this issue May 19, 2022 · 18 comments
Closed

[Analyzer]: Unexpected allocation when converting arrays to spans #69577

stephentoub opened this issue May 19, 2022 · 18 comments
Labels
api-approved API was approved in API review, it can be implemented area-System.Memory code-analyzer Marks an issue that suggests a Roslyn analyzer tenet-performance Performance related issue
Milestone

Comments

@stephentoub
Copy link
Member

The C# compiler has a valuable optimization that enables code like:

ReadOnlySpan<byte> span = new byte[] { 1, 2, 3 };

to be converted into code that emits that constant data into the assembly, and then this code just creates a span directly referencing that data. However, this optimization is limited to single byte primitive types, to all of the initialization data being constant, and to immediately storing into a ReadOnlySpan<byte>. If you do:

ReadOnlySpan<byte> span = new byte[] { 1, 2, someStaticField };

or

Span<byte> span = new byte[] { 1, 2, 3 };

or

ReadOnlySpan<char> span = new char[] { 'a', 'b', 'c' };

those will all allocate.

With #60948 and compiler support via dotnet/roslyn#61414 (or something similar), this optimization will helpfully be extended as well to char, short, ushort, int, uint, long, ulong, single, and double, such that code like:

ReadOnlySpan<char> span = new char[] { 'a', 'b', 'c' };

will similarly become non-allocating... but only when targeting .NET 7 or newer such that RuntimeHelpers.CreateSpan is available. When that methods isn't available, this same line of code will again regress to allocating.

It's thus really easy for a developer to accidentally allocate, as this syntax is something a developer typically only writes when they're actively trying to optimize and avoid the allocation. Hopefully language syntax will help to make this simpler, clearer, and less error prone in the future (dotnet/csharplang#5295), but for now we can add an analyzer to detect these cases and warn the developer about the unexpected allocation.

The analyzer would warn when:

  • Casting a new T[] expression to a ReadOnlySpan<T> either implicitly or explicitly, and
  • Either T isn't byte/sbyte/bool or CreateSpan is available and T isn't byte/sbyte/bool/char/ushort/short/uint/int/ulong/long/single/double, or
  • Some of the data being used to initialize the array isn't const

We could restrict this further to only applying to properties/method following the pattern:

static ReadOnlySpan<T> Data => new T[] { ... };

or we could extend it to statement bodies as well.

The default level should probably be suggestion.

For a fixer, we could write one for the property case, changing it to instead be a static readonly array. Or we could just not have a fixer at all.

@stephentoub stephentoub added tenet-performance Performance related issue code-analyzer Marks an issue that suggests a Roslyn analyzer api-ready-for-review API is ready for review, it is NOT ready for implementation labels May 19, 2022
@stephentoub stephentoub added this to the 7.0.0 milestone May 19, 2022
@ghost
Copy link

ghost commented May 19, 2022

Tagging subscribers to this area: @dotnet/area-system-memory
See info in area-owners.md if you want to be subscribed.

Issue Details

The C# compiler has a valuable optimization that enables code like:

ReadOnlySpan<byte> span = new byte[] { 1, 2, 3 };

to be converted into code that emits that constant data into the assembly, and then this code just creates a span directly referencing that data. However, this optimization is limited to single byte primitive types, to all of the initialization data being constant, and to immediately storing into a ReadOnlySpan<byte>. If you do:

ReadOnlySpan<byte> span = new byte[] { 1, 2, someStaticField };

or

Span<byte> span = new byte[] { 1, 2, 3 };

or

ReadOnlySpan<char> span = new char[] { 'a', 'b', 'c' };

those will all allocate.

With #60948 and compiler support via dotnet/roslyn#61414 (or something similar), this optimization will helpfully be extended as well to char, short, ushort, int, uint, long, ulong, single, and double, such that code like:

ReadOnlySpan<char> span = new char[] { 'a', 'b', 'c' };

will similarly become non-allocating... but only when targeting .NET 7 or newer such that RuntimeHelpers.CreateSpan is available. When that methods isn't available, this same line of code will again regress to allocating.

It's thus really easy for a developer to accidentally allocate, as this syntax is something a developer typically only writes when they're actively trying to optimize and avoid the allocation. Hopefully language syntax will help to make this simpler, clearer, and less error prone in the future (dotnet/csharplang#5295), but for now we can add an analyzer to detect these cases and warn the developer about the unexpected allocation.

The analyzer would warn when:

  • Casting a new T[] expression to a ReadOnlySpan<T> either implicitly or explicitly, and
  • Either T isn't byte/sbyte/bool or CreateSpan is available and T isn't byte/sbyte/bool/char/ushort/short/uint/int/ulong/long/single/double, or
  • Some of the data being used to initialize the array isn't const

We could restrict this further to only applying to properties/method following the pattern:

static ReadOnlySpan<T> Data => new T[] { ... };

or we could extend it to statement bodies as well.

The default level should probably be suggestion.

For a fixer, we could write one for the property case, changing it to instead be a static readonly array. Or we could just not have a fixer at all.

Author: stephentoub
Assignees: -
Labels:

area-System.Memory, tenet-performance, code-analyzer, api-ready-for-review

Milestone: 7.0.0

@GrabYourPitchforks
Copy link
Member

@stephentoub You gave as an example of a pitfall Span<byte> span = new byte[] { 1, 2, 3 };, but your "analyzer would warn when" section doesn't cover this case. Was this intentional?

@stephentoub
Copy link
Member Author

stephentoub commented May 20, 2022

Was this intentional?

It was. I used it as an example, but I think this is likely to be intentional on the part of a developer, and so warning about this would be likely to be noisy. I'd be more inclined to see a separate analyzer that sees you're never writing to a Span<T> (or passing it to something that accepts a Span and doesn't have a ReadOnlySpan overload) and recommends it be changed into a ReadOnlySpan<T>, at which point either the optimization would kick in or this analyzer would start warning.

@Neme12
Copy link

Neme12 commented May 30, 2022

I'm not sure this analyzer is useful for the general public, it seems like something only dotnet/runtime cares about. Or if does ship with .NET and is available to everyone, I wouldn't have the analyzer trigger automatically, only when you give a hint that you intend for this to be constant, for example:

// constalloc
static ReadOnlySpan<byte> Data => new byte[] { ... };

The analyzer would see e.g. the "constalloc" comment and would know that you intend for this to not be an array allocation but rather constant data embedded in the assembly. Since you gave an explicit hint, it could even be a warning by default. And it could catch the Span<> case as well like this.

Because what if you really need to have something non constant in there, let's say a field, and don't care about an allocation? The analyzer shouldn't bother people unless they explicitly want this to be constant, not even as a suggestion, because they can do nothing about that suggestion if they really need something non-constant. I think it would be much better if the analyzer only triggered when you explicitly mark the allocation somehow, for example with a comment like that. Alternatively, it could be an attribute that isn't emitted.

@stephentoub
Copy link
Member Author

stephentoub commented May 30, 2022

I'm not sure this analyzer is useful for the general public, it seems like something only dotnet/runtime cares about.

I disagree. Someone not familiar with the Roslyn optimization is unlikely to employ this pattern, as it's very counter intuitive, so it's only likely to affect those folks actually trying to remove allocations. On top of which, it won't be a warning or error by default, so it's just like any other suggestion in VS that can be upgraded to a warning or error for folks that care. And even if a hypothetical new language feature arrives, we still want to protect against this use, which must still work correctly or else risk significant regressions.

@Neme12
Copy link

Neme12 commented May 30, 2022

I disagree. Someone not familiar with the Roslyn optimization is unlikely to employ this pattern, as it's very counter intuitive, so it's only likely to affect those folks actually trying to remove allocations

Why is it unlikely? What if they just want to allocate an array but store it in a span for whatever reason, e.g. they always prefer that to arrays because accessing a span doesn't do variance checks?

@Neme12
Copy link

Neme12 commented May 30, 2022

On top of which, it won't be a warning or error by default, so it's just like any other suggestion in VS that can be upgraded to a warning or error for folks that care.

IDE suggestions are usually actionable though. But if you're allocating an array from non-constants, there's nothing you can do about that suggestion. It's a false positive and not useful to you at all. In fact it might only confuse you because you know nothing about any assembly data optimization, you're just trying to allocate an array and store it in a span.

@stephentoub
Copy link
Member Author

stephentoub commented May 30, 2022

they always prefer that to arrays because accessing a span doesn't do variance checks?

Knowing about and being concerned about the overhead of variance checks is expert level. And I struggle to imagine someone being ok with allocating a new array on every property access who is ok with that but not with the smaller overhead of variance checks.

Also, there are no types that have variance checks that this optimization would trigger for.

@stephentoub
Copy link
Member Author

stephentoub commented May 30, 2022

IDE suggestions are usually actionable though.

a) This is as well.
b) Many suggestions aren't.
c) If it's really confusing, disable the suggestion.

@Neme12
Copy link

Neme12 commented May 30, 2022

Knowing about and being concerned about the overhead of variance checks is expert level.

OK, maybe you just prefer span for another reason, e.g. you're going to slice it.

And I struggle to imagine someone being ok with allocating a new array on every property access who is ok with that but not with the smaller overhead of variance checks.

If the analyzer only worked for properties, then yeah it would make more sense. But if you're just doing this inside method:

ReadOnlySpan<byte> span = new byte[] { something1, something2 };

then you might not know anything about any optimization, you're just doing what you did before.

a) This is as well.

But only sometimes - it will only actionable in a minority of cases. If you're really putting something non-constant in the array, then chances are you really need something non-constant and you cannot make it constant, otherwise it would have been constant already.

c) If it's really confusing, disable the suggestion.

Hm, but let's say you find this analyzer useful, but only in cases where constant data is really what you want and you'd still find it useful if you could be explicit about it (like what the hypothetical language feature would be for - it would let you be explicit about it). Even in dotnet/runtime, if this is enabled as a warning, I can imagine there will be a lot of false positives where you're doing what I shown above inside a method and can't make it constant. As I said, if you're passing something non-constant, chances are you actually need that as opposed to it being a mistake.

@Neme12
Copy link

Neme12 commented May 30, 2022

Also, there are no types that have variance checks that this analyzer would trigger for.

You said in the original post that it would trigger when:

  • Either T isn't byte/sbyte/bool or CreateSpan is available and T isn't byte/sbyte/bool/char/ushort/short/uint/int/ulong/long/single/double, or

So when T is anything else.

@stephentoub
Copy link
Member Author

stephentoub commented May 30, 2022

Even in dotnet/runtime, if this is enabled as a warning, I can imagine there will be a lot of false positives where you're doing what I shown above inside a method and can't make it constant.

It will not. And part of the act of creating an analyzer is validating it on several repos, including runtime. If it were to be too noisy, we would tweak it.

maybe you just prefer span for another reason, e.g. you're going to slice it.
ReadOnlySpan<byte> span = new byte[] { something1, something2 };

Your example should be using stackalloc, not array allocation. It'd be a good thing for an analyzer to fire there.

it will only actionable in a minority of cases

Please share the data source that shows this is the minority use case.

@Neme12
Copy link

Neme12 commented May 30, 2022

I don't get why the default assumption would be that when you're allocating an array and putting something non-constant into it, you actually intended something constant and don't need that non-constant thing in there at all (even though you explicitly wrote it).

@stephentoub
Copy link
Member Author

stephentoub commented May 30, 2022

I don't get why the assumption would be that when you're allocating an array and putting something non-constant into it, you actually intended something constant and don't need that non-constant thing in there at all (even though you explicitly wrote it).

And directly assigning the new into a ReadOnlySpan. The whole point of suggestions are heuristics that might be of use. It's not an error, it's not a warning, it's a suggestion for something you might want to look at. Most suggestions are about things you "explicitly wrote"... by the cited logic nothing should ever suggest anything.

@Neme12
Copy link

Neme12 commented May 30, 2022

All suggestions are about things you "explicitly wrote"

Yes but it's almost always something that can be fixed automatically and is always actionable. Here it wouldn't always be actionable. You cannot just magically make a non-constant field, parameter or variable into a constant.

@stephentoub
Copy link
Member Author

Yes but it's almost always something that can be fixed automatically and is always actionable.

This is not true.

@stephentoub
Copy link
Member Author

Regardless, your objection seems to be that there will be too many false positives. As I've stated, if that ends up being the case, we can tweak the rule until its not the case. That's what we always do with new rules.

@terrajobst terrajobst added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Jun 23, 2022
@terrajobst terrajobst modified the milestones: 7.0.0, Future Jun 23, 2022
@stephentoub
Copy link
Member Author

I'm going to close this for now. The same cases we'd likely be able to flag with a low false positive rate are also the ones the C# compiler will likely be optimizing in the near future.

@stephentoub stephentoub closed this as not planned Won't fix, can't repro, duplicate, stale Nov 15, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Dec 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Memory code-analyzer Marks an issue that suggests a Roslyn analyzer tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

5 participants
@GrabYourPitchforks @stephentoub @terrajobst @Neme12 and others