Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libraries-pgo] NRE in System.Text.Json.Serialization.Tests.TypeInfoResolverFunctionalTests.DoNotSerializeValue42 #72775

Closed
jakobbotsch opened this issue Jul 25, 2022 · 6 comments · Fixed by #72842
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jul 25, 2022
@ghost
Copy link

ghost commented Jul 25, 2022

Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis
See info in area-owners.md if you want to be subscribed.

Issue Details

Pipeline run: https://dnceng.visualstudio.com/public/_build/results?buildId=1900401&view=ms.vss-test-web.build-test-results-tab&runId=49454490&resultId=197453&paneView=debug
Console log: https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-heads-main-d807ca12a06b41e2b8/System.Text.Json.Tests/1/console.82f1ff0d.log?helixlogtype=result

I see failures in both arm64 and win-x86 runs.

Author: jakobbotsch
Assignees: -
Labels:

area-System.Text.Json

Milestone: -

@jakobbotsch jakobbotsch added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI and removed area-System.Text.Json untriaged New issue has not been triaged by the area owner labels Jul 25, 2022
@jakobbotsch jakobbotsch added this to the 7.0.0 milestone Jul 25, 2022
@ghost
Copy link

ghost commented Jul 25, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Pipeline run: https://dnceng.visualstudio.com/public/_build/results?buildId=1900401&view=ms.vss-test-web.build-test-results-tab&runId=49454490&resultId=197453&paneView=debug
Console log: https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-heads-main-d807ca12a06b41e2b8/System.Text.Json.Tests/1/console.82f1ff0d.log?helixlogtype=result

I see failures in both arm64 and win-x86 runs.

Author: jakobbotsch
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@jakobbotsch jakobbotsch self-assigned this Jul 25, 2022
@jakobbotsch
Copy link
Member Author

Managed to get a repro on win-x86 after running several instances in a loop since this morning.
The attached PGO data repros it with

$env:COMPlus_JitRandomGuardedDevirtualization=57
$env:COMPlus_PGODataPath="pgo.txt"
$env:COMPlus_ReadPGOData=1
$env:COMPlus_TieredCompilation=0
$env:COMPlus_TieredPGO=1

and command line

C:\dev\dotnet\runtime\artifacts\bin\testhost\net7.0-windows-Release-x86\dotnet.exe exec --runtimeconfig System.Text.Json.Tests.runtimeconfig.json --depsfile System.Text.Json.Tests.deps.json xunit.console.dll System.Text.Json.Tests.dll -xml testResults.xml -nologo -notrait category=OuterLoop -notrait category=failing -method System.Text.Json.Serialization.Tests.TypeInfoResolverFunctionalTests.DoNotSerializeValue42

pgo.zip

@jakobbotsch
Copy link
Member Author

It seems gtTryRemoveBoxUpstreamEffects removes a box that is unused in the hot path but used in the cold path. When we hit the cold path we then ending up passing null instead of the result of boxing an int.
Trees after GDV expansion (hot call [000038], cold call [000018]):

------------ BB01 [000..???) -> BB04 (cond), preds={} succs={BB03,BB04}

***** BB01
STMT00000 ( 0x000[E-] ... 0x012 )
               [000007] -A---------ASG       ref   
               [000006] D------N---                         ├──▌  LCL_VAR   ref    V03 tmp0         
               [000005] -----------                         └──▌  ALLOCOBJ  ref   
               [000004] H----------                            └──▌  CNS_INT(h) int    0xAAE1444 class

***** BB01
STMT00001 ( ??? ... ??? )
               [000014] -A-XG------ASG       ref   
               [000013] D------N---                         ├──▌  LCL_VAR   ref    V04 tmp1         
               [000001] ---XG------                         └──▌  FIELD     ref    untypedPredicate
               [000000] -----------                            └──▌  LCL_VAR   ref    V00 this         

***** BB01
STMT00002 ( ??? ... ??? )
               [000012] -A---------ASG       int   
               [000011] -------N---                         ├──▌  IND       int   
               [000010] -----------                         │  └──▌  ADD       byref 
               [000008] -----------                         │     ├──▌  LCL_VAR   ref    V03 tmp0         
               [000009] -----------                         │     └──▌  CNS_INT   int    4
               [000003] -----------                         └──▌  LCL_VAR   int    V02 arg2         

***** BB01
STMT00005 ( ??? ... ??? )
               [000027] -A-X-------ASG       int   
               [000026] D------N---                         ├──▌  LCL_VAR   int    V06 tmp3         
               [000025] ---X-------                         └──▌  IND       int   
               [000024] -----------                            └──▌  ADD       byref 
               [000022] -----------                               ├──▌  LCL_VAR   ref    V04 tmp1         
               [000023] -----------                               └──▌  CNS_INT   int    12

***** BB01
STMT00006 ( ??? ... ??? )
               [000031] -----------JTRUE     void  
               [000030] -----------                         └──▌  NE        int   
               [000029] H----------                            ├──▌  CNS_INT(h) int    0xEF90DC8 ftn
               [000028] -----------                            └──▌  LCL_VAR   int    V06 tmp3         

------------ BB03 [???..???) -> BB02 (always), preds={} succs={BB02}

***** BB03
STMT00007 ( ??? ... ??? )
               [000037] -A-X-------ASG       ref   
               [000036] D------N---                         ├──▌  LCL_VAR   ref    V07 tmp4         
               [000035] ---X-------                         └──▌  IND       ref   
               [000034] -----------                            └──▌  ADD       byref 
               [000032] -----------                               ├──▌  LCL_VAR   ref    V04 tmp1         
               [000033] -----------                               └──▌  CNS_INT   int    4

***** BB03
STMT00008 ( ??? ... ??? )
               [000038] I-C-G------CALL      int    <>c.<JsonIgnoreConditionIsCorrectlyTranslatedToShouldSerializeDelegateAndChangingShouldSerializeIsRespected>b__36_4 (exactContextHnd=0x0EF7CB54)
               [000043] ----------- this                    ├──▌  LCL_VAR   ref    V07 tmp4         
               [000040] ----------- arg1                    ├──▌  LCL_VAR   ref    V01 arg1         
               [000041] ----------- arg2                    └──▌  BOX       ref   
               [000042] -----------                            └──▌  LCL_VAR   ref    V03 tmp0         

***** BB03
STMT00009 ( ??? ... ??? )
               [000046] -AC--------ASG       int   
               [000045] D------N---                         ├──▌  LCL_VAR   int    V05 tmp2         
               [000044] --C--------                         └──▌  RET_EXPR  int   (inl return expr [000038])

------------ BB04 [???..???), preds={} succs={BB02}

***** BB04
STMT00010 ( ??? ... ??? )
               [000048] -AC-G------ASG       int   
               [000047] D------N---                         ├──▌  LCL_VAR   int    V05 tmp2         
               [000018] --C-G------                         └──▌  CALL ind  int   
               [000053] ---X------- this                       ├──▌  IND       ref   
               [000052] -----------                            │  └──▌  ADD       byref 
               [000051] -----------                            │     ├──▌  LCL_VAR   ref    V04 tmp1         
               [000050] -----------                            │     └──▌  CNS_INT   int    4
               [000002] ----------- arg1                       ├──▌  LCL_VAR   ref    V01 arg1         
               [000017] ----------- arg2                       ├──▌  BOX       ref   
               [000016] -----------                            │  └──▌  LCL_VAR   ref    V03 tmp0         
               [000049] ----------- calli tgt                  └──▌  LCL_VAR   int    V06 tmp3         

------------ BB02 [???..013) (return), preds={} succs={}

***** BB02
STMT00004 ( ??? ... ??? )
               [000020] --C--------RETURN    int   
               [000021] -----------                         └──▌  LCL_VAR   int    V05 tmp2         
...
----------- Statements (and blocks) added due to the inlining of call [000038] -----------

Arguments setup:
gtTryRemoveBoxUpstreamEffects: attempting to remove side effects of BOX (valuetype) [000041] (assign/newobj STMT00000 copy STMT00002

Bashing NEWOBJ [000007] to NOP

Bashing COPY [000012] to NOP; no source side effects.
...
After constant propagation on [000016]:
STMT00010 ( ??? ... ??? )
N014 ( 53, 26) [000048] -ACXG---R--ASG       int    $VN.Void
N013 (  3,  2) [000047] D------N---                         ├──▌  LCL_VAR   int    V05 tmp2         d:2 $VN.Void
N012 ( 49, 23) [000018] --CXG------                         └──▌  CALL ind  int    $241
N006 (  8,  7) [000061] -A-X----R-- this setup                 ├──▌  ASG       ref    <l:$1c8, c:$1c7>
N005 (  3,  2) [000060] D------N---                            │  ├──▌  LCL_VAR   ref    V08 tmp5         d:1 $VN.Void
N004 (  4,  4) [000053] ---X-------                            │  └──▌  IND       ref    <l:$1ca, c:$1c9>
N003 (  2,  2) [000052] -------N---                            │     └──▌  ADD       byref  <l:$103, c:$104>
N001 (  1,  1) [000051] -----------                            │        ├──▌  LCL_VAR   ref    V04 tmp1         u:1 (last use) <l:$1c0, c:$82>
N002 (  1,  1) [000050] -----------                            │        └──▌  CNS_INT   int    4 $43
N008 (  9,  6) [000017] ----------- arg2 on STK                ├──▌  BOX       ref    $VN.Null
               [000069] -----------                            │  └──▌  CNS_INT   ref    null $VN.Null
N011 (  3,  2) [000049] ----------- calli tgt                  └──▌  LCL_VAR   int    V06 tmp3         u:1 (last use) <l:$200, c:$240>
N009 (  3,  2) [000062] ----------- this in ecx                ├──▌  LCL_VAR   ref    V08 tmp5         u:1 (last use) <l:$2c0, c:$86>
N010 (  3,  2) [000002] ----------- arg1 in edx                └──▌  LCL_VAR   ref    V01 arg1         u:1 (last use) $81

Full jitdump attached. I'll continue investigation tomorrow, but cc @AndyAyersMS if you have any ideas here.

out.txt

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Jul 25, 2022

Small repro (run with COMPlus_TieredPGO=1):

using System;
using System.Runtime.CompilerServices;
using System.Threading;

public class Program
{
    public static void Main()
    {
        for (int i = 0; i < 100; i++)
        {
            Call(new Impl1());
            if (i > 30)
                Thread.Sleep(10);
        }

        Call(new Impl2());
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    private static void Call(I i)
    {
        i.Foo(5);
    }
}

public interface I
{
    void Foo(object o);
}

class Impl1 : I
{
    public void Foo(object o)
    {
    }
}

class Impl2 : I
{
    public void Foo(object o)
    {
        Console.WriteLine("Got object of type {0}", o.GetType());
    }
}

Expected: "Got object of type System.Int32"
Actual: NullReferenceException thrown

@AndyAyersMS
Copy link
Member

Yeah, there's an implicit assumption that GT_BOX won't be duplicated, so that any use is the sole use.

Two ideas for a fix:

  • prespill any box in any arg tree (seems clunky) and (perhaps) disallow cloning of boxes
  • add a flag or field to GenTreeBox indicating the box is now multi-use, set this on the original and the clone when cloning, and then bail out in gtTryRemoveBoxUpstreamEffects if this is set.

Doing the latter would also enable forward sub of boxes by the inliner (eg see notes on #9118).

This seems like it should repro on 6.0.

jakobbotsch added a commit to jakobbotsch/runtime that referenced this issue Jul 26, 2022
There was an assumption that the local operand to a GT_BOX node is
single-use which was being violated when GDV clones these as arguments.
We now allow multi-uses of these locals by setting a flag when cloning
and then handle it in gtTryRemoveBoxUpstreamEffects.

There is also GTF_VAR_CLONED that would be set on the local itself, but
given that transformations can affect the operand local node arbitrarily
I went with another of these type of flags on GT_BOX instead.

Fix dotnet#72775
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jul 26, 2022
jakobbotsch added a commit that referenced this issue Jul 27, 2022
There was an assumption that the local operand to a GT_BOX node is
single-use which was being violated when GDV clones these as arguments.
We now allow multi-uses of these locals by setting a flag when cloning
and then handle it in gtTryRemoveBoxUpstreamEffects.

There is also GTF_VAR_CLONED that would be set on the local itself, but
given that transformations can affect the operand local node arbitrarily
I went with another of these type of flags on GT_BOX instead.

Fix #72775
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jul 27, 2022
github-actions bot pushed a commit that referenced this issue Jul 28, 2022
There was an assumption that the local operand to a GT_BOX node is
single-use which was being violated when GDV clones these as arguments.
We now allow multi-uses of these locals by setting a flag when cloning
and then handle it in gtTryRemoveBoxUpstreamEffects.

There is also GTF_VAR_CLONED that would be set on the local itself, but
given that transformations can affect the operand local node arbitrarily
I went with another of these type of flags on GT_BOX instead.

Fix #72775
carlossanlop pushed a commit that referenced this issue Aug 11, 2022
* JIT: Avoid removing multi-use boxes

There was an assumption that the local operand to a GT_BOX node is
single-use which was being violated when GDV clones these as arguments.
We now allow multi-uses of these locals by setting a flag when cloning
and then handle it in gtTryRemoveBoxUpstreamEffects.

There is also GTF_VAR_CLONED that would be set on the local itself, but
given that transformations can affect the operand local node arbitrarily
I went with another of these type of flags on GT_BOX instead.

Fix #72775

* Add a test

* Test nit

Co-authored-by: Jakob Botsch Nielsen <jakob.botsch.nielsen@gmail.com>
@ghost ghost locked as resolved and limited conversation to collaborators Aug 26, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants