Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Forward sub does not kick in in some physical promotion cases #90426

Open
jakobbotsch opened this issue Aug 11, 2023 · 1 comment
Open
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@jakobbotsch
Copy link
Member

Example from @stephentoub:

public class Tests
{
    private ParsedStat _stat;

    public ulong GetTime()
    {
        ParsedStat stat = _stat;
        return stat.utime + stat.stime;
    }

    internal struct ParsedStat
    {
        internal int pid;
        internal string comm;
        internal char state;
        internal int ppid;
        internal int session;
        internal ulong utime;
        internal ulong stime;
        internal long nice;
        internal ulong starttime;
        internal ulong vsize;
        internal long rss;
        internal ulong rsslim;
    }
}

Codegen:

;  V00 this         [V00,T01] (  3,  3   )     ref  ->  rcx         this class-hnd single-def <Program>
;* V01 loc0         [V01    ] (  0,  0   )  struct (80) zero-ref    do-not-enreg[SF] <Program+ParsedStat>
;# V02 OutArgs      [V02    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
;  V03 tmp1         [V03,T02] (  2,  2   )    long  ->  rax         single-def "V01.[008..016)"
;  V04 tmp2         [V04,T03] (  2,  2   )    long  ->  rcx         single-def "V01.[016..024)"
;  V05 tmp3         [V05,T00] (  3,  6   )   byref  ->  rcx         single-def "Spilling address for field-by-field copy"
;
; Lcl frame size = 0

G_M25807_IG01:  ;; offset=0x0000
						;; size=0 bbWeight=1 PerfScore 0.00
G_M25807_IG02:  ;; offset=0x0000
       add      rcx, 8
       mov      rax, qword ptr [rcx+0x08]
       mov      rcx, qword ptr [rcx+0x10]
       add      rax, rcx
						;; size=15 bbWeight=1 PerfScore 4.50
G_M25807_IG03:  ;; offset=0x000F
       ret      

Expected codegen:

mov rax, qword ptr [rcx+0x10]
add rax, qword ptr [rcx+0x18]
ret

The add rcx, 8 can be fixed via #88386. However, the containment issue happens because assignment decomposition produces commas:

***** BB01
STMT00000 ( 0x000[E-] ... 0x006 )
               [000026] -A-XG------COMMA     void  
               [000014] DA-X-------                         ├──▌  STORE_LCL_VAR byref  V05 tmp3         
               [000001] ---X-------                         │  └──▌  FIELD_ADDR byref  Program:_stat
               [000000] -----------                         │     └──▌  LCL_VAR   ref    V00 this          (last use)
               [000025] -A--G------                         └──▌  COMMA     void  
               [000020] DA--G------                            ├──▌  STORE_LCL_VAR long   V03 tmp1         
               [000019] n---G------                            │  └──▌  IND       long  
               [000018] -----------                            │     └──▌  ADD       byref 
               [000016] -----------                            │        ├──▌  LCL_VAR   byref  V05 tmp3         
               [000017] -----------                            │        └──▌  CNS_INT   long   8
               [000024] DA--G------                            └──▌  STORE_LCL_VAR long   V04 tmp2         
               [000023] n---G------                               └──▌  IND       long  
               [000022] -----------                                  └──▌  ADD       byref 
               [000015] -----------                                     ├──▌  LCL_VAR   byref  V05 tmp3         
               [000021] -----------                                     └──▌  CNS_INT   long   16

***** BB01
STMT00001 ( 0x007[E-] ... 0x014 )
               [000013] -----------RETURN    long  
               [000012] -----------                         └──▌  ADD       long  
               [000027] -----------                            ├──▌  LCL_VAR   long   V03 tmp1          (last use)
               [000028] -----------                            └──▌  LCL_VAR   long   V04 tmp2          (last use)

We should be able to produce top level statements instead to allow forward sub to kick in (or alternatively, teach forward sub about commas).

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Aug 11, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Aug 11, 2023
@ghost
Copy link

ghost commented Aug 11, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Example from @stephentoub:

public class Tests
{
    private ParsedStat _stat;

    public ulong GetTime()
    {
        ParsedStat stat = _stat;
        return stat.utime + stat.stime;
    }

    internal struct ParsedStat
    {
        internal int pid;
        internal string comm;
        internal char state;
        internal int ppid;
        internal int session;
        internal ulong utime;
        internal ulong stime;
        internal long nice;
        internal ulong starttime;
        internal ulong vsize;
        internal long rss;
        internal ulong rsslim;
    }
}

Codegen:

;  V00 this         [V00,T01] (  3,  3   )     ref  ->  rcx         this class-hnd single-def <Program>
;* V01 loc0         [V01    ] (  0,  0   )  struct (80) zero-ref    do-not-enreg[SF] <Program+ParsedStat>
;# V02 OutArgs      [V02    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
;  V03 tmp1         [V03,T02] (  2,  2   )    long  ->  rax         single-def "V01.[008..016)"
;  V04 tmp2         [V04,T03] (  2,  2   )    long  ->  rcx         single-def "V01.[016..024)"
;  V05 tmp3         [V05,T00] (  3,  6   )   byref  ->  rcx         single-def "Spilling address for field-by-field copy"
;
; Lcl frame size = 0

G_M25807_IG01:  ;; offset=0x0000
						;; size=0 bbWeight=1 PerfScore 0.00
G_M25807_IG02:  ;; offset=0x0000
       add      rcx, 8
       mov      rax, qword ptr [rcx+0x08]
       mov      rcx, qword ptr [rcx+0x10]
       add      rax, rcx
						;; size=15 bbWeight=1 PerfScore 4.50
G_M25807_IG03:  ;; offset=0x000F
       ret      

Expected codegen:

mov rax, qword ptr [rcx+0x10]
add rax, qword ptr [rcx+0x18]
ret

The add rcx, 8 can be fixed via #88386. However, the containment issue happens because assignment decomposition produces commas:

***** BB01
STMT00000 ( 0x000[E-] ... 0x006 )
               [000026] -A-XG------COMMA     void  
               [000014] DA-X-------                         ├──▌  STORE_LCL_VAR byref  V05 tmp3         
               [000001] ---X-------                         │  └──▌  FIELD_ADDR byref  Program:_stat
               [000000] -----------                         │     └──▌  LCL_VAR   ref    V00 this          (last use)
               [000025] -A--G------                         └──▌  COMMA     void  
               [000020] DA--G------                            ├──▌  STORE_LCL_VAR long   V03 tmp1         
               [000019] n---G------                            │  └──▌  IND       long  
               [000018] -----------                            │     └──▌  ADD       byref 
               [000016] -----------                            │        ├──▌  LCL_VAR   byref  V05 tmp3         
               [000017] -----------                            │        └──▌  CNS_INT   long   8
               [000024] DA--G------                            └──▌  STORE_LCL_VAR long   V04 tmp2         
               [000023] n---G------                               └──▌  IND       long  
               [000022] -----------                                  └──▌  ADD       byref 
               [000015] -----------                                     ├──▌  LCL_VAR   byref  V05 tmp3         
               [000021] -----------                                     └──▌  CNS_INT   long   16

***** BB01
STMT00001 ( 0x007[E-] ... 0x014 )
               [000013] -----------RETURN    long  
               [000012] -----------                         └──▌  ADD       long  
               [000027] -----------                            ├──▌  LCL_VAR   long   V03 tmp1          (last use)
               [000028] -----------                            └──▌  LCL_VAR   long   V04 tmp2          (last use)

We should be able to produce top level statements instead to allow forward sub to kick in (or alternatively, teach forward sub about commas).

Author: jakobbotsch
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@jakobbotsch jakobbotsch self-assigned this Aug 11, 2023
@jakobbotsch jakobbotsch added this to the 9.0.0 milestone Aug 11, 2023
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Aug 11, 2023
jakobbotsch added a commit to jakobbotsch/runtime that referenced this issue Aug 14, 2023
The existing code creates cascading COMMA trees with embedded
statements. When we are decomposing a top-level store, however, we can
just create top-level statements. This uses less memory and creates more
natural-looking IR that for instance allows forward sub to kick in.

Fix dotnet#90426
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Aug 14, 2023
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Sep 13, 2023
@jakobbotsch jakobbotsch modified the milestones: 9.0.0, 10.0.0 Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant