about winograd transform matrix #6

janboeye · 2018-04-20T10:30:44Z

in conv2d.py, winograd algorithm do G and B like normal matrix multiplication, this will not reduce multiplication by plus/minus. Is this understood correct?

Thanks

merrymercy · 2018-04-20T11:44:30Z

No.
I transformed this constant matrix by a utility function const_array and unrolled all the transform matrix multiplication. Then TVM's simplify pass will remove zero elements. You can print lowered ir to confirm.
I posted the performance comparison in apache/tvm#898

janboeye · 2018-04-20T12:00:53Z

@merrymercy

Thanks for explanation.
Could you explain that how const_array could help to remove zero?

produce G {
  for (i, 0, 4) {
    for (j, 0, 3) {
      G[((i*3) + j)] = select(((i == 3) && (j == 2)), 1.000000f, select((((i == 3) && (j == 1)) || ((i == 3) && (j == 0))), 0.000000f, select(((i == 2) && (j == 2)), 0.500000f, select(((i == 2) && (j == 1)), -0.500000f, select((((i == 2) && (j == 0)) || (((i == 1) && (j == 2)) || (((i == 1) && (j == 1)) || ((i == 1) && (j == 0))))), 0.500000f, select((((i == 0) && (j == 2)) || (((i == 0) && (j == 1)) || !((i == 0) && (j == 0)))), 0.000000f, 1.000000f))))))
    }
  }
}

I get the IR, but do not understand how TVM could remove all zero in this select statement.

Thanks

merrymercy · 2018-04-20T12:26:34Z

You should lower the whole kernel transform
https://github.com/dmlc/tvm/blob/5d53f0f9ecb490245f8dba542437b5b70b7ba87d/topi/python/topi/mali/conv2d.py#L566-L571
and unroll axes eps, nu, r_kh, r_kw. Then these select expression will be simplified

janboeye · 2018-04-20T15:06:19Z

@merrymercy

I write lower code like following

    # transform kernel
    s[G].compute_inline()
    eps, nu, k, c, kk, = s[U].op.axis
    r_kh, r_kw = s[U].op.reduce_axis
    s[U].reorder(k, c, kk, eps, nu, r_kh, r_kw)
    _ = [s[U].unroll(x) for x in [eps, nu, r_kh, r_kw]]
    print "transform kernel lower"
    su = tvm.create_schedule(s[U].op)
    print(tvm.lower(su, [kernel, G], simple_mode=True))

but I got following IR:

produce G {
  for (i, 0, 4) {
    for (j, 0, 3) {
      G[((i*3) + j)] = select(((i == 3) && (j == 2)), 1.000000f, select((((i == 3) && (j == 1)) || ((i == 3) && (j == 0))), 0.000000f, select(((i == 2) && (j == 2)), 0.500000f, select(((i == 2) && (j == 1)), -0.500000f, select((((i == 2) && (j == 0)) || (((i == 1) && (j == 2)) || (((i == 1) && (j == 1)) || ((i == 1) && (j == 0))))), 0.500000f, select((((i == 0) && (j == 2)) || (((i == 0) && (j == 1)) || !((i == 0) && (j == 0)))), 0.000000f, 1.000000f))))))
    }
  }
}
produce U {
  for (eps, 0, 4) {
    for (nu, 0, 4) {
      for (k, 0, 256) {
        for (c, 0, 1280) {
          for (kk, 0, 4) {
            U[((((((((eps*4) + nu)*256) + k)*1280) + c)*4) + kk)] = 0.000000f
            for (r_kh, 0, 3) {
              for (r_kw, 0, 3) {
                U[((((((((eps*4) + nu)*256) + k)*1280) + c)*4) + kk)] = (U[((((((((eps*4) + nu)*256) + k)*1280) + c)*4) + kk)] + ((weight[(((((((k*5120) + c) + (kk*1280))*3) + r_kh)*3) + r_kw)]*G[((eps*3) + r_kh)])*G[((nu*3) + r_kw)]))
              }
            }
          }
        }
      }
    }
  }
}

Could you help to check why my lower command is not right?

But the generated cuda code already removed zero in G and s[G].compute_inline is necessary.

Thanks

janboeye · 2018-04-21T02:48:47Z

I modify my lower code like following

    print "transform kernel lower"
    su = tvm.create_schedule(s[U].op)
    su[G].compute_inline()
    eps1, nu1, k1, c1, kk1 = su[U].op.axis
    r_kh1, r_kw1 = su[U].op.reduce_axis
    su[U].reorder(k1,c1, kk1, eps1, nu1, r_kh1, r_kw1)
    _ = [su[U].unroll(x) for x in [eps1, nu1, r_kh1, r_kw1]]
    print(tvm.lower(su, [kernel, G], simple_mode=True))

I could get the right IR.

Thanks

janboeye closed this as completed Apr 21, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

about winograd transform matrix #6

about winograd transform matrix #6

janboeye commented Apr 20, 2018

merrymercy commented Apr 20, 2018

janboeye commented Apr 20, 2018 •

edited

Loading

merrymercy commented Apr 20, 2018

janboeye commented Apr 20, 2018 •

edited

Loading

janboeye commented Apr 21, 2018

about winograd transform matrix #6

about winograd transform matrix #6

Comments

janboeye commented Apr 20, 2018

merrymercy commented Apr 20, 2018

janboeye commented Apr 20, 2018 • edited Loading

merrymercy commented Apr 20, 2018

janboeye commented Apr 20, 2018 • edited Loading

janboeye commented Apr 21, 2018

janboeye commented Apr 20, 2018 •

edited

Loading

janboeye commented Apr 20, 2018 •

edited

Loading