Skip to content
This repository has been archived by the owner on Mar 6, 2023. It is now read-only.

about winograd transform matrix #6

Closed
janboeye opened this issue Apr 20, 2018 · 5 comments
Closed

about winograd transform matrix #6

janboeye opened this issue Apr 20, 2018 · 5 comments

Comments

@janboeye
Copy link

hi, @merrymercy

in conv2d.py, winograd algorithm do G and B like normal matrix multiplication, this will not reduce multiplication by plus/minus. Is this understood correct?

Thanks

@merrymercy
Copy link
Owner

No.
I transformed this constant matrix by a utility function const_array and unrolled all the transform matrix multiplication. Then TVM's simplify pass will remove zero elements. You can print lowered ir to confirm.
I posted the performance comparison in apache/tvm#898

@janboeye
Copy link
Author

janboeye commented Apr 20, 2018

@merrymercy

Thanks for explanation.
Could you explain that how const_array could help to remove zero?

produce G {
  for (i, 0, 4) {
    for (j, 0, 3) {
      G[((i*3) + j)] = select(((i == 3) && (j == 2)), 1.000000f, select((((i == 3) && (j == 1)) || ((i == 3) && (j == 0))), 0.000000f, select(((i == 2) && (j == 2)), 0.500000f, select(((i == 2) && (j == 1)), -0.500000f, select((((i == 2) && (j == 0)) || (((i == 1) && (j == 2)) || (((i == 1) && (j == 1)) || ((i == 1) && (j == 0))))), 0.500000f, select((((i == 0) && (j == 2)) || (((i == 0) && (j == 1)) || !((i == 0) && (j == 0)))), 0.000000f, 1.000000f))))))
    }
  }
}

I get the IR, but do not understand how TVM could remove all zero in this select statement.

Thanks

@merrymercy
Copy link
Owner

You should lower the whole kernel transform
https://github.com/dmlc/tvm/blob/5d53f0f9ecb490245f8dba542437b5b70b7ba87d/topi/python/topi/mali/conv2d.py#L566-L571
and unroll axes eps, nu, r_kh, r_kw. Then these select expression will be simplified

@janboeye
Copy link
Author

janboeye commented Apr 20, 2018

@merrymercy

I write lower code like following

    # transform kernel
    s[G].compute_inline()
    eps, nu, k, c, kk, = s[U].op.axis
    r_kh, r_kw = s[U].op.reduce_axis
    s[U].reorder(k, c, kk, eps, nu, r_kh, r_kw)
    _ = [s[U].unroll(x) for x in [eps, nu, r_kh, r_kw]]
    print "transform kernel lower"
    su = tvm.create_schedule(s[U].op)
    print(tvm.lower(su, [kernel, G], simple_mode=True))

but I got following IR:

produce G {
  for (i, 0, 4) {
    for (j, 0, 3) {
      G[((i*3) + j)] = select(((i == 3) && (j == 2)), 1.000000f, select((((i == 3) && (j == 1)) || ((i == 3) && (j == 0))), 0.000000f, select(((i == 2) && (j == 2)), 0.500000f, select(((i == 2) && (j == 1)), -0.500000f, select((((i == 2) && (j == 0)) || (((i == 1) && (j == 2)) || (((i == 1) && (j == 1)) || ((i == 1) && (j == 0))))), 0.500000f, select((((i == 0) && (j == 2)) || (((i == 0) && (j == 1)) || !((i == 0) && (j == 0)))), 0.000000f, 1.000000f))))))
    }
  }
}
produce U {
  for (eps, 0, 4) {
    for (nu, 0, 4) {
      for (k, 0, 256) {
        for (c, 0, 1280) {
          for (kk, 0, 4) {
            U[((((((((eps*4) + nu)*256) + k)*1280) + c)*4) + kk)] = 0.000000f
            for (r_kh, 0, 3) {
              for (r_kw, 0, 3) {
                U[((((((((eps*4) + nu)*256) + k)*1280) + c)*4) + kk)] = (U[((((((((eps*4) + nu)*256) + k)*1280) + c)*4) + kk)] + ((weight[(((((((k*5120) + c) + (kk*1280))*3) + r_kh)*3) + r_kw)]*G[((eps*3) + r_kh)])*G[((nu*3) + r_kw)]))
              }
            }
          }
        }
      }
    }
  }
}

Could you help to check why my lower command is not right?

But the generated cuda code already removed zero in G and s[G].compute_inline is necessary.

Thanks

@janboeye
Copy link
Author

I modify my lower code like following

    print "transform kernel lower"
    su = tvm.create_schedule(s[U].op)
    su[G].compute_inline()
    eps1, nu1, k1, c1, kk1 = su[U].op.axis
    r_kh1, r_kw1 = su[U].op.reduce_axis
    su[U].reorder(k1,c1, kk1, eps1, nu1, r_kh1, r_kw1)
    _ = [su[U].unroll(x) for x in [eps1, nu1, r_kh1, r_kw1]]
    print(tvm.lower(su, [kernel, G], simple_mode=True))

I could get the right IR.

Thanks

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants