[bug report] Some bugs or inconvenience in bidirectional_sequence_lstm and lstmunit_activation #401

gdh1995 · 2022-05-23T13:55:48Z

When I tries to use the bidirectional LSTM operator, I found some bugs and fixed parts of them. So, I've created gdh1995@9ae1832 to show my patch; and those I can't fix are also described in this issues.

Those fixed issues

The below are detailed problems of https://github.com/VeriSilicon/TIM-VX/blob/f8741b4704ab9caffd49f6fc15f72616962ba1d1/src/tim/vx/internal/src/ops/vsi_nn_op_bidirectional_sequence_lstm.c :

when passed inputs[BI_LSTM_BW_INPUT_H_STATE] is empty, it copies shape from outputs[BI_LSTM_BW_OUTPUT_OUTPUT], but the later may also be empty if only curr_param->merge_outputs
it doesn't support empty c_state inputs, while unidirectional LSTM does.
when it creates new c_state outputs during iterating forwards, it uses dtype from outputs[BI_LSTM_FW_OUTPUT_OUTPUT], but I think a better source should be inputs[BI_LSTM_FW_INPUT_C_STATE]

1. so does it during iterating backwards

it sets lstm_ovxlib.internal_dtype, which should have been lstmunit_ovxlib.internal_dtype
the order of tensors in lstmcell_reshape_output_tensors_bw is D'C'B'A' when input words are A, B, C, D

1. in PyTorch, the order is still A'B'C'D', so a merged output tensor is [ [A, A'], [B, B'], [C, C'], [D, D'] ]
1. the above judgment is tested on torch==1.9.0+cu111 and Python 3.7 x64 for Windows.

Left bugs

When I test the operator, I noticed some kernel functions of lstmunit_activation don't work under some special combination of I/O data types. As far as I've catched, they are:

On a main branch of TIMVX and the A311D-6.4.10.2 package (https://github.com/VeriSilicon/TIM-VX/releases/tag/v1.1.42), the evis kernel (src/tim/vx/internal/src/kernel/evis/lstmunit_activation_evis.c) never works

the kernel will never write its output tensors, so the tensor will contain old data from old graph forwarding
the failed kernels are:

1. GEN_LSTMUNIT_STRUCT_ITEMS(0, 0, 0, 0, 0, F16, F16, F16, SIGMOID, S)
1. GEN_LSTMUNIT_STRUCT_ITEMS(0, 0, 0, 0, 0, F16, U8, F16, SIGMOID, S)

for example, after I created a graph with only 1 bidi-LSTM layer using the cl kernel, run it, and re-created the graph using the evis kernel, the new running result (under a input tensor of a same shape) is absolutely the same with the cl kernel.
while the version in x86_64_linux works well

The cl kernel (src/tim/vx/internal/src/kernel/evis/lstmunit_activation_cl.c) doesn't support F32toU8_F32_SIGMOID

the failed kernel is GEN_LSTMUNIT_STRUCT_ITEMS(0, 0, 0, 0, 0, F32, U8 , F32, SIGMOID, S)
the kernel will only output very small numbers (all items in the uint8-quantified tensor are either 0 or 1)
here's my test case:

1. by default the unidirectional LSTM layer creates float16 tensors as outputs of FullConnect nodes
1. I want to quantify weights and LSTM input+output tensors into uint8, and then the hidden state is also uint8-quantified
1. so lstmunit_activate receives several float16 inputs and is expected to generate a uint8-quantified tensor

both the version in A311D-6.4.10.2 and the x86_64_linux one will give wrong outputs.

Here's a conclusion of the 2 kernel bugs (uint8+fp16 means only hidden states are uint8, activation inputs and c_states are still float16):

                    | simulator (x86_64_linux)    | A311D
                    | evis    cl        cpu       | evis       cl        cpu
---------------------------------------------------------------------------
unidi | fp16        | ok      ok        ok        | not-write  ok        ok
      | uint8+fp16  | ok      too-small ok        | not-write  too-small ok
      | uint8+fp16  | ok      too-small ok        | not-write  too-small ok
---------------------------------------------------------------------------
bidi  | fp16        | ok      ok        ok        | not-write  ok        ok
      | uint8+fp16  | ok      too-small ok        | not-write  too-small ok

A311D + evis: never write the tensor at all
A311D + cl:

fp16+fp16 (S_F32toF32_F32_SIGMOID): ok
uint8+fp16 (S_F32toU8_F32_SIGMOID): value error: all are too small

The text was updated successfully, but these errors were encountered:

sunshinemyson · 2022-05-25T15:19:52Z

@gdh1995 ，

Thanks for the detail. We will check it ASAP.

sunshinemyson · 2022-06-16T05:47:12Z

@gdh1995 ，

Sorry for that take a long time to get update for you. I discussed with team on this, we think the implementation inside "internal" is not robust and we don't want to maintain it anymore internally.

For bidirectional_sequence_lstm support, we would like to compose it in tim-vx. If you think it is ok, I can share more detail about the information about unidrectional_sequence_lstm to github.

Thanks

gdh1995 · 2022-06-16T06:27:44Z

Sorry my English is not so good and I can't get what you meant.

If vsi_nn_op_bidirectional_sequence_lstm.c will not be updated, it doesn't matter for me - I can maintain a fork of TIM-VX in my private workspace and pull new changes periodic.

I still expect the buggy behavior of EVIS and CL kernels can be fixed - because according to their names, they might run faster than the CPU versions.

I know few about LSTM variants, so I guesse info about unidirectional-lstm is too hard for me. If you want to add a new version of bidi-lstm, I'm afraid I can't understand it easily but I'm glad to try it.

sunshinemyson · 2022-11-28T01:47:25Z

11fd278 We have another implementation instead of reuse original op inside "internal" implementation.

gdh1995 changed the title ~~[bug report] Some bugs or inconvenience in vsi_nn_op_bidirectional_sequence_lstm.c~~ [bug report] Some bugs or inconvenience in bidirectional_sequence_lstm and lstmunit_activation May 23, 2022

gdh1995 mentioned this issue May 24, 2022

feat(ops): expose a C++ class of BidirectionalSequenceLstm #400

Closed

sunshinemyson added this to the 22Q4_rel milestone Sep 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug report] Some bugs or inconvenience in bidirectional_sequence_lstm and lstmunit_activation #401

[bug report] Some bugs or inconvenience in bidirectional_sequence_lstm and lstmunit_activation #401

gdh1995 commented May 23, 2022 •

edited

Loading

sunshinemyson commented May 25, 2022

sunshinemyson commented Jun 16, 2022

gdh1995 commented Jun 16, 2022

sunshinemyson commented Nov 28, 2022

[bug report] Some bugs or inconvenience in bidirectional_sequence_lstm and lstmunit_activation #401

[bug report] Some bugs or inconvenience in bidirectional_sequence_lstm and lstmunit_activation #401

Comments

gdh1995 commented May 23, 2022 • edited Loading

Those fixed issues

Left bugs

sunshinemyson commented May 25, 2022

sunshinemyson commented Jun 16, 2022

gdh1995 commented Jun 16, 2022

sunshinemyson commented Nov 28, 2022

gdh1995 commented May 23, 2022 •

edited

Loading