训练过程出现的问题 #15

PanTings · 2020-10-16T02:04:19Z

你好，我开始训练之后出现size mismatch的问题，不知道是什么原因，是torch版本不一样吗？我在电脑上装的1.5.0+cpu；或者是什么其他原因？
2020-10-14 11:25:57 Epoch 0...
Traceback (most recent call last):
File "F:\MyProgram\PyCharm\Neu-Review-Rec-master\main.py", line 213, in
fire.Fire()
File "E:\Program Files (x86)\Python-3.6.8\lib\site-packages\fire\core.py", line 138, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "E:\Program Files (x86)\Python-3.6.8\lib\site-packages\fire\core.py", line 468, in _Fire
target=component.name)
File "E:\Program Files (x86)\Python-3.6.8\lib\site-packages\fire\core.py", line 672, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "F:\MyProgram\PyCharm\Neu-Review-Rec-master\main.py", line 85, in train
output = model(train_datas)
File "E:\Program Files (x86)\Python-3.6.8\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "F:\MyProgram\PyCharm\Neu-Review-Rec-master\framework\models.py", line 42, in forward
output = self.predict_net(ui_feature, uids, iids).squeeze(1)
File "E:\Program Files (x86)\Python-3.6.8\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "F:\MyProgram\PyCharm\Neu-Review-Rec-master\framework\prediction.py", line 32, in forward
return self.model(feature, uid, iid)
File "E:\Program Files (x86)\Python-3.6.8\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "F:\MyProgram\PyCharm\Neu-Review-Rec-master\framework\prediction.py", line 140, in forward
fm_out = self.build_fm(feature)
File "F:\MyProgram\PyCharm\Neu-Review-Rec-master\framework\prediction.py", line 129, in build_fm
fm_linear_part = self.fc(input_vec)
File "E:\Program Files (x86)\Python-3.6.8\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "E:\Program Files (x86)\Python-3.6.8\lib\site-packages\torch\nn\modules\linear.py", line 87, in forward
return F.linear(input, self.weight, self.bias)
File "E:\Program Files (x86)\Python-3.6.8\lib\site-packages\torch\nn\functional.py", line 1610, in linear
ret = torch.addmm(bias, input, weight.t())
RuntimeError: size mismatch, m1: [128 x 64], m2: [128 x 1] at C:\w\b\windows\pytorch\aten\src\TH/generic/THTensorMath.cpp:41

ShomyLiu · 2020-10-16T02:06:25Z

你好，能否提供下运行的方法和命令？

PanTings · 2020-10-16T02:09:43Z

谢谢啦
main.py train --model=MPCN --num_fea=2 --output=fm --use_gpu=False --gpu_id=-1

PanTings · 2020-10-16T02:24:39Z

另外，output设置成mlp，lfm, nfm都会出现类似的问题，都会在prediction.py、使用self.fc()方法的这一行报错

ShomyLiu · 2020-10-16T02:52:33Z

你好，MPCN模型的num_fea 应该是1，只用了review-level feature，所以:

python main.py train --model=MPCN --num_fea=1 --output=fm

设置为2的时候，应该会报错

PanTings · 2020-10-16T03:04:23Z

啊可是我设置1就报错这个
ValueError: the num_fea of MPCN is error, please specific --num_fea=2

ShomyLiu · 2020-10-16T03:09:14Z

MPCN的num_fea是1，确定是最新版吗:
https://github.com/ShomyLiu/Neu-Review-Rec/blob/master/models/mpcn.py#L21

PanTings · 2020-10-16T03:13:16Z

哇非常感谢你的耐心回答，真是num_fea的问题

xiulingque · 2022-04-01T14:28:42Z

你好，我开始训练之后出现维度不匹配的问题，不知道是什么原因，也没有找到有效的解决办法，请问如何解决这个问题。
load npy from dist...

user config:
vocab_size => 50002
word_dim => 300
r_max_len => 202
u_max_r => 13
i_max_r => 24
train_data_size => 51764
test_data_size => 6471
val_data_size => 6471
user_num => 5543
item_num => 3570
batch_size => 128
print_step => 100

Traceback (most recent call last):
File "main.py", line 212, in
fire.Fire()
File "/root/.local/lib/python3.7/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/root/.local/lib/python3.7/site-packages/fire/core.py", line 471, in _Fire
target=component.name)
File "/root/.local/lib/python3.7/site-packages/fire/core.py", line 681, in CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "main.py", line 45, in train
model = Model(opt, getattr(models, opt.model))
File "/root/Neu-Review-Rec/framework/models.py", line 17, in init
self.net = Net(opt)
File "/root/Neu-Review-Rec/models/mpcn.py", line 39, in init
self.reset_para()
File "/root/Neu-Review-Rec/models/mpcn.py", line 102, in reset_para
self.user_word_embs.weight.data.copy(w2v.cuda())
RuntimeError: The size of tensor a (50002) must match the size of tensor b (24150) at non-singleton dimension 0

ShomyLiu · 2022-04-02T01:16:57Z

这个应该是预加载的字典与设置的大小不一样的缘故，检查下，要么修改下voca_size要么修改下字典，试试

xiulingque · 2022-04-17T14:07:26Z

已经解决了，谢谢大佬回复，辛苦

…

------------------ 原始邮件 ------------------ 发件人: HT Liu ***@***.***> 发送时间: 2022年4月2日 09:17 收件人: ShomyLiu/Neu-Review-Rec ***@***.***> 抄送: xiulingque ***@***.***>, Comment ***@***.***> 主题: Re: [ShomyLiu/Neu-Review-Rec] 训练过程出现的问题 (#15) 这个应该是预加载的字典与设置的大小不一样的缘故，检查下，要么修改下voca_size要么修改下字典，试试 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: ***@***.***>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

训练过程出现的问题 #15

训练过程出现的问题 #15

PanTings commented Oct 16, 2020

ShomyLiu commented Oct 16, 2020

PanTings commented Oct 16, 2020

PanTings commented Oct 16, 2020

ShomyLiu commented Oct 16, 2020

PanTings commented Oct 16, 2020

ShomyLiu commented Oct 16, 2020

PanTings commented Oct 16, 2020

xiulingque commented Apr 1, 2022

ShomyLiu commented Apr 2, 2022

xiulingque commented Apr 17, 2022 via email

训练过程出现的问题 #15

训练过程出现的问题 #15

Comments

PanTings commented Oct 16, 2020

ShomyLiu commented Oct 16, 2020

PanTings commented Oct 16, 2020

PanTings commented Oct 16, 2020

ShomyLiu commented Oct 16, 2020

PanTings commented Oct 16, 2020

ShomyLiu commented Oct 16, 2020

PanTings commented Oct 16, 2020

xiulingque commented Apr 1, 2022

ShomyLiu commented Apr 2, 2022

xiulingque commented Apr 17, 2022 via email