Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于视频任务模型Plus版本 #8

Open
a773783082 opened this issue Jan 14, 2024 · 1 comment
Open

关于视频任务模型Plus版本 #8

a773783082 opened this issue Jan 14, 2024 · 1 comment

Comments

@a773783082
Copy link

a773783082 commented Jan 14, 2024

github界面只给了图片任务的R50和SwinL2个版本的模型,然后我在huggingface上demo的files里面看到了视频任务的R50版本(visual prompt,GLEE_vos_r50.pth),想问下作者能不能开源一下视频任务的SwinL版本,是不是因为huggingface上使用的GPU跑不动所以才没放SwinL版本?
此外,关于使用的体验,我发现模型对于没学过的语言提示词效果很差,比如用custom-list不认识人头(head),输入human head才有可能给出比较差的结果。

@wjf5203
Copy link
Collaborator

wjf5203 commented Mar 19, 2024

Thank you for your attention! Regarding the VOS version, we have only trained the weight of R50. We will update the weight of SwinL later or you can use the script that will be updated later to train by yourself. Empirically, increasing the background on VOS will not have big improvement (probably within two points). In addition, GLEE has relatively little contact with part-level data during training. We have released three versions of pretrain, joint-training, and scaleup models that may have different generalization properties.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants