Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up entity linking process #90

Closed
manandey opened this issue Dec 3, 2021 · 5 comments
Closed

Speed up entity linking process #90

manandey opened this issue Dec 3, 2021 · 5 comments
Labels

Comments

@manandey
Copy link

manandey commented Dec 3, 2021

Hi,

I want to do entity linking on a large subset of the c4/en dataset. Since in the current settings, I am able to extract entities for around ~1000 rows/hr using CPU and around 5000 rows/hr using GPU. Is there any way be it batching/multiprocessing or any other suggestions from your side to try out to speed up the process taking into account the size of the c4 dataset? Any advice would be highly appreciated. I was also eager to know if the data is currently stored in memory? Thanks!

@arjenpdevries
Copy link
Member

Hi @manandey sorry for not getting back to you - we are actually investigating how to tag a (different) large corpus so the question is relevant. Did you make progress yourself on this?

@manandey manandey reopened this Dec 10, 2021
@manandey
Copy link
Author

Hi @arjenpdevries, thanks a lot for your response! Using batching, did increase the speed up to a certain extent, but that might still be quite slow if we try to process a large corpus. I will keep the issue open since you are already investigating it. :) It would be a great help if you come up with some suggestions on this. Thanks again!

@github-actions
Copy link

github-actions bot commented Feb 9, 2022

This issue has not seen recent activity

@github-actions
Copy link

This issue has not seen recent activity

@github-actions github-actions bot added the Stale label Apr 11, 2022
@KDercksen KDercksen removed the Stale label Apr 11, 2022
@github-actions
Copy link

This issue has not seen recent activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants