Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add blip-2 to bettertransformer #1125

Merged
merged 7 commits into from
Jun 28, 2023

Conversation

baskrahmer
Copy link
Contributor

@baskrahmer baskrahmer commented Jun 21, 2023

What does this PR do?

Add BLIP-2 to the BetterTransformer API.

Part of #1056

Before submitting

  • Did you write any new necessary tests?

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jun 21, 2023

The documentation is not available anymore as the PR was closed or merged.

@baskrahmer baskrahmer marked this pull request as ready for review June 24, 2023 07:41
Copy link
Contributor

@fxmarty fxmarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thank you! Could you just solve the conflict?

We should definitely have a nice table in the doc with the speedups (if there is, because I suspect for some archs/settings it is not necessarily huge).

@baskrahmer
Copy link
Contributor Author

@fxmarty yes I agree such a comparison would be nice! If there are already benchmarks then I would be interested to work on this.

I also thought about adding tests to assert speedups, but if the speeds fluctuate this could mess with the CI.

@fxmarty fxmarty merged commit 9abc249 into huggingface:main Jun 28, 2023
63 of 64 checks passed
@fxmarty
Copy link
Contributor

fxmarty commented Jun 28, 2023

There are some scripts that we used for blog posts, but we did not put results in the documentation itself: https://github.com/huggingface/optimum/tree/main/tests/benchmark

Encoder implementation may need to be revamped soon though, as currently we error out when they are used for training, while there's not really any reason to now.

@kirillsemenov1314
Copy link

kirillsemenov1314 commented Dec 14, 2023

@baskrahmer thank you very much for contributing BLIP2 support!
I have a question - does it currently only support FlanT5 model? If so, I'm a bit confused of how does it support is since T5 model can not be supported due to nature of it's attention mechanism - shouldn't it be the same for FlanT5?

Also does it show any improvement in inference speed on GPU T4? Could not get any during my experiments, maybe I've done something wrong.
Thanks in advance!

@baskrahmer
Copy link
Contributor Author

Hey @kirillsemenov1314 :)

I have a question - does it currently only support FlanT5 model? If so, I'm a bit confused of how does it support is since T5 model can not be supported due to nature of it's attention mechanism - shouldn't it be the same for FlanT5?

I believe this statement no longer holds. The BetterTransformer implementation of the T5 layer is found in this file, so I suggest going through it if you're interested in the implementation.

Also does it show any improvement in inference speed on GPU T4? Could not get any during my experiments, maybe I've done something wrong. Thanks in advance!

This is an interesting topic. AFAIK active work is being done on this tool which can be used to also run benchmarks on BetterTransformer architectures. Inference speed is influenced by a variety of factors such as the model, dataset and hardware. It can thus very well be that there is no significant speedup using BetterTransformer in your case, and it does not necessarily apply you are doing something wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants