Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable uploading multiple images in demo.py #232

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ifsheldon
Copy link

I made a few changes to enable uploading multiple images. This should close #180.

The changes include:

  • Changed the logic of upload button, image window and text input
  • Added a new UI component, i.e., gallery, to show uploaded images

One remaining problem is that we cannot upload multiple images all at once, because I suspect this line may cause issues.

conv.messages[-1][1] = ' '.join([conv.messages[-1][1], text])

So now, after uploading an image, a text input should follow before uploading another image.

@LFavano
Copy link

LFavano commented May 22, 2023

I previously tried to feed multiple images manually using chat.upload_img(), but when asking the model to describe the different pictures uploaded it would still only consider one. In case you already tested this, is this also the case with your code?

@ifsheldon
Copy link
Author

I previously tried to feed multiple images manually using chat.upload_img(), but when asking the model to describe the different pictures uploaded it would still only consider one. In case you already tested this, is this also the case with your code?

I don't have this issue. Everything works fine. The quoted code is exactly the source of your issue. You can try my branch.

@LFavano
Copy link

LFavano commented May 23, 2023

I previously tried to feed multiple images manually using chat.upload_img(), but when asking the model to describe the different pictures uploaded it would still only consider one. In case you already tested this, is this also the case with your code?

I don't have this issue. Everything works fine. The quoted code is exactly the source of your issue. You can try my branch.

I tried your branch but am still having the issue, here's my steps:

  1. Upload the first image
  2. Use the prompt "Please describe the first image provided, a second image is coming after" and a couple of other variations. The description provided here is good.
  3. Upload a second image
  4. Use the prompt "Please describe both the images provided". The description provided is accurate but is only about the second image and ignores the first one

From this point on asking "Please describe the [first/second] image" only gets me really weird descriptions that seem to mix up the two images.

If you can get the model to describe both images at the same time (or do any reasoning on multiple images at once) maybe you can share the prompt you used.

EDIT: I realized that the outcome is a bit random, sometimes the descriptions get mixed up and other times they don't, but I can't manage to reliably do accurate reasoning on multiple images

@ifsheldon
Copy link
Author

ifsheldon commented May 23, 2023

@LFavano yeah, the IQ of miniGPT4 can fluctuate, especially for small models. I guess your prompt is also a bit misleading. The image embeddings are actually appended to the prompt, so the total embeddings the model see is <embedding of "Please describe the first image provided, a second image is coming after"> + <embedding of the first image>, then you see why sometimes miniGPT4 gives confusing output, because it gets confused as well.

@arshadshk
Copy link

when an image is uploaded, it gets converted to an embedding and then concatenated to the token embeddings. In the case of multiple images, is anyone here aware of a model that takes in multiple such image embeddings simultaneously and concatenates in the same sequence, a kind of one-pass inference?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Does MiniGPT4 support multiple image uploading?
3 participants