Implement new paper: Dreambooth-StableDiffusion, Google Imagen based Textual Inversion alternative #914

underlines · 2022-09-23T10:39:01Z

Is your feature request related to a problem? Please describe.
Only word embeddings are optimized in the current Textual Inversion implementation. But Dreambooth fine tunes the diffusion model as a whole. That's revolutionary.

Describe the solution you'd like
Add https://github.com/XavierXiao/Dreambooth-Stable-Diffusion

Describe alternatives you've considered

Additional context

The training images are obtained from the issue in the Textual Inversion repository, and they are 3 images of a large trash container. Regularization images are generated by prompt photo of a container. Regularization images are shown here:

After training, generated images with prompt photo of a sks container:

https://github.com/XavierXiao/Dreambooth-Stable-Diffusion/blob/main/assets/a-container-0038.jpg

Generated images with prompt photo of a red sks container:

https://github.com/XavierXiao/Dreambooth-Stable-Diffusion/blob/main/assets/a-red-sks-container-0021.jpg

Generated images with prompt a dog on top of sks container:

https://github.com/XavierXiao/Dreambooth-Stable-Diffusion/blob/main/assets/a-dog-on-top-of-sks-container-0023.jpg

The text was updated successfully, but these errors were encountered:

bmaltais · 2022-09-23T14:12:11Z

I think we can already consume the resulting ckpt... but to create them you need to use the project you linked to. Not sure if it is worth integrating the creation part in this project when there is a dedicated repo for that.

It also require a GPU with 35GB or VRAM or more... so most regular folks don't have access to that. So pretty much a non starter. https://github.com/JoePenna/Dreambooth-Stable-Diffusion/

grexzen · 2022-09-23T14:34:06Z

Yeah no point for this gen UI, but for re training that is an awesome find.

OP needs to create a comparison though between textual inversion and this to see if there are real advantages across many prompts and image sets.

ExponentialML · 2022-09-23T14:38:58Z

Yeah no point for this gen UI, but for re training that is an awesome find.

OP needs to create a comparison though between textual inversion and this to see if there are real advantages across many prompts and image sets.

The comparison between TI and Dreambooth are a pretty sizable difference with the latter having a major advantage.
There are a lot of examples here.

Also, there's really no need to implement Dreambooth in this. It finetunes the entire model, meaning you simply just replace the default model with the trained one afterwards. There are no embeddings to use here.

jd-3d · 2022-09-25T05:02:31Z

I would love to see the training part of the implementation put into the webui. There is a new memory tweak that just came out that allows the training to run on 24GB of VRAM which really opens things up to a lot of people:

See here for setting it up with the memory optimizations:
https://github.com/gammagec/Dreambooth-SD-optimized
and here for more info:
https://www.reddit.com/r/StableDiffusion/comments/xmkwmp/i_got_dreambooth_for_sd_to_work_on_my_3090_w24_gb/

Another example of how well it works:
https://www.reddit.com/r/StableDiffusion/comments/xn1jln/i_used_googles_dreambooth_to_finetune_the/

underlines · 2022-09-26T14:08:37Z

Nice explanation of the paper, showing what's possible:

https://www.youtube.com/watch?v=NnoTWZ9qgYg&t=5s

LiJT · 2022-09-29T14:06:11Z

Please!!!!!!!!

jd-3d · 2022-10-02T17:50:27Z

It's down to 10GB VRAM requirements now. This would be an amazing feature. More info here:
https://www.reddit.com/r/StableDiffusion/comments/xtc25y/dreambooth_stable_diffusion_training_in_10_gb/

bmaltais · 2022-10-11T07:40:48Z

How much VRAM does this version require? Get Outlook for iOS<https://aka.ms/o0ukef>

…

________________________________ From: d8ahazard ***@***.***> Sent: Tuesday, October 4, 2022 2:52:56 PM To: AUTOMATIC1111/stable-diffusion-webui ***@***.***> Cc: bmaltais ***@***.***>; Comment ***@***.***> Subject: Re: [AUTOMATIC1111/stable-diffusion-webui] Implement new paper: Dreambooth-StableDiffusion, Google Imagen based Textual Inversion alternative (Issue #914) Added a q&d port of the "Optimized-Dreambooth-SD" repo's version for training checkpoints via #1655<https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/1655>. Still needs to be implemented and added to the UI, but the basic bit to do the work should be there. — Reply to this email directly, view it on GitHub<#914 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABZA34RHEC5W2ONAWCNWRFDWBR4IRANCNFSM6AAAAAAQT3OC3U>. You are receiving this because you commented.Message ID: ***@***.***>

d8ahazard · 2022-10-11T14:24:15Z

How much VRAM does this version require? Get Outlook for iOShttps://aka.ms/o0ukef
…
________________________________ From: d8ahazard @.> Sent: Tuesday, October 4, 2022 2:52:56 PM To: AUTOMATIC1111/stable-diffusion-webui @.> Cc: bmaltais @.>; Comment @.> Subject: Re: [AUTOMATIC1111/stable-diffusion-webui] Implement new paper: Dreambooth-StableDiffusion, Google Imagen based Textual Inversion alternative (Issue #914) Added a q&d port of the "Optimized-Dreambooth-SD" repo's version for training checkpoints via #1655<#1655>. Still needs to be implemented and added to the UI, but the basic bit to do the work should be there. — Reply to this email directly, view it on GitHub<#914 (comment)>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABZA34RHEC5W2ONAWCNWRFDWBR4IRANCNFSM6AAAAAAQT3OC3U. You are receiving this because you commented.Message ID: @.***>

#2002

The PR here can be run with 8GB using the --medvram flag on launch, but it's VERY slow ATM. Testing with WSL and DeepSpeed to see if I can't make it faster.

0xdevalias · 2022-11-01T03:59:34Z

Potentially Related:

Running AUTOMATIC1111 / stable-diffusion-webui with Dreambooth fine-tuned models #1429
[Feature request] Dreambooth deepspeed #1734
[Feature Request]: Dreambooth on 8GB VRam GPU (holy grail) #3586
DreamBooth training in under 8 GB VRAM and textual inversion under 6 GB! #1741
Dreambooth #2002
- Dreambooth #2002 (comment)
  - Closing, opening new PR to squash commits and make it clean.
Dreambooth: Ready to go! #3995
- Dreambooth: Ready to go! #3995 (comment)
  - Please give this a look and merge.

mezotaken · 2023-01-12T22:14:12Z

https://github.com/d8ahazard/sd_dreambooth_extension

dvztimes mentioned this issue Sep 24, 2022

Request - Create a Textual Inversion GUI Tab #993

Closed

ClashSAN mentioned this issue Sep 25, 2022

[Feature Request] Implement DreamBooth with memory optimizations. This is much better textual inversion via fine-tuning #1010

Closed

0xdevalias mentioned this issue Nov 1, 2022

[Feature request] Dreambooth deepspeed #1734

Closed

This was referenced Nov 1, 2022

Running AUTOMATIC1111 / stable-diffusion-webui with Dreambooth fine-tuned models #1429

Closed

[Feature Request]: Dreambooth on 8GB VRam GPU (holy grail) #3586

Closed

Dreambooth: Ready to go! #3995

Closed

mezotaken added the enhancement New feature or request label Jan 12, 2023

mezotaken closed this as completed Jan 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement new paper: Dreambooth-StableDiffusion, Google Imagen based Textual Inversion alternative #914

Implement new paper: Dreambooth-StableDiffusion, Google Imagen based Textual Inversion alternative #914

underlines commented Sep 23, 2022

bmaltais commented Sep 23, 2022 •

edited

Loading

grexzen commented Sep 23, 2022

ExponentialML commented Sep 23, 2022

jd-3d commented Sep 25, 2022

underlines commented Sep 26, 2022 •

edited

Loading

LiJT commented Sep 29, 2022

jd-3d commented Oct 2, 2022

bmaltais commented Oct 11, 2022 via email

d8ahazard commented Oct 11, 2022

0xdevalias commented Nov 1, 2022 •

edited

Loading

mezotaken commented Jan 12, 2023

Implement new paper: Dreambooth-StableDiffusion, Google Imagen based Textual Inversion alternative #914

Implement new paper: Dreambooth-StableDiffusion, Google Imagen based Textual Inversion alternative #914

Comments

underlines commented Sep 23, 2022

bmaltais commented Sep 23, 2022 • edited Loading

grexzen commented Sep 23, 2022

ExponentialML commented Sep 23, 2022

jd-3d commented Sep 25, 2022

underlines commented Sep 26, 2022 • edited Loading

LiJT commented Sep 29, 2022

jd-3d commented Oct 2, 2022

bmaltais commented Oct 11, 2022 via email

d8ahazard commented Oct 11, 2022

0xdevalias commented Nov 1, 2022 • edited Loading

mezotaken commented Jan 12, 2023

bmaltais commented Sep 23, 2022 •

edited

Loading

underlines commented Sep 26, 2022 •

edited

Loading

0xdevalias commented Nov 1, 2022 •

edited

Loading