tracker: move `prepare_inputs_for_generation` into the generation mixin 🧹 #32685

gante · 2024-08-14T14:33:53Z

🧹 This is a tracker regarding the move of prepare_inputs_for_generation into the generation mixin 🧹

Why?

prepare_inputs_for_generation is not part of the core modeling, but rather a utility for generate
it should greatly reduce the need to touch modeling code, on generate changes. Fewer modeling changes -> improved model stability
greatly reduced number of lines of code 🙏

Tracker

Ordered list of tasks:

Fix related slow tests before we start — all llama, generate, and cache_utils [except sink cache, broken atm] slow tests should be passing to ensure we don’t break anything (Llama: make slow tests green 🟢 #33138)
PreTrainedModel doesn't inherit from GenerationMixin, so that can_generate() becomes independent of prepare_inputs_for_generation being overwritten or not (Generation: deprecate PreTrainedModel inheriting from GenerationMixin #33203 )
Move llama’s prepare_inputs_for_generation to the generation mixin. This implies moving one function that prepares the 4D mask too (the one that is called there)
Add tests for the generalist prepare_inputs_for_generation — currently we don’t test it directly, and we should
Address the case of synced_gpus in generate: when synced_gpus and cache_positions is out of bounds, take the latest available input_ids for dummy computations (see Fix synced GPUs #33252; should fix Multi GPU generate with llama shape error #32885, Shape mismatch when generating with multiple processes #32603, and Bugfix for generation with an early-stopping process #32641)
Delete prepare_inputs_for_generation from as many models as possible. There may be merge conflicts here, due to the 4D mask function. Try to iron out as many trivial cases as possible
Change prepare_inputs_for_generation to forward **kwargs from its input to its output. With minimal changes, this should enable most VLMs to use the shared function (they forward pixel_values from the input to the output)
By this point most cases of prepare_inputs_for_generation should be removed 🤗 We would need to check the others individually, there may be further simplification patterns available!

The text was updated successfully, but these errors were encountered:

gante · 2024-08-14T14:34:46Z

@ydshieh edit the tracker above as soon as you start working on a task, so we don't risk doing redundant work 🤗 (e.g. with the link to a draft PR)

I'll do the same!

ydshieh · 2024-08-14T14:36:02Z

Thanks

gante self-assigned this Aug 14, 2024

ydshieh self-assigned this Aug 14, 2024

gante added the Generation label Aug 14, 2024

ydshieh added the WIP Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress label Aug 14, 2024

gante mentioned this issue Aug 16, 2024

confusing deprecation msg for DynamicCache.seen_tokens - no cache_position in this class #32855

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tracker: move `prepare_inputs_for_generation` into the generation mixin 🧹 #32685

tracker: move `prepare_inputs_for_generation` into the generation mixin 🧹 #32685

gante commented Aug 14, 2024 •

edited

Loading

gante commented Aug 14, 2024 •

edited

Loading

ydshieh commented Aug 14, 2024

tracker: move prepare_inputs_for_generation into the generation mixin 🧹 #32685

tracker: move prepare_inputs_for_generation into the generation mixin 🧹 #32685

Comments

gante commented Aug 14, 2024 • edited Loading

Why?

Tracker

gante commented Aug 14, 2024 • edited Loading

ydshieh commented Aug 14, 2024

tracker: move `prepare_inputs_for_generation` into the generation mixin 🧹 #32685

tracker: move `prepare_inputs_for_generation` into the generation mixin 🧹 #32685

gante commented Aug 14, 2024 •

edited

Loading

gante commented Aug 14, 2024 •

edited

Loading