Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix local repos with remote code not registering for pipelines #33100

Merged
merged 9 commits into from
Aug 30, 2024

Conversation

Rocketknight1
Copy link
Member

@Rocketknight1 Rocketknight1 commented Aug 23, 2024

Draft PR, don't merge! Experimenting with fixes for the issue with local repos and remote code not registering correctly for pipelines.

TODO:

  • Add a test
  • Maybe run some slow autoclass/pipeline tests to make sure I'm not breaking anything
  • Try removing model_class.register_for_auto_class(cls.__name__) entirely?

Fixes #32923

@Rocketknight1 Rocketknight1 changed the title Extremely experimental fix! Fix local repos with remote code not registering for pipelines Aug 23, 2024
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@Rocketknight1 Rocketknight1 force-pushed the fix_local_vs_remote_autoclassing branch from 0bb67d8 to b2eb197 Compare August 26, 2024 14:11
@Rocketknight1 Rocketknight1 force-pushed the fix_local_vs_remote_autoclassing branch from 5f4eb03 to 8142311 Compare August 27, 2024 17:02
@Rocketknight1 Rocketknight1 marked this pull request as ready for review August 27, 2024 17:09
@Rocketknight1
Copy link
Member Author

This is now ready for review! The explanation for it is a bit confusing, though. The original issue is that custom code models did not correctly register with the pipeline model mappings when they were loaded from a local directory instead of from the Hub. The cause is this block of code that treats models differently when they are loaded from a local dir:

if os.path.isdir(config._name_or_path):
    model_class.register_for_auto_class(cls.__name__)
else:
    cls.register(config.__class__, model_class, exist_ok=True)

When I investigated, this code block was added by @sgugger to fix another issue, where custom code models loaded from local directories did not correctly save their modeling files when saved with save_pretrained().

In testing, I tried reverting this change, and it appears that custom code models now correctly save those files even after the revert.

Therefore, this PR reverts the change, and adds two new tests, one test to ensure that loading a local custom code model in a pipeline works now (this test fails without this PR), and one test to ensure that custom code models loaded from a local dir still correctly save their modeling files (this test passes on main, but I'm adding it as an anti-regression precaution).

cc @LysandreJik

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok the fix works for me and the test is good! As said in the other PR, would it be possible to move the user-owned repos to hf-internal-testing ?

@Rocketknight1
Copy link
Member Author

@LysandreJik done!

@Rocketknight1 Rocketknight1 merged commit 38d58a4 into main Aug 30, 2024
24 checks passed
@Rocketknight1 Rocketknight1 deleted the fix_local_vs_remote_autoclassing branch August 30, 2024 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Local repos not correctly registering for pipelines
3 participants