Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to load a package as a plugin, and reference submodules? #4

Open
jamesbcd opened this issue Feb 12, 2022 · 13 comments
Open

How to load a package as a plugin, and reference submodules? #4

jamesbcd opened this issue Feb 12, 2022 · 13 comments

Comments

@jamesbcd
Copy link

jamesbcd commented Feb 12, 2022

@avylove, I love this plugin library. But I'm stuck on packages. Is it possible to point me to an example where a package is loaded as a plugin? For example, if I have the following:

mypackage/
    __init__.py (Contains the subclass of the plugin parent)
    submodule1.py
    submodule2.py

How can I import submodule1.py from within init.py, or import submodule2 from within submodule.1py?

Thanks so much.

@avylove
Copy link
Contributor

avylove commented Feb 12, 2022

A plugin is specifically a class, but the loader accepts modules and will look in the module for plugin classes.

PluginLib doesn't care what you do within the classes or modules as long as you meet the spec defined in the parent class.

Yes, you can import submodule1 from __init__.py and import submodule2 from submodule1 as long as you don't have circular dependencies. All of this should be resolved when the modules get loaded. It should work with relative imports(.module1) in mypackage, but I recommend using absolute imports (mypackage.module1).

In your example, you just need to make sure you import the module containing the parent before calling PluginLoader and then supply modules=['mypackage'] as an argument.

Using the example from the docs, but spreading it out across mypackage.

├── mypackage
│   ├── __init__.py
│   ├── module1.py
│   └── module2.py
├── sample.py
└── test.py

Define the parent in sample module

"""
sample.py
"""
import pluginlib

@pluginlib.Parent('parser')
class Parser(object):

    @pluginlib.abstractmethod
    def parse(self, string):
        pass

mypackage root

"""
mypackage/__init__.py
"""
import sample
import mypackage.module1

class JSON(sample.Parser):

    _alias_ = 'json'

    def parse(self, string):
        return mypackage.module1.parse(string)

mypackage.module1

"""
mypackage/module2.py
"""
from mypackage.module2 import parse

mypackage.module2

"""
mypackage/module2.py
"""
import json

def parse(string):
    return json.loads(string)

Loading and calling plugins in test module

"""
test.py
"""
import pluginlib
import sample  # Import module with parent classes so they get registered

loader = pluginlib.PluginLoader(modules=['mypackage'])  # specify mypackage as a search location
plugins = loader.plugins   # Load the plugins
parser = plugins.parser.json()  # Use plugin
print(parser.parse('{"json": "test"}'))

Does that help? Let me know if there is something I didn't cover or you run into an issue.

@jamesbcd
Copy link
Author

Thanks, that's really clear. What I failed to mention is that mypackage is external to the application. I load it using PluginLoader(paths=[path_to_mypackage], prefix_package='myapp.plugins')

When the various submodules in mypackage are self-contained, it all works fine. But as soon I try to import one submodule into the other, I get:

pluginlib.exceptions.PluginImportError: Error while importing candidate plugin module myapp.plugins.mypackage from /path/to/mypackage: ModuleNotFoundError("No module named 'myapp.plugins.mypackage'")

The import statement in the submodule doesn't seem to know about its new namespace, even when I use relative references.

@avylove
Copy link
Contributor

avylove commented Feb 12, 2022

It sounds like myapp isn't in your Python path. Assuming it's path is /home/jamesbcd/dev/myapp, run it like this

PYTHONPATH=/home/jamesbcd/dev python test.py

That's for Linux. On Windows it's a little different, but you can Google that if you need to. Basically you need to tell Python where to find packages. If they are installed they will be in the default path, and it will include the directory your initial code is in, but anything else it needs to be told about it.

For the example above, if you rearrange the files like this:

├── mypackage
│   ├── __init__.py
│   ├── module1.py
│   └── module2.py
└── test
    ├── sample.py
    └── test.py

And from the test directory ran test.py without setting PYTHONPATH, you'd get a similar result

$ python test.py 
Traceback (most recent call last):
...
pluginlib.exceptions.PluginImportError: Error while importing candidate plugin module mypackage from None: ModuleNotFoundError("No module named 'mypackage'")

But setting PYTHONPATH to one directory up (same as ..), it works as expected.

$ PYTHONPATH=../ python test.py 
{'json': 'test'}

It's not a PluginLib-specific error, PluginLib is reporting it because it was encountered while importing mypackage to load plugins.

@jamesbcd
Copy link
Author

myapp is already in the python path, as mypackage can already import from it just fine.

However, I realise that perhaps I've misunderstood how pluginlib works. Is it correct to say that when passing a path to PluginLoadeder, pluginlib still relies on a normal python import to be successful at that path, and pluginlib's real work is then to find the correct plugin classes within that imported package?

In which case perhaps my problem is that mypackage is not in the python path. I tried appended the parent path of mypackage to sys.path prior to instantiating PluginLoader(paths=['path/to/mypackage'], prefix_package='myapp.plugins'). This sort of worked, but results in this warning:

PluginWarning: Duplicate plugins found for <class 'myapp.plugins.mypackage.MyPlugin'>: 'myapp.plugins.mypackage.MyPlugin and mypackage.MyPlugin

Is it possible to avoid duplicate plugins? Should I not specify a prefix_package? Thanks for your help. The python import machinery I find a little confusing.

@avylove
Copy link
Contributor

avylove commented Feb 13, 2022

mypackage does need to be in the path if you are going to use internal imports.

An easy way to verify the paths is to run python -c import mypackage from the same directory you're invoking PluginLib from.

Yes, PluginLib is really just a wrapper around metaclasses and import magic. It relies on the same import machinery, but calls it in a way to make things flexible. The idea of paths is so users can provide file paths in a config without really knowing they are using Python. If your users are Python devs, modules might make more sense. Either way, that plugin code needs to be able to operate independently, either being self contained or all it's imports being in the Python path.

You're getting duplicate warnings because it's finding the same class twice. The plugins get registered when they are loaded so if you're loading them manually and then loading them through PluginLoader, you'll get duplicates. It won't break anything and you can disable the warning with the warnings package, but it's probably better to resolve the issue.

The prefix package is really just there so bare modules can be loaded by path without risking conflict with something else. I don't think it factors in here.

@jamesbcd
Copy link
Author

Is it possible that the duplication occurs during the plugin directory walk? I'm providing PluginLoader with the path of the parent directory of mypackage. Should it be the actual path to mypackage instead?

@jamesbcd
Copy link
Author

jamesbcd commented Feb 13, 2022

Actually, as I debug, the issue seems to be the recursive importing of mypackage. As PluginLoader executes the mypackage module, the loader encounters this code inside mypackage/__init__.py:

from mypackage.submodule1 import SomeClass

I think that executing this import statement is causing my mypackage to be imported the first time (before PluginLoader completes its work). Then, once PluginLoader does complete, mypackage is imported the second time (now with the my app.plugins prefix).

Does that sound possible?

@avylove
Copy link
Contributor

avylove commented Feb 13, 2022

I don't think that's it exactly, but while working on another issue, it seems PluginLoader is loading package modules multiple times on Python 3.5+ . I will push up a fix for that soon and maybe that will solve your issue. I'm very curious. Will reply here once I push that up.

@avylove
Copy link
Contributor

avylove commented Feb 13, 2022

Just pushed up some changes. Please see if that help with the duplicate imports. Assuming it does, I have a little maintenance work to do on the repo and then I can push a new version to PyPI.

@avylove
Copy link
Contributor

avylove commented Feb 13, 2022

Pushed 0.9.0 to PyPI, so should be easier to test. Let me know if it improves things.

@jamesbcd
Copy link
Author

jamesbcd commented Feb 14, 2022

Thank-you! This is working much better now. It also turns out that plugin duplication now depends on whether the imports inside mypackage are relative or absolute.

I tested it out with a minimal example closer to my actual app:

myapp/
├─ __init__.py
├─ __main__.py
├─ plugin_parent.py
external/
├─ plugin_root1/
│  ├─ mypackage_root/
│  │  ├─ mypackage/
│  │  │  ├─ __init__.py
│  │  │  ├─ submodule1.py
│  │  │  ├─ submodule2.py
│  │  ├─ docs/
│  │  ├─ data/
│  ├─ otherpackage_root/
├─ plugin_root2/

#
# myapp/__main__.py
#
import sys
from pathlib import Path
import pluginlib

from .plugin_parent import Plugin

plugin_roots = [Path(__file__).parent.parent / "external/plugin_root1",
                Path(__file__).parent.parent / "external/plugin_root2"]

for plugin_root in plugin_roots:
    print(f"Looking in plugin root {plugin_root}")
    for path in plugin_root.iterdir():
        if not path.is_dir():
            continue
        print(f"\tLooking in directory {path}")
        
        path_in_syspath = path in sys.path
        if not path_in_syspath:
            sys.path.append(str(path))
        
        loader = pluginlib.PluginLoader(paths=[str(path)], prefix_package='myapp.plugins')
        plugins = loader.plugins
        
        if not path_in_syspath:
            sys.path.remove(str(path))
        
        print(f"\t\t{plugins}")

#
# external/plugin_root1/mypackage_root/mypackage/__init__.py
#
import sys
from myapp.plugin_parent import Plugin

from .submodule1 import func1

print("\t\tExecuting mypackage/__init__.py")

class PluginSubclass(Plugin):
    pass

#
# Output (with redundant path info trimmed out)
#
Looking in plugin root external/plugin_root1
        Looking in directory external/plugin_root1/otherpackage_root
                {'Plugin': {}}
        Looking in directory external/plugin_root1/mypackage_root
                Executing mypackage/__init__.py
                {'Plugin': {'PluginSubclass': <class 'myapp.plugins.mypackage_root.mypackage.PluginSubclass'>}}
Looking in plugin root external/plugin_root2

This works! (Prior to your pluginlib 0.9 update, the above failed with ModuleNotFoundError("No module named 'myapp.plugins'")

What's interesting, is that if I switch mypackage/__init__.py from relative to absolute imports, I get the duplicated plugin warning again:

#
# external/plugin_root1/mypackage_root/mypackage/__init__.py
#
import sys
from myapp.plugin_parent import Plugin

from mypackage.submodule1 import func1    # <-- Change to absolute

print("\t\tExecuting mypackage/__init__.py")

class PluginSubclass(Plugin):
    pass

#
# Output (trimmed)
#
Looking in plugin root external/plugin_root1
        Looking in directory external/plugin_root1/otherpackage_root
                {'Plugin': {}}
        Looking in directory external/plugin_root1/mypackage_root
                Executing mypackage/__init__.py
                Executing mypackage/__init__.py
external/plugin_root1/mypackage_root/mypackage/__init__.py:7: PluginWarning: Duplicate plugins found for <class 'myapp.plugins.mypackage_root.mypackage.PluginSubclass'>: myapp.plugins.mypackage_root.mypackage.PluginSubclass and mypackage.PluginSubclass
  class PluginSubclass(Plugin):
                {'Plugin': {'PluginSubclass': <class 'mypackage.PluginSubclass'>}}
Looking in plugin root external/plugin_root2

So relative imports worked as I'd hoped. Whereas using absolute imports does seem to result in mypackage being imported/executed twice, under two different fully-qualified names. I'm not sure if something about my directory structure is resulting in the duplication?

@avylove
Copy link
Contributor

avylove commented Feb 14, 2022

I think the difference is due to the way the modules are imported when given as a path. In this case, importlib.exec_module() is used and the work is done in it's own namespace. I think using absolute imports bleeds into the global namespace. Essentially the same code is imported in two different namespaces.

It would be good to let these kind of behaviors go without a warning in cases like this, but this is actually handled in the PluginType metaclass when plugins are defined, not when they are loaded, so it may be difficult to know if the conflict refers to the same object or not. Potentially, we could look at if they refer to the same code, but, even if that worked, I think this could cover up other issues.

I need to think on it a bit.

You may want to consider using modules instead of paths. It uses a more direct import logic, so these types of issues shouldn't come up. That said, it really comes down to what kind of contract you want to have with your users. Using modules assumes your plugins are installed Python modules rather than a directory of Python modules.

@jamesbcd
Copy link
Author

That all makes sense. Thanks for all your help and insight - really useful and much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants