Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Turning Plugin Management into Actual Package Management #179

Open
s-m-e opened this issue May 21, 2020 · 30 comments
Open

Turning Plugin Management into Actual Package Management #179

s-m-e opened this issue May 21, 2020 · 30 comments

Comments

@s-m-e
Copy link

s-m-e commented May 21, 2020

QGIS Enhancement: Turning Plugin Management into Actual Package Management

date 2020-05-21
authorSebastian M. Ernst (@s-m-e)
contacternst@pleiszenburg.de
maintainer@s-m-e
versionQGIS 3.16

Abstract

QGIS Python plugins can not explicitly depend on regular Python packages. Although QGIS Python plugins can depend on other QGIS Python plugins, introduced in QGIS 3.8, this mechanism is far away from mature. Code quality, design and maintainability of the entire current plugin management system within QGIS, based on a detailed analysis of version 3.12, are questionable at best. This document proposes (a) to re-implement the existing plugin management system with all of its features, (b) to clean up the cross-plugin dependency design and (c) to add support for both the conda and the pip Python package managers for managing QGIS Python plugins - effectively adding support for dependencies between QGIS Python plugins and regular Python packages. These proposed changes are fully backward compatible and do not introduce adverse performance characteristics.

TL;DR

Although not overly complicated from a pure technical point of view, this QEP is an unusual long one. Certain gaps in QGIS' relationship to the Python ecosystem make QGIS fall far behind its true potential. This document proposes to close the gaps. However, figuring out an actually good, sustainable solution - not just any solution - is not trivial. It requires a thorough analysis of (1) the problem, (2) the current codebase, (3) all possible options forward and (4) potential pitfalls. Unfortunately, the topic of Python modules and packaging is also paved with common misconceptions. This QEP has therefore intentionally been developed in great detail in a separate GitHub repository. The following table of contents provides direct links to the individual chapters of the QEP within the separate repository. Once the discussion ends, a final version of this QEP might replace this "TL;DR" section.

Interaction between traditional QGIS plugins and conda / pip

The described approach does not introduce any changes to the QGIS "legacy" Python plugin "package" format. It is therefore perfectly feasible to distribute a QGIS Python plugin in multiple formats, e.g. in the legacy format and as a conda package, while maintaining the entire project in a simple directory tree within a git repository or similar.

Because the support for conda and pip should not replace the existing "legacy" QGIS plugin "package" format, all three distribution paths are supposed to coexist. It is proposed to design the new pluginmanager module in a modular way, containing one "backend" each for conda, pip and the "legacy" QGIS Python plugin "package" format. Each backend is on its own responsible for installing & uninstalling QGIS Python plugins as well as handling their dependencies. Dependencies can intentionally not be handled across multiple backends, which would blow up the complexity of the implementation beyond a manageable point. The pluginmanager will only handle direct conflicts between the different backends and prohibit broken installations by exploiting dry-run capabilities of all involved tools. If a plugin is available from multiple backends (and/or in multiple versions), a user will be asked to choose a backend (and a version) for installation.

QGIS already supports package repositories. The concept will simply be extended by allowing conda and pip repositories with configuration options specific to those backends. In the case of conda, different conda repositories within a QGIS configuration could refer to different conda package channels. In the case of pip, different pip repositories could refer to specific package sources.

Table of Contents

  1. Motivation
  2. Proposed, Preferred Solution
  3. Alternative, Unfavorable Solutions
  4. Performance Implications
  5. Backward Compatibility
  6. Copyright
  7. Issue Tracking ID(s)
  8. Votes
@rouault
Copy link

rouault commented May 21, 2020

First, I should state this is really an AMAZING analysis !!! Probably one of the most thoroughful QEP ever.

Could I just ask that you extend a bit the TL;DR section to highlight a bit more about the interaction with pip and conda (mostly from things mentioned in https://github.com/qgist/pluginmanager-qep/blob/master/QEP.md#step-2-adding-support-for-python-package-managers), which is the main motivation for this rework. This will make this QEP even better

I'm not sure if given this proposal, QGIS would be able to install new QGIS plugins from pip or conda, or if he would find already installed QGIS plugins installed previously through them ?

Tiny detail regarding discovery of QGIS plugins from pip / conda: you mention scanning through the packages to find those with metadata.txt. I was wondering if for example having a dummy "qgis-base-plugin" package that pip/conda QGIS plugin packages would be required to depend on wouldn't make the discovery faster if there's some centralized file somewhere with the chain of dependencies (one advantage I can think of having this base package however would be the abiility to uninstall all packaged QGIS plugins by removing it) But I haven't looked at the implementation details of pip/conda, so perhaps the scan approach and find metadata.txt is the best one.

@s-m-e
Copy link
Author

s-m-e commented May 21, 2020

@rouault

Could I just ask that you extend a bit the TL;DR section to highlight a bit more about the interaction with pip and conda [...]

Do you want this information simply added from the long version to the TL;DR - or does the actual long version leave any open questions for you?

I'm not sure if given this proposal, QGIS would be able to install new QGIS plugins from pip or conda, or if he would find already installed QGIS plugins installed previously through them ?

QGIS would be able to install new QGIS plugins from PyPI (through pip) or through conda, yes. It would also find plugins previously installed through pip and conda if a user was running pip and/or conda outside of QGIS. After all, for simplicity, QGIS would scan every folder in Python's sys.path for Python packages containing a metadata.txt file. It does not matter how those Python packages got there, whether through QGIS interacting with pip or conda or through using pip or conda independently.

[...] dummy "qgis-base-plugin" package that pip/conda QGIS plugin packages would be required to depend on [...]

Sort of a "reverse dependency lookup" - i.e. "Which Python packages depend on the qgis-base-plugin package?" Both pip and conda can sort of do this, but especially in the case of conda, the process can be agonizingly slow. Just scanning through the filesystem is way faster. Python has an excellent scandir implementation, added to Python 3.5. It's perfectly capable of doing what I propose really, really fast :)

@rouault
Copy link

rouault commented May 21, 2020

Do you want this information simply added from the long version to the TL;DR

yes. As it appears otherwise close to the end of the long version, given reader fatigue, it is a bit hard to extract the high level overview of how QGIS would interact exactly with pip/conda.

Python has an excellent scandir implementation, added to Python 3.5. It's perfectly capable of doing what I propose really, really fast :)

Sounds good

@olivierdalang
Copy link

olivierdalang commented May 21, 2020

Wow !!! It was a great read !! Thanks for the proposal :-)

I also had in mind working on adding pip support to QGIS, but certainly with less ambitions than you. Also from a recent small contribution to the plugin manager, I can only agree that the code really deserves a rewrite.

A few suggestions for the QEP :

1/ Maybe add a short description on how the experience would change
both for plugin developers (will they need to now write a setup.py file ?) and end users (do they need to choose between qgis repo, pypi or conda ? do they choose once, or on a per-plugin basis ? for conda, is everything installed for them automatically ? what if there are some incompatible requirements ? etc.)
(mostly addressed already, I agree with @rouault's comment about read fatigue :-) )

2/ I think the alternative/unfavorable sections should mention the somewhat opposite strategy suggested on the ML thread (moving more towards C++ and keeping the python part to a minimum). It's not clear to me what goals of this QEP would be harder to achieve with that other approach ?

And a few questions/suggestions on the proposal itself :

1/ About backwards compatibility: there will be a lot of things to take into account for a rewrite from scratch (metadata.txt VS setup.py, qgis plugin repository VS pipy/conda repos, etc.). Handling all this legacy logic will make it hard to have a clean implementation and it may also become confusing for plugin developers (will they have to create both setup.py and metadata.txt ?).
An alternative could be to deal with backwards compatibility at the repository level, e.g. automate the publishing of pypi packages from the QGIS repo for old-style plugins, and automate the publishing of QGIS packages from pypi for new-style plugins.

2/ It's not exactly clear to me from reading the QEP what the intention is in terms of repositories. Would QGIS-Django stay the central repository for QGIS plugins (and thus expose a custom python package index, mirroring plugins.xml) ? Or would pypi become the central repo ? I think we want to keep one central repository at all costs (as there's no such thing as multiple central repositories :-) ). Multiple repositories should only be used for enterprise/dev setups, not for distributing plugins to the public.
As a reference, Django packages are hosted directly on pypi, and can be listed independently using classifiers, and it works well enough. Not sure if classifiers could be used for plugin discovery too ?

3/ Would packages be installed in a virtualenv ? Is it doable to have one virtualenv per user profile ?

4/ Maybe C++ plugins could be managed completely independently, so that the new implementation wouldn't have to take care of this. Basically it means the described C++ fallback manager wouldn't be a fallback, but just the regular C++ manager. It could be moved into QGIS settings instead, any maybe even rebranded as "extensions" to avoid confusion with python plugins (I'm not really sure of this, as I never installed/enabled/disabled a C++ plugin, but it seems you can't do much with C++ plugins in the manager anyways, and the move is towards killing them).

5/ Would the new plugin manager be a python plugin itself ?

6/ I'm not sure I like the idea of an alternative installer for Windows. If this works ends up in having QGIS being fully installable via pip or conda, that's a nice bonus, but I wouldn't advertise this as an installation method to the general public, it's already confusing enough now with just two ways to install.

Many thanks again for the great work ! I'll be following this with great interest ;-)

@nyalldawson
Copy link
Contributor

Nice description, thanks!

A couple of things which come immediately to mind

  • I don't think we should brand the existing way of providing plugins as a "legacy" approach, as this has negative connotations. (At least until the new way is a feature match!)

  • The current plugin repo approach has a very attractive feature in that it is SUPER easy for users/organisations to create custom plugin repos. All you need is a simple xml file stored somewhere -- this can even be on github directly or on a network share! There's no special server software required at all. And this is VERY widely used by organisations who either want to publish internal-only plugins or who have their own curated whitelist of plugins. It is unclear to me how this use case would be supported by the pip/conda style approach. How would you see this use case being addressed?

  • In the absense of formal stable API for scripts to handle plugin management, the pyplugin_installer package has become a quasi-public API for this. Using methods from pyplugin_installer is the ONLY way scripts can be written which do tasks like automatically installing another plugin or automatically checking for updates and updating a plugin. This is CRUCIAL functionality for enterprise deployments, where there is a need for QGIS startup scripts to be able to force install and force update internal organisation specific plugins. If
    pyplugin_installer is to be removed then we'll need to offer formal, stable API to support this use case (and hopefully much nicer then having to use the raw pyplugin_installer class, which is seriously horrible to use!)

  • Something to keep in mind: The convention in 3.x is that stable API is exposed through the qgis. prefix. If scripts are directly importing other classes (such as from processing import ...), then that's considered private api and use-at-your-own-risk. I'd like to see this same convention being used here, so that anything you consider as stable, public api is injected into the qgis.plugin_management (or similar) namespace, and anything not public stays outside of the qgis... prefix.

  • Regarding c++ plugins: In think you could potential avoid the need for the fallback c++ plugin management GUI. There's only a handful of c++ plugins left in the default install, and there's open PRs for removing another 3 of these (Drop the evis plugin QGIS#36607, Drop Globe plugin QGIS#36604, Drop compass plugin QGIS#36599). Potentially, the remaining plugins could be either moved to app directly (georeferencer, offline editing, geometry checker) or enabled/disabled by default with a QSettings key as the only way to change this. This would mean the new plugin manager GUI could focus SOLELY on Python plugin management...

  • In order to integrate with QGIS best, QgsTask should be utilised during plugin installation and upgrades

  • You'll NEED to use the Qt network framework for ALL network requests (i.e. QgsNetworkAccessManager or QgsBlockingNetworkRequest). You can't use Python network modules like requests here. Using QgsNetworkAccessManager is the only way to get the requests correctly using the user's proxy configuration, get the requests logged with the Network Logger panel, and the ONLY way to maintain the current functionality which integrates the QGIS authentication framework with plugin installation/upgrades.

After all, for simplicity, QGIS would scan every folder in Python's sys.path for Python packages containing a metadata.txt file.

This sounds quite costly to do -- would it happen on every startup? We've already got an issue with sloooow QGIS startup speeds caused by all the Python imports happening when Processing is initialiased, and I'd be concerned that this approach may make things much worse 😱

@dmarteau
Copy link

dmarteau commented May 22, 2020

Hi, Nice analysis !

Python does not intentionally allow this use-case for imported modules, so QGIS' implementation can be considered a hack and is unreliable. However, the added benefit from this feature is the ability to re-load a plugin at QGIS runtime, which greatly helps with the development of QGIS Python plugins.

I cannot agree more, this part is a total hack and rely on deprecated api. This is one of the most fragile part of the python plugin support in Qgis, and will blow up at some point in the future.

Worst, the unloading hack is not able in removing all plugins declarations, especially declarations in other namespaces. So, unloading plugin is merely a false promise since many side effects are potentially left behind.

Anyway, I have considered the question since sometime now, and I have never found a satisfactory
path that will supports both clean and sustainable uninstall and hot uninstall of packages. My guess is that a restart of the python interpreter would be the safer solution.

@dmarteau
Copy link

dmarteau commented May 22, 2020

After all, for simplicity, QGIS would scan every folder in Python's sys.path for Python packages containing a metadata.txt file.

There is no need for this, Python has an extension auto-discovery mecanism using setup.py entry_points: It completely removes the need for metadata.txt and entry points located in the __init__.py

@dmarteau
Copy link

dmarteau commented May 22, 2020

@olivierdalang

3/ Would packages be installed in a virtualenv ? Is it doable to have one virtualenv per user profile ?

This would be a desirable feature but you'll have to run a Python interpreter for each environment.

@s-m-e
Copy link
Author

s-m-e commented May 22, 2020

@rouault

Do you want this information simply added from the long version to the TL;DR

yes. As it appears otherwise close to the end of the long version, given reader fatigue, it is a bit hard to extract the high level overview of how QGIS would interact exactly with pip/conda.

Done - I added a new section to TL;DR. I hope it covers all desired aspects.

@s-m-e
Copy link
Author

s-m-e commented May 22, 2020

@olivierdalang

both for plugin developers (will they need to now write a setup.py file ?) [...]

They can, if they want to ship the plugin as a wheel, but the do not have to. It all depends on their desire to have (a) Python package dependencies and (b) binary extensions in the plugin. If (a) or (b) can be answered with a "yes", they will very likely need a setup.py package (or flit or other similar tools become more nature). In other words, all rules and technologies of contemporary Python packaging apply.

[...] and end users (do they need to choose between qgis repo, pypi or conda ? do they choose once, or on a per-plugin basis ?

A user can choose on a per-plugin basis - depending on whether or not a certain plugin is available through multiple distribution channels.

for conda, is everything installed for them automatically ? what if there are some incompatible requirements ?

Yes, if using conda (or pip) "everything" (i.e. all required dependencies) will be installed automatically. QGIS' pluginmanager will "only" make sure that there is no conflict between the backends. This can be achieved through dry-runs of the package managers. In the event of a conflict, QGIS' pluginmanager can provide a rather detailed analysis of the problem and suggestions on how to move forward (very similar to how apt or zypper are handling conflicts).

2/ I think the alternative/unfavorable sections should mention the somewhat opposite strategy suggested on the ML thread (moving more towards C++ and keeping the python part to a minimum). It's not clear to me what goals of this QEP would be harder to achieve with that other approach ?

I can add this information, yes. In a nutshell: The pluginmanager will heavily interact with Python plugins, e.g. for import (loading), detailed error handling or introspection, see proof of concept implementation for instance. While the proposal suggests to implement a sub-process layer around pip and conda, I really want to replace this asap with API calls if reasonably good APIs become available. Both tools are written in Python, after all. In summary, shifting as much of the pluginmanager logic to C++ as possible would cause a massive C++-to-Python-calls layer, much larger than the current one, plus a lot of related unnecessary complexity. From my perspective, there is just no good reason why this should be done.

1/ [...] Handling all this legacy logic will make it hard to have a clean implementation [...]

It's doable :) It's just a question of cleanly separating the pieces.

[...] and it may also become confusing for plugin developers (will they have to create both setup.py and metadata.txt ?).

Yep. It really requires some good documentation on plugin-writing. It's already complicated and the documentation has "gaps" and inconsistencies, mildly put. When I wrote my first plugins, I ended up reading QGIS source code (for the first time).

As far as setup.py files go - as mentioned, it depends on the plugin author's needs. I suggest to provide updated, well documented plugin templates based on cookiecutter. For Python packages with binary extensions, I commonly recommend the pylibrary template. This could be adapted for QGIS.

An alternative could be to deal with backwards compatibility at the repository level, e.g. automate the publishing of pypi packages from the QGIS repo for old-style plugins, and automate the publishing of QGIS packages from pypi for new-style plugins.

That's an option and it's up to you. You will have to support older versions of QGIS. So adding those features on a repository level will probably make the repository implementation much more complex. It's code is not as bad as the QGIS plugin installer code, but it would certainly require a re-write in a few places. From my perspective, the easier solution is to offer a "second" repository at qgis.org and use software like warehouse for it. warehouse is driving pypi.org. warehouse is not the only open source Python package repository implementation in existence.

Would QGIS-Django stay the central repository for QGIS plugins (and thus expose a custom python package index, mirroring plugins.xml) ?

QGIS-Django would stay the central repo.

Or would pypi become the central repo ?

No. pypi.org could be used by plugin authors, but the QGIS pluginmanager could be configured to not use / search it by default. If a user wants to install plugins from pypi.org, it must be added manually as a repository.

I think we want to keep one central repository at all costs [...]

Agreed.

As a reference, Django packages are hosted directly on pypi, and can be listed independently using classifiers, and it works well enough. Not sure if classifiers could be used for plugin discovery too ?

That's an idea worth exploring. warehouse (as well as the anaconda cloud) does support queries like this.

3/ Would packages be installed in a virtualenv ? Is it doable to have one virtualenv per user profile ?

Virtual environments are a topic on its own. Exploring related issues and ideas would result in another QEP of similar proportions ;) Bottom line: This proposal assumes that there is a working Python configuration in place, whether system-wide or within a virtual environment. It is assumed that QGIS is started inside this pre-existing configuration / environment. Independently, it is theoretically possible to make QGIS manage virtual environments and connect them to profiles. This would actually allow to cleanup some really wilde manipulations of sys.path. There are better and much cleaner ways for achieving this through environment variables and pth-files.

4/ Maybe C++ plugins could be managed completely independently, [...]

I would not kill them or the C++ plugin infrastructure in general (in my personal opinion). It has its legitimate use-cases. I'd also prefer to not manage C++ plugins somewhere else. At least from my perspective, having two places inside QGIS for managing "plugins" only adds confusion.

5/ Would the new plugin manager be a python plugin itself ?

I have considered this options. It can be done like this, yes, but there is a major problem: The proposed pluginmanager should handle plugin loading. It can not load itself. I also do not want to leave plugin loading to the old QGIS Python infrastructure. Bottom line: Some changes to QGIS core are required either way. I could in theory imagine a scenario where the pluginmanager remains a separate project and can be updated independently of QGIS - while QGIS core looks for it and initializes it if present.

6/ I'm not sure I like the idea of an alternative installer for Windows. If this works ends up in having QGIS being fully installable via pip or conda, that's a nice bonus, but I wouldn't advertise this as an installation method to the general public, it's already confusing enough now with just two ways to install.

Yes and no. Both current ways have serious limitations. Laying the foundation for a potential third way does not hurt. One day, this third way can proof to be better and replace the two existing distribution methods. Or it does not - who knows.

@s-m-e
Copy link
Author

s-m-e commented May 22, 2020

@nyalldawson

I don't think we should brand the existing way of providing plugins as a "legacy" approach, as this has negative connotations.

Agreed. I am more than happy to change the text and use a better term. Any suggestion?

The current plugin repo approach has a very attractive feature in that it is SUPER easy for users/organisations to create custom plugin repos. All you need is a simple xml file stored somewhere -- this can even be on github directly or on a network share! There's no special server software required at all. And this is VERY widely used by organisations who either want to publish internal-only plugins or who have their own curated whitelist of plugins. It is unclear to me how this use case would be supported by the pip/conda style approach. How would you see this use case being addressed?

Well, allow me to use the term "legacy" here ... For the legacy packaging approach, nothing really changes. Those xml-files could still be used. For pip and conda, the most flexible approach is to use special server software, yes. However, both tools can handle simple folders (i.e. network shares) just fine. pip simply looks at the folder when a user points it to the folder (e.g. while running pip install with certain options, --no-index --find-links). All you have to do is to dump the wheel files in there. conda has a conda index command which quickly builds a package index inside the folder (which is holding conda packages) before a user can then run conda install.

In the absense of formal stable API for scripts to handle plugin management, the pyplugin_installer package has become a quasi-public API for this. Using methods from pyplugin_installer is the ONLY way scripts can be written which do tasks like automatically installing another plugin or automatically checking for updates and updating a plugin. This is CRUCIAL functionality for enterprise deployments, where there is a need for QGIS startup scripts to be able to force install and force update internal organisation specific plugins. If pyplugin_installer is to be removed then we'll need to offer formal, stable API to support this use case (and hopefully much nicer then having to use the raw pyplugin_installer class, which is seriously horrible to use!)

O.k., I did not know that people relied on qgis.pyplugin_installer in such a way. The proposed index API must become really good then ... Question in return: Is it worth it to consider adding wrappers for certain APIs in qgis.pyplugin_installer (similar to what I propose for qgis.utils) for backward compatibility?

Something to keep in mind: The convention in 3.x is that stable API is exposed through the qgis. prefix. If scripts are directly importing other classes (such as from processing import ...), then that's considered private api and use-at-your-own-risk. I'd like to see this same convention being used here, so that anything you consider as stable, public api is injected into the qgis.plugin_management (or similar) namespace, and anything not public stays outside of the qgis... prefix.

Makes sense, agreed.

Regarding c++ plugins: In think you could potential avoid the need for the fallback c++ plugin management GUI. There's only a handful of c++ plugins left in the default install, and there's open PRs for removing another 3 of these (qgis/QGIS#36607, qgis/QGIS#36604, qgis/QGIS#36599). Potentially, the remaining plugins could be either moved to app directly (georeferencer, offline editing, geometry checker) or enabled/disabled by default with a QSettings key as the only way to change this. This would mean the new plugin manager GUI could focus SOLELY on Python plugin management...

It would simplify things, yes, but it would also sort of limit the C++ plugin infrastructure even further. I think it's worth keeping it around - see my earlier reply to Olivier Dalang.

In order to integrate with QGIS best, QgsTask should be utilised during plugin installation and upgrades

A task queue, I assume (?). Sure, this can be done. The pluginmanager could prepare "transactions" and then leave their execution to QgsTask.

You'll NEED to use the Qt network framework for ALL network requests (i.e. QgsNetworkAccessManager or QgsBlockingNetworkRequest). You can't use Python network modules like requests here. Using QgsNetworkAccessManager is the only way to get the requests correctly using the user's proxy configuration, get the requests logged with the Network Logger panel, and the ONLY way to maintain the current functionality which integrates the QGIS authentication framework with plugin installation/upgrades.

I have noticed that, yes. My proof of concept actually uses QgsBlockingNetworkRequest and QgsApplication.authManager. I am not particularly happy with it but I can see the point - especially with authentication management.

After all, for simplicity, QGIS would scan every folder in Python's sys.path for Python packages containing a metadata.txt file.

This sounds quite costly to do -- would it happen on every startup? We've already got an issue with sloooow QGIS startup speeds caused by all the Python imports happening when Processing is initialiased, and I'd be concerned that this approach may make things much worse scream

I came across related discussions and thought about solving it. (a) Python's scandir is really fast. Even in a worst-case scenario, the proposed approach is not as bad as it sounds. The only bottlenecks I can think of are really slow spinning disks and congested network shares - but both of them would blow up the current QGIS infrastructure as well. (b) There is no need to import certain things (which causes the current slowdowns on startup) if you use some of the magic provided by ast and inspect. I would handle actual imports as lazy as possible, which should speed up the QGIS loading times considerably.

I have played with the proposed solution. I'd be happy to implement a simple benchmark plugin for tests, which could be used in actual, complex deployments for gaining some useful numbers. How about that?

@s-m-e
Copy link
Author

s-m-e commented May 22, 2020

@dmarteau

My guess is that a restart of the python interpreter would be the safer solution.

Yes, indeed. But this would require a cleanup of the Python thread initialization code in QGIS core. It's a good idea but a little bit beyond the scope of this proposal.

After all, for simplicity, QGIS would scan every folder in Python's sys.path for Python packages containing a metadata.txt file.

There is no need for this, Python has an extension auto-discovery mecanism using setup.py entry_points: It completely removes the need for metadata.txt and entry points located in the init.py

It's an interesting idea but also a technology that is horribly bad documented.

My thinking is that looking for metadata.txt files is a dead-simple low-tech solution. The added benefit is backward compatibility.

@dmarteau
Copy link

@s-m-e

It's an interesting idea but also a technology that is horribly bad documented.

I understand the point and I agree that setup.py has never been very well documented - it is somewhat a little better since old distutils api is now deprecated - but it is still the reference for packaging python and should not be put aside.

Global scanning may have really unwanted side effects as it will scan everything even if not related to qgis.

@s-m-e
Copy link
Author

s-m-e commented May 22, 2020

@dmarteau

[...] but it is still the reference for packaging python and should not be put aside.

You're right, yep.

Global scanning may have really unwanted side effects as it will scan everything even if not related to qgis.

Agreed. Though, I'd not perform a truly "global" scan. This is how sys.path in QGIS 3.12 actually looks like at the moment (on conda on Linux, just as an example):

  • $CONDA/envs/$ENV/share/qgis/python
  • $HOME/.local/share/QGIS/QGIS3/profiles/$PROFILE/python
  • $HOME/.local/share/QGIS/QGIS3/profiles/$PROFILE/python/plugins
  • $CONDA/envs/$ENV/share/qgis/python/plugins
  • $CONDA/envs/$ENV/share/qgis/python/plugins
  • $CONDA/envs/$ENV/share/qgis/python
  • $PWD
  • $CONDA/envs/$ENV/lib/python37.zip
  • $CONDA/envs/$ENV/lib/python3.7
  • $CONDA/envs/$ENV/lib/python3.7/lib-dynload
  • $HOME/.local/lib/python3.7/site-packages
  • $CONDA/envs/$ENV/lib/python3.7/site-packages
  • $HOME/.local/share/QGIS/QGIS3/profiles/$PROFILE/python

I have used some "pseudo-environment variables" in this list for simplicity. Anyway, after killing the redundancies in here, I only need to scan the first level of those folders - for folders containing two specific files inside their root, __init__.py and metadata.txt.

@dmarteau
Copy link

dmarteau commented May 22, 2020

for folders containing two specific files inside their root, init.py and metadata.txt.

In this context, I think that metadata.txt is a poor choice and should be named 'qgismetadata.txt' or something which relates to Qgis without ambiguity.

@s-m-e s-m-e changed the title Turning Plugin Management into Actual Package Management Turning Plugin Management into Actual Package Management (QGIS Grant 2020 program) May 23, 2020
@3nids 3nids added the Grant-2020 QEP for 2020 Grant program label May 25, 2020
@olivierdalang
Copy link

olivierdalang commented May 29, 2020

@s-m-e I gave this another thought. I very much adhere to the main goal, but find the proposed approach too risky. I particularly fear having multiple repos/backend will end up being very confusing both for users (which backend should I install a plugin from) and plugin developers (so, is it metadata.txt or setup.py, the QGIS plugin repo or Pypi, etc), and even in terms of logic (for example I don't understand the implication of the sentence Dependencies can intentionally not be handled across multiple backends).

I think we should drop the idea of having pip/conda installable QGIS plugins as a first step. I don't think we have enough real-word use cases to justify this. If a plugin really provides an API to be used by other plugins, it can most probably be factored out of a QGIS plugin and become a regular python library. And for other cases, it's not clear how changing the package management improves other cross-plugin dependency issues (e.g. nondeterministic loading order).

Instead, I suggest to focus on making external libraries installable, which is where this proposal's value lies IMO (besides cleaning up the code base). We'd stick to the good old qgis plugins repo as the one and only repository for QGIS plugin. And then it's just about allowing to control the python environment being used. It's still just one environment, optionally a conda environment, and ideally one per user profile.

Going this way is less ambitious, but also less risky. It basically comes down to supporting pip_requirements and conda_requirements to metadata.txt, which is easy to explain to developers. For users, by default, they would always have a python backend supporting pip, thus not even notice a difference (besides maybe a new "libraries" panel showing pip freeze in the plugin manager). They'd get prompted to switch to a conda environment the day they try to install a plugin with conda_requirements.

Maybe I underestimate the use case for pip/conda installable python plugins. If that's the case, I'd love to hear more of real world examples where this would improve cross plugin dependency issues. I'd also like to hear more about what conda packages you're thinking of (as we could start even smaller, without conda, as pip already covers many libraries including stuff like tensorflow, and other notoriously annoying libraries to install with pip on windows such as scipy or numpy are already covered by the OSGeo4w installer).

@s-m-e
Copy link
Author

s-m-e commented May 29, 2020

@olivierdalang

@s-m-e I gave this another thought.

Thanks.

TL;DR: Given the feedback that I have seen (including your latest reply), I am under the impression that there are a significant number of misconceptions about Python packaging as well as "Python environments" and the implications of this proposal. Without writing too many more words (I just wrote 1.5k words in an attempt to reply to you at the required level of detail), how about organizing a (virtual) session for discussing this proposal? Short "presentation" / summary plus Q&A or similar. It would allow to actually talk through individual aspects one by one. We could then publish / link a full recording and a written (short) summary here.

@s-m-e
Copy link
Author

s-m-e commented May 29, 2020

@rouault @dmarteau @nyalldawson @olivierdalang

Thanks for your feedback. Following up on the above comment, I would like to organize a small online Q&A plus discussion session for actually talking about this proposal. I have created a survey for finding a suitable time sometimes next week (first week of June 2020). All proposed times are Central European Summer Time (CEST). I'd greatly appreciate your feedback.

@wonder-sk
Copy link
Member

My thoughts so far:

  • I like the idea to make the plugin installation/management more flexible
  • Have you had a chance to look at some other FOSS projects, how they handle all this? (e.g. Blender, Gimp, and there must be various others)
  • I am not convinced it is a good idea to support conda + pip + "legacy" plugins all at the same time. It will be probably confusing to developers and they will need some guidelines when/why they should choose one over the other
  • I quite like @olivierdalang 's suggestion to stick to the existing plugin installation and "only" address the issue of installation of custom python packages (ideally only from a single packaging system - PyPI)
  • if we would adopt conda/pip based discovery and installation of plugins, how do we handle the plugin review process? Right now QGIS admins can remove a plugin ("unapprove") if they find issues with it. I guess it would require a custom blacklist for conda/pip plugins?
  • one scary thing about conda+pip is that plugins could install the same dependency through two different systems. QGIS runs python plugins in a single process within single python interpreter, so I guess only one of those modules could be loaded, risking problems in the other plugin...
  • how would we handle conflicting versions of dependencies - e.g. plugin1 wants depX in version < 10 and plugin2 wants depX in version >= 12 ?
  • how would qgis installation itself integrate with conda/pip system? What I have in mind are packages like PyQt and PyQGIS shipped with QGIS - do we need to make them somehow conda/pip compatible in order to allow plugin packages to depend on them?
  • how about packages that depend on PyQt? For example, if a plugin depends on PyQt3D - from what I remember these Python/Qt dependencies can break easily due to C++ interfaces if they are not compiled using the same compiler (and the same version and some compilation options)
  • I wonder how we would bring conda (or pip) toolchain to Windows/macOS - would we need to ship QGIS with MSVC compiler on Windows and with XCode on macOS?
  • existing macOS packages are based on Homebrew packages, new macOS packages (to be finished soon) are based on custom-built Python - essentialy creating another python distribution like with OSGeo4W (just FYI, not sure if/how that matters with conda/pip integration)

Based on the thoughts above, it seems safer and easier to target a single packaging system and only support installation of wheels. Independent from that could be modernization of python plugin support code and python plugin installer - I agree those are not the most shiny pieces of code that we are proud of, and I am one of those responsible for that mess :-) That code survived with relatively few changes since its introduction somewhere around 2007 - we were all Python beginners at that point...

@borysiasty
Copy link
Member

borysiasty commented Jun 4, 2020

@wonder-sk said:

and I am one of those responsible for that mess :-)

Martin is not guilty! It was just a tiny piece of mess I enlarged into the current spaghetti monster...

@sbrunner
Copy link

sbrunner commented Jun 5, 2020

For me, the idea of ​​getting the plugins with pip and discover them with an entry point is technically attractive but I agree @wonder-sk and @olivierdalang that probably not a good idea for QGIS desktop, then +1 for the proposition of @olivierdalang. Perhaps it can be a good idea to have this kind of mechanism for QGIS server where we actually don't have any tool to install a plugin.

@andreasneumann
Copy link
Member

Just some brief feedback about the decision of the PSC to not accept this proposal (for now): we all agree that the Python plugin installer situation should be improved, but from the discussion here, we thought that there is no consensus among core devs yet. Once this consensus is reached we can rediscuss supporting this proposal.

@m-kuhn
Copy link
Member

m-kuhn commented Jun 5, 2020

Short summary of thoughts:

  • It's an important discussion
  • Very good discussion, notably good points by @olivierdalang and @wonder-sk towards the end
  • I guess most of us have played with the idea before

More notes about this comment by @wonder-sk

  • how would qgis installation itself integrate with conda/pip system? What I have in mind are packages like PyQt and PyQGIS shipped with QGIS - do we need to make them somehow conda/pip compatible in order to allow plugin packages to depend on them?

  • how about packages that depend on PyQt? For example, if a plugin depends on PyQt3D - from what I remember these Python/Qt dependencies can break easily due to C++ interfaces if they are not compiled using the same compiler (and the same version and some compilation options)

  • I wonder how we would bring conda (or pip) toolchain to Windows/macOS - would we need to ship QGIS with MSVC compiler on Windows and with XCode on macOS?

  • existing macOS packages are based on Homebrew packages, new macOS packages (to be finished soon) are based on custom-built Python - essentialy creating another python distribution like with OSGeo4W (just FYI, not sure if/how that matters with conda/pip integration)

These was where I ended up last time I thought about this.
The only solution I found for this is that we already ship pyqt, qt and other dependencies coming from this system (I evaluated conda) and compile qgis against conda libraries.

The biggest opportunity of that would be consistent dependencies across all platforms (got qgis 3.10.6 using the downloadable installer made from conda? you'll always have exactly the same gdal version). Also, it will be easier to collaboratively manage packages compared to the current osgeo4w situation.
The biggest risk is the dependency on a massive distributed build system which reminds of the homebrew dependency hell and things like updating qt on this system is a huge task in itself (at least at the moment. there have been several attempts to update to a newer version, with different build failures on every os).

Just some brief feedback about the decision of the PSC to not accept this proposal (for now): we all agree that the Python plugin installer situation should be improved, but from the discussion here, we thought that there is no consensus among core devs yet. Once this consensus is reached we can rediscuss supporting this proposal.

thanks for the clarification PSC 👍

@nyalldawson
Copy link
Contributor

The biggest risk is the dependency on a massive distributed build system which reminds of the homebrew dependency hell and things like updating qt on this system is a huge task in itself (at least at the moment. there have been several attempts to update to a newer version, with different build failures on every os).

Ouch, yeah... I would NOT want the main QGIS packaging to be dependent on something like conda. Homebrew has just burnt me too much to see that as a good thing! I can only see the horror in this situation as our ltr builds suddenly break mid-cycle because someone completely unaware of qgis has decided to drop the qt WebKit package because it was causing issue with some completely unrelated desktop app..! 😱

While osgeo4w does have its downsides, at least we are in direct control of what happens on it and can make packaging decisions in the best interest of qgis.

@3nids
Copy link
Member

3nids commented Jun 19, 2020

proposal has not been marked as eligible
see https://lists.osgeo.org/pipermail/qgis-developer/2020-June/061439.html

@s-m-e s-m-e changed the title Turning Plugin Management into Actual Package Management (QGIS Grant 2020 program) Turning Plugin Management into Actual Package Management Jun 19, 2020
@timlinux
Copy link
Member

timlinux commented Jul 9, 2020

I know the conversation here has gone a bit cold, so I just wanted to add a couple of thoughts on this proposal. I did join for the second half of the discussion call that was held and I think my feeling along with others is that this QEP (whilst beautifully written) is a bit overwhelming. There are many concepts to grasp and points of potential contention. I know we need a good roadmap for python in QGIS and there are other things potentially interacting with python like native Qt6 python API instead of SIP that will be coming up on the radar.

What I want to suggest is to treat this as a 'meta QEP' and that you break out small components of your plan and implement them in a piecemeal fashion. I think it will be easier to discuss and agree one thing e.g. adding pip or virtualenv etc. and then add that and then moving on to the next thing after that. I guess the concern with my proposal in the last sentence is that the concerns are so intermeshed that it is going to prove difficult to do that, but if you could find a first low hanging fruit that moves us one step forward to python Nirvana that would be great....

@s-m-e
Copy link
Author

s-m-e commented Jul 9, 2020

@timlinux Thanks for your reply.

I know the conversation here has gone a bit cold

Yeah, sorry for that. I have been working on a revised version of the QEP, mostly addressing the raised concerns. I did not yet have the time to get it polished for release.

There are many concepts to grasp and points of potential contention. I know we need a good roadmap for python in QGIS and there are other things potentially interacting with python like native Qt6 python API instead of SIP that will be coming up on the radar. What I want to suggest is to treat this as a 'meta QEP' and that you break out small components of your plan and implement them in a piecemeal fashion.

I agree. I am not entirely sure how to break it into smaller pieces because most of the touched components are heavily interwoven. So an overall roadmap which addresses interactions between the touched components is indeed required. However, I do not think that I am in a position to start it.

I think it will be easier to discuss and agree one thing e.g. adding pip or virtualenv etc.

With all due respect, some of the raised concerns are valid from a pure technical perspective, some are not. It's a wild, intermeshed mix which makes it difficult to address them in a short and concise manner. Talking about "adding" pip and virtualenv is one of those more critical points, yes, and also a good example for common misconceptions. Understanding what pip and virtualenv are and how they fit into the ecosystem, I'd never ever consider "adding" them to QGIS in any way, shape or form, but that's a (too?) long story on its own. In this context, some of the proposed "simplifications" to this QEP are simply unmaintainable, but it requires some deeper analysis of relevant technologies and the overall ecosystem for getting to this point.

I am not intending to lecture (and/or offend) people - I am merely offering to educate, so we can eventually have a (more) educated discussion about the subject matter. This is what I am proposing as a next step. I am open to debate. I am also be willing to offer a crash course, online, maybe a couple of hours or a day, about the relevant technologies - if there was actual interest. How does this sound?

[...] a first low hanging fruit that moves us one step forward [...]

Let me think of that. There are a few options.

@s-m-e
Copy link
Author

s-m-e commented Jul 10, 2020

QEP:

It is suggested to isolated metadata, metadatafield, metadataspec and version into a separate, new Python package. Ideally, both QGIS and QGIS-Django could then use this package as a common codebase for QGIS Python plugin metadata handling. Furthermore, relevant QGIS documentation could automatically be derived and updated from metadataspec.

... and ...

Currently, both QGIS and QGIS-Django handle plugin metadata but maintain separate and slightly different code for parsing, analyzing and validating plugin metadata files. As part of the proposed work, it is suggested to make both projects use a common code basis for this purpose.

How about a unified & tested library for meta data handling for starters? As a by-product, plugin authors could be provided with an offline verification tool for their plugins (prior to uploading them and trouble-shooting them based on feedback from QGIS Django). It's needed either way and probably one of the least controversial aspects of this QEP.

@das-g
Copy link

das-g commented Apr 30, 2022

Has there been any progress and/or discussion of this since summer 2020? Can anything be expected for the near or medium-term future?

(Asking because I'm quite unsure how (and when) to best proceed with NixOS/nixpkgs#59842.)

@s-m-e
Copy link
Author

s-m-e commented Apr 30, 2022

@das-g As the author of this QEP, I am still interested in pushing this topic but it certainly wont happen near-term. It got a little silent as I got side-tracked and the discussion here was ultimately not going anywhere, but I am soon going revive my efforts in a somewhat different form.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests