Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing dependencies from an existing lock file #196

Closed
scottyhq opened this issue May 26, 2022 · 20 comments
Closed

Removing dependencies from an existing lock file #196

scottyhq opened this issue May 26, 2022 · 20 comments

Comments

@scottyhq
Copy link
Contributor

I'm surprised that removing a dependency does not remove it and sub-dependencies from an existing conda-lock.yml (condalock 1.0.5):

Steps to reproduce:

  1. conda-lock lock --mamba -f environment.yml with the following environment.yml:
channels:
  - conda-forge
dependencies:
  - python=3.9
  - pystac
platforms:
  - linux-64
  1. Comment or delete the -pystac dependency and re-run conda-lock lock --mamba -f environment.yml, but pystac remains in conda-lock.yml:
  name: pystac
  url: https://conda.anaconda.org/conda-forge/noarch/pystac-1.4.0-pyhd8ed1ab_0.tar.bz2

A workaround is to wipe conda-lock.yml and start fresh, but it's nice to keep it around for version control.

@riccardoporreca
Copy link
Contributor

Not removing obsolete dependencies is indeed problematic and probably not a desirable behavior.
Even an upgrade of an existing dependency might leave behind former transitive dependencies that might eventually become incompatible with the rest of the current (and updated) dependencies.

@weiji14
Copy link

weiji14 commented Jun 20, 2023

Just checking to see if this issue has been resolved in conda-lock>=2? Or do we still need to do the rm conda-lock.yml && conda-lock lock --mamba -f environment.yml workaround?

scottyhq pushed a commit to pangeo-data/pangeo-docker-images that referenced this issue Jun 21, 2023
* Rename earthdata to earthaccess

Package was renamed from earthdata to earthaccess in nsidc/earthaccess#181, and reflected on conda-forge at v0.4.5.

* [condalock-command] autogenerated conda-lock files

* Regenerate conda-lock.yml files from scratch

Deleting the conda-lock.yml files and manually running `conda-lock lock` again to remove packages that are not needed anymore. Not locking the forge file as it took too long.

* Delete conda-lock.yml before locking

Workaround for conda/conda-lock#196.

* Manually re-lock forge image

Ran `conda-lock lock -f environment.yml -f ../pangeo-notebook/environment.yml -f ../base-notebook/environment.yml -p linux-64` locally in the `forge/` folder.

---------

Co-authored-by: pangeo-bot <pangeo-bot@users.noreply.github.com>
@maresb
Copy link
Contributor

maresb commented Jun 21, 2023

As far as I know this is unfortunately not yet resolved.

jomey added a commit to ICESAT-2HackWeek/ICESat-2-Hackweek-2023 that referenced this issue Jul 21, 2023
jomey added a commit to uwhackweek/jupyterbook-template that referenced this issue Jul 21, 2023
@mfisher87
Copy link
Contributor

mfisher87 commented Aug 18, 2023

Today I stumbled on a nasty consequence of this behavior. In my case, I had a dependency in the pip dependency subsection of my environment.yml, and that dependency became recently available on conda-forge. So naturally, I moved it out of the pip section and into the main dependencies list and ran conda-lock.

The result was that the same dependency was listed twice in conda-lock.yml, once for conda and once for pip. Of course, the pip version overwrote the conda version in the last stage of the install. This resulted in a broken environment with incompatible dependencies, even though our freshly-locked environment.yml defined a compatible environment spec.

I created a repository that walks through a simplified version of what I experienced, without the dependency conflicts: https://github.com/mfisher87/sscce-conda-lock-pip-ghosts

The real-world scenario I encountered was that we upgraded an old environment.yml file containing sphinx and autodoc-pydantic dependencies, and moved autodoc-pydantic out of the pip section at the same time. autodoc-pydantic==1.5.1, incompatible with recent versions of sphinx, was silently overwriting the "correct" version even though our environment file that our lock file was based on only contained the spec autodoc-pydantic==1.9.0. We fixed it by deleting conda-lock.yml and re-locking, but up until figuring out what was going on I was very confused 😆

@mfisher87
Copy link
Contributor

Given this behavior can lead to envs that don't reproduce as expected, could this issue be pinned to help confused folks like myself find it?

@mfisher87
Copy link
Contributor

Thinking about digging in to this as an exercise this weekend. From an outside perspective, without having any design context, I'd expect conda-lock to replace the previous lockfile wholesale every time a solve is completed successfully. Are there reasons not to do that, or would there be major challenges in implementing such behavior?

Thanks all for being welcoming to questions! ❤️

itsarobin added a commit to itsarobin/conda-lock that referenced this issue Aug 20, 2023
…or lock

Addresses conda#196: for requested platforms replaces lock content without erroneously persisting packages.

Co-authored-by: Arie Knoester <arikstr@gmail.com>
Reviewed-by: Matt Fisher <mfisher87@gmail.com>
@mfisher87
Copy link
Contributor

mfisher87 commented Aug 20, 2023

We took a stab at this. I feel like multiple sources (which I don't fully understand the use case for, and haven't found docs for yet) may be a confounding factor for our solution. Our naive understanding is that we need to retain support for per-platform updates, so we can't simply blow away the config file and drop in a new one. Draft PR coming soon :)

itsarobin added a commit to itsarobin/conda-lock that referenced this issue Aug 21, 2023
…or lock

Addresses conda#196: for requested platforms replaces lock content without erroneously persisting packages.

Co-authored-by: Arie Knoester <arikstr@gmail.com>
Reviewed-by: Matt Fisher <mfisher87@gmail.com>
@maresb
Copy link
Contributor

maresb commented Aug 21, 2023

Sorry for my lack of responsiveness this past weekend.

FWIW I don't understand or like the current behavior, and consequently I always delete the conda-lock.yml before relocking, which I find awkward.

I'd like to see things done differently, and would be supportive of doing a major release to change this behavior.

@mariusvniekerk, are there any good reasons for keeping the current behavior which I should know about? Also @bollwyvl?

@bollwyvl
Copy link
Contributor

I hadn't noticed these issues, mostly as I don't use the .yml format, and even then i still delete my .lock (or, according to constructor, now, .txt) files before resolving. So I can't really comment on how the existing strategy works, but just some thoughts:

In some other lockfile-based systems i've used, once a named entry appears in the lock, it will never get updated unless something new comes along that can be met by the existing entry, and all unused entries will be removed at the end. If the .yml format allows for partial locking, I could see how this could get very complicated. This is part of the reason I just use the @EXPLICIT files, as when i reach for conda-lock i almost always have multiple environments to loop/matrix over something anyway.

In conda(-forge)-land, generally when I relock, I want the freshest packages that I haven't pinned, and with mutable upstream state like repodata-patches, keeping old, unexamined, transient entries might be actively harmful, if a key dependency has changed (see pydantic<2, urllib<3 all over in the last few months).

@mfisher87
Copy link
Contributor

Sorry for my lack of responsiveness this past weekend.

Never a problem in my mind! Hope you got away from work for a bit :)

FWIW I don't understand or like the current behavior, and consequently I always delete the conda-lock.yml before relocking, which I find awkward.

I'd like to see things done differently, and would be supportive of doing a major release to change this behavior.

Yes, once I discovered this behavior, I realized I should be doing the same... but I don't want to 😆 Maybe a --merge/--no-merge flag or something could be useful to ensure that absolutely none of the existing lock content is considered. And maybe it should default to --no-merge if we're looking at a major release. But that may require some more refactoring to enable --update behavior to continue to work as expected.

once a named entry appears in the lock, it will never get updated unless something new comes along that can can not be met by the existing entry

I'm not sure I understand this as written; should "can" actually be "can not"? I.e. once an entry, e.g. foo=1.2.3 shows up in the lock file, it will never get updated until another entry is added that is incompatible, e.g. bar=2.3.4, which requires foo>1?

itsarobin added a commit to itsarobin/conda-lock that referenced this issue Aug 27, 2023
…or lock

Addresses conda#196: for requested platforms replaces lock content without erroneously persisting packages.

Co-authored-by: Arie Knoester <arikstr@gmail.com>
Reviewed-by: Matt Fisher <mfisher87@gmail.com>
itsarobin added a commit to itsarobin/conda-lock that referenced this issue Aug 27, 2023
…or lock

Addresses conda#196: for requested platforms replaces lock content without erroneously persisting packages.

Co-authored-by: Arie Knoester <arikstr@gmail.com>
Reviewed-by: Matt Fisher <mfisher87@gmail.com>
itsarobin added a commit to itsarobin/conda-lock that referenced this issue Aug 28, 2023
…or lock

Addresses conda#196: for requested platforms replaces lock content without erroneously persisting packages.

Co-authored-by: Arie Knoester <arikstr@gmail.com>
Reviewed-by: Matt Fisher <mfisher87@gmail.com>
mfisher87 pushed a commit to itsarobin/conda-lock that referenced this issue Sep 3, 2023
…or lock

Addresses conda#196: for requested platforms replaces lock content without erroneously persisting packages.

Co-authored-by: Arie Knoester <arikstr@gmail.com>
Reviewed-by: Matt Fisher <mfisher87@gmail.com>
mfisher87 pushed a commit to mfisher87/conda-lock that referenced this issue Sep 3, 2023
…or lock

Addresses conda#196: for requested platforms replaces lock content without erroneously persisting packages.

Co-authored-by: Arie Knoester <arikstr@gmail.com>
Reviewed-by: Matt Fisher <mfisher87@gmail.com>
mfisher87 pushed a commit to itsarobin/conda-lock that referenced this issue Sep 3, 2023
…or lock

Addresses conda#196: for requested platforms replaces lock content without erroneously persisting packages.

Co-authored-by: Arie Knoester <arikstr@gmail.com>
Reviewed-by: Matt Fisher <mfisher87@gmail.com>
mfisher87 pushed a commit to mfisher87/conda-lock that referenced this issue Sep 3, 2023
…or lock

Addresses conda#196: for requested platforms replaces lock content without erroneously persisting packages.

Co-authored-by: Arie Knoester <arikstr@gmail.com>
Reviewed-by: Matt Fisher <mfisher87@gmail.com>
@AlbertDeFusco
Copy link
Contributor

I'll add a little of my experience and expectations here. First, I agree with the growing sentiment that extraneous packages could be removed from the lock and I've also gotten in the habit of rm conda-lock.yml && conda-lock lock. To me this sounds much like conda env update --prune, which was recently fixed for classic solver.

There is one wrinkle that might be worth considering that a user brought to my attention: relock an env spec with minimal amount of changes. I wonder if there might be a way to have both a minimal update along with pruning of orphaned and un-requested packages.

@bollwyvl
Copy link
Contributor

bollwyvl commented Sep 6, 2023

--prune

I've seen some lockfile tools that offer a [--strategy=latest] argument... seems like having a countable number of these wrapping existing conda/mamba arguments could be very useful.

A super useful example, which may or may not be supported by the solvers, would be a --strategy=oldest, for building oldest compatible package set meeting the given specs. This would allow for using a single (set of) environment.yml file(s) to generate a (set of) lockfile(s) that accurately reflect both the latest and oldest, both of which took into account things like repodata-patches. I recently had to do something similar to bisect an upstream for a repodata patch, but ended up falling back to the upstream package manager (pip) for the nitty-gritty.

both a minimal update along with pruning of orphaned

As for combining multiple strategy solves of in a single .yml: again, I have no skin in the game, but seems extra-super complicated.

mfisher87 pushed a commit to mfisher87/conda-lock that referenced this issue Sep 10, 2023
…or lock

Addresses conda#196: for requested platforms replaces lock content without erroneously persisting packages.

Co-authored-by: Arie Knoester <arikstr@gmail.com>
Reviewed-by: Matt Fisher <mfisher87@gmail.com>
@mfisher87
Copy link
Contributor

@maresb can we mark this as resolved now that #485 is merged? Or should we wait for next release?

cc @weiji14 👋 :)

@maresb
Copy link
Contributor

maresb commented Sep 11, 2023

If you could write a brief release note let's just do a release now.

@mfisher87
Copy link
Contributor

Sure, will do my best! This is a tough one to boil down :)

Resolved an issue causing dependencies to persist in the lock file even if they were removed from the specification file (see #196).

Or a more low-level explanation, please cut up or edit however you like:

conda-lock no longer merges a pre-existing lockfile with freshly-generated locked dependencies. This was causing dependencies to persist in the lock file even if they were removed from the specification file (see #196). conda-lock will now always replace the old locked dependencies for a given platform with freshly-generated locked dependencies. In cases where the user requests a lock for a subset of platforms, those platforms not requested for lock will be persisted.

That last sentence is I think a good candidate for removal or revision.

@maresb
Copy link
Contributor

maresb commented Sep 11, 2023

@mfisher87, would it be accurate to say:

In most cases, the new behavior of locking with conda-lock is equivalent to rm conda-lock.yml && conda-lock. (The exception is when locking a platform is skipped due to the an unchanged content hash.)

@mfisher87
Copy link
Contributor

There's another exception: locking a subset of platforms. For example, if you've previously locked with an environment.yml that specifies 3 platforms, then you re-lock with -p linux-64, the other two platforms will be persisted from the original lockfile, and linux-64 will be overwritten completely with the new lock results.

In most cases, the new behavior of locking with conda-lock is equivalent to rm conda-lock.yml && conda-lock. (The exception is when locking a platform is skipped due to explicit request or an unchanged content hash.)

@maresb
Copy link
Contributor

maresb commented Sep 11, 2023

Okay, new release is out. I went for a minor version since I think the only real "breakage" here would be more aggressive updating.

For the release notes, I wanted to avoid complication, and avoid discussion of the exceptional cases. (In case I wrote something wrong/misleading I can still edit it.)

Thanks everyone for all the great feedback and discussion!

There are some unaddressed points about implementation of a minimal update strategy. Let's open a fresh issue for that.

@mfisher87
Copy link
Contributor

Would you mind updating the release notes to credit co-author @ArieKnoester for the PR? Thanks, Ben!

@maresb
Copy link
Contributor

maresb commented Sep 12, 2023

@mfisher87 oops, they're autogenerated and I missed that. Thanks, fixed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants