Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log warnings when skipping editable installations #12779

Merged
merged 2 commits into from
Aug 9, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
102 changes: 69 additions & 33 deletions crates/red_knot_python_semantic/src/module_resolver/resolver.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ use std::iter::FusedIterator;
use rustc_hash::{FxBuildHasher, FxHashSet};

use ruff_db::files::{File, FilePath, FileRootKind};
use ruff_db::system::{DirectoryEntry, System, SystemPath, SystemPathBuf};
use ruff_db::system::{DirectoryEntry, SystemPath, SystemPathBuf};
use ruff_db::vendored::VendoredPath;

use super::module::{Module, ModuleKind};
Expand Down Expand Up @@ -155,23 +155,41 @@ fn try_resolve_module_resolution_settings(
let mut static_search_paths = vec![];

for path in extra_paths {
files.try_add_root(db.upcast(), path, FileRootKind::LibrarySearchPath);
static_search_paths.push(SearchPath::extra(system, path.clone())?);
let search_path = SearchPath::extra(system, path.clone())?;
files.try_add_root(
db.upcast(),
search_path.as_system_path().unwrap(),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to get the path from search_path because it is the canonicalized path (whereas path isn't).

This is covered by the symlink test

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the added .as_system_path().unwrap() calls are kind of a shame though. Could we do the files.try_add_root() call inside the SearchPath::extra() call (as soon as we've canonicalised the system path, but before we've wrapped it in the enum)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A side effect like adding a path to files.try_add_root feels off to me in the SearchPath constructor.

But agree, it's a bit an annoyance. The only other solution I can think of is to separate the validation of a system-search path from its construction. But that feels more involved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A side effect like adding a path to files.try_add_root feels off to me in the SearchPath constructor.

To me it feels correct, actually. The constructor creates a validated, normalized search path; once a SearchPath instance has been created, that's meant to represent that it's been fully validated and can be used without issue by other modules. But one of the necessary steps to get to that point is to register the search path as a FileRoot; so to me it makes sense to have it as part of the constructor

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think it couples too many different things together. A search path should be allowed to exist without having a FileRoot. They don't strictly belong together. It's only site packages where we need one, but not because the site-packages search path would be incomplete without one.

I think my preferred solution would be to actually decouple the existence of a SearchPath from its validation but that's a longer discussion that we probably don't agree with.

I would prefer to keep it as is for now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough

Copy link
Member Author

@MichaReiser MichaReiser Aug 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can address your concern of too many unwraps by moving the canonicalize call out of the constructor. Would you prefer this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can address your concern of too many unwraps by moving the canonicalize call out of the constructor. Would you prefer this?

Yeah... I think so... but we should then add a note to the docs of the constructor that say that it's expected that the path has already been canonicalised before it's passed to the constructor

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I'll do that but I think I'm going to do it as a follow up PR so that I have fewer merge conflicts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

FileRootKind::LibrarySearchPath,
);
static_search_paths.push(search_path);
}

static_search_paths.push(SearchPath::first_party(system, src_root.clone())?);

static_search_paths.push(if let Some(custom_typeshed) = custom_typeshed.as_ref() {
let search_path = SearchPath::custom_stdlib(db, custom_typeshed.clone())?;
files.try_add_root(
db.upcast(),
custom_typeshed,
search_path.as_system_path().unwrap(),
FileRootKind::LibrarySearchPath,
);
SearchPath::custom_stdlib(db, custom_typeshed.clone())?
search_path
} else {
SearchPath::vendored_stdlib()
});

let mut site_packages_paths: Vec<_> = Vec::with_capacity(site_packages.len());

for path in site_packages {
let search_path = SearchPath::site_packages(system, path.to_path_buf())?;
files.try_add_root(
db.upcast(),
search_path.as_system_path().unwrap(),
FileRootKind::LibrarySearchPath,
);
site_packages_paths.push(search_path);
}

// TODO vendor typeshed's third-party stubs as well as the stdlib and fallback to them as a final step

let target_version = program.target_version(db.upcast());
Expand All @@ -197,7 +215,7 @@ fn try_resolve_module_resolution_settings(
Ok(ModuleResolutionSettings {
target_version,
static_search_paths,
site_packages_paths: site_packages.to_owned(),
site_packages_paths,
})
}

Expand Down Expand Up @@ -238,47 +256,64 @@ pub(crate) fn dynamic_resolution_paths(db: &dyn Db) -> Vec<SearchPath> {
let files = db.files();
let system = db.system();

for site_packages_dir in site_packages_paths {
for site_packages_search_path in site_packages_paths {
let site_packages_dir = site_packages_search_path
.as_system_path()
.expect("Expected site package path to be a system path");

if !existing_paths.insert(Cow::Borrowed(site_packages_dir)) {
continue;
}
let site_packages_root = files.try_add_root(
db.upcast(),
site_packages_dir,
FileRootKind::LibrarySearchPath,
);

let site_packages_root = files
.root(db.upcast(), site_packages_dir)
.expect("Site-package root to have been created.");

// This query needs to be re-executed each time a `.pth` file
// is added, modified or removed from the `site-packages` directory.
// However, we don't use Salsa queries to read the source text of `.pth` files;
// we use the APIs on the `System` trait directly. As such, add a dependency on the
// site-package directory's revision.
site_packages_root.revision(db.upcast());

dynamic_paths
.push(SearchPath::site_packages(system, site_packages_dir.to_owned()).unwrap());
dynamic_paths.push(site_packages_search_path.clone());

// As well as modules installed directly into `site-packages`,
// the directory may also contain `.pth` files.
// Each `.pth` file in `site-packages` may contain one or more lines
// containing a (relative or absolute) path.
// Each of these paths may point to an editable install of a package,
// so should be considered an additional search path.
let Ok(pth_file_iterator) = PthFileIterator::new(db, site_packages_dir) else {
continue;
let pth_file_iterator = match PthFileIterator::new(db, site_packages_dir) {
Ok(iterator) => iterator,
Err(error) => {
tracing::warn!(
"Failed to search for editable installation in {site_packages_dir}: {error}"
);
continue;
}
};

// The Python documentation specifies that `.pth` files in `site-packages`
// are processed in alphabetical order, so collecting and then sorting is necessary.
// https://docs.python.org/3/library/site.html#module-site
let mut all_pth_files: Vec<PthFile> = pth_file_iterator.collect();
all_pth_files.sort_by(|a, b| a.path.cmp(&b.path));

for pth_file in &all_pth_files {
for installation in pth_file.editable_installations() {
if existing_paths.insert(Cow::Owned(
installation.as_system_path().unwrap().to_path_buf(),
)) {
dynamic_paths.push(installation);
all_pth_files.sort_unstable_by(|a, b| a.path.cmp(&b.path));

let installations = all_pth_files
.iter()
.flat_map(PthFile::editable_installations);

for installation in installations {
if existing_paths.insert(Cow::Owned(installation.clone())) {
match SearchPath::editable(system, installation) {
Ok(search_path) => {
dynamic_paths.push(search_path);
}

Err(error) => {
tracing::warn!("Skipping editable installation: {error}");
}
}
}
}
Expand Down Expand Up @@ -324,7 +359,6 @@ impl<'db> FusedIterator for SearchPathIterator<'db> {}
/// One or more lines in a `.pth` file may be a (relative or absolute)
/// path that represents an editable installation of a package.
struct PthFile<'db> {
system: &'db dyn System,
path: SystemPathBuf,
contents: String,
site_packages: &'db SystemPath,
Expand All @@ -333,9 +367,8 @@ struct PthFile<'db> {
impl<'db> PthFile<'db> {
/// Yield paths in this `.pth` file that appear to represent editable installations,
/// and should therefore be added as module-resolution search paths.
fn editable_installations(&'db self) -> impl Iterator<Item = SearchPath> + 'db {
fn editable_installations(&'db self) -> impl Iterator<Item = SystemPathBuf> + 'db {
AlexWaygood marked this conversation as resolved.
Show resolved Hide resolved
let PthFile {
system,
path: _,
contents,
site_packages,
Expand All @@ -354,8 +387,8 @@ impl<'db> PthFile<'db> {
{
return None;
}
let possible_editable_install = SystemPath::absolute(line, site_packages);
SearchPath::editable(*system, possible_editable_install).ok()

Some(SystemPath::absolute(line, site_packages))
})
}
}
Expand Down Expand Up @@ -404,12 +437,15 @@ impl<'db> Iterator for PthFileIterator<'db> {
continue;
}

let Ok(contents) = db.system().read_to_string(&path) else {
continue;
let contents = match system.read_to_string(&path) {
Ok(contents) => contents,
Err(error) => {
tracing::warn!("Failed to read .pth file '{path}': {error}");
continue;
}
};

return Some(PthFile {
system,
path,
contents,
site_packages,
Expand All @@ -433,7 +469,7 @@ pub(crate) struct ModuleResolutionSettings {
/// That means we can't know where a second or third `site-packages` path should sit
/// in terms of module-resolution priority until we've discovered the editable installs
/// for the first `site-packages` path
site_packages_paths: Vec<SystemPathBuf>,
site_packages_paths: Vec<SearchPath>,
}

impl ModuleResolutionSettings {