-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove targets and sources and only take files from the src directory #317
Conversation
I think that we should choose one of:
In other words, if we want to be permissive about files you can include in your package, then we should strip out only files we know for sure shouldn't be included. If we want to be restrictive, then we should only take the I'm not completely sure which one I prefer. My risk averse self feels like the safe option is to be permissive and only take out files we know for sure we can remove. But I see how removing files keeps package sizes down. |
I'm for option (2): we can be restrictive if users have an escape hatch. This is also how Though I still think we should always pick up - even if we don't think we need them - at least the various files that we use in the legacy import ( |
And I think we can cut this corner for now since it's not strictly necessary to move forward: we could add the |
That's not quite how Rather, they take all files except some guaranteed exclusions by default, but if you include a
I'm OK with this, especially if we add the files key to the schema with a link to a discussion like this one. As we've found out recently, if something's there and I can't remember or figure out why, I'm gonna try and remove it! I'm especially OK with this because it feels safer to take all files for legacy imports, rather than be restrictive, just in case someone was doing something funky in their setup. |
As we discussed in the call, the way I'd like this to work would be:
|
And, to be clear, this will be a list of glob patterns, where allowable globs are up to us to define? (We could keep it easy on ourselves at first by deferring to |
@f-f I have implemented the strategy we agreed on in this discussion:
I believe this is up to date with all our suggested changes at this point. |
ci/src/Registry/API.purs
Outdated
filterUnsafeGlobs :: FilePath -> Array String -> Aff (Array String) | ||
filterUnsafeGlobs path globs = case Array.uncons globs of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we accept arbitrary globs then we can leak access to the file system, which would allow someone to publish a package that pulls in files from anywhere in the system and puts them in the tarball. We can take two approaches to this:
- We can reject a package altogether if we detect it uses an unsafe glob (one that accesses anything outside of the package directory itself)
- We can skip any globs that are unsafe, simply ignoring the user's input and taking files using safe globs only
I've taken the second path here. But I would understand if we want to throwWithComment
instead if we can determine that the files
key contains unsafe globs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should use globs at all, as they feel more of a liability than anything else (hard to parse, and to validate the security of).
We should take this list of paths (which can be files or directories), canonicalize them, and only check that they only point to subdirectories and not parent directories (this can be done by checking that the current absolute path is a substring of the canonical path we're checking)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, we can also just add a TODO in here and skip implementing the files
field for now - we'd leave it in place and just blow up if anyone is trying to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Globs are undeniably useful, though, for example for matching only .purs
files in another directory. And if someone is going through the effort to explicitly include files via the files
key, I suspect they're going to want more fine-grained control than just "a directory".
It's hard to parse globs (hence why we defer to fast-glob
), but is sanitizing them any harder than sanitizing a list of directory paths?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's hard to parse globs (hence why we defer to fast-glob), but is sanitizing them any harder than sanitizing a list of directory paths?
Yes, unless the underlying library offers facilities for that. E.g. I couldn't find anything to canonicalize globs in the fast-glob
library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed all code related to the files
key and we can punt on it.
We had another chat in the registry call this morning and agreed to go back to the NPM style as described in my comment above 1. That means we take all files by default, but let you selectively choose files to include via a Footnotes |
Fixes #316, fixes #292, and fixes #164.
This PR removes the
Target
type altogether as described in #164. It also remove thesource
key altogether. It ensures in the API pipeline that when we create a tarball for a package we only include thesrc
directory and some always-included files like the LICENSE or README orpurs.json
file.