Skip to content
This repository has been archived by the owner on Feb 3, 2018. It is now read-only.

Follow symlinks when stripping vendor directories #199

Merged
merged 22 commits into from
Apr 13, 2017

Conversation

spenczar
Copy link
Contributor

A bit of extra trickiness here because I'm scared of repos with a symlink like ./vendor -> /.

I would love any guidance on how to write an integration test for this! I verified by hacking this into dep and running it on a test repo, but I don't feel comfortable without being able to run cross-platform tests...

Fixes #198

@codecov
Copy link

codecov bot commented Mar 20, 2017

Codecov Report

Merging #199 into master will increase coverage by 0.03%.
The diff coverage is 83.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #199      +/-   ##
==========================================
+ Coverage   77.76%   77.79%   +0.03%     
==========================================
  Files          25       26       +1     
  Lines        3989     3995       +6     
==========================================
+ Hits         3102     3108       +6     
+ Misses        668      667       -1     
- Partials      219      220       +1
Impacted Files Coverage Δ
vcs_source.go 73.76% <ø> (+0.69%) ⬆️
strip_vendor.go 83.33% <83.33%> (ø)
source_manager.go 91.38% <0%> (-0.75%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 65f245d...05ef430. Read the comment docs.

@sdboyer
Copy link
Owner

sdboyer commented Mar 21, 2017

Uuuuughh. Yeah, I spent some time a couple weeks ago trying to work through similar problems in #177, ultimately having to roll back #157 because I couldn't find a satisfactory answer. However, that the need here may be simpler, as I think there's only the simpler requirement of inspecting the outermost symlink. That may make it equivalent to the first-pass of work in #157; the implementation and/or tests in there may be worth looking at.

Hopefully something in there will spark a thought about a cross-platform test. If not...well, we'll have to put our heads together 🤔 😄 because yeah, I'm not willing to merge something as potentially destructive as this without cross-platform test coverage.

symlinks are the fscking worst.

If a dependency contains a symlink named 'vendor', and that symlink points to
a directory, delete the symlink. Keep the directory around - we can tackle
that part when we get to actually pruning the vendored dependencies.

Also, add some test helpers for testing filesystem state changes.
Relative symlinks on windows confuse filepath.Walk. This is Go issue 17540,
golang/go#17540. While waiting for that to get fixed,
just use absolute symlinks in windows tests.
@spenczar
Copy link
Contributor Author

@sdboyer, this is ready for review at last!

@sdboyer
Copy link
Owner

sdboyer commented Mar 29, 2017

Gonna try to have a look tonight - sorry for delay!

Copy link
Owner

@sdboyer sdboyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

god, i hate symlinks.

i'm gonna need more time to really get this in my head, but some comments/questions for now 😄

for _, dir := range fs.dirs {
p := dir.prepend(fs.root)
if err := os.MkdirAll(p.String(), 0777); err != nil {
t.Fatalf("os.MkdirAll(%q, 0777) err=%q", p, 0777)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like second param should be err

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! Will do.

for _, dir := range fs.dirs {
p := dir.prepend(fs.root)
if err := os.MkdirAll(p.String(), 0777); err != nil {
t.Fatalf("os.MkdirAll(%q, 0777) err=%q", p, 0777)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ibid above

result.go Outdated
return removeAll(path)
}

if (info.Mode() & os.ModeSymlink) != 0 {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait - shouldn't this check be first, because Windows junction dirs? i must be missing something...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably aren't missing anything. I don't know much of anything about Windows and don't know what a junction dir is. Should these be reversed? What's the test that would prove it?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hah, i scarcely know either. it's just problems discussed in golang/go#17540 - Windows APIs can indicate that a file is both a symlink AND a dir. they do it when it's one of these "junction dirs."

Bottom line is, this approach on windows would potentially (?) erase the target of the symlink.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, gotcha. I'll try to figure out how to create a junction dir and write a test that exercises this.

For what it's worth, if this erases the target of symlinks, then we're already doing that today.

@sdboyer
Copy link
Owner

sdboyer commented Apr 1, 2017

I can't remember if I've pointed you to it before or not, but the work being done in golang/dep#247 might be helpful here.

@spenczar
Copy link
Contributor Author

spenczar commented Apr 2, 2017

Alright, I think we should just ignore junctions if we ever encounter them in this function. I don't think we ever will, though: junctions turn out to have much more in common with hard links than symbolic links, for one thing, and git won't place hard links on your file system. Similarly, msysgit won't put junctions on your file system. It's not at all clear how that would even work, since junctions must point to absolute paths and require lots of permissions to create.

There's also the lack of support in the Go os and filepath packages for handling junctions. It's hard to convince the go standard library's os.Remove call, for example, to remove just the junction link and not the thing it points to. I think you'd need to use the exec package and run linkd to do it. filepath.Walk doesn't behave the way I'd expect with junctions, too.

This function only runs against the code inside ./vendor, so it should only be running against stuff we pulled down with a VCS tool, so it probably won't ever encounter a junction. But if it does, I think the best thing is to ignore these things. Maybe we should raise an error instead, though?

I'm at least confident that we should not try to prune them.

@spenczar
Copy link
Contributor Author

spenczar commented Apr 2, 2017

I can find no way to confidently detect whether a file is a 'regular' symlink versus a junction on windows, other than writing some very gnarly syscalls. The current code checks if a file named vendor is both a symlink and a directory - this only happens on windows, but it happens for both symlinks and junctions. Symlinks named vendor would be safe to os.Remove, but junctions would not. We could just skip both to be safe, or we could try to tell the difference between the two.

syscall.Readlink sniffs to tell the difference in the standard library - it's the only code I've found that gets into the muck of reading the reparse tags that describe this. Beware, unsafe.Pointer calls ahead: https://github.com/golang/go/blob/a1cedf08428bdb91916bb5317c8413212308048c/src/syscall/syscall_windows.go#L1005-L1060

I don't think this sort of thing belongs in gps. It seems way too low-level. I think we should just not support pruning out symlinks on windows. What do you think, @sdboyer?

Copy link
Owner

@sdboyer sdboyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking awesome - thanks so much for the exhaustive work on it! Just a couple small things left, then it seems ready to merge.

t.Fatalf("file %q Close() err=%q", p, err)
}
}
// setupUisngJunctions inflats fs onto the host file system, but uses Windows
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: inflats

// On windows, links can be symlinks (which behave like Unix symlinks, mostly)
// or 'directory junctions', which respond 'true' to os.FileInfo.IsDir(). Run
// all these tests twice: once using symlinks, and once using junctions.
func stripVendorTestCase(tc fsTestCase) func(*testing.T) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

love this use of first-class funcs as subtests 😄

@@ -0,0 +1,3 @@
// +build !windows
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine to not have this file, I think. We've already got a lot of files in the gps pkg 😄

func (f fsPath) String() string { return filepath.Join(f...) }

func (f fsPath) prepend(prefix string) fsPath {
p := fsPath{prefix}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth wrapping prefix here in a filepath.FromSlash(). I realize it's not necessary to do with the current code, as ioutil.TempDir() is the only thing that ends up populating that var, but it takes away a footgun in the event that it gets reused differently in the future by someone not developing on windows.


if info.IsDir() {
_, ok := dirMap[path]
if !ok {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change dirMap, fileMap and linkMap all into map[string]bool. Yes, it's a bit more sloppy underneath, but it makes these if statements more readable. Because the zero value of bool is false, we can just:

if dirMap[path] {
  // ...
else {
  ...
}

instead of having to do the two-value map lookup return style.

result_test.go Outdated
before: filesystemState{
dirs: []fsPath{
fsPath{"package"},
fsPath{"package", "vendor"},
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems unnecessarily verbose to have to declare both of these, as os.MkdirAll() is the call that's used - shouldn't just the latter be sufficient?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, true. I just did it this way to help visualize the package layout. I'll trim out the redundancies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, the other reason is that it helps the before and after states look more parallel. This is the redundant version:

	t.Run("nonvendor symlink", stripVendorTestCase(fsTestCase{
		before: filesystemState{
			dirs: []fsPath{
				fsPath{"package"},
				fsPath{"package", "_vendor"},
			},
			links: []fsLink{
				fsLink{
					path: fsPath{"package", "link"},
					to:   "_vendor",
				},
			},
		},
		after: filesystemState{
			dirs: []fsPath{
				fsPath{"package"},
				fsPath{"package", "_vendor"},
			},
			links: []fsLink{
				fsLink{
					path: fsPath{"package", "link"},
					to:   "_vendor",
				},
			},
		},
	}))

And here's the trimmed down one:

	t.Run("nonvendor symlink", stripVendorTestCase(fsTestCase{
		before: filesystemState{
			dirs: []fsPath{
				fsPath{"package", "_vendor"},
			},
			links: []fsLink{
				fsLink{
					path: fsPath{"package", "link"},
					to:   "_vendor",
				},
			},
		},
		after: filesystemState{
			dirs: []fsPath{
				fsPath{"package"},
				fsPath{"package", "_vendor"},
			},
			links: []fsLink{
				fsLink{
					path: fsPath{"package", "link"},
					to:   "_vendor",
				},
			},
		},
	}))

I like how the after state matches the before state's directory listing, even if it's a little more verbose, so I think I'll keep these in.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, fair nuff.

result.go Outdated
case symlink && dir:
// This must be a windows junction directory. Support for these in the
// standard library is spotty, and we could easily delete an important
// folder if we called os.Remove. Just skip these.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is it os.Remove() we'd call here, or os.RemoveAll()? honestly, i don't even know, b/c idk whether to think of those junction dirs as files or dirs 😛

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, either one is bad news :)

@sdboyer
Copy link
Owner

sdboyer commented Apr 10, 2017

Those look like real failures, no?

@spenczar
Copy link
Contributor Author

So, tests don't pass on windows, because even a "normal" windows symlink created with os.Symlink which points at a directory looks like a dir and a symlink, not just junctions.

I'm kind of at a loss here. We could detect the difference between Windows symlinks (which can be safely removed) and junctions (which cannot) but we'd have to use a lot of unsafe and syscall to do so. Or, we could just not prune symlinked vendor directories on Windows. I'd like guidance from the benevolent leader here, @sdboyer :)

@spenczar
Copy link
Contributor Author

This block is what it looks like to determine whether a file is a symlink or a junction, for reference: https://github.com/golang/go/blob/a1cedf08428bdb91916bb5317c8413212308048c/src/syscall/syscall_windows.go#L1007-L1023

@sdboyer
Copy link
Owner

sdboyer commented Apr 10, 2017

https://github.com/golang/go/blob/a1cedf08428bdb91916bb5317c8413212308048c/src/syscall/syscall_windows.go#L1007-L1023

aaaaaack hack hack cough cough splutter

Yeah...let's just fall back to this:

we could just not prune symlinked vendor directories on Windows.

I'm content to kick this can down the road for now, as long as we make a TODO in the code and an issue on the repo. First, we have bigger fish to fry right now. Second, I'm wary about making too many choices with respect to symlinks now, as we may be making promises that ultimately aren't consistent with the final vision for the Go tool. Absent someone reporting an actual problem, I'd rather not invest in a bunch of logic that I don't fully understand that we may need to change anyway.

On Windows, we're unable to distinguish between symlinks and junctions. Deleting
junctions is dangerous and appears to be able to delete the underlying folder,
not just the junction that's pointing to a folder. That's unacceptably
dangerous: a malicious repo could contain a junction named 'vendor' pointing at
'C:\\' and we'd delete all the user's data.
@spenczar
Copy link
Contributor Author

(ノಥ益ಥ)ノ sʍopuıʍ

According to some guy on stackoverflow and some other guy on technet, icacls is unable to modify the permissions of a junction itself. It will always modify the junction's target.

There appears to be no way, other than modifying file system metadata directly (probably by using syscall and unsafe), to change the permission of the junction. You're not supposed to be able to list junctions contents - you have to go to the paths inside them directly (according to this blog post, anyway).

Sadly, we're just going to have to drop the test coverage of junctions. Rough.

It's too hard to create junctions with permissions that let filepath.Walk
process them.
@sdboyer
Copy link
Owner

sdboyer commented Apr 13, 2017

(ノಥ益ಥ)ノ sʍopuıʍ

this

@sdboyer
Copy link
Owner

sdboyer commented Apr 13, 2017

Well, I'm glad we have all the history here - and commits to potentially un-revert with tests - whenever we do decide to tackle this. Thanks for being indefatigable, as ever, @spenczar 😄

@sdboyer sdboyer merged commit 956acf3 into sdboyer:master Apr 13, 2017
krisnova pushed a commit to krisnova/dep that referenced this pull request Apr 21, 2017
Follow symlinks when stripping vendor directories
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants