Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipfs get cross platform paths #2013

Open
djdv opened this issue Nov 28, 2015 · 10 comments
Open

ipfs get cross platform paths #2013

djdv opened this issue Nov 28, 2015 · 10 comments
Labels
help wanted Seeking public contribution on this issue kind/bug A bug in existing code (including security flaws) topic/windows Windows specific

Comments

@djdv
Copy link
Contributor

djdv commented Nov 28, 2015

ipfs get does not handle paths with platform specific separators in them gracefully.

Consider the structure

test (QmSqqKkPViWCn5c6LShAqved78NtYr5okY8gN2frkNzJ9E)
|-- Directory with slash\ in it
|   |-- Filename with slash\ in it
|   `-- Regular filename
`-- Directory without slash in it
    |-- Filename with slash\ in it
    `-- Regular filename

geting this on a system that uses \ as a path separator results in an error. Here's an example on Windows with \.
screenshot
[Note: github doesn't like this image for some reason, hosting via ipfs]

@whyrusleeping mentioned this in PR #1933 (comment) for add but it shouldn't be possible to add an invalid path on your platform so it was a non issue there, that's not the case for get though.

To handle this there would need to be some table of restricted symbols for various file systems/schemes/whatever, said symbols would need to be replaced with some other valid symbol. Maybe try to use the given path/name and fallback to filtering on errors like these? This would tamper with the validity of the objects you're getting, they wouldn't add up to the same hash but since the hash already exists and you already have it, pinning it directly should make this a non-issue.
Outside of Windows restricting \ I'm not sure what else if anything is effected.

A way to workaround this is to circumvent get by using the gateway and external tools that do this conversion, for example using wget
wget -r http://127.0.0.1:8080/ipfs/QmSqqKkPViWCn5c6LShAqved78NtYr5okY8gN2frkNzJ9E
This will percent encode the offending slash.

Paging: @rht

@rht
Copy link
Contributor

rht commented Nov 28, 2015

So in ipfs add, all separators are already converted to '/'s, all ''s are names only. This is unambiguous.

Would it fix this if 1. https://github.com/ipfs/go-ipfs/blob/master/thirdparty/tar/extractor.go#L68 is replaced with gopath.Join (use / as separator), 2. if on windows, additionally escape the backslash if strings.Contains(fname, '\')?

@djdv
Copy link
Contributor Author

djdv commented Nov 28, 2015

The problem on Windows is that it accepts both '' and '/' as path delimiters even when mixed so it always splits the path. 1 and 2 would work but it couldn't be escaped it would have to be replaced with another symbol altogether and I'm not sure what would be appropriate or if that approach is acceptable either, a potential replacement could be a Unicode mimic such as . For Windows in particular all of these are forbidden for filenames and would potentially cause issues <>:"/\|?* I'll have to look into those, of these only the question mark is commonly seen in my experience.

@djdv
Copy link
Contributor Author

djdv commented Nov 28, 2015

As expected it doesn't like this tree either.

test/ (QmW86XfuidKM4hEie6oGfb5pYyp99EwVLjfqcFALsoRpjv)
`-- symbols
    |-- ".txt
    |-- *.txt
    |-- :.txt
    |-- < .txt
    |-- >.txt
    |-- ?.txt
    |-- \.txt
    `-- |.txt

@rht
Copy link
Contributor

rht commented Nov 28, 2015

Path validation was discussed in #1710, where a solution was implemented in #1740.
Accommodating windows paths would further restrict what counts as valid path.

@djdv
Copy link
Contributor Author

djdv commented Dec 3, 2015

@rht thanks for that, I'm going to see if I can think of any other issues related to this and reference them there later.

Another issue is dealing with case-sensitive names on systems that don't care for case. NTFS can be case sensitive but Windows does not respect it under normal conditions https://support.microsoft.com/en-us/kb/100625, I know Mac OS also offers case sensitive file systems as well as other *nix systems.

Given this structure where the lower case path is 1, the mixed case is 2 and the upper case is 3
(Mind the spelling error, early morning)

test/ (QmNpDmaG9QpHFXxfLVCyqc3bFvtpxmT6M1KexGeSbkpkGQ)
    |-- CASE\ SENSATIVE\ FILE
    |-- CASE\ SENSATIVE\ PATH
    |   `-- FileWithData
    |-- Case\ Sensative\ File
    |-- Case\ Sensative\ Path
    |   `-- FileWithData
    |-- case\ sensative\ file
    `-- case\ sensative\ path
        `-- FileWithData

geting it on Windows results in this.
screenshot
Which is odd since that file should contain "File 3", it's keeping the uppercase path but getting the file from the lowercase path. The same is true for "CASE SENSATIVE FILE" which should also be "root file 3" but is "root file 1".

@mildred
Copy link
Contributor

mildred commented Dec 15, 2015

Accommodating windows paths would further restrict what counts as valid path.

I don't think we should restrict paths everywhere. What happens if tomorrow we port IPFS to a system where filenames must be 8 character long max with a 3 character long extension ? Will we impose this restriction on every other IPFS user ?

The correct solution is either to fail ion Windows (it is always possible to ipfs cat to get the file content on stdout) or find an escape mechanism that doesn't use restricted characters. Possibly using unicode.

@rht
Copy link
Contributor

rht commented Jan 21, 2016

There was an issue on case insensitive fs in #288.

@whyrusleeping
Copy link
Member

Can someone verify if this is still an issue?

@whyrusleeping whyrusleeping added the need/verification This issue needs verification label Aug 23, 2016
@djdv
Copy link
Contributor Author

djdv commented Aug 24, 2016

Version: 0.4.4-dev-8830aae
Still an issue on my Windows machine.
screenshot
Relevant hash for testing:
[removed]

@whyrusleeping whyrusleeping added kind/bug A bug in existing code (including security flaws) help wanted Seeking public contribution on this issue and removed need/verification This issue needs verification labels Aug 24, 2016
@djdv
Copy link
Contributor Author

djdv commented Feb 13, 2017

@leerspace pointed out another problem case #3660 (comment), directories with a trailing space are not valid in Windows and cause issues when using ipfs get. i.e. directory name \file.ext. I added this to the test hash.

Edit:
Added some more whitespace cases. /ipfs/QmPQmkrUq9w8qqkNQjtkqM72Kv4n7dQtfGJpKDwxHZMqCD

```
tree -N "Windows has a nightmare"
tree -N "Windows has a nightmare"
Windows has a nightmare
├── case
│   ├── case sensitive file
│   ├── Case Sensitive File
│   ├── CASE SENSITIVE FILE
│   ├── case sensitive path
│   │   └── FileWithData
│   ├── Case Sensitive Path
│   │   └── FileWithData
│   └── CASE SENSITIVE PATH
│       └── FileWithData
├── slashes
│   ├── Directory with slash\ in it
│   │   ├── Filename with slash\ in it
│   │   └── Regular filename
│   └── Directory without slash in it
│       ├── Filename with slash\ in it
│       └── Regular filename
├── symbols
│   ├── :.txt
│   ├── ?.txt
│   ├── ".txt
│   ├── *.txt
│   ├── \.txt
│   ├── <.txt
│   ├── >.txt
│   └── |.txt
└── whitespace
    ├──  directory leading with whitespace
    │   ├──  FileWithData.txt
    │   └── FileWithData.txt
    ├──  File name leading with whitspace.txt
    ├── directory ending in whitespace
    │   ├── FileWithData .txt
    │   ├── FileWithData .txt
    │   └── FileWithData.txt
    ├── File name ending with whitespace after extension.txt
    ├── File name ending with whitespace before extension .txt
    └── File name with whitespace before and after extension .txt
```

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Seeking public contribution on this issue kind/bug A bug in existing code (including security flaws) topic/windows Windows specific
Projects
None yet
Development

No branches or pull requests

4 participants