Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std::Command::new fails for some CMD scripts on Windows 10 msvc nightly 1.19/1.20 #42791

Closed
budziq opened this issue Jun 21, 2017 · 21 comments · Fixed by #44542
Closed

std::Command::new fails for some CMD scripts on Windows 10 msvc nightly 1.19/1.20 #42791

budziq opened this issue Jun 21, 2017 · 21 comments · Fixed by #44542
Labels
C-bug Category: This is a bug. O-windows-msvc Toolchain: MSVC, Operating system: Windows P-high High priority regression-from-stable-to-beta Performance or correctness regression from stable to beta. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@budziq
Copy link
Contributor

budziq commented Jun 21, 2017

While working on mdbook build.rs issue I've discovered that on nightly the std::Command::new() for a windows 10 batch script "npm.cmd" results in underlying node-js process to not find required resource paths.

I tried this code:

use std::process::Command;

fn main() {
    let status = Command::new("npm.cmd")
        .arg("-v")
        .output();

    println!("Command exited with '{:?}'", status);
    let status = status.expect("Process spawn should succeed!");
    if !status.status.success() {
        panic!("Command failed!");
    }
}

I expected to see this happen:
This works on all channels for my linux and windows 7 boxes (did not test other than msvc here) as well as stable and beta for windows 10 box (and appveyor)

Command exited with 'Ok(Output { status: ExitStatus(ExitStatus(0)), stdout: "3.10.10\n", stderr: "" })'

Instead, this happened only on windows 10 (and appveyor) msvc nightly:

Command exited with 'Ok(Output { status: ExitStatus(ExitStatus(1)), stdout: "", stderr: "module.js:471\r\n    throw err;\r\n    ^\r\n\r\nError: Cannot find module \'C:\\projects\\nightly-cmd-fail\\node_modules\\npm\\bin\\npm-cli.js\'\r\n    at Function.Module._resolveFilename (module.js:469:15)\r\n    at Function.Module._load (module.js:417:25)\r\n    at Module.runMain (module.js:604:10)\r\n    at run (bootstrap_node.js:389:7)\r\n    at startup (bootstrap_node.js:149:9)\r\n    at bootstrap_node.js:504:3\r\nmodule.js:471\r\n    throw err;\r\n    ^\r\n\r\nError: Cannot find module \'C:\\projects\\nightly-cmd-fail\\node_modules\\npm\\bin\\npm-cli.js\'\r\n    at Function.Module._resolveFilename (module.js:469:15)\r\n    at Function.Module._load (module.js:417:25)\r\n    at Module.runMain (module.js:604:10)\r\n    at run (bootstrap_node.js:389:7)\r\n    at startup (bootstrap_node.js:149:9)\r\n    at bootstrap_node.js:504:3\r\n" })'
thread 'main' panicked at 'Command failed!', main.rs:11
note: Run with `RUST_BACKTRACE=1` for a backtrace.

Please note that the panic is explicitly triggered in main.rs to make appveyor report the error.

The npm.cmd sets paths for node-js via environmental variables and these seem not to be forwarded properly.

Sorry for mostly useless description as I have no experience with npm/node and no access to the actual win10 box on which the problem was reproduced (I was only able to locate this problem via appveor logs).

Meta

rustc --version --verbose
rustc 1.20.0-nightly (445077963 2017-06-20)
binary: rustc
commit-hash: 445077963c55297ef1e196a3525723090fe80b22
commit-date: 2017-06-20
host: x86_64-pc-windows-msvc
release: 1.20.0-nightly
LLVM version: 4.0

But the problem exists at least since (sorry no verbose output here. I have no access to that win10 box anymore)
rustc 1.19.0-nightly (04145943a 2017-06-19)

Link to appveyor showing the problem:
https://ci.appveyor.com/project/budziq/nightly-cmd-fail/build/1.0.10

@retep998
Copy link
Member

So you're saying the combination of Rust nightly and either appveyor or Windows 10 is causing the npm stuff to fail?

Also, the actual error that is occurring here:

module.js:471
    throw err;
    ^

Error: Cannot find module \'C:\\projects\\nightly-cmd-fail\\node_modules\\npm\\bin\\npm-cli.js\'
    at Function.Module._resolveFilename (module.js:469:15)
    at Function.Module._load (module.js:417:25)
    at Module.runMain (module.js:604:10)
    at run (bootstrap_node.js:389:7)
    at startup (bootstrap_node.js:149:9)
    at bootstrap_node.js:504:3
module.js:471
    throw err;
    ^

Error: Cannot find module \'C:\\projects\\nightly-cmd-fail\\node_modules\\npm\\bin\\npm-cli.js\'
    at Function.Module._resolveFilename (module.js:469:15)
    at Function.Module._load (module.js:417:25)
    at Module.runMain (module.js:604:10)
    at run (bootstrap_node.js:389:7)
    at startup (bootstrap_node.js:149:9)
    at bootstrap_node.js:504:3

@budziq
Copy link
Contributor Author

budziq commented Jun 21, 2017

So you're saying the combination of Rust nightly and either appveyor or Windows 10 is causing the npm stuff to fail?

What I was able to gather sofar is that env variables in the cmd script might not be forwarded correctly to the nodejs subprocess.

@Mark-Simulacrum Mark-Simulacrum added O-windows-msvc Toolchain: MSVC, Operating system: Windows T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Jun 23, 2017
@alexcrichton
Copy link
Member

@budziq what makes you think this is a rustc bug? It's looking like the process was spawned correctly and output was captured, which makes me think that this may be a local configuration issue?

@budziq
Copy link
Contributor Author

budziq commented Jun 25, 2017

@alexcrichton I am not positive that it is a rustc bug. But identical setup works on windows7 (stable/beta/nightly) as well as the exact same setup works for windows 10 stable/beta and fails on nightly suggests that this might be a regression in stdlib.

I am assuming that while process start and output capture works, there might be problem with how the env variables are passed to the child process. I'm no expert on windows nor nodejs (I just happened to help with triage on mdBook issue ) I am assuming that very specific contents of npm.cmd trigger this behavior. I might try to get a hold of win10 box again and try to get the reproduction path down to a manageable minimum.

@ollie27
Copy link
Member

ollie27 commented Jul 17, 2017

After some investigating it turns out this is an interesting Windows feature/quirk/bug in how %~dp0 is handled. We're seeing this now in Rust because of #42436 (so my bad). %~dp0 is supposed to return the drive and path to the running batch file but that can fail if it was called with a quoted name.

This isn't a new issue: https://stackoverflow.com/a/26851883.

Note that this can still be triggered on stable if the name of the batch file contains a space such as Command::new("foo bar.bat") so reverting #42436 won't really help.

What I suggest doing in this case is using Command::new("cmd").arg("/c").arg("npm.cmd") instead which seems to work around the issue. It's also what you should be doing according to the CreateProcess docs.

@Mark-Simulacrum Mark-Simulacrum marked this as a duplicate of #43273 Jul 19, 2017
@Mark-Simulacrum Mark-Simulacrum added the regression-from-stable-to-beta Performance or correctness regression from stable to beta. label Jul 19, 2017
@Mark-Simulacrum
Copy link
Member

Marking as a regression from stable to beta, which I think is now true? Uncertain. Either way, I suspect we should consider this a compatibility note and close, since based on @ollie27's comment directly above people using this functionality should be doing so differently according to the Windows documentation.

@azerupi
Copy link
Contributor

azerupi commented Jul 19, 2017

This is an annoying regression though. Because instead of having to cfg only the name of the binary you now have to cfg all the places where you create / run the command because you need different args etc.

We used the method below in our build.rs, but in the actual code the Command part is repeated a couple of times in different places. So this regression would result in more cfg boilerplate.

#[cfg(windows)]
mod execs {
    pub const NPM: &'static str = "npm.cmd";
    pub const STYLUS: &'static str = "stylus.cmd";
}
#[cfg(not(windows))]
mod execs {
    pub const NPM: &'static str = "npm";
    pub const STYLUS: &'static str = "stylus";
}

...

if !Command::new(execs::STYLUS)
        .arg(stylus_dir)
        .arg("--out")
        .arg(theme_dir)
        .arg("--use")
        .arg("nib")
        .status()?
        .success() {
    bail!("Stylus encoutered an error");
}

@budziq
Copy link
Contributor Author

budziq commented Jul 19, 2017

Either way, I suspect we should consider this a compatibility note and close, since

I would opt for a fix instead of closing with compatibility note based on the fact that problem is focused on windows 10 and nightly (not sure if it has already crawled into beta) with the original (stable) behaviour being desirable and this regression possibly affecting large number of crates but being hard to catch with cargo bomb (i suppose that not many crates run windows cmd scripts in their regression tests)

based on @ollie27's comment directly above people using this functionality should be doing so differently according to the Windows documentation.

You are essentially suggesting that crates / binaries implement a workaround. IMHO this is what stdlib is for (providing a portable abstraction) and client code should not be interested (or even aware) of the fine details of underlying implementation.

@ollie27
Copy link
Member

ollie27 commented Jul 20, 2017

@azerupi something like the following should work and as I mentioned should probably be done anyway.

#[cfg(windows)]
mod execs {
    use std::process::Command;
    pub fn npm() -> Command {
        let mut cmd = Command::new("cmd");
        cmd.arg("/c").arg("npm.cmd");
        cmd
    }
    pub fn stylus() -> Command {
        let mut cmd = Command::new("cmd");
        cmd.arg("/c").arg("stylus.cmd");
        cmd
    }
}
#[cfg(not(windows))]
mod execs {
    use std::process::Command;
    pub fn npm() -> Command {
        Command::new("npm")
    }
    pub fn stylus() -> Command {
        Command::new("stylus")
    }
}

the fact that problem is focused on windows 10

Is it? I haven't been able to reproduce any differences between Window 10 64 bit and Windows 7 32 bit in a VM. The Stack Overflow answer I linked talks about this in Windows XP so if there really is a difference between 7 and 10 then there might be a different bug.

You are essentially suggesting that crates / binaries implement a workaround.

I believe the best place for the workaround is in the batch files themselves. Using a subroutine (like shown here) for %~dp0 seems to be the only reliable thing to do.

IMHO this is what stdlib is for (providing a portable abstraction) and client code should not be interested (or even aware) of the fine details of underlying implementation.

std doesn't provide a portable way to run batch files. If you use Command::new("foo.bat") then you have to specify the file extension anyway so using cmd /c isn't really any less portable. Client code should not rely on the fine details of the underlying implementation either, in this case whether the executable name is quoted.

Command is a bit of a mess on Windows. In this case it's definitely the fault of Windows. #37519 is Rust's fault though.

It wouldn't be a disaster to revert #42436 and return to the previous behaviour which just has different bugs.

@budziq
Copy link
Contributor Author

budziq commented Jul 20, 2017

@ollie27

My comments (sofar and below) may came out negative so please excuse me. I understand the technical details and greately appreciate the work you guys are doing here! I can live with any workaround in my code really and this is more of a concerned citizen speaking as I find this problem a serious regression (as noted the problem did not occur in stable nor beta and its win 10 speciffic)

the fact that problem is focused on windows 10

Is it? I haven't been able to reproduce any differences between Window 10 64 bit and Windows 7 32 bit in a VM. The Stack Overflow answer I linked talks about this in Windows XP so if there really is a difference between 7 and 10 then there might be a different bug.

yep. I was able to reproduce the problem only on borrowed win 10 laptop and nightly (it does not reproduce on win 7 nor on any platform when using stable and beta - the statement about beta might no longer be true if the problem code migrated there) and in appveyor VM (as before the problem was on nightly only)
https://ci.appveyor.com/project/azerupi/mdbook/build/1.0.412

std doesn't provide a portable way to run batch files

Then there should be a section in docs mentioning that. In sane OS es there is no distinction and windows strives to make it transparent for the most part.

If you use Command::new("foo.bat") then you have to specify the file extension anyway

Which is a bug in itself requiring a workaround in user code. This exactly proves my point about providing portable abstractions instead of leaking backend specific requirements to the user.

Command is a bit of a mess on Windows

to say the least :/

The npm.cmd uses %~dp0 but I'll not be able to verify if the workaround works untill I'm back from holidays in 1.5 weeks. Is there a chance to leave this issue open untill I have chance to verify the workaround on my side?

Thanks for your time!

@azerupi
Copy link
Contributor

azerupi commented Jul 20, 2017

@ollie27 Thanks for the suggestion, it is cleaner than what I was going to write :)

Command is a bit of a mess on Windows. In this case it's definitely the fault of Windows. #37519 is Rust's fault though.

Indeed, it seems like it. If there is no way to abstract the Windows part away to provide a more unified abstraction, so be it. I can live with the workaround. It would be a shame though, because I suppose a lot of people use Command in their build script and would be happy if it worked seamlessly on all the platforms.

Personally, I don't have a Windows machine to test my code on, I just assume std code is cross-platform if I don't use OS specific modules. Maybe it is a wrong assumption to make, but so far it has worked out pretty well and that is just amazing!

I guess I'm just asking to not dismiss this issue too quickly because it is not "our fault" and see if we can end up with something better. Even if it wasn't working flawlessly in a cross-platform way before, I would consider diverging further as a serious regression that should at least be well thought out.

@retep998
Copy link
Member

retep998 commented Jul 20, 2017

The fact that you even can run bat files with CreateProcess honestly feels like a backwards compatibility hack that Microsoft wishes they could undo. A bat file is not a binary, and when you do run it the process which is actually running is cmd.exe. Unfortunately the fact that CreateProcess is able to run bat files at all leads people into the false belief that it is any good at that. In fact, imposing the requirement that Command be able to run bat files precludes us from adding the ability to run binaries from paths longer than MAX_PATH! In order to support long paths, we'd have to specify the application name in the lpApplicationName parameter, which does not support executing bat files directly!

So please, just use cmd /c instead of executing bat files directly so that Rust will have the ability to make Command better in the future.

@budziq
Copy link
Contributor Author

budziq commented Jul 20, 2017

I would like to excuse for the terse form (I'm on mobile only) so please take this with grain of salt.

The fact that you even can run bat files with CreateProcess honestly feels like a backwards compatibility hack that Microsoft wishes they could undo

It might be but I have not encountered a modern language stdlib that would not allow running bat files within the PATH. But if this is the policy then either forbid it explicitly in API or at least document such limitation in the docs.

A bat file is not a binary, and when you do run it the process which is actually running is cmd.exe.

neither are any other executable scripts on NIXes. IMHO this is besides the point

Unfortunately the fact that CreateProcess is able to run bat files at all leads people into the false belief that it is any good at that.

Please note that we are discussing two completely different entities. You are mentioning the internal Windows implementation CreateProcess and its limitations which I perfectly understand. While I'm talking about the user facing rust API std::Command that IMHO should hide as many platform specific warts as possible and document its limitations if former is not possible.

Also I have a big problem with accepting a "wont fix" on a regression with an explanation such as "this was always a mistake and we should have never allowed it". I understand API stability not only as a "compilability" but also behavioral contract.

Please note that (possibly most) stdlib users might consider not being able to run bat files directly a serious drawback (not found in other mainstream languages).

@budziq
Copy link
Contributor Author

budziq commented Jul 20, 2017

Hmm this still came out stronger than intended 😞 .

Please note that as a pragmatic programmer I'll test and implement the suggested workaround (thanks for the help with that!)

previous comments were a well intended suggestions from someone concerned about rust and its success on the crowded programming language market.

Please keep at the good work!:heart:

@retep998
Copy link
Member

retep998 commented Jul 20, 2017

If you want something that is capable of running any executable script, what you really want is a wrapper around ShellExecuteEx and not CreateProcess.

In order to spawn a child process, Rust has to use either CreateProcess or ShellExecuteEx. Rust decided to use CreateProcess which means it is fundamentally limited in the sort of behavior it can support by what CreateProcess does. In order to paper over differences in behavior between platforms Rust has to do a lot of its own work, such as searching in PATH itself if you modify the PATH of the Command to emulate posix semantics and that's already given rise to a bunch of bugs (#37519). When Rust goes too far in providing cross platform behavior it ends up causing bugs like #29494. Some things simply do not work when you try to make the behavior the same across platforms #31259. Rust cannot completely unify the behavior across platforms. At some point Rust has to draw a line for where the platforms have different behavior and document it accordingly.

#42436 fixes an issue which could result in a security hole. You could attempt to run foo but if it doesn't exist it might end up running a completely different binary named foo bar.exe. It would be great if we could provide complete stability in behavior, but when fixing bugs there will inevitably be some corner case that is broken. It's unfortunate that quoting the binary name causes %~dp0 to no longer work but fortunately there is a clear way to change your code such that it reliably works across all Rust versions.

I really wish we could let Command support binary paths longer than MAX_PATH but a lot of people already rely on Command being able to directly execute bat scripts, far more than those that relied on %~dp0, so I don't think we'll ever be able to fix this bug. Because of std's stability guarantees it looks like if we ever want a good Command we'll have to rely on third party crates.

@budziq
Copy link
Contributor Author

budziq commented Jul 20, 2017

Thanks for the background information. I feel bad to have wasted so much of your time on a discussion that was less than productive 😞 so feel free to disregard my ramblings.

looks like if we ever want a good Command we'll have to rely on third party crates.

I think that this might be the best course of action.

@Mark-Simulacrum Mark-Simulacrum added I-nominated P-high High priority C-bug Category: This is a bug. labels Jul 27, 2017
@alexcrichton
Copy link
Member

@vadimcn do you have thoughts here? Or do you agree with the previous summaries?

@vadimcn
Copy link
Contributor

vadimcn commented Aug 1, 2017

I was quite surprised to learn that CreateProcess can execute batch scripts at all. MSDN explicitly says "To run a batch file, you must start the command interpreter; set lpApplicationName to cmd.exe and set lpCommandLine to the following arguments: /c plus the name of the batch file."

I agree with @retep998 that trying to abstract away all platform differences is fraught with peril.

@alexcrichton
Copy link
Member

Ok thanks for the info @vadimcn!

In light of that the libs team's conclusion during the meeting of "we're not going to revert" I believe still stands, so I'm going to close this.

If there's difficulty in updating applications though please let us know! We can try to help work around it if necessary.

@azerupi
Copy link
Contributor

azerupi commented Aug 4, 2017

In light of that the libs team's conclusion during the meeting of "we're not going to revert" I believe still stands, so I'm going to close this.

I am fine with the decision, but could I ask for more explicit documentation about this behavior?

The example on the std::process::Command doc shows the right thing to do, but it is never explicitly explained. Having a note about this in the docs would probably help avoid some confusion for people that are less familiar with Windows, like myself. 😄

@budziq
Copy link
Contributor Author

budziq commented Aug 4, 2017

Agreed. Actually I would also expect that such change in behaviour would also be mentioned in the rust release notes. Even if the original behaviour was unintended side effect, this is an important change in behavioral contract for stdlib.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Sep 13, 2017
In rust-lang#42436 the behavior for spawning processes on Windows was tweaked slightly to
fix various bugs, but this caused rust-lang#42791 as a regression, namely that to spawn
batch scripts they need to be manually spawned with `cmd /c` instead now. This
updates the compiler to handle this case explicitly for Emscripten.

Closes rust-lang#42791
alexcrichton added a commit to nrc/rust that referenced this issue Sep 14, 2017
In rust-lang#42436 the behavior for spawning processes on Windows was tweaked slightly to
fix various bugs, but this caused rust-lang#42791 as a regression, namely that to spawn
batch scripts they need to be manually spawned with `cmd /c` instead now. This
updates the compiler to handle this case explicitly for Emscripten.

Closes rust-lang#42791
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Sep 15, 2017
…, r=nikomatsakis

rustc: Spawn `cmd /c emcc.bat` explicitly

In rust-lang#42436 the behavior for spawning processes on Windows was tweaked slightly to
fix various bugs, but this caused rust-lang#42791 as a regression, namely that to spawn
batch scripts they need to be manually spawned with `cmd /c` instead now. This
updates the compiler to handle this case explicitly for Emscripten.

Closes rust-lang#42791
alexcrichton added a commit to alexcrichton/rust that referenced this issue Sep 16, 2017
…, r=nikomatsakis

rustc: Spawn `cmd /c emcc.bat` explicitly

In rust-lang#42436 the behavior for spawning processes on Windows was tweaked slightly to
fix various bugs, but this caused rust-lang#42791 as a regression, namely that to spawn
batch scripts they need to be manually spawned with `cmd /c` instead now. This
updates the compiler to handle this case explicitly for Emscripten.

Closes rust-lang#42791
frewsxcv added a commit to frewsxcv/rust that referenced this issue Sep 16, 2017
…, r=nikomatsakis

rustc: Spawn `cmd /c emcc.bat` explicitly

In rust-lang#42436 the behavior for spawning processes on Windows was tweaked slightly to
fix various bugs, but this caused rust-lang#42791 as a regression, namely that to spawn
batch scripts they need to be manually spawned with `cmd /c` instead now. This
updates the compiler to handle this case explicitly for Emscripten.

Closes rust-lang#42791
alexcrichton added a commit to alexcrichton/rust that referenced this issue Sep 17, 2017
…, r=nikomatsakis

rustc: Spawn `cmd /c emcc.bat` explicitly

In rust-lang#42436 the behavior for spawning processes on Windows was tweaked slightly to
fix various bugs, but this caused rust-lang#42791 as a regression, namely that to spawn
batch scripts they need to be manually spawned with `cmd /c` instead now. This
updates the compiler to handle this case explicitly for Emscripten.

Closes rust-lang#42791
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. O-windows-msvc Toolchain: MSVC, Operating system: Windows P-high High priority regression-from-stable-to-beta Performance or correctness regression from stable to beta. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants