Expose raw Stdout/err/in #58326

ParadoxSpiral · 2019-02-09T16:17:17Z

Currently there is not easy/obvious way to get an unbuffered Stdout/err/in. The types do exist in stdio, however they are not public for reasons not noted.

For example these types would be useful for CLI applications that write a lot of data at once without it getting unnecessarily flushed.

One can use platform specific extensions such as from_raw_fd on unix, and from _raw_handle on windows as a workaround.

The text was updated successfully, but these errors were encountered:

josephlr · 2019-02-10T19:10:43Z

This is similar to #23818, where having more control over the buffering/flushing behavior of stdio/stdin would be helpfull.

pitdicker · 2019-02-15T19:02:26Z

So I've been thinking about the same thing while working on #58454.

I do wonder how useful it they would be though.

For example these types would be useful for CLI applications that write a lot of data at once without it getting unnecessarily flushed.

I don't think Stdout flushes unnecessary? It should flush just as often or less than StdoutRaw, because Stdout lets large writes (greater than 8k) pass through directly. And Stderr isn't buffered at all.

Stdout does have an interesting comment:

pub struct Stdout {
    // FIXME: this should be LineWriter or BufWriter depending on the state of
    //        stdout (tty or not). Note that if this is not line buffered it
    //        should also flush-on-panic or some form of flush-on-abort.
    inner: Arc<ReentrantMutex<RefCell<LineWriter<Maybe<StdoutRaw>>>>>,
}

Working on this would change flushing behavior for piped buffered stdout, but is not at all relevant for StdoutRaw.

Stdout and Stderr do synchronize writes across threads however. I suppose that is why the panic code in the standard library writes to StderrRaw directly.

Another point, the Windows stdio implementation turns out to be quite tricky because it converts from UTF-8 to UTF-16 when writing to the console, but accepts arbitrary bytes when writing to a pipe. Ideally StdinRaw holds some state to deal with sliced UTF-16 buffers. And the buffered LineWriter in Stdout helps to give valid UTF-8 slices, because it slices at sentences, preventing incomplete UTF-8 code points.

Finally the non-raw types have a Maybe in-between, to gracefully handle the case where stdin/stdout/stderr is not available, see RFC 1014.

Some advantages the *Raw types could have:

StderrRaw does not do synchronization, I suppose there are cases where that is necessary to prevent deadlocks.
StdoutRaw will write everything out directly, no need for flushing. Bad for performance, and code that gradually builds up a line or message can give very messy output when multi-threaded. Still it can be used well and is desired by some (print! macro should flush stdout #23818).
Having the raw methods allows for building different abstractions.

A single synchronized Stdin with its own buffer seems pretty solid to me. I see no real advantage in StdinRaw.

ParadoxSpiral · 2019-02-26T14:24:44Z

I don't think Stdout flushes unnecessary? It should flush just as often or less than StdoutRaw, because Stdout lets large writes (greater than 8k) pass through directly. And Stderr isn't buffered at all.

Oh, I wasn't aware that LineWriter hands big writes through. However it would still be useful if you could wrap StdoutRaw in a BufWriter that you have more control over without also still having the LineWriter beneath.

Some advantages the *Raw types could have:

A slight advantage is that you can more easily check if any of them are not present, since the raw functions return Results.

A single synchronized Stdin with its own buffer seems pretty solid to me. I see no real advantage in StdinRaw.

I agree a StdinRaw is probably not desireable.

pitdicker · 2019-02-27T07:34:34Z

A slight advantage is that you can more easily check if any of them are not present, since the raw functions return Results.

Sorry, just made a PR to that makes them no longer return a Result #58768. But it didn't work, the raw types on all systems already only return Ok.

I am still trying to make up my mind whether writing up a pre-RFC or PR to expose the *Raw types brings something useful.

StdoutRaw can in some sense break the synchronization promise of Stdout: when Stdout is locked by one thread it can write multiple lines without having lines from another thread 'interrupt'. So any custom implementation that wraps StdoutRaw that want to play nice with the standard library should lock Stdout before using StdoutRaw. That seems to make the idea of using different synchronization primitives not really interesting anymore.

Oh, I wasn't aware that LineWriter hands big writes through. However it would still be useful if you could wrap StdoutRaw in a BufWriter that you have more control over without also still having the LineWriter beneath.

I am preparing a PR that switches the buffering mode between LineWriter and Bufwriter depending on whether a terminal is connected, but expect it to not land easily... Would that fit your needs, or some method on Stdout that gives more control over the buffering?

retep998 · 2019-02-27T10:03:50Z

One big question is should we have StdoutRaw and StdinRaw for Windows consoles that allow the user to read and write [u16] directly? Should we also allow the user to write arbitrary [u8] bytes through the narrow codepage?

ParadoxSpiral · 2019-02-27T10:17:48Z

I am preparing a PR that switches the buffering mode between LineWriter and Bufwriter depending on whether a terminal is connected, but expect it to not land easily... Would that fit your needs, or some method on Stdout that gives more control over the buffering?

I would like to be able to set the capacity of the inner BufWriter.

BartMassey · 2019-10-29T09:08:43Z

I wrote up some relevant stuff on Reddit here. My repo is here if you want to play with it. ~~It all amounts to a pretty good argument that exposing StdoutRaw isn't going to buy you much performance in typical cases, I think.~~

BartMassey · 2019-11-10T19:01:07Z

After further investigation it looks like writing directly to the underlying File with custom buffering is much faster than anything that I've been able to figure out for using stdout(). I'm playing with this right now; see http://github.com/BartMassey/rust-nonstdio for a very early preview of a thing. The Background section of the README has some information that is relevant here.

jgoerzen · 2022-06-08T00:34:39Z

This led to an otherwise-unnecessary use of std::io::copy for me. I have data that I want to pipe to a command. This data may come from stdin, or it may come from some other Read. I can't just call .stdin() on this with a handle that comes from io::stdin() if I've read anything from that handle, because the first read, even if read_exact on 1 byte, will have read 8K and the remainder of the 8K block will be discarded.

This is an unfortunate data loss bug that is entirely non-obvious in the library and not blocked by the type system.

agausmann · 2022-07-27T19:14:36Z

In my application I need to forward I/O from a serial port to stdio, and the serial port is an interactive console where the remote device echoes the characters back if they should be printed on the terminal, and controls various other aspects of the terminal. (Come to think of it, It doesn't have to be a serial port, another similar example with this behavior is an SSH client)

Line-buffered stdin simply does not work for this case; action needs to be taken for every input byte, not for every line.

tbu- · 2022-10-05T19:42:40Z

Having line-buffered stdout is also an unnecessary performance overhead when outputting binary data. First the code scans for newlines to only partially write the bytes to stdout and copy the remaining bytes into a buffer, only for me to flush the remaining bytes out.

One syscall too many for every write that I do and unnecessary buffer copying.

SUPERCILEX · 2022-12-16T23:24:47Z

Created a proposal for this: rust-lang/libs-team#148

WieeRd · 2023-11-08T18:12:48Z

If anyone else came across this issue while looking for a workaround, this is what I'm using right now.

use std::{fs::File, io};

#[cfg(unix)]
pub fn stdout_raw() -> File {
    use std::os::fd::{AsRawFd, FromRawFd};

    let stdout = io::stdout();
    let raw_fd = stdout.as_raw_fd(); // or just use `1`
    unsafe { File::from_raw_fd(raw_fd) }
}

#[cfg(windows)]
pub fn stdout_raw() -> File {
    use std::os::windows::io::{AsRawHandle, FromRawHandle};

    let stdout = io::stdout();
    let raw_handle = stdout.as_raw_handle();
    unsafe { File::from_raw_handle(raw_handle) }
}

#[cfg(test)]
mod test {
    use super::*;
    use std::io::{self, Write};

    #[test]
    fn rawwwwww() -> io::Result<()> {
        let mut stdout = stdout_raw();
        stdout.write_all(b"This stdout... is RAWWWWWW!!!")?;

        Ok(())
    }
}

jgoerzen · 2023-11-08T18:27:48Z

In Filespooler, for Unix, I am using:

/// stdin is buffered by default on Rust, and you can't change it.  Since
/// we need to precisely read the header before letting a subprocess
/// handle the payload in stdin-process, we have to use trickery.  Bleh.
///
/// Take care not to let this value drop before spawning, because that would
/// cause stdin to be closed.
///
/// See: https://github.com/rust-lang/rust/issues/97855
pub fn get_unbuffered_stdin() -> File {
    let s = stdin();
    let locked = s.lock();
    let file = unsafe { File::from_raw_fd(locked.as_raw_fd()) };
    file
}

FWIW

tbu- · 2023-11-08T19:02:25Z

AFAICT, both of these will close the actual stdin/stdout FD once the returned file gets dropped. This is probably not intended.

Additionally, the solution posted by @jgoerzen looks like it's trying to lock stdin, but the lock is immediately dropped after the function is returned.

WieeRd · 2023-11-10T16:33:38Z

Definitely wouldn't recommend using my snippet for any serious/larger code, it's just suggested as a quick and dirty workaround for trivial but IO intensive scenario. In my case I was solving an algorithm problem when I encountered this.

jonas-schievink added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. C-feature-request Category: A feature request, i.e: not implemented / a PR. labels Feb 9, 2019

jonhoo mentioned this issue Jan 6, 2021

Use BufWriter when STDOUT is not a TTY jonhoo/inferno#206

Merged

jgoerzen mentioned this issue Jun 8, 2022

process::Command data loss when using Stdio::inherit for stdin and stdin has been read from #97855

Open

SUPERCILEX mentioned this issue Dec 16, 2022

Expose raw std{in,out,err} rust-lang/libs-team#148

Open

NobodyXu mentioned this issue Dec 9, 2023

Project Stalled BartMassey/rust-nonstdio#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose raw Stdout/err/in #58326

Expose raw Stdout/err/in #58326

ParadoxSpiral commented Feb 9, 2019

josephlr commented Feb 10, 2019

pitdicker commented Feb 15, 2019

ParadoxSpiral commented Feb 26, 2019

pitdicker commented Feb 27, 2019

retep998 commented Feb 27, 2019

ParadoxSpiral commented Feb 27, 2019

BartMassey commented Oct 29, 2019 •

edited

Loading

BartMassey commented Nov 10, 2019 •

edited

Loading

jgoerzen commented Jun 8, 2022

agausmann commented Jul 27, 2022 •

edited

Loading

tbu- commented Oct 5, 2022

SUPERCILEX commented Dec 16, 2022

WieeRd commented Nov 8, 2023 •

edited

Loading

jgoerzen commented Nov 8, 2023

tbu- commented Nov 8, 2023

WieeRd commented Nov 10, 2023

Expose raw Stdout/err/in #58326

Expose raw Stdout/err/in #58326

Comments

ParadoxSpiral commented Feb 9, 2019

josephlr commented Feb 10, 2019

pitdicker commented Feb 15, 2019

ParadoxSpiral commented Feb 26, 2019

pitdicker commented Feb 27, 2019

retep998 commented Feb 27, 2019

ParadoxSpiral commented Feb 27, 2019

BartMassey commented Oct 29, 2019 • edited Loading

BartMassey commented Nov 10, 2019 • edited Loading

jgoerzen commented Jun 8, 2022

agausmann commented Jul 27, 2022 • edited Loading

tbu- commented Oct 5, 2022

SUPERCILEX commented Dec 16, 2022

WieeRd commented Nov 8, 2023 • edited Loading

jgoerzen commented Nov 8, 2023

tbu- commented Nov 8, 2023

WieeRd commented Nov 10, 2023

BartMassey commented Oct 29, 2019 •

edited

Loading

BartMassey commented Nov 10, 2019 •

edited

Loading

agausmann commented Jul 27, 2022 •

edited

Loading

WieeRd commented Nov 8, 2023 •

edited

Loading