-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize Seek::stream_len
impl for File
#125087
base: master
Are you sure you want to change the base?
Conversation
r? @ChrisDenton rustbot has assigned @ChrisDenton. Use |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
☔ The latest upstream changes (presumably #127360) made this pull request unmergeable. Please resolve the merge conflicts. |
It uses the file metadata on Unix with a fallback for files incorrectly reported as zero-sized. It uses `GetFileSizeEx` on Windows. This reduces the number of syscalls needed for determining the file size of an open file from 3 to 1.
Alas, this is insufficient. There are other filesystems with questionable
|
Amazing, I did not foresee this.
For this particular example, this doesn't look insufficient, since it doesn't regress anything: use std::fs::File;
use std::io;
use std::io::Seek as _;
use std::io::SeekFrom;
fn main() -> io::Result<()> {
let file = File::open("/sys/kernel/oops_count")?;
println!("{}", (&file).seek(SeekFrom::End(0))?);
Ok(())
} This outputs 4096 on my machine, the same as the
|
EDIT: Seems like I was incorrect. It uses some heuristic to
|
Ok, maybe seeking is nonsense on those files too and they're only meant to be used via But that still leaves FUSE drivers which can implement basically arbitrary behavior. And any approach distinguishing good from bad filesystems would require additional syscalls. |
What is Rust's platform support for bad FUSE drivers? Who decides which of the methods to trust ( I can kinda see not crashing the Rust program if a FUSE driver decides to return |
FUSE is just the canary in the coal mine. sysfs shows that in-kernel filesystems do strange things too. If we have 2 examples already then I assume there's some network filesystem will also do weird things when weird remote machines are involved. So I'd say the trust hierarchy when it comes to "how many bytes does this file contain is" read > seek > stat It's unfortunate that |
I had assumed
So after checking again, it seems that even for procfs |
Sorry, just catching up. Given the above, @the8472 are you happy that this is at least no worse than the status quo? I.e. if seek or stat is returning wrong or nonsense values then that's a problem in either case. |
I think it would be better to only do this for regular files. Non-regular ones have weird behavior in general. E.g. directories have a size according to Even with that restriction I think someone will still eventually run into issues around FUSE, network filesystems or weird filesystems. That said, stream_len is an unstable API and it's a convenience method, so we don't necessarily have to provide ironclad guarantees. Maybe we should just rephrase the documentation of
this would be better
I mean, it always applies to trait methods that trait impls can override them and can provide different behavior, so this isn't special. But perhaps it's better to not suggest that implementations will only ever behave exactly like that. |
It uses the file metadata on Unix with a fallback for files incorrectly reported as zero-sized. It uses
GetFileSizeEx
on Windows.This reduces the number of syscalls needed for determining the file size of an open file from 3 to 1.