Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'fs' can not read data if same FD is used for write and readFile #13572

Closed
thisconnect opened this issue Jun 9, 2017 · 20 comments
Closed

'fs' can not read data if same FD is used for write and readFile #13572

thisconnect opened this issue Jun 9, 2017 · 20 comments
Labels
fs Issues and PRs related to the fs subsystem / file system.

Comments

@thisconnect
Copy link

  • Version: v6.11.0
  • Platform: Darwin computer.local 16.6.0 Darwin Kernel Version 16.6.0: Fri Apr 14 16:21:16 PDT 2017; root:xnu-3789.60.24~6/RELEASE_X86_64 x86_64
  • Subsystem: fs

Description:

#3163 added file descriptor support to *File() functions (for example fs.readFile).

Issue: Writing then reading on the same FD does not read the data when using fs.readFile(fd). Only reading (witout writing) works as expected.

The demo opens a file descriptor, writes data to it and tries to read the data using the same FD and fs.readFile. The demo code uses sync methods for simplicity, non-sync version has the same behavior.

const fs = require('fs');
const path = require('path');

const filepath = path.resolve(__dirname, './test-write-and-read.txt');

// 'w+' - Open file for reading and writing.
// The file is created (if it does not exist)
// or truncated (if it exists).
const fd = fs.openSync(filepath, 'w+');

// 'r+' - Open file for reading and writing.
// An exception occurs if the file does not exist.
// const fd = fs.openSync(filepath, 'r+');

// fs.writeFileSync(fd, 'foo bar'); // no issue
fs.writeSync(fd, 'FOOBAR'); // ISSUE: readFileSync(fd) will not read content

// fs.fsyncSync(fd); // <- seems to have no impact

// always reads content (no issue)
const buf = Buffer.alloc(20);
fs.readSync(fd, buf, 0, 20, 0);
console.log('read with fd:', buf.toString())

// always reads content (no issue)
console.log('readFileSync with filepath:', fs.readFileSync(filepath, { encoding: 'utf8' }));

// ISSUE: does not read content if previously writeSync(fd, 'foo bar')
console.log('readFileSync with fd:', fs.readFileSync(fd, { encoding: 'utf8' }));

fs.closeSync(fd);

Expected output:

read with fd: FOOBAR
readFileSync with filepath: FOOBAR
readFileSync with fd: FOOBAR

Actual output:

read with fd: FOOBAR
readFileSync with filepath: FOOBAR
readFileSync with fd:

(last line is missing the file content FOOBAR)

Note:

  • fs.fsync seems to have no impact. Assuming this could help and flush the data to the filesystem
  • readFileSync with fd does successfully read data if previously no data has been written to the FD before and the test file contains content.
@vsemozhetbyt vsemozhetbyt added the fs Issues and PRs related to the fs subsystem / file system. label Jun 9, 2017
@vsemozhetbyt
Copy link
Contributor

Can reproduce with Node.js 8.1.0 on Windows 7 x64.

This is something like #7099, but not valid for readFile* cases: these methods should not read from the current fd position.

cc @nodejs/fs

@vsemozhetbyt
Copy link
Contributor

vsemozhetbyt commented Jun 9, 2017

@thisconnect Flushing the data to the filesystem seems unneeded, the file size is correctly changed before reading, if I get this right:

const fs = require('fs');
const path = require('path');

const filepath = path.resolve(__dirname, './test-write-and-read.txt');
const fd = fs.openSync(filepath, 'w+');

console.log(fs.statSync(filepath).size);
fs.writeSync(fd, 'FOOBAR');
console.log(fs.statSync(filepath).size);
console.log(fs.readFileSync(fd, { encoding: 'utf8' }));
0
6
  // empty string

@vsemozhetbyt
Copy link
Contributor

vsemozhetbyt commented Jun 9, 2017

  1. The cause for readFileSync() seems to be this: in this call chain the reading position is undefined:

node/lib/fs.js

Line 576 in 47b9772

bytesRead = tryReadSync(fd, isUserFd, buffer, pos, size - pos);

node/lib/fs.js

Line 542 in 47b9772

bytesRead = fs.readSync(fd, buffer, pos, len);

node/lib/fs.js

Line 680 in 47b9772

return binding.read(fd, buffer, offset, length, position);

Which may fall into this note from the doc:

If position is null, data will be read from the current file position.

Does the binding function equal undefined and null here?

  1. For the fs.readFile() the same happens in this call chain:

node/lib/fs.js

Line 348 in 47b9772

req.oncomplete = readFileAfterOpen;

node/lib/fs.js

Line 426 in 47b9772

req.oncomplete = readFileAfterStat;

node/lib/fs.js

Line 458 in 47b9772

context.read();

node/lib/fs.js

Line 396 in 47b9772

binding.read(this.fd, buffer, offset, length, -1, req);

Does the binding function equal -1 and null here?

It seems we should transfer 0 as a position in these both cases.

@vsemozhetbyt vsemozhetbyt added the confirmed-bug Issues with confirmed bugs. label Jun 9, 2017
@seishun
Copy link
Contributor

seishun commented Jun 9, 2017

This happens because when called with an fd, readFile[Sync] reads from the current file position. I tried to document this behavior in #10853, but it got stalled and then closed.

@vsemozhetbyt
Copy link
Contributor

vsemozhetbyt commented Jun 9, 2017

It seems this better be fixed than documented. readFile[Sync] are supposed to return the full file content according to method names, and the current behavior seems confusing.

@seishun
Copy link
Contributor

seishun commented Jun 9, 2017

@vsemozhetbyt Please read the discussion in that PR. TL;DR some fd are unseekable, so the behavior would be inconsistent between seekable and unseekable fds.

@vsemozhetbyt
Copy link
Contributor

Seems to be a duplicate of the #9671

@seishun seishun removed the confirmed-bug Issues with confirmed bugs. label Jun 9, 2017
@MylesBorins
Copy link
Contributor

MylesBorins commented Jun 9, 2017

@thisconnect are you seeing the same failure in 6.10.x?

edit: this appears to fail on 6.10.x and 6.9.x... so it is the same older bug not a new regression

@thisconnect
Copy link
Author

thisconnect commented Jun 9, 2017

@MylesBorins yes same on 6.10.x

@thisconnect
Copy link
Author

@thisconnect Flushing the data to the filesystem seems unneeded, the file size is correctly changed before reading, if I get this right

@vsemozhetbyt flushing was just an attempt to rule out this being a possible problem. I too get correct sizes with stats.

@thisconnect
Copy link
Author

After reading why to not use fs.exists (see RECOMMENDED section of fs.exists in https://nodejs.org/api/fs.html#fs_fs_exists_path_callback) wrote my own small(ish) module, that nobody uses :) . See fildes (https://github.com/thisconnect/fildes), it promises some fs methods, and takes care about creating missing dir structure if flag is 'w', 'w+', 'a' or 'a+'. I was exited that readFile and writeFile support FD, that landed in Node 5.x and started using it with Node 6.x LTS.

I wasn't aware about the fact that fd have something like a position or could open non-seekable files (sockets?). After reading through the issues libuv/libuv#1357 , #7099, #9671, #9699, #10809, #10853

As a naive Node.js user, this is what I suspect:

  • fs.readFile should read the whole file, else seems like a bug
  • fs.readFileSync(path) should equal fs.readFileSync(fs.openSync(path))
  • FD's having something like a "current file position" is not a feature nor documented (afaik)
  • if FD would have postion support, there would be other API needed to work with it
  • fs.read should be used if one does not want the whole file
  • fs.read and fs.write chaning the position is unexpected
  • fs.readFile non-seekable should either read everything it can or throw an error

@sam-github
Copy link
Contributor

sam-github commented Jun 10, 2017

I pretty much agree with the expectations, FYI, here is some more info:

  • fs.readFile should read the whole file, else seems like a bug
    • agreed
  • fs.readFileSync(path) should equal fs.readFileSync(fs.openSync(path))
    • that simple use does, but if you fd=openSync() and readFile twice from the same fd, the second read will get no data :-(
  • FD's having something like a "current file position" is not a feature nor documented (afaik)
    • Its fundamental to POSIX fd handling, and as such, not documented in the node fs docs, but it is indeed a feature. fds come in two flavours, seekable and not-seekable. files are seekable, because random access is possible, sockets and pipes are not seekable, you can only read the next data in the stream.
  • if FD would have postion support, there would be other API needed to work with it
  • fs.read should be used if one does not want the whole file
    • agreed
  • fs.read and fs.write chaning the position is unexpected
    • well, undocumented, yes, but expected (EDIT: at least by those familiar with the underlying C calls) because its what happens on Windows, Linux,and all other runtimes that use fds to read and write files (all of them?)
  • fs.readFile non-seekable should either read everything it can or throw an error
    • agreed

@thisconnect
Copy link
Author

@sam-github Thanks for the info!

  • fs.readFileSync(path) should equal fs.readFileSync(fs.openSync(path))
    • that simple use does, but if you fd=openSync() and readFile twice from the same fd, the second read will get no data :-(

Just for the record. This behavior is unexpected for me and feels like a bug

@seishun
Copy link
Contributor

seishun commented Jun 11, 2017

Should we create an issue to discuss what we're going to do with readFile and fds (document it or change the behavior)?

@sam-github
Copy link
Contributor

#10853 already dicusses, it could be reopened.

@thisconnect
Copy link
Author

thisconnect commented Jun 12, 2017

IMO The expected cases should be fixed and behavior changed according to the current documentation. readFile should read whole file. writeFile should write whole file.

The weird edge cases with non-seekable fd's (files without a beginning) should be documented (assuming they already kind of work as currently documented).

#10853 is already too long and hard to follow. I suggest to open a new issue with a proposal like this (if you agree) that people can agree or decline. To me this is a bug and should be fixed in 6.x too

EDIT: read @sam-github's summary #13572 (comment)

@Trott
Copy link
Member

Trott commented Apr 30, 2018

@nodejs/fs What should happen with this? Close it? Doc change? Code change? Discussion issue? Something else?

@bnoordhuis
Copy link
Member

Below are some observations but to summarize: it doesn't matter what code changes we make, we'll still need to document the gotchas.

  1. The example is about sync methods; an async version (or a mix of sync and async) would be rife with race conditions because concurrent file operations would observe the file position moving around.

  2. fs.readFile() could be taught to use positional reads (fs.readFileSync() already does) but positional reads aren't atomic on Windows so (1) rears its ugly head again.

  3. fs.appendFile() and fs.writeFile() in append-only mode are inherently non-positional and therefore also suffer from (1)...

  4. ...as does any file that isn't seekable for whatever reason.

@bzoz Would it be possible to teach uv_fs_read() and uv_fs_write() to do positional ops without affecting the file position? ReadFile() and WriteFile() take an OVERLAPPED that lets you specify the offset, don't they?

@bzoz
Copy link
Contributor

bzoz commented May 2, 2018

@bnoordhuis positional ReadFile/WriteFile moves the file pointer, thus we have a workaround with restoring the position in libuv/libuv#1357

We can make readFile use positional reads, using the same fd in multiple async operations is asking for trouble anyway. The current behavior however however has been in Node for quite some time, it is possible that we would break something with such change.

OTOH, maybe we should just add fs.seek?

@apapirovski
Copy link
Member

New discussion is in #23433 so I'm going to close this out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fs Issues and PRs related to the fs subsystem / file system.
Projects
None yet
Development

No branches or pull requests

9 participants