Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API proposal: Expose FileOptions.NoBuffering flag and support O_DIRECT on Linux #27408

Open
buybackoff opened this issue Sep 16, 2018 · 18 comments
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.IO
Milestone

Comments

@buybackoff
Copy link

Proposed API

namespace System.IO
{
    public enum FileOptions
    {
        ...
        NoBuffering = 0x20000000
    }
}

This flag is hidden but supported on Windows by value already (see https://github.com/dotnet/coreclr/blob/75e62c545ac5c7195bf846b47e28c4f27736d64c/src/System.Private.CoreLib/shared/System/IO/FileStream.cs#L204).

This proposal is to make it part of the public surface and implement it on Unix by mapping it to O_DIRECT.


Continuation of https://github.com/dotnet/coreclr/issues/19229 and dotnet/coreclr#19232.

Expose FileOptions.NoBuffering as a part of API. It is already unofficially supported on Windows. Just uncomment this line and add O_DIRECT support for Linux.

PR for the changes.

@jkotas
Copy link
Member

jkotas commented Sep 17, 2018

cc @JeremyKuhne @pjanotti

@stephentoub
Copy link
Member

Per my comments at dotnet/corefx#32314 (comment), I'm hesitant to see this exposed, as it looks like it would be very difficult to use this safely on unix.

@GSPP
Copy link

GSPP commented Sep 25, 2018

It is unfortunate that there is apparently no way to expose this safely due to the unstable unix semantics.

What should happen now if client code passes 0x20000000 on unix? On Windows, this is passed through to the OS although it is undocumented. Should it be silently ignored on unix or should an exception be thrown to warn that the requested mode is not available? I don't know the answer but a conscious decision should be made.

@buybackoff
Copy link
Author

buybackoff commented Sep 25, 2018 via email

@danmoseley
Copy link
Member

What is the proposal now -- to expose for Windows, throw on Unix?
@JeremyKuhne thoughts on whether the value for Windows outweighs the potential to break code when it is later attempted on Unix.

@JeremyKuhne
Copy link
Member

We should probably expose it and throw for Unix (until we can come up with a safe implementation). I think that is better than having a "hidden" option for Windows. @pjanotti, what are your thoughts?

@pjanotti
Copy link
Contributor

pjanotti commented Oct 2, 2018

My take is that we should be minimizing the differences between OSes. The fact that many exist doesn't mean that we should be adding more without a clear tangible benefit. My vote will be to not do anything until Unix story is sorted out.

@JeremyKuhne
Copy link
Member

@pjanotti To be clear- this particular one exists already, you just have to know it is there and that we check for the value. Given that both you and @stephentoub are hesitant we should probably park this pending more feedback and/or a solid proposal for how to tackle the Unix end.

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@JeremyKuhne JeremyKuhne removed the untriaged New issue has not been triaged by the area owner label Mar 3, 2020
@adamsitnik
Copy link
Member

As @buybackoff mentioned, it's currently supported on Windows, but not on Linux (the (FileOptions)0x20000000 /* NoBuffering */ check):

if (options != FileOptions.None && (options & ~(FileOptions.WriteThrough | FileOptions.Asynchronous | FileOptions.RandomAccess | FileOptions.DeleteOnClose | FileOptions.SequentialScan | FileOptions.Encrypted | (FileOptions)0x20000000 /* NoBuffering */)) != 0)

We can easily expose it (by adding an enum value) and O_DIRECT to OpenFlags:

private static Interop.Sys.OpenFlags PreOpenConfigurationFromOptions(FileMode mode, FileAccess access, FileShare share, FileOptions options)

The problem is that both on Linux and Windows it would still be hard to use as the undelrying OS mechanisms have non-trivial requirements:

  • Windows (source):
    • File access sizes, including the optional file offset in the OVERLAPPED structure, if specified, must be for a number of bytes that is an integer multiple of the volume sector size. For example, if the sector size is 512 bytes, an application can request reads and writes of 512, 1,024, 1,536, or 2,048 bytes, but not of 335, 981, or 7,171 bytes.
    • File access buffer addresses for read and write operations should be physical sector-aligned, which means aligned on addresses in memory that are integer multiples of the volume's physical sector size. Depending on the disk, this requirement may not be enforced.
  • Linux (source):
    • The data buffer being transferred must be aligned on a memory boundary that is a multiple of the block size.
    • The file or device offset at which data transfer commences must be a multiple of the block size.
    • The length of the data to be transferred must be a multiple of the block size.

Problems:

  • We would need to implement a cross-platform way of getting block size. (doable, but not trivial)
  • As of today, .NET does not expose an API that allows for the allocation of aligned memory:
  • We can allocate a buffer and get the address of the first byte that is X-bytes aligned, but this is not a great user experience and hard to use with async IO (can't use Span for async methods, would need to create a Memory out of IntPtr and I am not sure if we provide any built-in way for doing that).
  • ReadByte and WriteByte would have to take this under consideration (currently they use stackalloc byte[1]). This would require Provide Marshal.Alloc api that accepts alignment argument #33244 for async IO

@Maoni0
Copy link
Member

Maoni0 commented May 4, 2021

for .NET 6.0 I would vote to improve the performance of Marshal.Alloc. it seems like since the current perf is so bad, it wouldn't be very hard to improve it and it would benefit everyone who's currently using AllocateHGlobal.

we are currently adding the regions support. when that's ready (which will not be this release) it would make adding this much easier in GC. we can allocate regions (which are much smaller units of memory) for objects for each different alignment and collect them not just in gen2 GCs. we will also be implementing another parameter in the AllocateArray API which is the generation (again, much easier with regions). does this sound reasonable?

@jkotas
Copy link
Member

jkotas commented May 4, 2021

Marshal.Alloc. it seems like since the current perf is so bad, it wouldn't be very hard to improve it and it would benefit everyone who's currently using AllocateHGlobal.

I do not think that this is a correct conclusion.

On Unix, the performance of Marshal.Alloc* is about as good as it can be.

On Windows, there is some fixed overhead due to Windows APIs that we historically used to implement it. This overhead is typically negliable if you are using Marshal.Alloc* to allocate buffers (compared to the cost of doing something useful with the buffers). Fixing this overhead would be a breaking change (discussed e.g. in #29051).

@Maoni0
Copy link
Member

Maoni0 commented May 4, 2021

you are saying the perf is not blocker on windows so I'm proposing to simply use the Marshal.Alloc as suggested in #33244 for .NET 6.

@msedi
Copy link

msedi commented Sep 1, 2021

Isn't the problem when introducing the NoBuffering that certain conditions need to be met (memory-alignment, sector-wise reading, etc. https://docs.microsoft.com/en-us/windows/win32/fileio/file-buffering). I would love to see this but hiding the internals could lead to the problem that you don't get the full benefit out of it?

@HighPerfDotNet
Copy link

@adamsitnik asked me to comment here, my apologies for saying the obvious stuff and not contributing meaningfully to the technical discussion here.

NoBuffering is a very powerful but still obscure option that most people don't know about and thus sadly leave perf on the table or even struggle to get things running without it (like it happened to us 15 years ago), we use it as default for our file IO with rare exceptions when we know that Windows caching would actually be beneficial.

I don't know Linux well enough, but it would be very handy to have a degree of support on it also, ideally main FileStream object should deal with all the alignment issues (.NET 6 now supports aligned allocations) for the user not to care (we created our own wrapper for that) and supporting existing code 0x20000000 to mean the same there.

To drive adoption of this feature we really need a good blog post in .NET Blog demonstrating its value, especially for network writes in scenarios where amount of data written exceeds available memory on target box and there is some software running there that also got plenty of memory allocated - having it swapped to disk ins one hell of a killer even on NVMes and even disabling swap can still lead to exhaustion of resources.

@ericmutta
Copy link
Contributor

@adamsitnik following up on the discussion at #86836 here are some ideas towards supporting direct I/O on Windows and Linux.

So the first order of business is to create an abstraction over NativeMemory for allocating aligned memory blocks:

using System.Runtime.InteropServices;

// represents a block of memory that is aligned to a specified byte boundary.
public unsafe class AlignedMemoryBlock: IDisposable
{
  // size of the block in bytes.
  public int Size { get; }

  // alignment of the block in bytes.
  public int Alignment { get; }

  // pointer to the block's bytes in memory.
  public IntPtr Pointer { get; }

  // cast the block's bytes to a span of the given type.
  public Span<T> AsSpan<T>() where T : unmanaged => new(Pointer.ToPointer(), Size / sizeof(T));

  // create a new aligned memory block with the given size and alignment.
  public AlignedMemoryBlock(int ArgSize, int ArgAlignment) {
    // store the size and alignment (handy for unit testing).
    (Size, Alignment) = (ArgSize, ArgAlignment);

    // allocate the block.
    Pointer = new IntPtr(NativeMemory.AlignedAlloc((nuint)ArgSize, (nuint)ArgAlignment));
  }

  // free the block.
  public void Dispose() => NativeMemory.AlignedFree(Pointer.ToPointer());
}

// represents a block of memory that is aligned to the sector size of the platform.
public class SectorAlignedMemoryBlock: AlignedMemoryBlock
{
  // get the sector size of the platform (here we just assume it's 512 bytes).
  public static int GetPlatformSectorSize() => 512;

  // create a new sector-aligned memory block with the given size.
  public SectorAlignedMemoryBlock(int ArgSize) : base(ArgSize, GetPlatformSectorSize()) { }
}

Note that the above AlignedMemoryBlock and SectorAlignedMemoryBlock are decoupled from this particular use case and the AlignedMemoryBlock class can be used in any scenario which needs aligned memory blocks for whatever reason 👍

Next up we need a way to open a file with the right attributes for direct I/O using SectorAlignedMemoryBlock instances. This means (for my current needs at least):

  • For Windows: FILE_FLAG_WRITE_THROUGH | FILE_FLAG_NO_BUFFERING | FILE_FLAG_OVERLAPPED
  • For Linux: O_DIRECT | O_SYNC

The FileStream class already has 500 million constructors, so I don't think its a good idea to overload it yet again for this use case. Since I am already using the handy RandomAccess class for this kinda work, and we use File.OpenHandle to get instances of SafeFileHandle, I propose File.OpenBlockHandle and a SafeBlockFileHandle:

// handle to files opened for direct block I/O operations.
public sealed class SafeBlockFileHandle: IDisposable
{
  /// <inheritdoc />
  public void Dispose() { }
}

public partial class File
{
  // open a file for direct block I/O operations. It should have most of the same arguments as the current
  // File.OpenHandle() function but enforce the direct I/O flags (e.g. O_DIRECT | O_SYNC on Linux).
  public static SafeBlockFileHandle OpenBlockHandle(string ArgPath) => new SafeBlockFileHandle();
}

Finally we need functions to read and write blocks directly to/from disk. I think we can just add these as overloads to the RandomAccess class (though it may be cleaner to create a RandomBlockAccess class with similar set of functions but dedicated to this use case):

public partial class RandomAccess
{
  // reads a block of bytes from the given file handle at the given offset into the given buffer.
  public static ValueTask<int> ReadAsync(SafeBlockFileHandle ArgHandle, SectorAlignedMemoryBlock ArgBuffer, long ArgOffset) {
    throw new NotImplementedException();
  }

  // writes a block of bytes to the given file handle at the given offset from the given buffer.
  public static ValueTask WriteAsync(SafeBlockFileHandle ArgHandle, SectorAlignedMemoryBlock ArgBuffer, long ArgOffset) {
    throw new NotImplementedException();
  }
}

With all the pieces in place, usage will look something like this:

    // open a file for direct block I/O operations.
    using var file = File.OpenBlockHandle(@"c:\my-binary-file.dat");

    // allocate a 4KB block.
    using var block = new SectorAlignedMemoryBlock(4096);

    // read data from the file into the block.
    await RandomAccess.ReadAsync(file, block, ArgOffset: 0);

Remaining issues/questions to address:

  • This is a rough draft written in one sitting. It will need more work (e.g. adding cancellation tokens to the read/write functions, making sure an AlignedMemoryBlock can't be disposed multiple times, etc).
  • I added AlignedMemoryBlock.AsSpan<T>() thinking it would give me access to the memory block as a Span<T> but those can't be used in async methods and I couldn't find a way to create a Memory<T> from a void* or an IntPtr. This part needs attention from an expert on the subject because as it stands above, you can create an AlignedMemoryBlock but then you have to deal with the IntPtr when trying to read from or write to it.
  • When doing direct I/O, is there a constraint on the offset parameter as well? If so, the functions will need to validate that argument before doing I/O.
  • The RandomAccess class supports scatter/gather I/O. I was interested in this for my use case but found it too troublesome to handle block-level locking/unlocking when you have to do it for multiple blocks as they are being gathered or scattered. But I suspect others may find use for it so we may need overloads that take SectorAlignedMemoryBlock buffers.
  • People doing this kinda thing usually know that they need to pool instances of classes like SectorAlignedMemoryBlock and the current ArrayPool<T> won't be any help here because it can allocate blocks larger than requested. It might be useful to provide another pooling implementation suitable for fixed-size blocks like the ones for this use case (I know about ObjectPool<T> but that creates new objects when the pool is exhausted which is not ideal if you want to enforce an upper bound on pool size).
  • The RandomAccess class helps with multi-threaded access to a file, but we still don't have good synchronization primitives for use with async methods. I have had to create a custom AsyncReaderWriterLock for my use case (thanks @stephentoub for those excellent articles by the way!) and I wish could use something already built in.

Last but not least: I salute the .NET team for the amazing work you all do. This is just one small set of APIs and it's taken me several hours to type all this up, I can't imagine the insane amount of hours required to churn out all those new APIs we get with each version of .NET!

@adamsitnik
Copy link
Member

I propose File.OpenBlockHandle and a SafeBlockFileHandle

This could be avoided by extending FileOptions with NoBuffering as initially suggested by @buybackoff

So to open the handle you would need to pass FileOptions.NoBuffering to File.OpenHandle

create an abstraction over NativeMemory for allocating aligned memory blocks

We already have a helper that derives from MemoryManager for that:

internal sealed class SectorAlignedMemory<T> : MemoryManager<T>

But it's internal. But overall I agree that if we want to expose NoBuffering and allow our users for an easy way to do async IO we should provide such type so our users don't need to reinvent the wheel anytime they want to use it..

When doing direct I/O, is there a constraint on the offset parameter as well?

Yes, IIRC in needs to be a multiple of volume sector size: https://learn.microsoft.com/en-us/windows/win32/fileio/file-buffering#alignment-and-file-access-requirements

ArrayPool won't be any help here because it can allocate blocks larger than requested

You could always rent a bigger array, pin it so the address does not change when GC happens in the meantime, get the first aligned address of it and use only a slice of it by for example creating a Memory<byte> out of it (new Memory<byte>(pinnedArray, firstAlignedIndex, size))

@ericmutta thank you for your feedback!

@ericmutta
Copy link
Contributor

@adamsitnik: So to open the handle you would need to pass FileOptions.NoBuffering to File.OpenHandle

Thanks for following up! Some questions here: if I pass FileOptions.NoBuffering to File.OpenHandle will there be any validation done by the read/write functions that the size/alignment requirements for direct I/O are being met? If yes, will these be done for every single call to those functions?

My thinking behind suggesting ReadAsync(SafeBlockFileHandle, SectorAlignedMemoryBlock) was that the O_DIRECT requirements would be validated once in the relevant data types, then all the read/write functions have to do for validation is a ensure the parameters are not null. Also, while on the topic of validation, what do Windows and Linux do when the requirements are not met, do we just get an IOException or does the process crash?

I appreciate all the work you are doing on this and look forward to seeing how it evolves to the final result 🙏

@adamsitnik
Copy link
Member

@ericmutta the OS performs the validation for every call anyway. But it reports quite a generic error (invalid parameter) and then our job would be to check the alignment & offset and provide more detailed error message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.IO
Projects
None yet
Development

No branches or pull requests