Skip to content

Commit

Permalink
[mono][wasm] Bundle assemblies as WebCIL (#79416)
Browse files Browse the repository at this point in the history
Define a new container format for .NET assemblies that looks less like a Windows PE file. Use it for bundling assemblies in wasm projects.

* Implement WebCIL loader

  It will try to look for WebCIL formatted images instread of normal .dll files

* Checkpoint works on wasm sample; add design doc

* Push .dll->.webcil probing lower in the bundle logic

* Also convert satellite assemblies and implement satellite matching

* [wasm] don't leak .webcil image names to the debugger

   In particular this will make source and breakpoint URLs look like `dotnet://foo.dll/Foo.cs` which means that grabbing PDBs via source link will work, etc.

* Add PE DebugTableDirectory to webcil

   This is used to retrieve the PPDB data and/or the PDB checksum from an image.

   Refactor mono_has_pdb_checksum to support webcil in addition to PE images

* Implement a WebcilReader for BorwserDebugProxy like PEReader

  This needs some improvements:
   - add support for reading CodeView and EmbeddedPDB data
   - copy/paste less from the WebcilWriter task
   - copy/paste less from PEReader (will require moving WebcilReader to SRM)

* [debug] Match bundled pdbs if we're looking up .webcil files

  The pdbs are registered by wasm with a notional .dll filename. if the debugger does a lookup using a .webcil name instead, allow the match

* Adjust debug directory entries when writing webcil files

   the PE COFF debug directory entries contain a 'pointer' field which is an offset from the start of the file.

   When writing the webcil file, the header is typically smaller than a PE file, so the offsets are wrong.  Adjust the offsets by the size of the file.

   We assume (and assert) the debug directory entries actually point at some PE COFF sections in the PE file (as opposed to somewhere past the end of the known PE data).

   When writing, we initially just copy all the sections directly, then seek to where the debug directory entries are, and overwrite them with updated entries that have the correct 'pointer'

* Fix bug in WebcilWriter

   Stream.CopyTo takes a buffer size, not the number of bytes to copy.

* bugfix: the debug directory is at pe_debug_rva not at the CLI header

* skip debug fixups if there's no debug directory

* WebcilReader: implement CodeView and Emebedded PPDB support

* [WBT] Add UseWebcil option (default to true)

* rename WebcilWriter -> WebcilConverter [NFC]

* fixup AssemblyLoadedEventTest

* hack: no extension on assembly for breakpoint

* pass normal .dll name for MainAssemblyName in config

   let the runtime deal with it - bundle matching will resolve it to the .webcil file

* Wasm.Debugger.Tests: give CI 10 more minutes

* Add Microsoft.NET.WebAssembly.Webcil assembly project

   Mark it as shipping, but not shipping a nuget package.

   The idea is that it will be shipped along with the WasmAppBuilder msbuild task, and with the BrowserDebugProxy tool.

* Move WebcilConverter to Microsoft.NET.WebAssembly.Webcil

* Move WebcilReader to Microsoft.NET.WebAssembly.Webcil

   delete the duplicated utility classes

* make the webcil magic and version longer

* Code style improvements from review

* Improve some exception messages, when possible

* Suggestings from code review

* Add WasmEnableWebcil msbuild property.  Off by default

* Build non-wasm runtimes without .webcil support

* Run WBT twice: with and without webcil

   This is a total of 4 runs: with and without workloads x with and without webcil

* do the cartesian product correctly in msbuild

* also add webcil to template projects

* environment variable has to be non-null and "true"

   We set it to "false" sometimes

* Fix wasm work items

   They should be the same whether or not webcil is used.  Just the WorkloadItemPrefix should be used to change the name.

* Update src/libraries/sendtohelix-wasm.targets

* PInvokeTableGeneratorTests: don't try to use the net472 WasmAppBuilder

   Look for the default target framework subdirectory under the tasks directory in the runtime pack when trying to find the tasks dll. In particular don't try to load the net472 version on modern .NET

* PInvokeTableGeneratorTests: Add more diagnostic output if tasksDir is not found

* simplify prefix comparison in bundled_assembly_match

* WasmAppBuilder improve logging

   Just emit a single Normal importance message about webcil; details as Low importance.

* Add missing using

Co-authored-by: Ankit Jain <radical@gmail.com>
Co-authored-by: Larry Ewing <lewing@microsoft.com>
  • Loading branch information
3 people committed Jan 21, 2023
1 parent 0aad863 commit 68d1b8f
Show file tree
Hide file tree
Showing 41 changed files with 1,508 additions and 62 deletions.
111 changes: 111 additions & 0 deletions docs/design/mono/webcil.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# WebCIL assembly format

## Version

This is version 0.0 of the Webcil format.

## Motivation

When deploying the .NET runtime to the browser using WebAssembly, we have received some reports from
customers that certain users are unable to use their apps because firewalls and anti-virus software
may prevent browsers from downloading or caching assemblies with a .DLL extension and PE contents.

This document defines a new container format for ECMA-335 assemblies
that uses the `.webcil` extension and uses a new WebCIL container
format.


## Specification

As our starting point we take section II.25.1 "Structure of the
runtime file format" from ECMA-335 6th Edition.

| |
|--------|
| PE Headers |
| CLI Header |
| CLI Data |
| Native Image Sections |
| |



A Webcil file follows a similar structure


| |
|--------|
| Webcil Headers |
| CLI Header |
| CLI Data |
| |

## Webcil Headers

The Webcil headers consist of a Webcil header followed by a sequence of section headers.
(All multi-byte integers are in little endian format).

### Webcil Header

``` c
struct WebcilHeader {
uint8_t id[4]; // 'W' 'b' 'I' 'L'
// 4 bytes
uint16_t version_major; // 0
uint16_t version_minor; // 0
// 8 bytes
uint16_t coff_sections;
uint16_t reserved0; // 0
// 12 bytes

uint32_t pe_cli_header_rva;
uint32_t pe_cli_header_size;
// 20 bytes

uint32_t pe_debug_rva;
uint32_t pe_debug_size;
// 28 bytes
};
```

The Webcil header starts with the magic characters 'W' 'b' 'I' 'L' followed by the version in major
minor format (must be 0 and 0). Then a count of the section headers and two reserved bytes.

The next pairs of integers are a subset of the PE Header data directory specifying the RVA and size
of the CLI header, as well as the directory entry for the PE debug directory.


### Section header table

Immediately following the Webcil header is a sequence (whose length is given by `coff_sections`
above) of section headers giving their virtual address and virtual size, as well as the offset in
the Webcil file and the size in the file. This is a subset of the PE section header that includes
enough information to correctly interpret the RVAs from the webcil header and from the .NET
metadata. Other information (such as the section names) are not included.

``` c
struct SectionHeader {
uint32_t st_virtual_size;
uint32_t st_virtual_address;
uint32_t st_raw_data_size;
uint32_t st_raw_data_ptr;
};
```

### Sections

Immediately following the section table are the sections. These are copied verbatim from the PE file.

## Rationale

The intention is to include only the information necessary for the runtime to locate the metadata
root, and to resolve the RVA references in the metadata (for locating data declarations and method IL).

A goal is for the files not to be executable by .NET Framework.

Unlike PE files, mixing native and managed code is not a goal.

Lossless conversion from Webcil back to PE is not intended to be supported. The format is being
documented in order to support diagnostic tooling and utilities such as decompilers, disassemblers,
file identification utilities, dependency analyzers, etc.

Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
<Project>
<Import Project="..\Directory.Build.props" />
<PropertyGroup>
<IsShipping>true</IsShipping>
<!-- this assembly should not produce a public package, rather it's meant to be shipped by the
WasmAppBuilder task and the BrowserDebugProxy -->
<IsShippingPackage>false</IsShippingPackage>
<!-- This isn't a public API in a public package, don't ship documentation xml in the nugets that consume this assembly -->
<GenerateDocumentationFile>false</GenerateDocumentationFile>
</PropertyGroup>
</Project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

namespace System.Runtime.CompilerServices
{
internal sealed class IsExternalInit { }
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFrameworks>$(NetCoreAppToolCurrent);$(NetFrameworkToolCurrent)</TargetFrameworks>
<Description>Abstractions for modifying .NET webcil binary images</Description>
<IncludeSymbols>true</IncludeSymbols>
<Serviceable>true</Serviceable>
<AllowUnsafeBlocks>true</AllowUnsafeBlocks>
<CLSCompliant>false</CLSCompliant>
</PropertyGroup>

<ItemGroup>
<!-- we need to keep the version of System.Reflection.Metadata in sync with dotnet/msbuild and dotnet/sdk -->
<PackageReference Include="System.Reflection.Metadata" Version="$(SystemReflectionMetadataVersion)" />
<PackageReference Include="System.Collections.Immutable" Version="$(SystemCollectionsImmutableVersion)" />
</ItemGroup>

<ItemGroup>
<Compile Include="Webcil\**\*.cs" />
</ItemGroup>

<ItemGroup Condition="'$(TargetFrameworkIdentifier)' == '.NETFramework'">
<Compile Include="Common\IsExternalInit.cs" />
</ItemGroup>
</Project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

namespace Microsoft.NET.WebAssembly.Webcil.Internal;

internal static unsafe class Constants
{
public const int WC_VERSION_MAJOR = 0;
public const int WC_VERSION_MINOR = 0;
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

using System.Runtime.InteropServices;

namespace Microsoft.NET.WebAssembly.Webcil;

/// <summary>
/// The header of a WebCIL file.
/// </summary>
///
/// <remarks>
/// The header is a subset of the PE, COFF and CLI headers that are needed by the mono runtime to load managed assemblies.
/// </remarks>
[StructLayout(LayoutKind.Sequential, Pack = 1)]
public unsafe struct WebcilHeader
{
public fixed byte id[4]; // 'W' 'b' 'I' 'L'
// 4 bytes
public ushort version_major; // 0
public ushort version_minor; // 0
// 8 bytes

public ushort coff_sections;
public ushort reserved0; // 0
// 12 bytes
public uint pe_cli_header_rva;
public uint pe_cli_header_size;
// 20 bytes
public uint pe_debug_rva;
public uint pe_debug_size;
// 28 bytes
}
Loading

0 comments on commit 68d1b8f

Please sign in to comment.