SharedTree Engineering Plan #8662

DLehenbauer · 2022-01-05T21:18:44Z

Engineering Plan

January marks the beginning of our engineering work on the converged tree DDS, which builds on Autodesk's pioneering work with the Property Tree DDS and incorporates novel approaches for preserving user intention contributed by the Microsoft Whiteboard team.

Following the Fluid naming conventions, we will adopt the name "SharedTree" for the converged tree DDS going forward (not to be confused with the experimental DDS currently residing in the '/experimental/tree' directory).

Engineering will happen in the main FluidFramework branch. Since we are on a productization path, we will forgo the use of the '/experimental/' directory. However, we will continue to publish packages under "@fluid-experimental/" until we are ready to release.

January

ChangeSet Format (Design)

The changeset format is the most critical aspect of the SharedTree design. Changesets are used as both the wire format for communicating changes between clients as well as the storage format for summarizing the tree to disk (for storage, the changeset is squashed such that it is a snapshot of the current tree state.)

Our highest priority for January is finalizing the changeset format. @yann-achard-MS has identified an algebra of intention preserving data change operations and is working on a proposal to express these operations in an efficient PropertySet-like changeset format.

In general, the design of the changeset format is challenging due to the many constraints:

The format must be compact to minimize storage and network costs.
The format must be spatially ordered (i.e., organized hierarchically matching the tree structure) for efficient filtering for clients with partial views.
However, the format must also preserve temporal order (i.e., the order in which changes were applied) when necessary for intention preservation.
The format must support the ability to be split into (roughly) fixed-size chunks to support efficient random/partial access.
The format must encode enough information so that it may be split and partially applied to pre-existing chunk boundaries.

@yann-achard-MS, along with support from @ruiterr and @PaulKwMicrosoft, will continue to finalize the changeset format in January. Our key result is to unblock implementation of rebasing and encoding of changsets at the beginning of February.

In-Memory Representation (Design)

The client's In-Memory representation is a key integration point that enables end-to-end testing. Lingering concerns about runtime costs when operating on high density data, such as mesh vertices, have made it difficult to completely close on aspects of the in-memory representation.

To rectify this, @CraigMacomber is joining @PaulKwMicrosoft to drill into our remaining pathological cases. (See Craig's document on tree compression here.) The key result is to unblock implementation of the in-memory representation by the end of January so we can integrate other deliverables as they arrive.

Queued Work

The following work items are also candidates for January but deemed less important than the design items called out above (currently they are unfunded)

Op Algebra Implementation

The op algebra used to represent data changes is sufficiently stable that we should begin the initial implementation and verification soon to bolster our confidence in the correctness of the design (intention preservation, convergence, parity with PropertySet and Whiteboard DDSes, etc.)

Changeset Encode/Decode and Chunking

A significant portion of the work to encode/decode and chunk changesets is independent of the final changeset format and could begin now (and would be informative to the changeset design work with respect to size tradeoffs.)

Of particular interest are the algorithms for splitting changesets into (roughly) fixed-sized blobs and splitting incoming changesets along pre-existing chunk/view boundaries. In both cases, it is desirable to perform the intersection operations without decoding and re-encoding either the source or destination chunk.

DLehenbauer · 2022-04-04T14:49:22Z

Content has been merged into Epic #8273

ghost added the triage label Jan 5, 2022

DLehenbauer self-assigned this Jan 5, 2022

yann-achard-MS mentioned this issue Jan 13, 2022

Add docs for prospective SharedTree Changeset format #8741

Merged

tylerbutler added the area: dds Issues related to distributed data structures label Jan 21, 2022

tylerbutler added this to the January 2022 milestone Jan 21, 2022

ghost removed the triage label Jan 21, 2022

tylerbutler added the design-required This issue requires design thought label Jan 21, 2022

anthony-murphy modified the milestones: January 2022, February 2022 Feb 1, 2022

anthony-murphy modified the milestones: February 2022, March 2022 Mar 1, 2022

DLehenbauer closed this as completed Apr 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SharedTree Engineering Plan #8662

SharedTree Engineering Plan #8662

DLehenbauer commented Jan 5, 2022 •

edited

Loading

DLehenbauer commented Apr 4, 2022

SharedTree Engineering Plan #8662

SharedTree Engineering Plan #8662

Comments

DLehenbauer commented Jan 5, 2022 • edited Loading

Engineering Plan

January

ChangeSet Format (Design)

In-Memory Representation (Design)

Queued Work

Op Algebra Implementation

Changeset Encode/Decode and Chunking

DLehenbauer commented Apr 4, 2022

DLehenbauer commented Jan 5, 2022 •

edited

Loading