Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cosmos: rework expressions around scalar/structural types #33999

Open
roji opened this issue Jun 15, 2024 · 1 comment
Open

Cosmos: rework expressions around scalar/structural types #33999

roji opened this issue Jun 15, 2024 · 1 comment

Comments

@roji
Copy link
Member

roji commented Jun 15, 2024

The same expression (syntax-wise) in Cosmos can return both scalars and structural types. For example x.Foo can return a scalar (1) or a structural type (JSON object representing an entity type). This is different from relational, where generally an expression type either returns a scalar (e.g. ColumnExpression), or a structural type (e.g. TableExpression, which represents a set of structural types). An exception to this in relational is probably JSON column access.

Our general SQL expression tree design mirrors this: SqlExpression represents scalars (has a TypeMapping), non-SqlExpressions represent structural types. In Cosmos (but also in some places in relational) things are different: the same expression can typically return both a scalar and a structural type. For example, a JSON property access (x.Foo) can return a scalar or structural type.

As a result, after #33998 the Cosmos query pipeline has an explosion of explosion of expression types which represent the same syntax, but return different things. For example, ScalarAccessExpression represents x.Foo where Foo is a scalar, ObjectAccessExpression represents the same where Foo is a structural type, and ObjectArrayAccessExpression represents the same where Foo is an array of structural types.

This is a bad state of affairs; I considered unifying by e.g. having a dummy type mapping for structural types (allowing SqlExpression to represent structural types as well), but the expression split goes into shaper generation as well. So for now I continued along the current path of duplicating expression types. A more modern shaper generation architecture (and I think more aligned to relational) wouldn't require this separation at the expression level, but rather recognizes structural types via StructuralTypeShaperExpression; I went a bit in this direction but more work is needed.

Once our shaper no longer looks at the server/syntax expressions to determine structural type information (but uses StructuralTypeShaperExpression instead), we should be able to remove all structural type/navigation information from those syntax expressions, and unify them. This would make a much clearer separation between server (query) and client (shaper).

@roji
Copy link
Member Author

roji commented Jun 19, 2024

Note: for 9.0, consider at least unifying all expression pairs which the shaper doesn't care about. We could have a CosmosExpression which can have either a SqlExpression (when it represents a scalar) or an ITypeBase (when it represents a structural type - though actually having the type isn't really needed at the moment).

@ajcvickers ajcvickers removed their assignment Aug 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants