Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Type system ordering of: object interfaces, directive arguments, input object fields, enum values #1063

Open
wants to merge 5 commits into
base: maintain-order
Choose a base branch
from

Conversation

benjie
Copy link
Member

@benjie benjie commented Nov 21, 2023

UPDATE 2024-04-05: Following last night's WG meeting, this PR has been split into 3 parts, and is now stacked behind these PRs:


As raised by @cdaringe in #1062, enum values don't dictate an explicit order. On scanning through the spec in more detail, it turns out this is true of a few other things too.

Everything in SDL is implicitly ordered (since it's a textual representation, one thing after another) and everything in introspection is implicitly ordered (because it's represented via lists). E.g. in introspection, enumValues is a list, and a list is inherently ordered.

I feel it's an unwritten rule that GraphQL introspection should be stable (i.e. introspect the exact same schema twice with the exact same introspection query and the results should be the same). Thus, there should be an order (dictated by the schema designer), and I'd like to make that more explicit.

I researched the current status, and I think we can start to fix this with the few minor edits I made to the spec in this PR, in particular:

  1. changing from using the word set (which is generally perceived as unordered) to the word list (which is always ordered),
  2. specifying list for things that are currently ambiguous.

Generally this was achieved by copying text from similar things, e.g. the directive arguments copied from field arguments; input object fields copied from object fields.

I know that @IvanGoncharov has been very careful in graphql-js to ensure that ordering is stable, I believe he ensures that introspection -> SDL -> introspection always results in the same results.

Status before this PR (emphasis mine)

Object fields: ordered ✅

GraphQL Objects represent a list of named fields, each of which yield a value of a specific type.

Object field arguments: ordered ✅

Object field arguments are defined as a list of all possible argument names and their expected input types

Object interfaces: not declared? 😞

An object type may declare that it implements one or more unique interfaces.

(I couldn't find anything in Section 3 declaring set/list.)

Input object fields: a "set" ❌

A GraphQL Input Object defines a set of input fields

Enum values: a "set" ❌

However Enum types describe the set of possible values

Directive arguments: not declared? 😞

(I couldn't find anything in Section 3 declaring set/list.)

Copy link

netlify bot commented Nov 21, 2023

Deploy Preview for graphql-spec-draft ready!

Name Link
🔨 Latest commit d7d9e37
🔍 Latest deploy log https://app.netlify.com/sites/graphql-spec-draft/deploys/660fbffa67af420008052209
😎 Deploy Preview https://deploy-preview-1063--graphql-spec-draft.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@@ -1473,7 +1474,7 @@ EnumValuesDefinition : { EnumValueDefinition+ }
EnumValueDefinition : Description? EnumValue Directives[Const]?

GraphQL Enum types, like Scalar types, also represent leaf values in a GraphQL
type system. However Enum types describe the set of possible values.
type system. However Enum types describe the list of possible values.
Copy link
Member Author

@benjie benjie Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered "However Enum types describe as a list the set of possible values." but it didn't seem worth the extra words.

@cdaringe
Copy link

The prior presence of “set” to me unambiguously implied unordered. Your switch to “list” i think is likely? a weaker implicit association with ordering for serialized content via introspection.

This proposal definitely works (happy to support it), but i think we needed top level or inline definitions/assertions about what terms (eg list) apply ordered or not.

@benjie
Copy link
Member Author

benjie commented Nov 22, 2023

The prior presence of “set” to me unambiguously implied unordered.

I don't think the term "set" itself implies ordered or unordered, "set" is independent of ordering. For example in JavaScript a Set is ordered, in Java a TreeSet is ordered but a HashSet is not, in mathematics you can have ordered and partially ordered sets. Whether the sets were ordered or not was not stated.

i think we needed top level or inline definitions/assertions about what terms (eg list) apply ordered or not.

I agree. The term "list" itself is not as unamiguously "ordered" as I at first thought; for example here's an excerpt from an A-level Computing syllabus in the UK:

A list is a number of items in an ordered or unordered structure.
-- https://en.wikibooks.org/wiki/A-level_Computing/AQA/Problem_Solving,_Programming,_Operating_Systems,_Databases_and_Networking/Programming_Concepts/Lists

Similarly HTML has both <ol> and <ul> tags for ordered and unordered lists respectively.


An alternative fix would be to leave the existing wording as-is (but maybe slightly edited for consistency) and instead indicate that all lists and sets in the spec are ordered unless otherwise stated. We currently leave the order to be "implementation defined". Essentially, we are asking the question "are the two following schemas the same?":

enum E {
  A
  Z
}
type Query {
  a: E
}
enum E {
  Z
  A
}
type Query {
  a: E
}

I'd argue that they are not, and that the order of the enum is significant. A change in that order would trigger a change in the printed SDL that you might version control, a change in any generated documentation, a change in browsing it in GraphiQL, and could have many other consequences elsewhere (for example code generation). We should definitely be more explicit about this.

@benjie
Copy link
Member Author

benjie commented Nov 22, 2023

I've added a note to the beginning of the spec 👍

spec/GraphQL.md Outdated
**Lists Are Ordered**

Unless otherwise stated (for example, "an unordered list"), any mention of the
term "list" in this document indicates an ordered collection.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost feels like we need to differentiate between syntax order and semantic (~execution?) order. In the syntax, I think I always want determined order. At execution, sometimes the order is unimportant like for input object literals.
Don't want to bikeshed this more than needed though. This PR is a net improvement to clarity so I'm all for it 👍 .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The coercion of input objects is explicitly unordered: https://spec.graphql.org/draft/#sec-Input-Objects.Input-Coercion; the order of a literal is arguably important though, e.g. {f(o:{a:1,b:2})} and {f(o:{b:2,a:1})} are concretely different documents even though the coerced interpretation of the o argument is equivalent.

spec/GraphQL.md Outdated
**Lists Are Ordered**

Unless otherwise stated (for example, "an unordered list"), any mention of the
term "list" in this document indicates an ordered collection.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 . Thanks--this catch-all I think does a lot of important work 😃

@leebyron
Copy link
Collaborator

leebyron commented Jan 4, 2024

Following up on this after the discussion in the WG meeting today, my suggestions:

  1. Add an appendix section describing the data structure terms we use more specifically along with a note about preserving observable order for semantically unordered data when possible.

  2. In the introductory section on conformance, add in references to "data collections" alongside "algorithms" where relevant. What we say there about allowing differences in implementation as long as outcomes are observably correct remains true for what data structures implementors choose to use.

  3. Add in a non-normative note in the introspection section about how order "should" be preserved for the unordered sets when possible (but remove redefining set to list)

This was a bit of a riff, so edit away:

## A.6 Data Collections

This specification describes the semantic properties of data collections 
using types like "list", "set" and "map". These describe observable data 
collections such as the result of applying a grammar and the inputs and 
outputs of algorithms. They also describe unobservable data collections 
such as temporary data internal to an algorithm. Each data collection 
type defines the operations available, and whether values unique or ordered.

**List**

The term "list" describes a sequence of values which may not be unique. 
A list is ordered unless explicitly stated otherwise (as an "unordered 
list"). For clarity the term "ordered list" may be used when an order 
is semantically important.

**Set**

The term "set" describes a unique collection of values, where each value 
is considered a "member" of that set. A set is unordered unless explicitly 
stated otherwise (as an "ordered set"). For clarity the term "unordered set" 
may be used when the lack of an order is semantically important.

**Map**

The term "map" describes a collection of "entry" key and value pairs, where 
the set of keys across all entries is unique but the values across all 
entries may repeat. A map is unordered unless explicitly stated otherwise 
(as an "ordered map"). For clarity the term "unordered map" may be used 
when the lack of an order is semantically important.

Note: To improve legibility, when possible implementations should preserve
observable order for unordered data collections. For example, if an applied 
grammar to an input string results in an unordered set, serializing that 
set (to a string or other observable output) should produce the same order 
found in the original input string.

@benjie
Copy link
Member Author

benjie commented Mar 27, 2024

@leebyron I've changed list, set and map into definitions and have also changed the final non-normative note into a normative statement since it uses the word should (equivalent to we recommend, which is true - we do recommend that) and thus does not add a requirement; but otherwise I've not much changed your section A.6 suggestion other than the final sentence.

I've also attempted to implement your other suggestions, and have then gone through and corrected the types throughout section 3, for example an object doesn't define a list of fields but more an ordered set of fields...

... except, is it really a map where the field names are the key? I'll add this to the WG next month.

Copy link
Member

@michaelstaib michaelstaib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This clarification is nice!

@leebyron leebyron added 📄 Draft (RFC 2) RFC Stage 2 (See CONTRIBUTING.md) and removed 💭 Strawman (RFC 0) RFC Stage 0 (See CONTRIBUTING.md) labels Apr 4, 2024
@benjie benjie changed the base branch from main to maintain-order April 5, 2024 09:10
@benjie
Copy link
Member Author

benjie commented Apr 5, 2024

types which implement this Interface are guaranteed to implement those fields.
Whenever a field claims it will return an Interface type, it will return a valid
implementing Object type during execution.
An `Interface` defines an ordered set of fields; `Object` types and other
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO Interface/Object is not really an ordered set? Maybe it is? It feels a little bit annoying for maintainers to feel like they must maintain order (does the order of fields in the object need to match the order in the interface)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, IMO every "set" we have is in fact an ordered set, so specifying ordered set vs. just set is more, not less, confusing.

Am I wrong and is there in fact any unordered set? Or should we just use set and use unordered set when we don't expect you to be able to preserve order?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A set in set theory is unordered, and calling a programming data structure a set outside of this spec generally implies it to be unordered. Even if it wouldn’t be exactly wrong for the spec to only say in one place "all sets in this document are implicitly ordered" it’s not great spec-writing. Specifying "ordered set" explicitly every time makes readers less likely to make a wrong assumption.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SimonSapin

A set in set theory is unordered

Veering a little off topic, but that's not technically true, more accurate might be to say that pure set theory is "order agnostic" - the concept of order is entirely absent from the definition of a set. Typically set theory and order theory go hand in hand, and a set can be ordered, unordered, or even partially ordered.

That said, I agree with your conclusion that explicit > implicit.


@mjmahone

IMO Interface/Object is not really an ordered set? Maybe it is? It feels a little bit annoying for maintainers to feel like they must maintain order (does the order of fields in the object need to match the order in the interface)?

Great catch. IMO an interface defines an ordered set of fields (order is significant, and must be reflected in introspection); however we should state that implementations can implement this set of fields in any order - I've added a commit 7170d82 to address this.

@@ -920,7 +921,8 @@ of rules must be adhered to by every Object type in a GraphQL schema.
returns {true}.
4. If argument type is Non-Null and a default value is not defined:
1. The `@deprecated` directive must not be applied to this argument.
3. An object type may declare that it implements one or more unique interfaces.
3. An object type may declare that it implements a set of one or more unique
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this also be an ordered set of interfaces?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep; great catch - 602d319

@@ -1347,7 +1350,7 @@ UnionMemberTypes :
- UnionMemberTypes | NamedType
- = `|`? NamedType

GraphQL Unions represent an object that could be one of a list of GraphQL Object
GraphQL Unions represent an object that could be one of a set of GraphQL Object
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not ordered?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree ce8c732

Comment on lines 304 to +305
Scalars and Enums form the leaves in response trees; the intermediate levels are
`Object` types, which define a set of fields, where each field is another type
in the system, allowing the definition of arbitrary type hierarchies.
`Object` types, which define an ordered set of fields, where each field is

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn’t objects (and interfaces, and input objects) have an ordered map of fields? The field names are unique, not the field definitions as a whole. foo: Int, foo: String would be a set of two different fields that happen to share a name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something I raised in the WG actually; but that change is beyond the scope of this PR.

types which implement this Interface are guaranteed to implement those fields.
Whenever a field claims it will return an Interface type, it will return a valid
implementing Object type during execution.
An `Interface` defines an ordered set of fields; `Object` types and other

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A set in set theory is unordered, and calling a programming data structure a set outside of this spec generally implies it to be unordered. Even if it wouldn’t be exactly wrong for the spec to only say in one place "all sets in this document are implicitly ordered" it’s not great spec-writing. Specifying "ordered set" explicitly every time makes readers less likely to make a wrong assumption.

@@ -920,7 +921,8 @@ of rules must be adhered to by every Object type in a GraphQL schema.
returns {true}.
4. If argument type is Non-Null and a default value is not defined:
1. The `@deprecated` directive must not be applied to this argument.
3. An object type may declare that it implements one or more unique interfaces.
3. An object type may declare that it implements a set of one or more unique

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented interfaces are semantically a set but is duplication in the type definition ignored or an error? For example, is type T implements I & I equivalent to type T implements I or does it make a schema invalid?

https://spec.graphql.org/draft/#sec-Objects.Type-Validation

An object type may declare that it implements one or more unique interfaces.

This suggest that duplication is invalid, but it could be a bit clearer with wording closer to that used for fields (“The field must have a unique name”):

An object type may declare that it implements one or more interfaces. Declared interface names must be unique.

Similarly in https://spec.graphql.org/draft/#sec-Interfaces.Type-Validation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you feel about a change in the wording:

-3. An object type may declare that it implements an ordered set of one or more
-   unique interfaces.
+3. An object type may declare that it implements one or more unique interfaces,
+   these interfaces form an ordered set.

Personally I think that the wording already covers this ("one or more unique interfaces" implies each can only be specified once, and then "an ordered set of " makes it into an ordered set afterwards); but perhaps this more awkward wording is more precise?

Comment on lines +987 to 988
arguments are defined as an ordered set of all possible argument names and their
expected input types.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to fields (comment above): rather than a set, shouldn’t field definitions (and directive definitions) have an ordered map of argument names to argument definitions?

And the values of this map are more than expected types: an argument definition may also include a default value and directives)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here can be interpreted as "(an ordered set of all possible argument names) (and their expected input types)" so the set is fine; but I hate ambiguity and would rather define it as a map too. I think that's currently out of scope for this PR though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📄 Draft (RFC 2) RFC Stage 2 (See CONTRIBUTING.md)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants