Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add a way to reorder elements during encoding #2722

Draft
wants to merge 2 commits into
base: dev
Choose a base branch
from

Conversation

Chuckame
Copy link
Contributor

Closes #2668

Copy link
Contributor

@pdvrieze pdvrieze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good starting point, but some issues, including a lack of proper support for nested structs (or a nested struct to be wrapped in a value class).

private val mapElementIndex: (SerialDescriptor, Int) -> Int,
) : CompositeEncoder {
// Each time we encode a field, if the next expected schema field index is not the good one, it is buffered until it's the time to encode it
private var bufferedCalls = Array<BufferedCall?>(structureDescriptor.elementsCount) { null }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cache means you need to create an instance at least per structure descriptor, but more conveniently for every instance. In such case beginStructure can just become private val delegateEncoder: CompositeEncoder.

Although it may also make sense to have this retained (and used for subStructures, although in such case the element index would be an important parameter)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the first part, I suppose you were talking about the supplier that should just be the encoder (no supplier), and I agree.

But for the second part, I don't understand, the buffer is just for the corresponding descriptor, that should not be shared with sub-structure: we are reordering one level of a given descriptor, but not the sub structures.

Comment on lines 215 to 235
override fun beginStructure(descriptor: SerialDescriptor): CompositeEncoder {
invalidCall("beginStructure")
}

override fun encodeInline(descriptor: SerialDescriptor): Encoder {
invalidCall("encodeInline")
}

private fun invalidCall(methodName: String): Nothing {
// This method is normally called by encodeSerializableValue or encodeNullableSerializableValue that is buffered, so we should never go here during buffering as it will be delegated to the concrete CompositeEncoder
throw UnsupportedOperationException("$methodName should not be called when reordering fields, as it should be done by encodeSerializableValue or encodeNullableSerializableValue")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect. It is valid to have an inline/value class have a struct member. That would just result in a nested reordering encoder (delay-created with the result of calling beginStructure).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The beginStructure is called by the serializer from inside the encodeSerializableValue, so it should never be called directly AFAIK 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that you have an encoder on which beginStructure is called. In the implementation it will create/call the reordering composite encoder (and return it), and perhaps also to the delegate encoder (to be passed to the reordering encoder).

serializer: SerializationStrategy<T>,
value: T
) {
bufferEncoding(descriptor, index) { encodeSerializableElement(descriptor, index, serializer, value) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect. You will need a wrapping encoder so Encoder.beginStructure will work correctly. To make that work you also need a wrapping SerializationStrategy that captures the delegate encoder (for the wrapping) unless you add a way to determine the encoder in another way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why it may not work. The serializer is not going to be called during the buffering step. FYI I already tested with a data class with a value class in a value class containing a list, and another value class containing a data class, and beginStructure is called only inside the encodeSerializableValue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say we have a structure: { a: { b: "foo", c: 42}, d: true } then property a is encoded by the outer serializer (o) invoking encodeSerializableElement(descriptor, 0, innerSerializer, value.a). This is then implemented by getting an Encoder instance and calling innerSerializer.serialize(encoder, value). This InnerSerializer.serialize function then calls beginStructure (for the inner struct) which returns another CompositeEncoder.

If you implement this by delegating encodeSerializableElement to the non-reordered encoder it gets to decide what CompositeEncoder is used for the inner struct. If you interject it is possible for the ordered composite encoder to override it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But as you said, it's calling encodeSerializableElement so it won't reach the innerSerializer during buffering, but only during the next step when encoding with the CompositeEncoder delegate.

Comment on lines +129 to +141
override fun encodeInlineElement(descriptor: SerialDescriptor, index: Int): Encoder {
return BufferingInlineEncoder(descriptor, index)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the implementation of this should be equivalent to:
thisEncoder. encodeSerializableElement(MyData. serializer. descriptor, 0, MyInt. serializer(), myInt)

Including the support of nested structs (nested structs always go through encodeSerializableElement)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is working like this after reordering, or maybe I misunderstood

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about:

@Serializable data class Outer(val a: Middle, val b: Int)

@Serializable value class Middle(val i: Inner)

@Serializable value class Inner(val c: String, d: Boolean)

The way I read it, this at least can be implemented using encodeInlineElement(d, 0) on the composite encoder for outer. Which would then mean that the inner value is serialized by calling encodeSerializableValue() on that (BufferingInlineEncoder) encoder.

Copy link
Contributor Author

@Chuckame Chuckame Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test has been successful with your example, so I still don't get your point https://github.com/Kotlin/kotlinx.serialization/pull/2722/files#r1648164815

the inner value is serialized by calling encodeSerializableValue() on that (BufferingInlineEncoder) encoder

No, here are the steps:

  • encodeInlineElement(d, 0) -> returns BufferingInlineEncoder with descriptor of Middle and index 0, but does not buffer anything for the moment (this is the role of BufferingInlineEncoder)
  • BufferingInlineEncoder.encodeSerializableValue is called, so nothing more is done, no nested call is done even for Inner thanks to bufferEncoding buffering compositeEncoderDelegate.encodeInlineElement(descriptor, elementIndex).encodeSerializableValue(serializer, value)
  • then when endStructure is called, replays the elements encoding, so the Inner will be reached out only at this moment

@Chuckame
Copy link
Contributor Author

First, I did not cover everything as I wanted first to agree on the main idea of buffering the calls. It seems that it's ok for you 😄

Secondly, if it's ok for you, I'll focus on the buffering itself, and then refine the public API and namings. I've did the current work quickly to show you the idea.

To finish, thanks for the quick answer 🚀

I'll make some tests to test the different use cases

@pdvrieze
Copy link
Contributor

@Chuckame The buffering is basically correct (I've implemented this for XML for quite some time now)

@Chuckame
Copy link
Contributor Author

Chuckame commented Jul 3, 2024

@pdvrieze I've tried with all the possible combinations, and value classes serializers are not directly using encodeInlineElement but encodeSerializableElement that is buffered, which I think it's a bug where encodeInlineElement is never called (see #2738 freshly created).

Let's imagine that this bug doesn't exist, and encodeInlineElement is called as expected, beginStructure call on the inline's encoder is explicitly forbidden or discouraged as said in the docs:

    /**
     * [...]
     *
     * Calling [Encoder.beginStructure] on returned instance leads to an unspecified behavior and, in general, is prohibited.
     */
    public fun encodeInline(descriptor: SerialDescriptor): Encoder

So I would say that this use case shouldn't be handled to follow the docs, as this beginStructure call should be done by the structure's serializer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants