Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix line ending bug #90

Closed
wants to merge 11 commits into from
Closed

Conversation

kekavc24
Copy link
Contributor

@kekavc24 kekavc24 commented Jul 1, 2024

I debugged #89 further while submitting #91 and noted several issues that are closely coupled:

  • Trailing line-break only applied to YamlList or YamlMap and never a YamlScalar
  • Dangling comments and line-breaks left behind if a comment spans multiple lines.

This PR:

  1. Applies a line break after each YamlNode encoded within a block.
  2. Fixes Literal and folded strings cannot handle "\n \n" at the end of a string #89.
  3. Introduces a normalizeEncodeBlock function that dynamically prunes the additional line break if:
  4. Introduces a skipAndExtractComments function which greedily skips any comments and whitespace belonging to the YamlNode being replaced.

Further changes made in #93


  • I’ve reviewed the contributor guide and applied the relevant portions to this PR.
Contribution guidelines:

Note that many Dart repos have a weekly cadence for reviewing PRs - please allow for some latency before initial review feedback.

@kekavc24
Copy link
Contributor Author

kekavc24 commented Jul 4, 2024

@jonasfj I have improved this PoC further by:

1. Lazily looking ahead for comments as that would bring the draft closer to the how YamlEditor currently does its edits.
2. Refactored the functions that edit YamlNode that are encoded as block elements to take advantage of the skipAndExtractComments which makes code a bit straight-forward and easy to understand (entirely subjective until you take a look) 😅

All in all, the existing and currently failing tests kinda vouch for this PoC but I'm ready to push its limit to prove its worth with stronger and more convoluted tests.

See #90, #93 & #94

@jonasfj
Copy link
Member

jonasfj commented Jul 4, 2024

I'm down for taking a look at this.
Currently, traveling and I fear this requires a bit of attention to understand. So I'll try to take a look next week.

I saw you had nice comments explaining some of the methods. That's awesome.

Feel free to give some review hints in the PR description :)

Also if any of these changes could be done independently, please do consider submitting smaller independent PRs. It's much easier to review, which often leads to faster turnaround time.

@kekavc24 kekavc24 marked this pull request as ready for review July 5, 2024 10:07
@kekavc24
Copy link
Contributor Author

kekavc24 commented Jul 5, 2024

I've tried to split the 22 commits over 3 PRs based on the changes I made. #90 -> #93 -> #94.

@@ -87,8 +87,10 @@ int deepHashCode(Object? value) {
}

/// Returns the [YamlNode] corresponding to the provided [key].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: consider updating the documentation.

Also feel free to submit this as a separate PR, and hint that you plan on using it in this PR. Then we can reduce the size of this PR a bit.

/// [YamlList] or a top-level [YamlScalar] of the [yaml] string provided.
///
/// [currentEndOffset] represents the end offset of [YamlScalar] or [YamlList]
/// or [YamlMap] being replaced, that is, `end + 1`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably better illustrated with an example.

/// ```dart
/// final yaml = 'my_key: "replaced-value"\n';
/// //                                    ^--- currentEndOffset points to the newline
/// fimal currentEndOffset = 24;
/// ```

ofcourse, I might have counted wrong here 🤣

Also does this mean that currentEndOffset can be greater than the length of the yaml document? (currentEndOffset == yaml.length, so yaml[currentEndOffset] might throw)

Example: k: v will have currentEndOffset: 4 (if I'm replacing v, right).

Copy link
Member

@jonasfj jonasfj Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, if I'm asking dumb questions here, but it might be worth making some examples.

And I think we should make sure we're doing it right, if we get he wrong invariants we'll have a lot of bugs to fix :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also does this mean that currentEndOffset can be greater than the length of the yaml document? (currentEndOffset == yaml.length, so yaml[currentEndOffset] might throw)

currentEndOffset shouldn't be greater than yaml.length when passed in as an argument. May be I should throw if it is?


Example: k: v will have currentEndOffset: 4 (if I'm replacing v, right).

I think 3 since it's always zero-based.


Sorry, if I'm asking dumb questions here, but it might be worth making some examples.

And I think we should make sure we're doing it right, if we get he wrong invariants we'll have a lot of bugs to fix :D

Sure, will add examples.

Comment on lines +276 to +277
/// Extracts comments for a node that is replaced within a [YamlMap] or
/// [YamlList] or a top-level [YamlScalar] of the [yaml] string provided.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it just always a for a YamlNode that is replaced?

Is it only if replaced, or if removed too? (is there a difference? An example might help).

If this can be: "a node within a YamlMap or YamlList, or a top-level YamlScalar".
Then isn't that the same as saying: YamlNode?
Can a YamlNode appear elsewhere in the document?

Just trying to be concise and understand if we listed out the options this way for a reason, or if we could write the equivalent thing more concisely.

Oh, I guess the node that currentEndOffset points to can't be a top-level YamlList or top-level YamlMap is that correct? (or am I reading it too formal).
If so, maybe it'd be shorter to write that. Or write both.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah. I'll change that. I wrote that method for #90 as a PoC before working on #94.

Comment on lines +285 to +286
/// If not sure of the next [YamlNode]'s [nextStartOffset] pass in null and
/// allow this function to handle that manually.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better we know why it's appropriate to use this, rather than: I wasn't sure.

If not passing it always works, then why would we ever pass it in (I don't think we should be overly concerned with performance here, if that's the concern).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote that function to be open ended and versatile in case someone wanted to tackle #71 or do something cool with YamlEditor if they have the next node in a YamlList or YamlMap

Comment on lines +283 to +284
/// May be null if the current [YamlNode] being replaced is the last node
/// in a [YamlScalar] or [YamlList] or if its the only top-level [YamlScalar].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, but does this make assumptions about block-mode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was a heads-up for anyone who may decide to call that function. It should be YamlMap or YamlList.

final edit = SourceEdit(
start, end - start, yamlEncodeBlock(valueNode, 0, lineEnding));

end = skipAndExtractCommentsInBlock(_yaml, end, null, lineEnding).$1;
Copy link
Member

@jonasfj jonasfj Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
end = skipAndExtractCommentsInBlock(_yaml, end, null, lineEnding).$1;
final (endOffset, _) = skipAndExtractCommentsInBlock(_yaml, end, null, lineEnding);
end = endOffset;

or similar, but let's avoid .$1 if we can (inside a simple helper method like getKeyNode using .$2 is probably not so bad, but elsewhere, especially in larger code blocks, I'd suggest avoiding it).

Or make skipAndExtractCommentsInBlock return a record with named properties. So we can do: skipAndExtractCommentsInBlock(...).endOffset;.

) {
var terminalNode = update;

if (terminalNode is! YamlScalar) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make a utility method called findTerminalNode(YamlNode node)

Then we can properly document what a terminal node is too.

Also if you do that, you won't need to break loop; you can just do a return instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

I'm beginning to think I shouldn't have split this PR to #93 and #94? All those changes address the title of this PR or simply build up to it.

Comment on lines +413 to +414
while (terminalNode is! YamlScalar) {
switch (terminalNode) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider something like while(true) { switch(terminalNode) { instead.

Also, how does this work? Don't the switch-statement have to be exhaustive?
I get that it can only be YamlList and YamlMap, when it's not YamlScalar.
But the compiler can't see that because YamlNode isn't a sealed class.

Maybe add a default case that throws an AssertionError. It should never happen, but if someone makes a new kind of YamlNode in the future, it's better than whatever else might happen.

Comment on lines +388 to +401
/// Normalizes an encoded [YamlNode] encoded as a string by pruning any
/// dangling line-breaks.
///
/// This function checks the last `YamlNode` of the [update] that is a
/// `YamlScalar` and removes any unwanted line-break within the
/// [updateAsString].
///
/// This is achieved by obtaining the chunk of the [yaml] that is after the
/// current node being replaced using its [nodeToReplaceEndOffset]. If:
/// 1. The chunk has any trailing line-break then the it is left untouched.
/// 2. The node being replaced with [update] is not the last node, then it
/// is left untouched.
/// 3. The terminal node in [update] is a `YamlScalar`, that is,
/// the last [YamlNode] within the [update] that is not a collection.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider adding some example.

Also why is it that we don't do this normalization when we create the updateAsString? Wouldn't that be more correct -- I don't know -- I probably don't understand all the context here, that's why I'm asking :D

Copy link
Contributor Author

@kekavc24 kekavc24 Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I’ll add an example.

I do that because yamlEncodeBlock has no context of which method called it or the current structure of the yaml document. It could be:

  1. updateInList or _appendToBlockList or _insertInBlockList for lists
  2. _replaceInBlockMap or _addToBlockMap for block maps
  3. Yaml.update for top level YamlNode.

Thus, the callers should make a call to the function to prune the trailing line break because they have an idea of the yaml's structure.

Additionally, I found it quite advantageous on our part to try and include line breaks for an existing YamlNode being replaced to edits. This reduces bugs and removes the need to try and guess where the next line break is.

It’s better to include it and check later if it was there just before suggesting an edit using a definite endOffset

@kekavc24 kekavc24 closed this Jul 17, 2024
@kekavc24 kekavc24 deleted the fix-line-ending-bug branch July 17, 2024 07:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Literal and folded strings cannot handle "\n \n" at the end of a string
2 participants