Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add keepSoureToken parse option, adding srcToken values to Nodes #309

Merged
merged 2 commits into from
Sep 24, 2021

Conversation

eemeli
Copy link
Owner

@eemeli eemeli commented Sep 11, 2021

Fixes #308

This allows for usage like this:

import { Composer, CST, Parser } from 'yaml'

const src = 'foo:\n  [ 42 ]'
const tokens = Array.from(new Parser().parse(src))

const docs = new Composer({ keepSourceTokens: true }).compose(tokens)
for (const doc of docs) {
  const node = doc.get('foo', true)
  CST.setScalarValue(node.srcToken, 'eek')
}

const res = tokens.map(CST.stringify).join('')
assert.equal(res, 'foo:\n  eek')

The srcToken value isn't available on nodes that are created from empty contents, as those do not have a matching CST token. This means that for example the above would fail if the src was 'foo:\n', as then node.srcToken is undefined.

To handle that case, you'd probably need to CST.visit the parent collection, identify the right CollectionItem, and then use CST.createScalarToken() when setting its value. At which point it might be more straightforward to keep all of the operations in the CST, and to never even compose it into documents.

@andersk
Copy link

andersk commented Sep 11, 2021

This means that for example the above would fail if the src was 'foo:\n', as then node.srcToken is undefined.

Might it make sense to use an empty pseudotoken for that case instead of undefined?

Or provide an API like

setKeyScalarValue(item: CollectionItem, value: string, context?): void;
setValueScalarValue(item: CollectionItem, value: string, context?): void;

for setting item.key and item.value, which would work even if they were initially undefined?

Or an API that returns the a new Token rather than mutating the existing one, making it the user’s responsibility to insert it into the document?

makeScalarValue(token: Token | undefined, value: string, context?): Token;

For the last two ideas, the user needs access to the CollectionItem, so a YAMLMap.getPair(key: K): Pair | undefined and a Pair.srcToken?: CollectionItem would be helpful.

@eemeli
Copy link
Owner Author

eemeli commented Sep 12, 2021

This means that for example the above would fail if the src was 'foo:\n', as then node.srcToken is undefined.

Might it make sense to use an empty pseudotoken for that case instead of undefined?

The problem with that approach is that in this case the source of the Node is not just undefined, it is literally undefined, as in there's no sequence of characters in the source for the Node. It is instead constructed from that very lack of characters.

Or provide an API like

setKeyScalarValue(item: CollectionItem, value: string, context?): void;
setValueScalarValue(item: CollectionItem, value: string, context?): void;

for setting item.key and item.value, which would work even if they were initially undefined?

Or an API that returns the a new Token rather than mutating the existing one, making it the user’s responsibility to insert it into the document?

makeScalarValue(token: Token | undefined, value: string, context?): Token;

I don't think these are necessary, as they'd end up as rather minimal wrappers around CST.createScalarToken and CST.setScalarValue.

For the last two ideas, the user needs access to the CollectionItem, so a YAMLMap.getPair(key: K): Pair | undefined and a Pair.srcToken?: CollectionItem would be helpful.

The accessor you're looking for is findPair(pairs, key), which is available from 'yaml/util'. It should probably be better documented or removed. You can also iterate the pairs with visit(), which would give you similar results.

Adding srcToken also to Pair makes sense, and would provide a decent workaround for the undefined issues.

@andersk
Copy link

andersk commented Sep 12, 2021

The problem with that approach is that in this case the source of the Node is not just undefined, it is literally undefined, as in there's no sequence of characters in the source for the Node. It is instead constructed from that very lack of characters.

An empty sequence is still a sequence. The suggestion was that you could invent a “token” corresponding to an empty string of source characters: { type: "null", offset: (next to the colon), source: "" }.

I don't think these are necessary, as they'd end up as rather minimal wrappers around CST.createScalarToken and CST.setScalarValue.

My goal was to find an API that would automatically work for the undefined case without requiring the user to explicitly branch for it as currently required:

if (item.value === undefined)
    item.value = CST.createScalarToken(value, context);
else
    CST.setScalarValue(item.value, value, context);

But if you’re content with the existing API, don’t let me stop you from merging this.

@eemeli
Copy link
Owner Author

eemeli commented Sep 12, 2021

If you can find a clean way of adding appropriate empty value tokens into the CST during the parse, then I might be interested in considering a PR for it. However, even that won't solve all cases, as explicit ? keys don't need a matching : value.

Assigning a srcToken value that's unconnected to the actual source CST tree doesn't really make sense, as then modifying that would have no effect on the source CST. Or did I misunderstand your idea here?

Similarly, I'd be happy to consider a PR for CST.setScalarValue to be able to handle its first argument being empty, which would probably achieve what you're looking for

@andersk
Copy link

andersk commented Sep 12, 2021

Right, the first idea was that the parser would insert these empty tokens into the CST tree.

Without an object to represent the empty token, it’s not possible for CST.setScalarValue to handle its first argument being empty, for the same reason: modifying undefined would have no effect on the source CST. Hence the other two ideas—the CollectionItem is an object that can be passed to a function and mutated even when its key or value cannot.

@eemeli eemeli merged commit 819e451 into master Sep 24, 2021
@eemeli eemeli deleted the keep-source-tokens branch September 24, 2021 15:18
Nathan-Fenner pushed a commit to Nathan-Fenner/yaml that referenced this pull request Nov 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

keepCstNodes was useful; consider restoring it
2 participants