Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable feature lightup on json strings (classification, bracematching, diagnostics). #59034

Merged
merged 156 commits into from
Feb 2, 2022

Conversation

CyrusNajmabadi
Copy link
Member

@CyrusNajmabadi CyrusNajmabadi commented Jan 23, 2022

Looks like this:

image

Dark mode:

image

TODO:

  • Support new System.Text.Json apis.
  • Support new System.Text.Json options api.
  • Ensure full testing.

@@ -1171,7 +1171,7 @@ public void TestMissingColon()
<Diagnostic Message=""Only properties allowed in an object"" Start=""20"" Length=""3"" />
</Diagnostics>",
@"<Diagnostics>
<Diagnostic Message=""Only properties allowed in an object"" Start=""20"" Length=""3"" />
<Diagnostic Message=""Strings must start with &quot; not '"" Start=""12"" Length=""1"" />
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some tests changed as we now always report the earlier error in a json string.

Copy link
Member

@jasonmalinowski jasonmalinowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:, although a few questions still pending. I think to me the main ones are:

  1. The parser that deals with figuring out options specified by the enum seems a bit strange, since it seems I can pretty easily make combos of flags that don't make sense?
  2. The brace matcher optimization to avoid decending into parts of the trees that aren't needed I think is still pending too, unless I missed where it got added. (GitHub is definitely getting slow with a scale of this PR, so hard to say.)

@@ -13,7 +14,7 @@ public partial class CSharpJsonParserNstTests : CSharpJsonParserTests
private void TestNST(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nst?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Covered in an email i just sent to @jinujoseph and you. An open source suite of json tests.

Comment on lines 96 to 98
if (child.IsNode)
{
var result = FindObjectOrArrayNode(child.Node, ch);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CyrusNajmabadi Did this happen or no? (it's possible it snuck into some other place somewhere?)

Comment on lines +56 to +59
// Stop walking up once we hit a statement. We don't need/want statements higher up the parent chain to
// have any impact on this token.
if (syntaxFacts.IsStatement(node))
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I recognized this may have moved, asking anyways):

Does this need some check for lambdas, or OK with something outside a lambda impacting inside of it? I think there's a reasonable argument for it, this just my general caution any time I see check similar to this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. i think this is acceptable for now. We could always grow this later if users find this undesirable.

var name = GetNameOfType(typeNode, syntaxFacts);
if (name != null)
{
if (syntaxFacts.StringComparer.Compare(nameof(Regex), name) == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're just ignoring aliases here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've had discussions around source generators too where a "is the type here being created" helper that does this might be helpful, not sure if we should extract this pattern. At least that way the ignoring of aliases can be in one place so we can fix it everywhere later?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we do. Note: this is going to go semantic in a followup PR this week to support StringSyntaxAttribute. So this is just about detecting this when older frameworks.

/// </summary>
internal sealed class RegexLanguageDetector : AbstractLanguageDetector<RegexOptions, RegexTree>
{
private const string _patternName = "pattern";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming convention isn't PatternName here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fix in followup :)


public RegexLanguageDetector(
EmbeddedLanguageInfo info,
INamedTypeSymbol? regexType,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this type to be createable even if we don't have a regex type, because there's still other comments we are looking for as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currect! we would still allow regex highlighting if you had // lang=regex on something.

/// </summary>
internal struct LanguageCommentDetector<TOptions> where TOptions : struct, Enum
{
private static readonly Dictionary<string, TOptions> s_nameToOption =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Immutable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can do in followup.

/// </summary>
internal struct LanguageCommentDetector<TOptions> where TOptions : struct, Enum
{
private static readonly Dictionary<string, TOptions> s_nameToOption =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assumption here is TOptions is a flags enum? Does something need to enforce that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure. i can add that in followup.

Comment on lines +12 to +17
/// <summary>
/// Parse using loose rules. This is very relaxed and allows lots of features not allowed by standard IETF 8259
/// (like comments, and trailing commas). This also allows all the additional constructs added by Json.net, like
/// constructors and string that use single-quotes instead of double-quotes.
/// </summary>
Loose = 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be structured that "Loose" is equal to Comments | TrailingCommas and strict = 0? Otherwise, it seems odd that some bits mean "be more strict" and some mean "be more loose"? I guess though this has to work this way because this is how the comment detector is looking for these particular names?

(in general this generic parser seems a bit janky here, since I guess a user could write loose,strict and we'd just go with that?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so... there's two issues here. First: i want // lang=json to be as permissive as possible. Which means the 0 state should be that (though i suppose we could also have a way to pass the 'default' that a detector should pick if no other information is provided).

Second, there are actually two separate groups. The first choice is: "are we going json.net parsing, or .net parsing?" Second, if we're doing .net parsing, is it in strict mode (teh default), or does it allow these two variants.

Note that Loose is still a superset of that, allowing a broad set of other craziness that is not controllable.

I don't have a great way to model this IMO, but i'm open to ideas.

@CyrusNajmabadi
Copy link
Member Author

The brace matcher optimization to avoid decending into parts of the trees that aren't needed I think is still pending too, unless I missed where it got added. (GitHub is definitely getting slow with a scale of this PR, so hard to say.)

It hasn't been done. I want to do it as part of a larger refactoring of the EmbeddedParsing system so that, like with normal Roslyn parsing, information abotu positions/widths flow up to the parent nodes. Then we can use that to narrow the walking down.

@CyrusNajmabadi CyrusNajmabadi merged commit 29dfd6e into dotnet:main Feb 2, 2022
@CyrusNajmabadi CyrusNajmabadi deleted the jsonFeatures branch February 2, 2022 18:17
@ghost ghost added this to the Next milestone Feb 2, 2022
@RikkiGibson RikkiGibson modified the milestones: Next, 17.2.P1 Feb 4, 2022
@ryzngard ryzngard added UX Review Not Required UX Review Not Required and removed Needs UX Triage labels Apr 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants