Skip to content
This repository has been archived by the owner on Aug 31, 2023. It is now read-only.

feat(rome_js_formatter): add format element label #2783

Merged
merged 13 commits into from
Jul 1, 2022

Conversation

denbezrukov
Copy link
Contributor

Summary

This PR introduces the new IR element Label. This IR matches Prettier's label command.

This IR can be useful if representation depends on different representation of child content.
E.g., to decide how to print an assignment expression, we might want to know whether its right-hand side has been printed as a method call chain, not as a plain function call.

Test Plan

Added a new doc showing how to use the new IR

Copy link
Contributor

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

crates/rome_formatter/src/builders.rs Outdated Show resolved Hide resolved
crates/rome_formatter/src/builders.rs Outdated Show resolved Hide resolved
@denbezrukov
Copy link
Contributor Author

denbezrukov commented Jun 24, 2022

@MichaReiser Could you help please?
There is an assertion static_assert!(std::mem::size_of::<crate::FormatElement>() == 32usize); for

pub enum FormatElement {
  ...
  Label(Label)
  ..
}
pub struct Label {
    pub(crate) content: Box<[FormatElement]>,
    label: &'static str,
}

May be we can:

  • use Label(Box<Label>) instead Label(Label).
  • use
pub struct Label {
    pub(crate) content: Box<[FormatElement]>,
    label: Box<&'static str>,
}
  • use another type for label? u16?

Is it ok to use &'static str type? Maybe there are some cases then need String? E.g. create dynamic label?

@MichaReiser
Copy link
Contributor

I don't think using dynamic strings is a good idea as these labels drive the formatting logic. Therefore, using a static string sounds good, in which case the size is 24bytes, and 32 for FormatElement. Or is the assertion triggering with a static string?

An alternative is to do something similar to GroupId where it is a wrapper for a usize but holds a name in debug builds. But I don't think we need this optimisation just yet

@MichaReiser
Copy link
Contributor

MichaReiser commented Jun 24, 2022

Or you could try to make Label a thin wrapper around TypeId. That has the nice added benefit that the compiler assigns compile time constants for each label. It also enforces constants (types actually) for each label, and comparing the labels in release is simply comparing two u64

This requires that Each label has its own Type (zero type struct or enum).

It would probably make sense to add a debug only name field, which you can automatically derive by using type_name

#[derive(Eq, PartialEq, Copy, Clone)]
pub struct Label {
  id: TypeId,
  #[cfg(debug_assertions)]
  label: &'static str
}

impl Label { 
  pub const fn of<T>() -> Self {
    Self {
      id: TypeId::of<T>(),
      #[cfg(debug_assertions)]
      label: type_name::<T>()
    }
  }
}

enum MemberNameLabel{}

let member_name = Label::of<MemberNameLabel>()

Argh, I've no idea how to create a code block on the iPhone 😅

}

impl LabelId {
pub fn of<T: ?Sized + 'static>() -> Self {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unstable api to make this function const.

impl Debug for Label {
fn fmt(&self, fmt: &mut Formatter) -> fmt::Result {
fmt.debug_struct("")
.field("label_id", &self.label_id)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks:

label_id: LabelId {
        id: TypeId {
            t: 4451653586406524753,
        },
        label: "rome_formatter::arguments::tests::test_nesting::SomeChain",
    },

@denbezrukov
Copy link
Contributor Author

Or is the assertion triggering with a static string?

Yes, it is. I had two solutions how to handle this.

  • use Label(Box<Label>) as a variant for FormatElement.
  • use Box<&'static str> as a field for label.

Or you could try to make Label a thin wrapper around TypeId.

This way is great 😻 Thank you! I've edited the example of using this new IR.

Only one problem that we can't use const in pub const fn of<T>() -> Self. Because this API is unstable. I attached the link in the comment.

crates/rome_formatter/src/builders.rs Outdated Show resolved Hide resolved
crates/rome_formatter/src/builders.rs Outdated Show resolved Hide resolved
Comment on lines 556 to 559
/// let is_labelled = match labelled_content {
/// FormatElement::Label(labelled) => labelled.label_id() == label_id,
/// _ => false,
/// };
Copy link
Contributor

@ematipico ematipico Jun 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this is the correct usage. This example shows that the consumer is forced to extract manually the FormatElement which is a low-level API for our formatter.

I wonder instead if we should provide an API, something like f.label_assert_of/label_id.assert_of (it asserts if the type of the label), which returns a boolean for us.

The suggestion is vague because I still need to understand how we do want to use the label and it can be used inside a real world example.

Copy link
Contributor Author

@denbezrukov denbezrukov Jun 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed this prettier API then I looked into isPoorlyBreakableMemberOrCallChain for variable assignment.

They use label to understand that CallExpression has been printed as a member-chain.
I've just realized that it seems they traverse tree to extract label. Because they call isPoorlyBreakableMemberOrCallChain with properties array.

But current this PR API allows look only last FormatElement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need to come up with a different solution here, I suppose. I am not sure if printCallExpression actually prints in their main buffer or not (it seems not, which works for their logic). Regardless, they are able to extract the IR for they right-hand side of the assignment like expression and then decide the layout.

Our formatter works left-to-right, which means that once we write the right-hand side, it's there, unless we write it in a temporary buffer (which can be expensive, so we should avoid it).

I would suggest another solution for the assignment case. They create the label only when they actually create a member chain. They don't create the member chain in a specific case: https://github.com/prettier/prettier/blob/9dd761a6e491ffff3856eea47fb10b4573b351a6/src/language-js/print/member-chain.js#L342-L347

That condition is this one: https://github.com/rome/tools/blob/main/crates/rome_js_formatter/src/utils/member_chain/groups.rs#L42-L48

If we are able to use that logic inside the assignment like formatting, we might be able to not use the label.

Otherwise, the only solution that I can see is the write the right-hand side inside a temporary buffer, than inspect that buffer and check the label. But as said before, this has a big impact on memory usage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense👍
I guess that we can try it (:

What do you think about another case?
Now we have should_not_indent_if_parent_indents in binary_like_expression module. This function is aware about place where it will be printed. It uses should_break_after_operator from assignment_like module because when binary_like_expression is right part and layout is BreakAfterOperator assignment_like adds indent and to avoid double indent we have to check the same logic in binary_like_expression . May be we can invert this dependency and use label for binary_like_expression that it already has indent and check this label in assignment_like module.

  • fn should_not_indent_if_parent_indents(current_node: &JsAnyBinaryLikeLeftExpression) -> bool {
    let parent = current_node.syntax().parent();
    let parent_kind = parent.as_ref().map(|node| node.kind());
    let great_parent = parent.and_then(|parent| parent.parent());
    let great_parent_kind = great_parent.map(|node| node.kind());
    match (parent_kind, great_parent_kind) {
    (Some(JsSyntaxKind::JS_PROPERTY_OBJECT_MEMBER), _)
    | (Some(JsSyntaxKind::JS_INITIALIZER_CLAUSE), Some(JsSyntaxKind::JS_VARIABLE_DECLARATOR)) => {
    current_node
    .as_expression()
    .and_then(|expression| should_break_after_operator(expression).ok())
    .unwrap_or(false)
    }
    (
    Some(JsSyntaxKind::JS_RETURN_STATEMENT | JsSyntaxKind::JS_ARROW_FUNCTION_EXPRESSION),
    _,
    ) => true,
    _ => false,
    }
    }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if that would work. Mainly because at the end of the format phase, where that function is used, the logic might add parenthesis to the binary expression, and the indentation has to stay inside the parenthesis. That's why checking the AST is better in this case.

Copy link
Contributor Author

@denbezrukov denbezrukov Jun 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! Thank you!
So to emulate prettier label functionality we can add new extension inspect_label, InspectLabelBuffer and method for FormatElement has_label which traverses IR tree and search expected label?

EDIT:
It seems that second case also doesn't write in the main buffer.

pub fn inspect(&mut self, f: &mut Formatter<Context>) -> FormatResult<&FormatElement> {
let result = self
.memory
.get_mut()
.get_or_insert_with(|| f.intern(&self.inner));
match result.as_ref() {
Ok(content) => Ok(content.deref()),
Err(error) => Err(*error),
}
}

/// Formats `content` into an interned element without writing it to the formatter's buffer.
pub fn intern(&mut self, content: &dyn Format<Context>) -> FormatResult<Interned> {
let mut buffer = VecBuffer::new(self.state_mut());
crate::write!(&mut buffer, [content])?;
Ok(buffer.into_element().intern())
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has_label which traverses IR tree and search expected label?

Is this what Prettier does? Does it traverse ALL the IR in order to find a label? I thought it just checks the first element

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has_label which traverses IR tree and search expected label?

Is this what Prettier does? Does it traverse ALL the IR in order to find a label? I thought it just checks the first element

Sorry, you're right.
I double checked and prettier checks the first element.

Copy link
Contributor Author

@denbezrukov denbezrukov Jul 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest another solution for the assignment case. They create the label only when they actually create a member chain. They don't create the member chain in a specific case: https://github.com/prettier/prettier/blob/9dd761a6e491ffff3856eea47fb10b4573b351a6/src/language-js/print/member-chain.js#L342-L347

That condition is this one: https://github.com/rome/tools/blob/main/crates/rome_js_formatter/src/utils/member_chain/groups.rs#L42-L48

Could you please help me?🙏🏽 I can't find this case in Rome code:

Because it seems this conditional:

is prettier:

UPDATE:
Do I understand correctly that prettier uses only one array for all groups and Rome uses two structs (HeadGroup and Groups)? I was wondering about cutoff value. It can be 2 and 3 depends on should_merge.
It seems that it always should be 1. Because Rome uses two structs and then should_merge is true it mutates Groups vec.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shouldMerge value is used to essentially decide the layout of the formatting, and this affects the head of the group. So I went for a different approach because prettier's one was not working for us.

Yes, in theory cutoff should not be needed anymore, now that we actually mutate the groups vector

crates/rome_formatter/src/buffer.rs Outdated Show resolved Hide resolved
@ematipico ematipico requested a review from leops July 1, 2022 08:10
Co-authored-by: Emanuele Stoppa <my.burning@gmail.com>
@denbezrukov
Copy link
Contributor Author

@ematipico Could I try to implement isPoorlyBreakableMemberOrCallChain after this PR?
https://github.com/prettier/prettier/blob/a043ac0d733c4d53f980aa73807a63fc914f23bd/src/language-js/print/assignment.js#L329

@ematipico
Copy link
Contributor

@ematipico Could I try to implement isPoorlyBreakableMemberOrCallChain after this PR?
https://github.com/prettier/prettier/blob/a043ac0d733c4d53f980aa73807a63fc914f23bd/src/language-js/print/assignment.js#L329

Sure go ahead. I think it's the last piece for completing assignments

@denbezrukov
Copy link
Contributor Author

@ematipico Could I try to implement isPoorlyBreakableMemberOrCallChain after this PR?
https://github.com/prettier/prettier/blob/a043ac0d733c4d53f980aa73807a63fc914f23bd/src/language-js/print/assignment.js#L329

Sure go ahead. I think it's the last piece for completing assignments

🥳🥳

There is one more piece

I guess that we can implement canBreak the same way as willBreak.

@ematipico ematipico merged commit 73f9b7e into rome:main Jul 1, 2022
@denbezrukov denbezrukov deleted the feature/format-element-label branch July 1, 2022 15:27
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants