-
Notifications
You must be signed in to change notification settings - Fork 664
feat(rome_service): introduce a cross-language Workspace
abstraction
#2593
Conversation
Parser conformance results on ubuntu-latestjs/262
jsx/babel
ts/babel
ts/microsoft
|
Deploying with Cloudflare Pages
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's so cool!
About naming. Rome service might be ambiguous considering our plans. I personally would go either with:
rome_server
but I can see how this isn't good considering that some of the infrastructure might be unrelated to running in a server. However, we could still extract these types in the future.rome_workspace
: If it's really about workspaces, but my guess is that the crate contains more than that?
Roslyn calls this component a [compiler server[(https://github.com/dotnet/roslyn/blob/main/docs/compilers/Compiler%20Server.md), which would be another option as crate name.
I'll now jump into the code :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love it but it's a lot to take in. I've a few comments around naming that might be worth looking into. It might also be good to give others some time to take a look at this PR and leave comments that are more familiar with the LSP handling than I am.
let mut options = JsFormatOptions::default(); | ||
/// Read the formatting options for the command line arguments and inject them | ||
/// into the workspace settings | ||
pub(crate) fn parse_format_options(session: &mut CliSession) -> Result<(), Termination> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: should this function be renamed to parse_workspace_settings
. For example, I could see a --project
option where one can specify a workspace configuration file.
/// Wrapper for an underlying `rome_service` error | ||
#[error(transparent)] | ||
WorkspaceError(#[from] RomeError), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Not sure about the naming here. The documentation refers to rome_service
but the error itself is called WorkspaceError
.
crates/rome_fs/src/path.rs
Outdated
@@ -10,7 +10,7 @@ use std::{fs::File, io, io::Write, ops::Deref, path::PathBuf}; | |||
#[derive(Debug, Clone, Eq, Hash, PartialEq)] | |||
pub struct RomePath { | |||
file: PathBuf, | |||
file_id: Option<usize>, | |||
file_id: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to introduce a FileId
new type
green: GreenNode, | ||
} | ||
|
||
impl SendNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat!
use rome_rowan::AstNode; | ||
|
||
use crate::{ | ||
settings::{FormatSettings, Language, LanguageSettings, LanguagesSettings, SettingsHandle}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this Language
trait is different from rome_rowan
's Langauge
trait? I'm a bit concerned that this will be confusing because engineers will be exposed to both and it then is unclear when to use which.
It might help to rename this to SupportedLanguage
, FileType
(a file kind that the workspace supports).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rome_service::Language
trait is actually a supertrait that inherits from rome_rowan::Language
and adds some service-specific features (mostly related to resolving settings for that language at the moment hence why it's declared in settings.rs
)
type FormatSettings = JsFormatSettings; | ||
type FormatOptions = JsFormatOptions; | ||
|
||
fn lookup_settings(languages: &LanguagesSettings) -> &LanguageSettings<Self> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to star at this line for 30 seconds to notice that one struct is LanguageSetting
sand the other is just
LanguageSetting`.
line_with
isn't a language specific option. Are LanguageSettings
and LanguageSetting
normalized struct where Rome fills in all options even if inherited from another option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The concepts of "settings" and "options" differ on a few points, and a significant one is that "language specific settings" doesn't mean "settings that apply only to this language" but "settings that apply to all files of this language". So for instance a workspace may have a global line_width
of 80, and a language-specific line_width
of 120 for markdown files
impl<T> From<Parse<T>> for AnyParse | ||
where | ||
T: AstNode, | ||
T::Language: 'static, | ||
{ | ||
fn from(parse: Parse<T>) -> Self { | ||
let root = parse.syntax(); | ||
let diagnostics = parse.into_diagnostics(); | ||
|
||
Self { | ||
// SAFETY: the parser should always return a root node | ||
root: root.as_send().unwrap(), | ||
diagnostics, | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be defined outside of the javascript.rs
file? It doesn't seem to be specific to javascript. Same for some for the other functions following.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put all these declarations in javascript.rs
as I tried to make it the only file that imports rome_js_*
crates: here Parse<T>
is declared in rome_js_parser
for instance, and the following functions call into functions from rome_js_parser
, rome_js_syntax
or rome_js_formatter
for instance (and rome_analyzer
that's currently specialized for JS only)
#[derive(Default)] | ||
pub struct LanguageSettings<L: Language> { | ||
/// Formatter settings for this language | ||
pub format: L::FormatSettings, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the benefit of this trait over just using L::FormatSettings
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually I expect this struct will gets some additional fields, to specify settings for the linter for instance
What are you thoughts if it would be necessary to make any of the operations on the |
On the naming issue, I think it would make sense to call the crate On making the |
hum...
Can't we model Workspace as enum Workspace {
Local { ... },
Ssh { ... },
Websocket { ... }
} That would allow us to use Another possibility is to implement a loop {
let msg = channel.recv_async().await;
match msg {
...
}
} I grant that is a little bit annoying to generate request/reply with channels.
Do you think this is a huge problem? I think we will in the end use an async runtime, and there is 99% chance that will be tokio. :D One good side effect is that we will probably benefit from https://tokio.rs/blog/2021-12-announcing-tokio-console.
I think because we are on a chicken-egg problem. I was discussing with @MichaReiser and we don't have async on lower levels, because the higher levels don't have async. If the façades start to be async and block when calling lower levels (not ideal, but fine for now), lower levels will soon see the benefit and start to migrate. |
This certainly doesn’t need to be addressed in this PR, but what’s the long-term plan for We would need to change how we handle language capabilities, and it could be the responsibility of whatever calls In addition to allowing cheap copies, an important aspect of a On naming: I do think that the name of the |
Absolutely, to be clear none of the reasons I outlined for not using async right now are actual blockers, only minor inconvenience that led me to not include that in this initial implementation, but I also believe we will eventually end up using async code at least for some parts of the toolchain
This is also my intention, I think in the long term it would certainly make sens to merge
I don't think it would be entirely possible for the
This is somewhat intended, the idea is that at least in the context of a language server the state of the |
|
||
/// Wrapper for an underlying `rome_service` error | ||
#[error(transparent)] | ||
WorkspaceError(#[from] RomeError), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does #[from]
do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an attribute provided by thiserror
, it automatically implements From<RomeError>
for Termination
pub struct OpenFileParams { | ||
pub path: RomePath, | ||
pub content: String, | ||
pub version: i32, | ||
} | ||
|
||
pub struct GetSyntaxTreeParams { | ||
pub path: RomePath, | ||
} | ||
|
||
pub struct ChangeFileParams { | ||
pub path: RomePath, | ||
pub content: String, | ||
pub version: i32, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
version
is a bit opaque, and maybe it requires some documentation. Especially when it's used, where 0
is passed, I can't understand what's its purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the LSP this version number is provided by the editor and incremented on each change, then when the language server asynchronously sends diagnostics for a document the version of the document the diagnostics where computed for is sent alongside, so the editor can automatically invalidate previous diagnostics if the document changes.
I carried this concept over to the workspace since I think it will eventually be useful to tag the result of various queries with the version of the corresponding document, especially if multiple processes are interacting concurrently with the workspace, but this isn't entirely implemented yet.
/// Add a new file to the workspace | ||
fn open_file(&self, params: OpenFileParams) -> Result<(), RomeError>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function name and documentation have different meanings. Maybe we should add more description to the documentation?
// Return a textual, debug representation of the syntax tree for a given document | ||
fn get_syntax_tree(&self, params: GetSyntaxTreeParams) -> Result<String, RomeError>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it work only on supported files? (the ones that we are able to parse)
If so, maybe the documentation should mention it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is something that I didn't really explain in the documentation, but most of the methods on the Workspace
rely on the underlying language implementing the corresponding capability. So for instance get_syntax_tree
will return a RomeError::SourceFileNotSupported
if the language for this document does not implement the parse
and debug_print
capabilities
self.syntax.remove(¶ms.path); | ||
self.documents.insert( | ||
params.path, | ||
Document { | ||
content: params.content, | ||
version: params.version, | ||
}, | ||
); | ||
Ok(()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Workspace
is implemented only once, and here we don't handle any errors (no try operators), would it make sense to change the signature of the function to not return Result
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently there is no reason for WorkspaceServer
to return a Result
since this operation cannot fail, but I made most of the methods on Workspace
return a result to ensure what's using them has error recovery code in place in case errors need to be introduced in the future.
For instance with a WorkspaceClient
connected to a remote WorkspaceServer
the connection may fail at any moment and return an error, this is also the reason why every method takes a single params
struct as that will make it easier to eventually derive Serialize
and Deserialize
on those
I see your reasons, although "workspace" is also another way to identify multiple projects opened in the same editor. At least, that's what I always thought (and how I used it). Considering this premise, I find the term misleading, because it makes me think that that data structure is handling multiple projects/libraries. Which is not the case. Could we change |
Yes that's the intent, eventually I imagine there could be a shared daemon process running in the background and hosting one instance of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still have doubts about naming, but it should not be a show stopper. The changes make sense and we can revisit them in case there's something we want to change.
05406d3
to
d089ea0
Compare
Summary
This PR renames the
rome_core
crate torome_service
and replaces theFeatures
member of theApp
struct with a newWorkspace
object. This new object takes on most of the responsibilities previously handled by the LSP crate: managing a set of "open" documents, allowing for these to be queried for diagnostics and code actions, run through the formatter, or any other action the user may want to perform on a "file"With this change the
Workspace
is used as the backing implementation for most features in the language server (the LSP crate is now mostly acting as an adapter translating calls from the Language Server Protocol to the Workspace service) and the CLI. Overall this means adding support for additional languages should be relatively straightforward and mostly limited to therome_service
crate with little to no changes in the LSP and CLI consumer cratesWorkspace
is declared as a trait with a single implementation (WorkspaceServer
) as it is designed to eventually work over an optional transport layer (an additionalWorkspaceClient
implementation may then delegate calls to a remote instance of the workspace server over an IPC channel for instance)The
Workspace
interface refers to files using theRomePath
struct (that now unconditionally holds an internedFileId
along with the textual path) and is designed to abstract the underlying language-specific processing that may be happening for each call. Internally, theFeatures
andCapabilities
structs have been expanded into explicit vtables holding a set of (optional) function pointers representing the language-specific implementation of each feature supported by a given language. In order to handle a request likelint
orformat
the workspace implementation determines the language associated with the providedRomePath
, then looks up the correspondingCapabilities
and delegates to the language-specific implementationInternally this PR also add a new
SendNode
type torome_rowan
that rootSyntaxNode
can be converted to and from.SendNode
is a language-agnostic handle to the root (green) node of a syntax tree and implements bothSend
andSync
, and can thus be sent or shared between threads. This is used to implement the "syntax cache" of the workspace server to allow documents to be parsed on demand by allowing the result of the parser to be stored in a shared and language-independent container.Test Plan
I haven't written any tests for the
WorkspaceServer
itself yet, but it is already being indirectly checked through the existing test suite for the CLI. Additionally, I also made an attempt at adding a few basic tests for the language server implementation.