Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support TypeScript #424

Closed
goodmind opened this issue Aug 16, 2017 · 28 comments · Fixed by #461
Closed

Support TypeScript #424

goodmind opened this issue Aug 16, 2017 · 28 comments · Fixed by #461

Comments

@goodmind
Copy link

So, typescript syntax was merged in babylon would you support it here?

Im getting this error:

Error: {type: TSArrayType, start: undefined, end: undefined, loc: undefined, elementType: [object Object], __clone: undefined} does not match type Printable

babel/babylon#523
babel/babel#5899

@skovhus
Copy link

skovhus commented Aug 22, 2017

I would happily help out here, but would like some guidelines. @benjamn

@benjamn
Copy link
Owner

benjamn commented Aug 22, 2017

This is going to be a lot of work, but it's totally doable, and parallelizable across multiple contributors! Recast's ability to handle arbitrary AST type systems simultaneously is one of its greatest strengths, so I'm happy to see it applied to TypeScript.

At a high level, recast.parse makes a parallel copy of the original AST, so that recast.print can compare the modified AST to the original. In order to reprint the AST conservatively, unmodified source code (corresponding to unmodified AST regions) gets copied straight to the output. Where the AST is modified, Recast falls back to its pretty-printer to reprint just the nodes that were modified, though it strives to do that as infrequently as possible.

If unmodified code happens to contain an AST node that Recast doesn't know how to print, nothing will break, because Recast never has to generate new code for that node. However, if the AST changed in the vicinity of that AST node, Recast may have to use the pretty-printer. To be safe, of course, you need to teach Recast how to print any new types of AST nodes. The beauty is, there's no limit to how many different AST types you can add to the pretty-printer. They don't even have to represent the same language, as long as the AST type names are distinct.

Now, in reality, Recast benefits from having a limited understanding the relationships between AST types. For example, you're seeing a failure here because recast.types.namedTypes.Printable.check({ type: TSArrayType, ... }) doesn't know that TSArrayType should be considered a subtype of the Printable supertype. We could just disable those assertions, as long as we implement a printer case for TSArrayType (and all the other TS* types), though the ideal solution would be to add corresponding type definitions to the ast-types library, like we do for Flow already.

Another place where it's going to be useful to understand the AST type system is here, where we check whether a node is an Expression (and not, say, a Statement).

In short, I think we should get started on ast-types/def/typescript.js (see benjamn/ast-types#213, with ast-types/def/flow.js as an example).

Once we have enough TypeScript definitions to capture a small TypeScript AST, as parsed by Babylon, then we can begin writing Recast test cases:

const ast = recast.parse(typeScriptSource, {
  // Recast can use any parser you like!
  parse(source) {
    // Getting the options right here may take some finesse.
    // Presumably TypeScript isn't enabled by default, for example.
    return babylon.parse(source, options);
  }
});

// Should work even if Recast doesn't know how to print any of the new types.
recast.print(ast).code;

// Will fail unless Recast knows how to print everything.
recast.prettyPrint(ast).code;

These test cases will fail until the pretty-printer learns how to print everything, but picking off new AST types one by one is fun, because you can feel the incremental progress.

As a personal note, I'll be on vacation from this Thursday through Labor Day (September 4th), but I'm happy to answer questions before or after that.

Cheers! 🎉

@skovhus
Copy link

skovhus commented Aug 22, 2017

@benjamn thanks for taking your time writing this up!

@chriskrycho
Copy link

Has anyone started work on this at all? I may or may not give it a swing in some free time coming up if not.

@benjamn
Copy link
Owner

benjamn commented Dec 20, 2017

@chriskrycho Go for it! I haven't made any progress since my last comment.

@blaster151
Copy link

blaster151 commented Jan 19, 2018

Hi, I'm beginning to use recast and I love it. (Although it's in the context of jscodeshift as a wrapper.) I am extremely interested in understanding how to use it with TypeScript. I have had mostly good success in using the Flow parser because there are many similarities, but the Flow parser only goes so far and the specific things that break for me are explicit typecasts (such as

let x = <any>myVar;

or

let x = myVar as any;

This kind of thing throws off the Flow parser and everything in the rest of the file gets pulled in as a big JSXToken.

Can someone help me understand why the set of parsers available on the tool AST Explorer include many others beyond what Recasts supports? There is a "typescript" parser there that produces a fine AST representation of TypeScript files, including the syntax described above. I need help understanding what it would take to bridge Recast to one of these extended parsers. Thanks in advance!

@skovhus
Copy link

skovhus commented Jan 20, 2018

@blaster151

Did you see the comment on how to get started with this issue #424 (comment) ?

Cheers

@blaster151
Copy link

blaster151 commented Jan 20, 2018

Thanks, @skovhus . . . ! I'm a little new to the ecosystem here and so while I think I am beginning to understanding the relationships between jscodeshift, recast, and ast-types, I was hoping there was some way to get recast to use a parser that already exists that can handle TypeScript. (I'm guessing that the problem with some of the parsers available via "AST Explorer" is that many of them are not based on the ast-types library?) I will dig in more to try to understand what's going on; I'm sure some of my assumptions here are naive . . .

@Phoenixmatrix
Copy link
Contributor

I don't want to step in on someone else who already is looking at this, but we are betting pretty hard on Recast to maintain our codebase and are looking on starting to sprinkle some TypeScript in, and there doesn't seem to be anything better out there, so I'd consider jumping in on this. We do a lot of codemods, so we could even make progress using our real-world use cases to test progression.

@chriskrycho did you start yet, or just I poke at it greenfield?

@chriskrycho
Copy link

chriskrycho commented Jan 30, 2018 via email

@Phoenixmatrix
Copy link
Contributor

@benjamn

Question for you: from poking around and your explanation, the recast and ast-types bits look like they'll be pretty straightforward.

There is one thing I'm not too sure, and it's what to do on the parser front.
So, the default is esprima, and as far as I know, that won't support typescript. That's expected.

Using a custom parser, we have a couple of option (sorry, I'm sure my terminology will be completely off. Parser land is confusing):

1- Prettier and ESlint use typescript-eslint-parser. They use the primary typescript parser and convert to an AST compatible with those tools.
2- We could use the actual typescript parser directly and add the definition to ast-types, which I think is possible.
3- Using Babylon@7's typescript support.

Option 1 is a bit fuzzy to me, as the relationship between the babylon typescript support and typescript-eslint-parser seem a little blurry, but it allowed Prettier and ESlint to work, after all and the author was working actively with the TS team. It does generate an AST that could work here.

Option 2 seems like it would work if i understand the ast-types architecture (which could work with any AST as long as we put it in the definition?). Removes the middleman, but probably can't reuse much because, as I understand it, the AST format is very different.

Option 3 is only available in Babylon 7 i think, which is beta, but I have this working locally for a simple case. Seems like it would make things pretty simple (just use the babylon parser with the typescript and jsx plugin. Boom done) and a lot could be reused.

On paper, using Babylon's typescript support seems like an obvious choice, but I was curious if anyone understood the relationship between it and typescript-eslint-parser a little better (digging through their issue tracker history, I'm having trouble putting the context together)

@benjamn
Copy link
Owner

benjamn commented Jan 31, 2018

@Phoenixmatrix I don't know much about typescript-eslint-parser, but I agree with your intuition that most people hoping to use Recast with TypeScript will also be using Babylon 7, or at least I'm comfortable recommending Babylon 7 to anyone who wants this functionality. Babel and Babylon have to be careful to keep JS and TS AST type names distinct, since they might deal with both simultaneously, which also happens to be an important constraint for Recast and ast-types. I know that's a little hand-wavy, but option 3 seems right to me!

@Phoenixmatrix
Copy link
Contributor

good good. I'll run away with that for now. Btw, the Recast & ast-types design is really clever and nice to work with. Good work. Gives me a whole new level of appreciation for this project.

@blaster151
Copy link

blaster151 commented Jan 31, 2018 via email

@Phoenixmatrix
Copy link
Contributor

Progress update: since recast and its dependencies were already on Babylon 7, I didn't have to worry about breaking changes on the existing stuff, which was quite convenient.

I already have a local branch of ast-types and recast processing a couple of simple AST node types and pretty printing them correctly. While the TS ast nodes are different (and prefixed), most are very similar to Flow, so I have been able to use it as inspiration for both the ast and the pretty printer with some pretty trivial tweaks.

I'll probably have a good portion of the simple cases working by the end of the week and then I'll push the branches up.

The only cases things that will be even remotely challenging are:

  • the typescript AST types that are completely different from anything else. We'll actually have to try for thise ;)

  • Making sure the AST created in ast-types is complete. That isn't required to get stuff "to work", so that's where other people contributing will help the most. To complete the AST definition in places where I cut corners to get things working.

@Phoenixmatrix
Copy link
Contributor

Phoenixmatrix commented Feb 4, 2018

Progress is going at a good clip. Using the existing babylon support as a base for the typescript stuff is making things much, much easier (I'm getting all of the javascript constructs for free and only have to implement babylon 7's TS* nodes, which are generally much simpler overall).

I have most of the basic constructs (type annotations/declarations/aliases, most assertions and interfaces) working. You can see the progress in the following branches:

my ast-types branch
my recast branch

There's still a decent chunk of work to do, but most simple cases work now. Some things I didn't really pay attention:

  • the .build calls in the ast (not needed to get the printer working, but I'll have to actually think those through eventually)
  • the bases of the ast nodes. I'm going at it semi-randomly right now.
  • taking into consideration configurations in the printer. Right now I'm making a lot of assumptions just to get things working.

Still, progress is progress. I'm basically going through the TypeScript documentation and copy pasting/tweaking the examples in the tests one by one. You can take a look at the basic printing tests to see what I've got working.

@Phoenixmatrix
Copy link
Contributor

Got most of the ast supported now. I'm playing a bit of wack-a-mole with the more advanced types, pasting examples from the doc, but there's fewer and fewer left to do. Getting there! Will probably make a little bit more progress tonight, then it will be time to get some eyes on my approach before spending too much time polishing.

@benjamn
Copy link
Owner

benjamn commented Feb 4, 2018

@Phoenixmatrix Awesome work! Just drop a note here whenever you're ready for feedback/approval.

@hoschi
Copy link

hoschi commented Feb 5, 2018

@Phoenixmatrix awesome that you started this!

I have a question about the projects around. As I understood it, Babylon needs to be used as parser when working with TypeScript. But Babylons AST differ from the ESTree spec, doesn't that mean that one must also use babel-types when it comes to contruct/check nodes instead of ast-types (which you modified)? E.g. ast-types has b.literal where in Babyolns AST you need to work with the different kinds like b.stringLiteral.

I ask because I want to use the checking methods and tree walking usage (with TS support) in hoschi/yode#11 .
Also I started last week a customer project which is a generator to convert configuration objects into a React app, so I use the node builder API here.

I found ast-types first, but now I'm confused because using Babylon/babel-types which its latest syntax support seems to me important and therefore I can't imagine where I can use ast-types. Did I get the connections between the projects wrong?

@Phoenixmatrix
Copy link
Contributor

Phoenixmatrix commented Feb 5, 2018

@hoschi ast-types has type definitions for the babylon ast. You can see, for example, the string literal type defined here

The only thing is that babylon is not the default parser. But Recast can take a babylon AST no problem.

If all you need is to walk/check node, you might be able to get away with just using babylon/babel-types i think, and they already have stuff for typescript. My goal is to be able to make changes to code without altering the original code's formatting, and to do so at scale (hundreds/thousands of projects), thus the Recast typescript support effort.

@benjamn
Copy link
Owner

benjamn commented Feb 5, 2018

Just to add to @Phoenixmatrix's comment, Recast doesn't enforce any opinions about how you should traverse or modify the AST between recast.parse and recast.print, so you can use whatever traversal/transform tool you like, which is important because there are so many different tools to choose from, each with its own advantages and disadvantages.

The nice thing about ast-types is that it can simultaneously support everything you might want. Here's where it defines support for Babel's StringLiteral type, and here's where it defines the ESTree Literal type. Both AST formats coexist in peace.

So, for example, either of these two builder calls

require("recast").types.builders.stringLiteral("hi")
require("ast-types").builders.stringLiteral("hi")

will give you an AST node with type and value identical to

require("@babel/types").stringLiteral("hi")

Of course, @babel/types only supports its own AST format, so

require("@babel/types").literal("bye")

does not work. Since ast-types supports ESTree types as well as Babel types,

require("ast-types").literal("hi again")

works as you would expect when you're working with an ESTree AST.

If there's anything @babel/types can do that ast-types can't, it's either because it's something super Babel-specific that I didn't care to support, or just because no one has implemented it yet. In any case, ast-types is a good bet for working with AST nodes in any format.

@Phoenixmatrix
Copy link
Contributor

@benjamn ok, ready to at least get some initial set of eyes before I go too much further. Again, there's a lot of work to be done so don't get too scared if things are a bit sloppy, but I'd like to get some input before I start gold plating things

#461 and benjamn/ast-types#249

Obviously since the ast-types isn't pushed out, the travis builds will fail, but it should be normal for now.

@hoschi
Copy link

hoschi commented Feb 5, 2018

@Phoenixmatrix didn't see that, should have digged deeper.
@benjamn thanks for the explanation, makes a lot more sense now.

@zhenyanghua
Copy link

Now that Babylon@next (v7 beta) is archived and continued in the name of @babel/parser, we could get to use typescript and jsx with a custom parser configuration with existing recast API:

import babelParser from '@babel/parser';

const tsAst = parse(entryFile, {
  parser: {
    parse(source) {
      return babelParser.parse(source, {
        // additional options
        sourceType: 'module',
        plugins: ['jsx', 'typescript']
      });
    }
  }
});

@Gregoor
Copy link

Gregoor commented Oct 20, 2021

Hi everyone, thanks for your work on recast 👋🏼

Now that Babylon@next (v7 beta) is archived and continued in the name of @babel/parser, we could get to use typescript and jsx with a custom parser configuration with existing recast API:

import babelParser from '@babel/parser';

const tsAst = parse(entryFile, {
  parser: {
    parse(source) {
      return babelParser.parse(source, {
        // additional options
        sourceType: 'module',
        plugins: ['jsx', 'typescript']
      });
    }
  }
});

I happened to call it just like that, but then just now ran into a parsing error localted in a recast->esprima dependency. I am a little confused, because I thought this API allowed me to use my own parser instead, what is esprima doing there?

Thanks again!

@zhenyanghua
Copy link

zhenyanghua commented Oct 20, 2021 via email

@zhenyanghua
Copy link

I happened to call it just like that, but then just now ran into a parsing error localted in a recast->esprima dependency. I am a little confused, because I thought this API allowed me to use my own parser instead, what is esprima doing there?

recast/lib/parser.ts

Lines 36 to 44 in 3741e4d

// Use ast.tokens if possible, and otherwise fall back to the Esprima
// tokenizer. All the preconfigured ../parsers/* expose ast.tokens
// automatically, but custom parsers might need additional configuration
// to avoid this fallback.
const tokens: any[] = Array.isArray(ast.tokens)
? ast.tokens
: require("esprima").tokenize(sourceWithoutTabs, {
loc: true,
});

@Gregoor If the babel parser doesn't provide the tokens, it will fallback to use esprima tokenizer. All you need to do is to set the tokens to true in your parser option.

@Gregoor
Copy link

Gregoor commented Oct 23, 2021

I happened to call it just like that, but then just now ran into a parsing error localted in a recast->esprima dependency. I am a little confused, because I thought this API allowed me to use my own parser instead, what is esprima doing there?

recast/lib/parser.ts

Lines 36 to 44 in 3741e4d

// Use ast.tokens if possible, and otherwise fall back to the Esprima
// tokenizer. All the preconfigured ../parsers/* expose ast.tokens
// automatically, but custom parsers might need additional configuration
// to avoid this fallback.
const tokens: any[] = Array.isArray(ast.tokens)
? ast.tokens
: require("esprima").tokenize(sourceWithoutTabs, {
loc: true,
});

@Gregoor If the babel parser doesn't provide the tokens, it will fallback to use esprima tokenizer. All you need to do is to set the tokens to true in your parser option.

That indeed did the track. Thank you so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants