Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rework and vastly expand the MIR section #67

Merged
merged 4 commits into from
Feb 28, 2018

Conversation

nikomatsakis
Copy link
Contributor

This isn't done, but we now explain a lot more about MIR. These are all topics I've explained in the last few days, so they're on my mind.

cc @sgrif -- this includes a pretty detailed writeup of skolemization

Copy link
Member

@mark-i-m mark-i-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nikomatsakis ! I learned a lot.

Sorry for assaulting you with so many comments 🤕

There are a lot of places where things should be placed in triple-backticks to format properly but aren't, especially in the regionck chapter.

Do you happen to know what READMEs are subsumed by this chapter? It would be good to mark them off in #2, and in rust-lang/rust#48479.

src/glossary.md Outdated
@@ -18,12 +18,14 @@ HIR Map | The HIR map, accessible via tcx.hir, allows you to qu
generics | the set of generic type parameters defined on a type or item
ICE | internal compiler error. When the compiler crashes.
ICH | incremental compilation hash. ICHs are used as fingerprints for things such as HIR and crate metadata, to check if changes have been made. This is useful in incremental compilation to see if part of a crate has changed and should be recompiled.
inference variable | when doing type or region inference, an "inference variable" is a kind of special type/region that represents value you are trying to find. Think of `X` in algebra.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps something like

when doing type or region inference, an "inference variable" is a kind of special type/region that represents what you are trying to infer. Think of X in algebra. For example, if we are trying to infer the type of a variable in a program, we create an inference variable to represent that unknown type.

@@ -0,0 +1,122 @@
# Background topics
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to add all of these topics to the glossary in brief form with links to the right spot in this chapter.

connected by edges. The key idea of a basic block is that it is a set
of statements that execute "together" -- that is, whenever you branch
to a basic block, you start at the first statement and then execute
all the remainder. Only at the end of the is there the possibility of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Only at the end of the block"


<a name=variance>

## What is co- and contra-variance?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is content from the nomicon that could be borrowed here...

refer to local variables that are defined *outside* of the
expression. We say that those variables **appear free** in the
expression. To see why this term makes sense, consider the next
example.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing example here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just poor phrasing, fixed

src/mir.md Outdated
leading underscore, like `_1`. There is also a special "local"
(`_0`) allocated to store the return value.
- **Places:** expressions that identify a location in memory, like `_1` or `_1.f`.
- **Rvalues:** expressions that product a value. The "R" stands for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"that produce"

src/mir.md Outdated
constant (like `22`) or a place (like `_1`).

You can get a feeling for how MIR is structed by translating simple
programs into MIR and ready the pretty printed output. In fact, the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"and reading"

src/mir.md Outdated
like `_0` or `_1`. We also intermingle the user's variables (e.g.,
`_1`) with temporary values (e.g., `_2` or `_3`). You can tell the
difference between user-defined variables have a comment that gives
you their original name (`// "vec" in scope 1...`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the "scope" blocks?

src/mir.md Outdated

<a name=promoted>

### Promoted constants
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add "promoted constants" to the glossary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, I wonder if we should have some kind of "index" where we can define these "in situ" and then have a another pass that adds them to the glossary

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be pretty nifty... I was kind of banking on people requesting things be added as they came across them 😛

src/mir.md Outdated
- **Rvalues** are represented by the enum `Rvalue`.
- **Operands** are represented by the enum `Operand`.

## MIR Visitor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the content of this subsection can be moved to the other subchapter.

@nikomatsakis nikomatsakis mentioned this pull request Feb 25, 2018
19 tasks
@nikomatsakis
Copy link
Contributor Author

Sorry for assaulting you with so many comments 🤕

💯 keep 'em comin'

@nikomatsakis
Copy link
Contributor Author

There are a lot of places where things should be placed in triple-backticks to format properly but aren't, especially in the regionck chapter.

Do backticks format differently from "4-space indent"?

@nikomatsakis
Copy link
Contributor Author

Do you happen to know what READMEs are subsumed by this chapter?

I'll take a look.

@nikomatsakis
Copy link
Contributor Author

Haha, at first I only saw a few comments... then I realized GH was hiding 28 of them from me =)

@mark-i-m
Copy link
Member

Do backticks format differently from "4-space indent"?

Oh, apparently GH renders them the same. Do you know if mdbook will do the same? I didn't know you could do that.

I'll take a look.

I updated #2 with my best guesses. Let me know what you think.

src/glossary.md Outdated
DefId | an index identifying a definition (see `librustc/hir/def_id.rs`). Uniquely identifies a `DefPath`.
HIR | the High-level IR, created by lowering and desugaring the AST ([see more](hir.html))
HirId | identifies a particular node in the HIR by combining a def-id with an "intra-definition offset".
HIR Map | The HIR map, accessible via tcx.hir, allows you to quickly navigate the HIR and convert between various forms of identifiers.
free variable | a "free variable" is one that is not bound within an expression or term; see [the background chapter for more](./background.html#free-vs-bound)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, could you actually move this after DefId? The glossary is a bit out of order at the moment. This is fixed in #56...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@nikomatsakis
Copy link
Contributor Author

nikomatsakis commented Feb 26, 2018

Oh, apparently GH renders them the same. Do you know if mdbook will do the same? I didn't know you could do that.

I assume so, 4-space indent was the "original markdown" technique, triple-backticks were added later. I've never quite kicked the habit of 4-space indent...

UPDATE: seems to look fine

@nikomatsakis
Copy link
Contributor Author

@mark-i-m I added some notes to #2

@nikomatsakis
Copy link
Contributor Author

@mark-i-m rebased, fixed the order of glossary. The alphabet is hard.

@mark-i-m
Copy link
Member

The alphabet is hard.

Lol, tell me about it... I've gotten this wrong for a few PR cycles now 😛

@mark-i-m
Copy link
Member

Thanks @nikomatsakis!

I need to run now, but I think I should come around to this tomorrow

@mark-i-m mark-i-m mentioned this pull request Feb 26, 2018
27 tasks
Copy link
Member

@davidtwco davidtwco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was just reading through this to see if I could pick anything up from it, noticed a small typo.

regions are the results of lexical region inference and hence are
not of much interest. The intention is that -- eventually -- they
will be "erased regions" (i.e., no information at all), since we
don't be doing lexical region inference at all.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we don't be doing -> since we won't be doing?

Copy link
Member

@mark-i-m mark-i-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the effort @nikomatsakis ! This is a great chapter! I found a few more typos.

The universes section clears up a lot of things 👍

src/glossary.md Outdated
query | perhaps some sub-computation during compilation ([see more](query.html))
region | another term for "lifetime" often used in the literature and in the borrow checker.
sess | the compiler session, which stores global data used throughout compilation
side tables | because the AST and HIR are immutable once created, we often carry extra information about them in the form of hashtables, indexed by the id of a particular node.
sigil | like a keyword but composed entirely of non-alphanumeric tokens. For example, `&` is a sigil for references.
skolemization | a way of handling subtyping around "for-all" types (e.g., `for<'a> fn(&'a u32)` as well as solving higher-ranked trait bounds (e.g., `for<'a> T: Trait<'a>`). See [the chapter on skolemization and universes](./mir-regionck.html#skol) for more details.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are missing a closing paren after the first example...

[appear free][fvb] in the function body.
- First, it finds the set of regions that appear within the
signature of the function (e.g., `'a` in `fn foo<'a>(&'a u32) {
... }`. These are called the "universal" or "free" regions -- in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are missing a closing paren after the example

@@ -154,6 +155,118 @@ outlives `'static`. Now, this *might* be true -- after all, `'!1`
could be `'static` -- but we don't *know* that it's true. So this
should yield up an error (eventually).

### What is a universe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really useful section. It might be worthwhile to note that the root universe is not unique in some sense, IIUC. There is a root universe for every "instance" of regionck, right? Otherwise, you would have generics in your universe from other items, right?

}
```

Here, the root universe would consider of the lifetimes `'static` and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"would consider of" -> "would consist of"

```

When we enter *this* type, we will again create a new universe, which
let's call `U2`. It's parent will be the root universe, and U1 will be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"which we'll"

in common. And because everything in U1 is scoped to just U1 and its
children, that inference variable X would have to be in U0. And since
X is in U0, it cannot name anything from U1 (or U2). This is perhaps easiest
to see by using a kind of generic "logic" example:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woah. That's really cool. Mind blown. 💥

@nikomatsakis
Copy link
Contributor Author

@mark-i-m ok I addressed those changes I believe

Copy link
Member

@mark-i-m mark-i-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@mark-i-m mark-i-m merged commit 03044a3 into rust-lang:master Feb 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants