Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor DebruijnIndex to be 0-based #49813

Closed
nikomatsakis opened this issue Apr 9, 2018 · 7 comments
Closed

Refactor DebruijnIndex to be 0-based #49813

nikomatsakis opened this issue Apr 9, 2018 · 7 comments
Labels
A-traits Area: Trait system C-cleanup Category: PRs that clean code up or issues documenting cleanup. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-traits Working group: Traits, https://internals.rust-lang.org/t/announcing-traits-working-group/6804

Comments

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Apr 9, 2018

Part of #49810:

The DebruijnIndex type -- for some reason I no longer recall -- is 1-based:

rust/src/librustc/ty/sty.rs

Lines 897 to 941 in 4b9b70c

/// A [De Bruijn index][dbi] is a standard means of representing
/// regions (and perhaps later types) in a higher-ranked setting. In
/// particular, imagine a type like this:
///
/// for<'a> fn(for<'b> fn(&'b isize, &'a isize), &'a char)
/// ^ ^ | | |
/// | | | | |
/// | +------------+ 1 | |
/// | | |
/// +--------------------------------+ 2 |
/// | |
/// +------------------------------------------+ 1
///
/// In this type, there are two binders (the outer fn and the inner
/// fn). We need to be able to determine, for any given region, which
/// fn type it is bound by, the inner or the outer one. There are
/// various ways you can do this, but a De Bruijn index is one of the
/// more convenient and has some nice properties. The basic idea is to
/// count the number of binders, inside out. Some examples should help
/// clarify what I mean.
///
/// Let's start with the reference type `&'b isize` that is the first
/// argument to the inner function. This region `'b` is assigned a De
/// Bruijn index of 1, meaning "the innermost binder" (in this case, a
/// fn). The region `'a` that appears in the second argument type (`&'a
/// isize`) would then be assigned a De Bruijn index of 2, meaning "the
/// second-innermost binder". (These indices are written on the arrays
/// in the diagram).
///
/// What is interesting is that De Bruijn index attached to a particular
/// variable will vary depending on where it appears. For example,
/// the final type `&'a char` also refers to the region `'a` declared on
/// the outermost fn. But this time, this reference is not nested within
/// any other binders (i.e., it is not an argument to the inner fn, but
/// rather the outer one). Therefore, in this case, it is assigned a
/// De Bruijn index of 1, because the innermost binder in that location
/// is the outer fn.
///
/// [dbi]: http://en.wikipedia.org/wiki/De_Bruijn_index
#[derive(Clone, PartialEq, Eq, Hash, RustcEncodable, RustcDecodable, Debug, Copy, PartialOrd, Ord)]
pub struct DebruijnIndex {
/// We maintain the invariant that this is never 0. So 1 indicates
/// the innermost binder. To ensure this, create with `DebruijnIndex::new`.
pub depth: u32,
}

This seems a bit confusing and unnecessary. If we are going to unify it with CanonicalVar, it'd be nicer if it were 0-based. Also, maybe we should make the internal field private and use an accessor (e.g., to_depth() -> usize or something).

I guess that the first step to doing this refactor is to remove the assertion from new:

rust/src/librustc/ty/sty.rs

Lines 1158 to 1160 in 4b9b70c

pub fn new(depth: u32) -> DebruijnIndex {
assert!(depth > 0);
DebruijnIndex { depth: depth }

A quick ripgrep reveals that DebruijnIndex::new is often invoked with a hard-coded 1 or 2, which can .. presumably be just adjusted to 0 and 1, respectively:

/home/nmatsakis/.cargo/bin/rg --no-heading --color never 'DebruijnIndex::new'
src/librustc_driver/test.rs:330:        let r = self.re_late_bound_with_debruijn(id, ty::DebruijnIndex::new(1));
src/librustc_driver/test.rs:487:            let t_ptr_bound2 = env.t_rptr_late_bound_with_debruijn(1, ty::DebruijnIndex::new(2));
src/librustc_driver/test.rs:524:            let t_rptr_bound2 = env.t_rptr_late_bound_with_debruijn(1, ty::DebruijnIndex::new(2));
src/librustc_driver/test.rs:552:        let t_rptr_bound1 = env.t_rptr_late_bound_with_debruijn(1, ty::DebruijnIndex::new(1));
src/librustc_driver/test.rs:555:        let t_rptr_bound2 = env.t_rptr_late_bound_with_debruijn(1, ty::DebruijnIndex::new(2));
src/librustc_driver/test.rs:571:        let re_bound1 = env.re_late_bound_with_debruijn(1, ty::DebruijnIndex::new(1));
src/librustc_driver/test.rs:586:            let t_rptr_bound2 = env.t_rptr_late_bound_with_debruijn(1, ty::DebruijnIndex::new(2));
src/librustc_trans/common.rs:428:            let env_region = ty::ReLateBound(ty::DebruijnIndex::new(1), ty::BrEnv);
src/librustc/util/ppaux.rs:500:            tcx.mk_region(ty::ReLateBound(ty::DebruijnIndex::new(1), br))
src/librustc/ty/sty.rs:939:    /// the innermost binder. To ensure this, create with `DebruijnIndex::new`.
src/librustc/ty/fold.rs:390:                    self.mk_region(ty::ReLateBound(ty::DebruijnIndex::new(current_depth), br))
src/librustc/ty/fold.rs:451:            self.mk_region(ty::ReLateBound(ty::DebruijnIndex::new(1), ty::BrAnon(counter)))
src/librustc/ty/util.rs:592:        let env_region = ty::ReLateBound(ty::DebruijnIndex::new(1), ty::BrEnv);
src/librustc/middle/resolve_lifetime.rs:101:        let depth = ty::DebruijnIndex::new(1);
src/librustc/middle/resolve_lifetime.rs:110:        let depth = ty::DebruijnIndex::new(1);
src/librustc_typeck/check/intrinsic.rs:122:                    tcx.mk_imm_ref(tcx.mk_region(ty::ReLateBound(ty::DebruijnIndex::new(1),
src/librustc_typeck/check/intrinsic.rs:301:                    tcx.mk_imm_ref(tcx.mk_region(ty::ReLateBound(ty::DebruijnIndex::new(1),
src/librustc_typeck/check/generator_interior.rs:128:        fcx.tcx.mk_region(ty::ReLateBound(ty::DebruijnIndex::new(current_depth),
src/librustc/infer/higher_ranked/mod.rs:420:                    return infcx.tcx.mk_region(ty::ReLateBound(ty::DebruijnIndex::new(1), *a_br));
src/librustc/infer/higher_ranked/mod.rs:476:        fldr(region, ty::DebruijnIndex::new(current_depth))
src/librustc/infer/higher_ranked/mod.rs:754:                        ty::DebruijnIndex::new(current_depth - 1), br.clone()))
@nikomatsakis nikomatsakis added A-traits Area: Trait system T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-traits Working group: Traits, https://internals.rust-lang.org/t/announcing-traits-working-group/6804 labels Apr 9, 2018
@muscar
Copy link

muscar commented Apr 9, 2018

Hi. I can give this a go.

@muscar
Copy link

muscar commented Apr 10, 2018

Quick update: I'm currently working on this. I made the changes locally, but I'm getting some errors. I'll open a WIP PR today.

@muscar
Copy link

muscar commented Apr 10, 2018

WIP PR: #49840

@nikomatsakis
Copy link
Contributor Author

@muscar nice!

@nikomatsakis
Copy link
Contributor Author

I've been thinking about this. I think perhaps the way to go is to do this in some more fine-grained steps. I think we want to get to the point where the field of DebruijnIndex is private, so the definition sort of looks like this:

pub struct DebruijnIndex {
    index: u32,
}

Moreover, right now there are two concepts at play. There is "depth", which is 0 when there are no binders in scope and 1 if we have passed through one binder, and "index", which would start at 0. In the existing definition, we kind of unify those by using depth all the time. So e.g. there are a bunch of visitors that carry a binders field

struct GatherLifetimes<'a> {
map: &'a NamedRegionMap,
binder_depth: u32,
have_bound_regions: bool,
lifetimes: FxHashSet<Region>,
}

The field starts at 0 and increments as we pass through a binder:

if let hir::TyBareFn(_) = ty.node {
self.binder_depth += 1;
}

and we then compare against the depth field:

if debruijn.depth < self.binder_depth =>

I want to refactor those to carry not a "number of binders" as a u32 but rather a DebruijnIndex, representing an "outer" binder that was not passed through -- let's call it outer_index. It would start as the "innermost" debruijn index value (today: 1, after this issue is closed: 0). We would then increment and decrement same as today, but instead of checking if debruijn.depth < self.binder_depth, as in the code above, we would check debruijn.depth < self.outer_index in order to see if the late-bound region is bound by one of the binders we passed through.

Also, rather than directly manipulating the field and adding one, we should use helper functions. This is the DebruijnIndex API I was working towards:

impl DebruijnIndex {
    pub const INNERMOST: DebruijnIndex = DebruijnIndex::new(0);

    /// Returns a DebruijnIndex with the given index.
    pub fn new(index: u32) -> Self {
        DebruijnIndex { index }
    }

    /// Returns the index of the enclosing binder that binds this
    /// region/type. Returns 0 to represent the innermost binder.
    pub fn index(self) -> usize {
        self.index
    }

    /// Returns the resulting index when this value is moved into
    /// `amount` number of new binders. So e.g. if you had
    ///
    ///    for<'a> fn(&'a x)
    ///
    /// and you wanted to change to
    ///
    ///    for<'a> fn(for<'b> fn(&'a x))
    ///
    /// you would need to shift the index for `'a` into 1 new binder.
    pub fn shifted_in(self, amount: u32) -> DebruijnIndex {
        DebruijnIndex { index: self.index + amount }
    }

    /// Update this index in place by shifting it "in" through
    /// `amount` number of binders.
    pub fn shift_in(&mut self, amount: u32) {
        *self = self.shifted(amount);
    }

    /// Returns the resulting index when this value is moved out from
    /// `amount` number of new binders.
    pub fn shifted_out(self, amount: u32) -> DebruijnIndex {
        DebruijnIndex { index: self.index - amount }
    }

    /// Update in place by shifting out from `amount` binders.
    pub fn shift_out(&self, amount: u32) {
        *self = self.shifted(amount);
    }
}

@nikomatsakis
Copy link
Contributor Author

@csmoe mentioned an interest in that; @muscar, I've not heard from you in a while, so I'm assuming you got busy (which is totally fine!), but let me know otherwise!

@nikomatsakis
Copy link
Contributor Author

oh btw the approach I would take here is to:

  • add the API methods above, but without changing the definition of the DebruijnIndex struct yet (i.e., keep field public, and keep it starting at 1)
  • convert each visitor to the new style, one by one
  • convert any other uses of the field, one by one
  • then make field private and remove remaining uses

Use ./x.py check in between iterations to ensure everything builds at each step. Then, when an error occurs, we can bisect to see why. =)

@jkordish jkordish added the C-cleanup Category: PRs that clean code up or issues documenting cleanup. label May 16, 2018
bors added a commit that referenced this issue May 29, 2018
Refactor DebruijnIndex to be 0-based

Fixes #49813
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-traits Area: Trait system C-cleanup Category: PRs that clean code up or issues documenting cleanup. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-traits Working group: Traits, https://internals.rust-lang.org/t/announcing-traits-working-group/6804
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants