Miri engine: avoid having mutliple AllocId for the same static #61345

RalfJung · 2019-05-30T09:24:05Z

We currently actually assign two AllocId to every static/const: one pointing to the DefId (a "lazy" ID that can be created without evaluating anything), and one pointing to an actual Allocation (a "resolved ID" available only after it has been evaluated). Also see the comments added in #61278. The "resolved ID" should never be visible to other CTFE or Miri evaluations, because then we'd have two different IDs for the same allocation!

The second ID gets assigned when we intern the result of const evaluation. Or rather, it gets assigned and added to the local map of the CTFE engine when we allocate the return place for const evaluation, which is later used as the root for interning at which point it gets moved into the global tcx allocation map.

First of all, are we entirely sure that the stuff we intern will not use the resolved ID anywhere? Rust provides no "obvious" way to take the address of the return place, and I think that would be the only way the "resolved ID" could leak into the evaluated program. Still, can we make interning not intern the "root", just to be really sure that if that ID leaks somehow it will not cause problems? Or even better, can we not allocate a new ID for the return place and instead use the "original" ID? In the CTFE engine, that ID could map to an Allocation in the local memory map, even though it maps to a Static in the global tcx memory. By avoiding even assigning a second ID we'd avoid all problems!

If we cannot avoid assigning a second ID, we also have to make sure that other computations, when they request the content of a static, do not use the "resolved ID". Currently, the const_eval_raw query will return a RawConst pointing exactly to that "resolved ID". Maybe we should instead make it return a &'tcx Allocation? We don't really need the ID in there because whoever made the const_eval_raw query obviously had the DefId needed to do that and that's all it takes to lookup the "lazy" ID in the tcx! But we could still include it for convenience.

Cc @oli-obk

The text was updated successfully, but these errors were encountered:

RalfJung · 2019-05-30T10:01:42Z

A potentially related problem is that we create new allocations for constant slices every time eval_const_to_op gets executed.

RalfJung · 2019-05-30T10:55:31Z

Similarly, LLVM codegen also creates new allocations for slices in from_const. That also seems strange.

RalfJung · 2019-12-23T21:43:45Z

@oli-obk do #67000 and the other planned changes around statics change anything here?

oli-obk · 2019-12-23T21:51:00Z

I think we can do this, yes. Instead of having GlobalAlloc::Static we could just have this AllocId refer to the real allocation directly. Since we have a scheme to lazily fill in the allocation for an ID this should work out by temporarily also having the ID in the local alloc map. The only worry I have is that we'll be able to mutate a static's memory via its name in the initializer of said static. The mutates value will be overridden in the end though, so it's not observable except for the weird situation where you can write to it.

RalfJung · 2022-07-20T14:55:51Z

With #99420, we have the same situation also for vtables -- they have a "symbolic" alloc ID and a "concrete" one that is backed by an actual &Allocation.

I think my preferred solution at this point would be to

make vtable_allocation return a &Allocation, so that we never have a concrete AllocId for vtables in the global alloc_map. However this will require significant changes to a lot of consumers of this function.
do something similar for statics: when evaluating a static, put the return place for that static at the AllocId that is symbolic in the global alloc_map -- we will then have a concrete allocation in the interpreter memory, but overlap between these two is fine (it happens all the time in Miri) and the local interpreter memory takes precedence. That avoids allocating a second AllocId for this static. It means that eval_static needs to return an &Allocation (with the concrete contents), not an AllocId -- does that make sense?

…, r=RalfJung,cjgillot Store static initializers in metadata instead of the MIR of statics. This means that adding generic statics would be even more difficult, as we can't evaluate statics from other crates anymore, but the subtle issue I have encountered make me think that having this be an explicit problem is better. The issue is that ```rust static mut FOO: &mut u32 = &mut 42; static mut BAR = unsafe { FOO }; ``` gets different allocations, instead of referring to the same one. This is also true for non-static mut, but promotion makes `static FOO: &u32 = &42;` annoying to demo. Fixes rust-lang#61345 ## Why is this being done? In order to ensure all crates see the same nested allocations (which is the last issue that needs fixing before we can stabilize [`const_mut_refs`](rust-lang#57349)), I am working on creating anonymous (from the Rust side, to LLVM it's like a regular static item) static items for the nested allocations in a static. If we evaluate the static item in a downstream crate again, we will end up duplicating its nested allocations (and in some cases, like the `match` case, even duplicate the main allocation).

…ung,cjgillot Store static initializers in metadata instead of the MIR of statics. This means that adding generic statics would be even more difficult, as we can't evaluate statics from other crates anymore, but the subtle issue I have encountered make me think that having this be an explicit problem is better. The issue is that ```rust static mut FOO: &mut u32 = &mut 42; static mut BAR = unsafe { FOO }; ``` gets different allocations, instead of referring to the same one. This is also true for non-static mut, but promotion makes `static FOO: &u32 = &42;` annoying to demo. Fixes rust-lang/rust#61345 ## Why is this being done? In order to ensure all crates see the same nested allocations (which is the last issue that needs fixing before we can stabilize [`const_mut_refs`](rust-lang/rust#57349)), I am working on creating anonymous (from the Rust side, to LLVM it's like a regular static item) static items for the nested allocations in a static. If we evaluate the static item in a downstream crate again, we will end up duplicating its nested allocations (and in some cases, like the `match` case, even duplicate the main allocation).

RalfJung mentioned this issue May 30, 2019

Miri: give machine the chance to tag all allocations #61278

Merged

RalfJung changed the title ~~Miri engine~~ Miri engine: avoid having mutliple AllocId for the same static May 30, 2019

RalfJung mentioned this issue Apr 16, 2020

Make TLS accesses explicit in MIR #71192

Merged

oli-obk added the E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. label Jul 20, 2022

RalfJung mentioned this issue Oct 9, 2023

Store static initializers in metadata instead of the MIR of statics. #116564

Merged

bors closed this as completed in 6a4222b Feb 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Miri engine: avoid having mutliple AllocId for the same static #61345

Miri engine: avoid having mutliple AllocId for the same static #61345

RalfJung commented May 30, 2019

RalfJung commented May 30, 2019

RalfJung commented May 30, 2019

RalfJung commented Dec 23, 2019

oli-obk commented Dec 23, 2019

RalfJung commented Jul 20, 2022

Miri engine: avoid having mutliple AllocId for the same static #61345

Miri engine: avoid having mutliple AllocId for the same static #61345

Comments

RalfJung commented May 30, 2019

RalfJung commented May 30, 2019

RalfJung commented May 30, 2019

RalfJung commented Dec 23, 2019

oli-obk commented Dec 23, 2019

RalfJung commented Jul 20, 2022