Refactor the way interpreters handle special functions #121273

WaffleLapkin · 2024-02-18T19:16:00Z

... by reusing the ExtraFnVal type.

This changes find_mir_or_eval_fn into find_mir_or_extra_fn, which helps structure code a little bit better. Having separate "find info for evaluation" and "evaluate" steps is nice on its own, but it's also required for tail calls.

(this is needed for #113128, as otherwise correctly implementing tail calls in the interpreter creates a dependency loop)

r? @oli-obk

rustbot · 2024-02-18T19:16:10Z

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

The Miri subtree was changed

cc @rust-lang/miri

By reusing the `ExtraFnVal` type. This changes `find_mir_or_eval_fn` into `find_mir_or_extra_fn`, which helps structure code a little bit better. Having separate "find info for evaluation" and "evaluate" steps is nice on its own, but it's also required for tail calls.

WaffleLapkin · 2024-02-18T20:55:06Z

Stack overflow happens on this example:

#![feature(const_align_offset)]

const _: usize = (1 as *const ()).align_offset(2);

fn main() {}

With the following infinite recursion:

eval_fn_call -> call_extra_fn (ref)
call_extra_fn -> eval_fn_call (ref)

I'm not exactly sure how to resolve this... When handling align_offset we need to try special const eval version, but if it fails (the pointer doesn't point to an allocation) we need to actually execute the normal function body, which gets us to the same exact Instance which makes us retry the const eval path....

I can add a dont_use_extra_fn: bool to the eval_fn_call/find_mir_or_extra_fn so that find_mir_or_extra_fn returns the actual body if it's true but uuuhhhhh. I don't like this solution... Any ideas?

compiler/rustc_const_eval/src/const_eval/machine.rs

compiler/rustc_const_eval/src/interpret/terminator.rs

oli-obk · 2024-02-19T10:25:21Z

compiler/rustc_const_eval/src/interpret/machine.rs

@@ -191,22 +192,17 @@ pub trait Machine<'mir, 'tcx: 'mir>: Sized {
    /// nor just jump to `ret`, but instead push their own stack frame.)
    /// Passing `dest`and `ret` in the same `Option` proved very annoying when only one of them
    /// was used.
-    fn find_mir_or_eval_fn(
+    fn find_mir_or_extra_fn(


The documentation of this function needs an update

I fixed it up, should be better now.

Athough it looks like there are other tests and comments mentioning __rust_maybe_catch_panic, which doesn't seem to exist anymore?... Possibly someone needs to go over them and update them too.

WaffleLapkin · 2024-02-19T13:53:37Z

@bors r=oli-obk

bors · 2024-02-19T13:53:40Z

📌 Commit 3f17b34 has been approved by oli-obk

It is now in the queue for this repository.

RalfJung

Thanks for the PR! I still have some concerns though. @bors r-

Having separate "find info for evaluation" and "evaluate" steps is nice on its own,

I buy this in the abstract, but looking at the PR I am not entirely sold:

You had to expose a fairly low-level function (eval_fn_call) as fully pub now, and add an even-lower-level helper (eval_body) as pub(crate), indicating maybe the new structure is worse than the old one in terms of encapsulation.
The phase separation doesn't seem to be entirely working, given that in a bunch of cases call_extra_fn has to push a stack frame with MIR in it. So now we have two different code paths that do something similar: find_mir_or_extra_fn returns MIR, or it returns an ExtraFnVal and then call_extra_fn gets the MIR and has to deal with it by itself. That's definitely worse than before. Is there a way to avoid this redundancy?

Maybe explain on a high level the problem you are trying to solve. We should look for a better design that solves your problem while avoiding introducing these issues.

RalfJung · 2024-02-19T19:51:41Z

compiler/rustc_const_eval/src/const_eval/machine.rs

+                        // const eval extra fn which ends up here, so calling `eval_fn_call`
+                        // would cause infinite recursion and stack overflow.
+                        let body = ecx.load_mir(instance.def, None)?;
+                        ecx.eval_body(


Instead of having to introduce and expose a new very low-level function here, would it work to have find_mir_or_extra_fn make the choice of whether to call the real function or the interpreter shim? Fundamentally the problem seems to be that we first said we want an ExtraFn but then later we decided we want MIR anyway.

OTOH, Miri also has that situation, I wonder how this is handled there.

RalfJung · 2024-02-19T19:52:34Z

src/tools/miri/src/shims/foreign_items.rs

+pub enum ExtraFnVal {
+    ForeignFn { link_name: Symbol },
+    DynSym(DynSym),
+}


Both of these represent function pointers to a named global symbol. Why do we need to distinguish these variants?

I don't know. They were handled differently before and I just kept the same difference — DynSym has an assert that makes sure that we actually used a shim, and not an exported symbol.

Before, this type had a clear meaning: it represented "things a function pointer can point to". The interpreter natively supports function pointers that point to an Instance, but for dlsym support we needed function pointers that point to something else, and that's what this extra-fn stuff is about.

But with your change, now there can be a function pointer that points to DynSym("functionname"), and a function pointer that points to ForeignFn { link_name: "functionname" }, and you made those be different things. That's not pre-existing, that's new in this PR. And I don't see why we need these two kinds of function pointers. The assert indicates that either these should be two different types, or one type that's just a Symbol like DynSym before. But I can't see why it would be a good idea to make this a sum type with two variants that conceptually represent the same thing.

Using ExtraFnVal as return type in find_mir_or_extra_fn makes some sense, because it matches the same dichotomy: either a native Rust function (Instance/MIR body), or "something else". Probably things could be cleaned up a bit by actually using FnVal<ExtraFnVal> as return type, then the function pointer ("indirect call"/Ty::FnPtr) codepath and the direct call (Ty::FnDef) code path can be even more unified.

But that is something we can do later, currently there are bigger problems.

That's not pre-existing

I'm pretty sure it is. DynSyms are handled in emulate_dyn_sym here:

rust/src/tools/miri/src/shims/foreign_items.rs

Lines 156 to 157 in 0395fa3

let res = self.emulate_foreign_item(sym.0, abi, args, dest, ret, unwind)?;

assert!(res.is_none(), "DynSyms that delegate are not supported");

While usual foreign function are handled in a branch in find_mir_or_eval_fn here:

rust/src/tools/miri/src/machine.rs

Line 960 in 0395fa3

return ecx.emulate_foreign_item(link_name, abi, &args, dest, ret, unwind);

Those are slightly distinct (see the assert) even though they result in basically the same emulate_foreign_item call.

I have nothing against having only a single type, but I'm not sure how to keep this assert then...

I'm pretty sure it is. DynSyms are handled in emulate_dyn_sym here:

While usual foreign function are handled in a branch in find_mir_or_eval_fn here:

Yes, but that's not what I was talking about. We just had two separate codepaths for handling calls to Rust items vs dynamically loaded items. The "obvious" thing for you to do would be having two different types to represent this. You decided to unify these types, which I can get behind (in fact I think it's a great idea), but it's a bad idea to unify them by introducing redundancy in the value type of the interpreter.

Here's what I said: there were before not two ways to represent a function pointer. You then pointed at two functions doing related things. That's a category error as I was talking about representation! So any counterargument can only point at data in the interpreter, not interpreter functions.

Those are slightly distinct (see the assert) even though they result in basically the same emulate_foreign_item call.

Yeah, the dynsym handling doesn't currently support calling a MIR-implemented function. We simply didn't need this so far. It may be that with your patch, supporting this becomes trivial, so then there's no more reason for the assert.

RalfJung · 2024-02-19T19:53:32Z

src/tools/miri/src/shims/foreign_items.rs

                    | "ExitProcess"
                    => {
                        let exp_abi = if link_name.as_str() == "exit" {
                            Abi::C { unwind: false }
                        } else {
                            Abi::System { unwind: false }
                        };
-                        let [code] = this.check_shim(abi, exp_abi, link_name, args)?;
+                        let args = this.copy_fn_args(args)?; // FIXME: Should `InPlace` arguments be reset to uninit?


Why is this duplicated so often now?

Because there is a code path which needs &[FnArg], but most need a &[OpTy].

There's now more of these asserts than before though, I think? So what changed?

RalfJung · 2024-02-19T19:54:28Z

compiler/rustc_const_eval/src/interpret/terminator.rs

@@ -497,15 +498,15 @@ impl<'mir, 'tcx: 'mir, M: Machine<'mir, 'tcx>> InterpCx<'mir, 'tcx, M> {
    /// `with_caller_location` indicates whether the caller passed a caller location. Miri
    /// implements caller locations without argument passing, but to match `FnAbi` we need to know
    /// when those arguments are present.
-    pub(crate) fn eval_fn_call(
+    pub fn eval_fn_call(


I'm not happy with making this more public, this is a rather low-level function. Can this be avoided?

It can probably be avoided with the return Result<Option<Instance>> scheme.

I feel like returning Option<Instance> is the same level of awkwardness...

This PR has the awkwardness of attempting to split "find the fn" from "call the fn", but then in quite a few cases "call the fn" ends up finding MIR elsewhere, completely breaking the structure.

If it is okay for call_extra_fn to call eval_fn_call, then we're just back to where we were before but with more messy code, I think? I don't see how you gained anything.

RalfJung · 2024-02-19T19:54:52Z

compiler/rustc_const_eval/src/interpret/terminator.rs

@@ -885,6 +709,198 @@ impl<'mir, 'tcx: 'mir, M: Machine<'mir, 'tcx>> InterpCx<'mir, 'tcx, M> {
        }
    }

+    pub(crate) fn eval_body(


Please add a doc comment.

RalfJung · 2024-02-19T19:59:20Z

@bors r-
since bors does not seem to be reading review comments

WaffleLapkin · 2024-02-19T21:03:42Z

@RalfJung the problem I'm trying to solve is needing to pass arguments to find_mir_or_.... This is a problem because it creates the following circular dependency: to evaluate arguments for a tail call we need to create a new stack frame¹, but to create a new stack frame, we need to know what function we are calling, which requires calling find_mir_or_..., and so on.

would it work to have find_mir_or_extra_fn make the choice of whether to call the real function or the interpreter shim?

I don't think so, no. This won't work because align_offset needs to check arguments to know if we want to call normal function or the shim (depending on if this is a pointer to an allocation or a pointer typed integer).

Although I guess we could implement both paths in a shim?... The problem is only the complexity of align_offset implementation.

But other than that, I think all other cases can decide based on just the instance.

Fundamentally the problem seems to be that we first said we want an ExtraFn but then later we decided we want MIR anyway.

Yes, that's the awkward thing. It both causes my original problem (evaluating special function requires arguments, but getting mir shouldn't) and the awkwardness of this PR (everything calling into each other, eval_body, etc).

We need to copy arguments to a new stack frame, otherwise they die before being passed to the function ↩

WaffleLapkin · 2024-02-19T21:08:36Z

Although. Mh. It looks like naive implementation of align_offset is not that complex, actually? (~~unless I've missed some edgecase ofc~~; UPD: I indeed missed the case where size is not a power of two)

(~~play~~)

WaffleLapkin · 2024-02-19T21:21:08Z

Actually, InterpCx::align_offset already calls the original instance in the const eval case, so this might be easier than I thought?... Ah, nevermind, copying this call would also cause infinite recursion, I'm pretty sure.

RalfJung · 2024-02-19T22:29:11Z

Yes, that's the awkward thing. It both causes my original problem (evaluating special function requires arguments, but getting mir shouldn't) and the awkwardness of this PR (everything calling into each other, eval_body, etc).

Ah okay so that's why you can't have find_mir_or_extra_fn check the argument for align_offset to decide whether to pick a MIR body or call a special handler? I'll have to spend more time to understand why you need to find the MIR without knowing the arguments (not sure when I will have the time to read all the things you wrote about that), but I can just take this constraint as given for now.

For align_offset, I think this means you need to always pick the MIR body (making align_offset not even be a special function any more), and then in the MIR body call a special intrinsic that allows to early return in a special case in const-eval. This could be done as a separate PR preparing for this one.

But the problem isn't solved then, Miri also has some cases where it wants to run a MIR body:

easy: panic_impl should call the #[panic_handler] function. That can just be done in find_mir_or_extra_fn rather than eval_extra_fn.
medium: the Rust allocator functions should call the #[global_allocator] when it exists, but be special magic functions without MIR when it does not. This may mean that both find_mir_or_extra_fn and eval_extra_fn need to match on these names, to handle these two cases.
hard: when a foreign (extern) function is not supported, we fall back to finding a function with that name as its #[link_name]. But of course in find_mir_or_extra_fn we can't know yet whether it is supported or not... I am not sure how to best to this without losing the ability to detect the situation where someone uses #[link_name] to overwrite a function Miri supports (which are generally well-known functions that it is UB to overwrite). We could completely refactor the "foreign item" handling in Miri such that there is a global table / registry from function names to handlers for these functions, and then find_mir_or_extra_fn could check #[link_name] and check if an entry exists in that table to make sure we don't overwrite such well-known functions... but that would be a huge refactoring, I hope there's an easier way.

WaffleLapkin · 2024-02-19T23:34:27Z

For align_offset, I think this means you need to always pick the MIR body (making align_offset not even be a special function any more), and then in the MIR body call a special intrinsic that allows to early return in a special case in const-eval. This could be done as a separate PR preparing for this one.

Makes sense, sure.

But the problem isn't solved then, Miri also has some cases where it wants to run a MIR body:

I think all of those are solved by allowing find_mir_or_extra_fn to return a different instance/body pair (as it was before the current changes) + using more enums inside ExtraFnVal.

panic_impl — find_mir_or_extra_fn returns the instance/body of the right function
global_allocator — find_mir_or_extra_fn checks the allocator mode and either returns the instance/body of the right function or returns an enum describing the function to be emulated
foreign functions — same deal

RalfJung · 2024-02-20T06:53:19Z

foreign functions — same deal

How is find_mir_or_extra_fn supposed to know whether Miri supports a given foreign function?
This is one big match over the function name, and the match will live in eval_extra_fn with your approach. I certainly don't want to duplicate this list.

In fact it gets even worse -- Miri has support for loading a .so file and dynamically using functions from that .so file to implement foreign functions. So knowing whether we support a function involves checking whether the .so file contains a function of the given name.

RalfJung · 2024-02-20T07:33:51Z

Looking at the current align_offset handling (in master) -- woah that is messy.^^ align_offset calls eval_fn_call which recursively calls align_offset again but now with different arguments so there's no loop.

Miri foreign item handling also has to sometimes push a function, which is similar to what align_offset needs to do, but Miri does it quite differently.

And neither of these seem great, they're both accessing pretty low-level interpreter functionality. If I were to refactor this now I'd try to extract the common code between eval_fn_call and Miri's call_function: something that takes instance, MIR body, ABIs, etc, as well as arguments, and then pushes the stack frame and passes the arguments. Basically, the part of eval_fn_call that's after find_mir_or_eval_fn. Both CTFE align_offset and Miri should be able to use this. And this seems quite similar to your eval_body as well! I didn't realize this is what it does, the diff makes it quite hard to see. So maybe that would be a good first step -- make eval_fn_call and push_stack_frame private (or pub(super) if needed) and make the only pub thing something like eval_body -- though I'd name it something more like push_stack_frame_with_args or so.

That doesn't achieve your desired split of "find fn" and "call fn" though, which I still need to understand better.

RalfJung · 2024-02-20T10:07:17Z

This is a problem because it creates the following circular dependency: to evaluate arguments for a tail call we need to create a new stack frame1, but to create a new stack frame, we need to know what function we are calling, which requires calling find_mir_or_..., and so on.

Hm... but normal calls also already have that issue, don't they? We evaluate arguments after pushing the stack frame. And it does work fine. What changes when considering tail calls?

align_offset is awkward indeed. I've now sketched two ways to improve it:

make it not be special by itself, but instead call an intrinsic to allow the interpreter to short-circuit its evaluation.
use push_stack_frame_with_args instead of eval_fn_call to avoid the circularity.

oli-obk · 2024-02-20T11:24:44Z

For intrinsics with fallback bodies we will need a similar scheme soon (fall back to a different instance than the one invoked)

oli-obk · 2024-02-20T19:16:39Z

I don't understand the state of the discussion here anymore. We can figure out how to untangle align_offset, but that seems orthogonal to this PR

IMO the PR I approved was a good refactoring. We could make it more codegen-like by not calling the bodies directly, but by bubbling the new instance outwards towards the regular call logic. But that's not strictly necessary

RalfJung · 2024-02-20T21:21:08Z

IMO this PR is a strict decrease in code organization and maintainability.

Currently we have a single clear entrypoint for "what happens on a fn call": find_mir_or_eval_fn. It either returns the MIR+Instance to use or just completely emulates handles the call itself. What align_offset does is messy, but this PR doesn't improve that I think?

With this PR we have one function (find_mir_or_extra_fn) that will sometimes find the MIR and sometimes return a ExtraFnVal, and then a second function (call_extra_fn) that sometimes finds the MIR and sometimes handles the call. When call_extra_fn finds the MIR it has to take care of initializing the stack frame itself, but when find_mir_or_extra_fn finds the MIR then the stack frame handling is done by the core interpreter. This is redundant, and inconsistent. That looks strictly worse than the status quo to me. We also have to make previously private interpreter functions public, which is never a good sign. And then there is the change to the value type for function pointers in Miri (which is ExtraFnVal's primary purpose), which I don't think makes sense.

Also if it is somehow necessary to find the MIR in find_mir_or_extra_fn for tail calls, this PR doesn't achieve that since sometimes it's call_extra_fn that finds the MIR. So I don't even understand how this resolves whatever issue there is with tail calls.

I can think of various ways to clean this up, since I agree that improvements are possible here. I've mentioned some of them above. But I don't think this PR is an improvement. And I don't understand how tail calls factor in so I don't know which of the refactorings I have in my mind are prepared for tail calls.

Without taking into account tail calls, the first refactor I would try is to notice that eval_fn_call kind of has 3 branches: call_extra_fn, call_intrinsic, find_mir_or_eval_fn. They differ quite a bit in what the hooks can do and which services are "offered" to them (like being able to just return a MIR body and have someone else do the work of initializing the stack frame). These could probably be merged into a single function (i.e., a single machine hook, probably still called find_mir_or_eval_fn) that takes FnVal and returns Option<(&'mir mir::Body<'tcx>, ty::Instance<'tcx>)>. And then I already mentioned above that it would be a good idea to make eval_fn_call and push_stack_frame private and only expose a push_stack_frame_with_args. (Though that does make me wonder which caller FnAbi Miri's call_function helper should use.)

What I don't see is why splitting this new version of find_mir_or_eval_fn into two parts is a good idea. This introduces a two-phase split with a "narrow waist" into the logic of that function, and that split is just not natural if we look at what this function does today. (If this truly did achieve separating of "find function" from "eval function", I could be convinced that this helps structure the code. But this is not what the PR does. The "eval function" hook in many cases still ends up just finding the function. And it's not surprising, the things we want to do in Miri just don't lend themselves to introducing such a "narrow waist". So looking at the machine trait I might be misled into thinking that things are organized one way, when actually that organization isn't present.) Intrinsic fallback bodies shouldn't need this. Tail calls maybe do but I haven't yet understood why. Once I understand the constraints I'm happy to help design a refactor that satisfies them. I don't want to block tail calls, but before we land a PR that imposes a structure on the code that the code just doesn't have, complete with the work-arounds required because of that mismatch, I'd like us to take a step back to the drawing board.

RalfJung · 2024-02-20T21:50:48Z

(I've kept editing the post above, please read on Github of you're usually just reading the email notifications.)

bors · 2024-02-27T04:08:55Z

☔ The latest upstream changes (presumably #121655) made this pull request unmergeable. Please resolve the merge conflicts.

WaffleLapkin · 2024-02-28T21:49:08Z

@RalfJung

Also if it is somehow necessary to find the MIR in find_mir_or_extra_fn for tail calls, this PR doesn't achieve that since sometimes it's call_extra_fn that finds the MIR. So I don't even understand how this resolves whatever issue there is with tail calls.

I think what I had in mind is that if find_mir_or_extra_fn returns an extra fn, we downgrade the tail call to a normal call. This is questionable, but I believe would be correct given that extra fns don't use tail calls themselves.

Still, I agree that this PR is not a clear improvement. Let me try to explain the problem I'm trying to solve.

So the problem is here (TerminatorKind::TailCall PR):

rust/compiler/rustc_const_eval/src/interpret/terminator.rs

Lines 170 to 189 in fb3375c

    
           let Some(prev_frame) = self.stack_mut().pop() else { 
        
               span_bug!( 
        
                   terminator.source_info.span, 
        
                   "empty stack while evaluating this tail call" 
        
               ) 
        
           }; 
        
           let StackPopCleanup::Goto { ret, unwind } = prev_frame.return_to_block else { 
        
               span_bug!(terminator.source_info.span, "tail call with the root stack frame") 
        
           }; 
        
           self.eval_fn_call( 
        
               callee, 
        
               (fn_sig.abi, fn_abi), 
        
               &args, 
        
               with_caller_location, 
        
               &prev_frame.return_place, 
        
               ret, 
        
               unwind, 
        
           )?;

(this will be refactored into a replace_stack_frame function but that's not the point)

This currently is completely broken. stack_mut().pop() invalidates the arguments' locals, causing ICE when performing the call. To fix this I wanted to do the following (in replace_stack_frame):

push the new stack frame
move the arguments from the old stack frame to the new one
delete the old stack frame with .remove(len - 2) (-ish)
evaluate the function 🎉

The problem is that there is a circular dependency — 1 requires knowing the function (stack frame contains body & instance), but the only way to know the function is to call find_mir_or_eval_fn, which requires arguments, which we can"t have, because we haven't moved them yet, which we can't do because there is no new stack frame yet. The dependency cycle is new stack frame -> function -> arguments -> new stack frame.

My idea (granted, not fully implemented here) was to break the "function -> arguments" edge. I'm not sure what other options there are, breaking this edge seems like a requirement. Hopefully you have ideas now to solve the problem better, than this PR =)

LMK if something in this explanation is not clear.

RalfJung · 2024-03-04T17:47:53Z

delete the old stack frame with .remove(len - 2) (-ish)

Uh, I'm not sure that will work. We have an optimized representation for local variables that never have their address taken: we store them not in an Allocation but directly in the locals map of the stack frame. We still need to be able to create places that point to them though, so the place is represented as a Local + the index of the frame in the stack. If you remove a non-top stack frame the indices shift and who knows what happens...

I once tried to get rid of this index and make it implicitly always the "current" stack frame, but it turns out when it comes to return values we actually do use cross-stack-frame indexing here: the callee stores the place in the caller that the return value should be put in, and that may be such an optimized place. We could make sure the return place is always a "proper" place in memory, and that would probably not cost a lot of performance. (The long-lived PlaceTy here does make me nervous, so making this a MPlaceTy would make things a bit less fragile anyway.) But then I got worried there might be other places where we're doing this and I didn't want to touch this system...

But anyway, taking a step back... in an idealized interpreter, we would evaluate all the arguments to values, then pop the stack frame (deallocating the backing memory for all its locals), then push the new one. In the rustc interpreter this does not directly work, since an evaluated argument (OpTy) might actually not contain the value, but just reference the value in memory -- yet another optimization. But you should still be able to pop the frame from the stack, as long as you don't deallocate the locals. Then you can push the new stack frame using the same method as today. And then you deallocate all the locals from the old frame. I think that should work?

So in other words:

evaluate all arguments to OpTy
pop the old stack frame without deallocating any locals (but keep it around, we need its local map later); this does not invalidate the OpTy (it would invalidate mir::Operand but we don't need those any more)
push the new stack frame; we have the OpTy so this should work as today
now that the arguments have been copied, deallocate the locals from the old frame

RalfJung · 2024-03-04T17:52:40Z

Ah, it's slightly more complicated since evaluated arguments are not OpTy, they are FnArg. Right, that's for in-place argument passing. That may store a PlaceTy and that may be invalidated by the pop.

But what even are the semantics of in-place argument passing with tail calls? IOW, what does become func(move mylocal) do?

WaffleLapkin · 2024-03-05T12:04:57Z

But what even are the semantics of in-place argument passing with tail calls? IOW, what does become func(move mylocal) do?

I'm not exactly sure, since I don't even know what precisely is f(move local)... Am I interpreting your words correctly that in-place = by-ref? (i.e. func accepts a pointer to some place in the caller that stores mylocal)

From the backend perspective I think it means "move mylocal to the place of the same argument of the current function". Example with low level pseudo code:

fn a(s: Struct) {
    become b(String::new());
}

fn b(s: String) {}

==>

fn a(s_ptr: *mut Struct) {
    // evaluation of the arguments
    let arg0 = String::new();

    // drops of all the locals
    drop_in_place(s_ptr);

    // move of arguments to the appropriate place
    ptr::write(s_ptr, arg0);

    // reset everything to be as if the function is just called... somehow? idk

    // jump to b
    jump b;
}

fn b(s_ptr: *mut String) { ... }

Note that we can always do that, because tail calls (as currently designed) always require signatures/abis to match.

How to represent this in the Rust AM? I'm not sure... Somehow the arguments need to move from the replaced frame to the new frame (which is done in the backend by just reusing the same frame).

RalfJung · 2024-03-05T12:46:18Z

I'm not exactly sure, since I don't even know what precisely is f(move local)... Am I interpreting your words correctly that in-place = by-ref? (i.e. func accepts a pointer to some place in the caller that stores mylocal)

It means that the compiler is allowed but not required to make the callee directly use the caller memory for this place. In the AM, what we do is we create a copy as usual but we also mark the caller memory as "inaccessible" so any attempt to observe or overwrite it causes UB, meaning the compiler can then optimize to reuse that memory since it can't be used for anything else.

But with tail calls, the caller memory is deallocated before the callee even starts. So I think it just makes no sense to even have move in a tail call?

Well I guess it makes sense in your example because the actual location of the argument is even further up the stack, so it doesn't go away with the stack frame. Hm...

I guess in principle we could change how arguments are evaluated to FnArg. If it's an in-place argument in an optimized stack frame, then we may as well use OpTy, since there's anyway no memory to "protect" that could be reused. Then FnArg no longer contains a PlaceTy and we can keep them around even after popping the stack frame.

WaffleLapkin · 2024-03-05T15:15:00Z

But with tail calls, the caller memory is deallocated before the callee even starts. So I think it just makes no sense to even have move in a tail call?

Well I guess it makes sense in your example because the actual location of the argument is even further up the stack, so it doesn't go away with the stack frame. Hm...

Yeah. The idea is that due to how tail call restrictions are currently done¹ you can always find a stack frame where everything can live, because there was some original normal call to a function with the same signature, which had to pass arguments of the same types.

I guess in principle we could change how arguments are evaluated to FnArg. If it's an in-place argument in an optimized stack frame, then we may as well use OpTy, since there's anyway no memory to "protect" that could be reused. Then FnArg no longer contains a PlaceTy and we can keep them around even after popping the stack frame.

I have little idea what all of this means but 👍🏻 ahaha.

NB: we potentially want to eventually add an ABI which does not have these restrictions, but that's a whole other issue for another time ↩

RalfJung · 2024-03-05T16:30:20Z

Yeah. The idea is that due to how tail call restrictions are currently done1 you can always find a stack frame where everything can live, because there was some original normal call to a function with the same signature, which had to pass arguments of the same types.

Hm, interesting. But in the model I don't think we want to require that if a call was "move" then all tail calls it makes have to be "move" as well, so we can't exploit that.

I would be curious about which ICEs you ran into. The code you posted already does eval_fn_call_arguments before popping the stack frame and it seems to entirely omit cleaning up the locals of that call (-> memory leak), so not sure why it should ICE -- except maybe it is actually due to the FnArg issue I mentioned above. Does the call that ICEs involve move?

WaffleLapkin · 2024-03-05T22:37:14Z

@RalfJung so.. I was a bit misremembering and I'm getting errors, not ICEs:

#![allow(incomplete_features, dead_code)]
#![feature(explicit_tail_calls)]

const fn f(x: u32) {
    become f(x);
}

const _: () = f(0);

error[E0080]: evaluation of constant value failed
 --> ./t.rs:8:15
  |
8 | const _: () = f(0);
  |               ^^^^ accessing a dead local variable

And looking at the mir it does contain a move:

// MIR for `f` after built

fn f(_1: u32) -> () {
    debug x => _1;
    let mut _0: ();
    let mut _2: !;
    let mut _3: u32;

    bb0: {
        StorageLive(_3);
        _3 = _1;
        tailcall f(Spanned { node: move _3, span: ./t.rs:5:14: 5:15 (#0) });
    }

    bb1: {
        StorageDead(_3);
        unreachable;
    }

    bb2: {
        return;
    }
}

WaffleLapkin · 2024-03-05T22:39:39Z

Aha, found how to make an example without move (by passing a constant). The following hangs the compiler (as it should!):

#![allow(incomplete_features, dead_code)]
#![feature(explicit_tail_calls)]

const fn f(_: u32) {
    become f(1);
}

const _: () = f(1);

// MIR for `f` after built

fn f(_1: u32) -> () {
    let mut _0: ();
    let mut _2: !;

    bb0: {
        tailcall f(Spanned { node: const 1_u32, span: ./t.rs:5:14: 5:15 (#0) });
    }

    bb1: {
        FakeRead(ForMatchedPlace(None), _1);
        unreachable;
    }

    bb2: {
        unreachable;
    }

    bb3: {
        return;
    }
}

RalfJung · 2024-03-06T06:59:58Z

Okay so I think the first order of business would be to get rid of the PlaceTy in FnArg, replace it by MPlaceTy.

That would require changes in eval_fn_call_arguments: in the Move case, call eval_place. Then do something like:

if let Either::Left(mplace) = place.as_mplace_or_local() {
  FnArg::InPlace(mplace)
} else {
  // This argument doesn't live in memory, so there's no place
  // to make inaccessible during the call.
  // This is also crucial for tail calls, where we want the `FnArg` to
  // stay valid when the old stack frame gets popped.
  FnArg::Copy(self.place_to_op(place)?)
}

Tweak the way we protect in-place function arguments in interpreters Use `MPlaceTy` instead of `PlaceTy` in `FnArg` and ignore (copy) locals in an earlier step ("Locals that don't have their address taken are as protected as they can ever be"). This seems to be crucial for tail call support (as they can't refer to caller's locals which are killed when replacing the stack frame). r? `@RalfJung` cc `@oli-obk` see rust-lang#121273 (comment)

Rollup merge of rust-lang#122076 - WaffleLapkin:mplace-args, r=RalfJung Tweak the way we protect in-place function arguments in interpreters Use `MPlaceTy` instead of `PlaceTy` in `FnArg` and ignore (copy) locals in an earlier step ("Locals that don't have their address taken are as protected as they can ever be"). This seems to be crucial for tail call support (as they can't refer to caller's locals which are killed when replacing the stack frame). r? `@RalfJung` cc `@oli-obk` see rust-lang#121273 (comment)

Tweak the way we protect in-place function arguments in interpreters Use `MPlaceTy` instead of `PlaceTy` in `FnArg` and ignore (copy) locals in an earlier step ("Locals that don't have their address taken are as protected as they can ever be"). This seems to be crucial for tail call support (as they can't refer to caller's locals which are killed when replacing the stack frame). r? `@RalfJung` cc `@oli-obk` see rust-lang/rust#121273 (comment)

rustbot assigned oli-obk Feb 18, 2024

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Feb 18, 2024

This comment has been minimized.

Sign in to view

WaffleLapkin added 3 commits February 18, 2024 20:22

Adapt to the interpreter refactor in miri

e5f7a74

Remove an unused parameter

99db695

oli-obk reviewed Feb 18, 2024

View reviewed changes

compiler/rustc_const_eval/src/const_eval/machine.rs Show resolved Hide resolved

Fix infinite recursion in ctfe align_offset

586d704

rust-cloud-vms bot force-pushed the find-mir-please branch from 0bd2727 to 586d704 Compare February 18, 2024 21:31

oli-obk reviewed Feb 18, 2024

View reviewed changes

compiler/rustc_const_eval/src/interpret/terminator.rs Outdated Show resolved Hide resolved

Move eval_body call inside a match

dc9b611

oli-obk reviewed Feb 19, 2024

View reviewed changes

oli-obk approved these changes Feb 19, 2024

View reviewed changes

Fixup doc comment

3f17b34

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 19, 2024

RalfJung reviewed Feb 19, 2024

View reviewed changes

bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Feb 19, 2024

WaffleLapkin mentioned this pull request Mar 6, 2024

Tweak the way we protect in-place function arguments in interpreters #122076

Merged

WaffleLapkin closed this Mar 6, 2024

WaffleLapkin deleted the find-mir-please branch March 6, 2024 10:07

	let res = self.emulate_foreign_item(sym.0, abi, args, dest, ret, unwind)?;
	assert!(res.is_none(), "DynSyms that delegate are not supported");

Refactor the way interpreters handle special functions #121273

Refactor the way interpreters handle special functions #121273

Conversation

WaffleLapkin commented Feb 18, 2024

rustbot commented Feb 18, 2024

This comment has been minimized.

WaffleLapkin commented Feb 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WaffleLapkin commented Feb 19, 2024

bors commented Feb 19, 2024

RalfJung left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RalfJung Feb 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RalfJung Feb 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RalfJung commented Feb 19, 2024

WaffleLapkin commented Feb 19, 2024

Footnotes

WaffleLapkin commented Feb 19, 2024 • edited Loading

WaffleLapkin commented Feb 19, 2024 • edited Loading

RalfJung commented Feb 19, 2024

WaffleLapkin commented Feb 19, 2024

RalfJung commented Feb 20, 2024 • edited Loading

RalfJung commented Feb 20, 2024 • edited Loading

RalfJung commented Feb 20, 2024

oli-obk commented Feb 20, 2024

oli-obk commented Feb 20, 2024

RalfJung commented Feb 20, 2024 • edited Loading

RalfJung commented Feb 20, 2024

bors commented Feb 27, 2024

WaffleLapkin commented Feb 28, 2024

RalfJung commented Mar 4, 2024 • edited Loading

RalfJung commented Mar 4, 2024

WaffleLapkin commented Mar 5, 2024

RalfJung commented Mar 5, 2024

WaffleLapkin commented Mar 5, 2024

Footnotes

RalfJung commented Mar 5, 2024

WaffleLapkin commented Mar 5, 2024

WaffleLapkin commented Mar 5, 2024

RalfJung commented Mar 6, 2024

WaffleLapkin commented Feb 18, 2024 •

edited

Loading

RalfJung left a comment •

edited

Loading

RalfJung Feb 19, 2024 •

edited

Loading

RalfJung Feb 20, 2024 •

edited

Loading

WaffleLapkin commented Feb 19, 2024 •

edited

Loading

WaffleLapkin commented Feb 19, 2024 •

edited

Loading

RalfJung commented Feb 20, 2024 •

edited

Loading

RalfJung commented Feb 20, 2024 •

edited

Loading

RalfJung commented Feb 20, 2024 •

edited

Loading

RalfJung commented Mar 4, 2024 •

edited

Loading