Segfault due to stack size limit when recursing? #30

critiqjo · 2016-01-30T19:31:56Z

fn crazy_recurse(d: usize) -> usize {
    if d == 0 { d }
    else { 1 + crazy_recurse(d - 1) }
}

fn main() {
    println!("Heil Hydra {}", crazy_recurse(16000));
    Scheduler::new().with_workers(1)
        .run(|| {
            for i in 0..4 {
                Scheduler::spawn(move || {
                    println!("Heil Hydra {}", crazy_recurse(1964)); //1963 works fine!
                });
            }
        }).unwrap();
}

Issue only in debug build, probably because the recursion gets optimized out in release build. If this can't be resolved quickly, I think it is worth a mention in the readme.

lhecker · 2016-01-30T19:42:09Z

The default stack size of new coroutines is 128KB. This means that you're using about 64 bytes per recursion.

BUT: You can optionally specify the stack size of newly spawned coroutines. Just replace your call to spawn(func) with spawn_opts(func, opts), where opts is a coio::Options instance.

If you have any idea how to improve this API you can mention it here. 😃
It might take some time until we implement it though, because I'd consider this as low-priority (PRs are always welcome though).

critiqjo · 2016-01-30T19:43:55Z

A quick resolution could simply be emitting a panic instead of segfault. Stack resizing at runtime seems hard.

critiqjo · 2016-01-30T19:45:29Z

I don't know if it can be "quick", but it should be "easier", right? Segfault is really bad!

lhecker · 2016-01-30T19:52:58Z

You do realize that not even plain Rust itself panics on a stack overflow, right? In practice it simply does not happen because your stack is usually so large that it will never overflow. So... I think it's unlikely that we will ever implement it.

It might be possible though by using some kind of guard page that triggers a interrupt when it's accessed or something like that. But I've never done something like that before and wouldn't really know where to start.

What is possible though is to have growing stacks on x64 by using lazily allocated pages. But that is also something I've never done before and not really some "easy" thing to achieve and thus would be low-priority too.

critiqjo · 2016-01-30T20:03:56Z

I see... By lazily allocating, do you mean: issuing a very large malloc assuming that the physical pages will be mapped lazily? Or is it something else?

lhecker · 2016-01-30T20:41:18Z

I've only remotely heard about this before, but AFAIR: Yes by using a large allocation and letting the MMU allocate physical pages lazily.

zonyitoo · 2016-01-31T04:58:53Z

It is impossible to panic when you triggered stack overflow. Currently in Coio, we uses a protected pages at the end of the allocated stack space. When you access to that page, OS will kill the program for invalid access, which is why you get the "unknown error occur" message in your program.

I personally don't want to support segmented stack or reallocate stack like the latest Go does. Because it just add complexity in the runtime. If you want to run your crazy recursion function, you can increase the stack size by:

use coio::Builder;

Builder::new().stack_size(whatever_stack_size_you_want).spawn(|| { ... });

And yes, the memory allocating is lazy. OS wil allocate the page when you actually need it.

critiqjo · 2016-01-31T05:23:54Z

I personally don't want to support segmented stack or reallocate stack like the latest Go does. Because it just add complexity in the runtime.

Fair enough... Just a suggestion that it might be good to add a "Limitations" section in the Readme.

Closing.

critiqjo · 2016-01-31T05:30:36Z

A draft:

Limitations

Stack size per coroutine is limited by 128KiB by default, you can change this by using spawn_opts or custom Builder.
The last page of the stack is "protected", which will trigger an invalid access and terminate the process when hit. (This is not guaranteed when using FFI ... [reasons]??)

zonyitoo · 2016-01-31T05:32:47Z

I could explain why I don't want to add segmented stack and continuous stack:

Segmented stack clearly has huge performance issue. Imagine you have a loop right at the boundary of the current stack, then a new segment of stack will be created/destroyed many times.
Continuous stack is bad for FFI call. It is very hard to do it right without the help of compiler. Many of the Rust's libraries uses FFI to call C libraries, so it is not an option.

You could open another issue for adding limitation section in README. Or you could just open a PR, I am happy to merge it.

zonyitoo · 2016-01-31T05:34:05Z

This is not guaranteed when using FFI

No, it is guaranteed. The process will be killed by the OS when you access to the last page (4KiB in most of the OSes).

critiqjo · 2016-01-31T05:36:13Z

No, it is guaranteed.

These comments are outdated?

zonyitoo · 2016-01-31T05:39:28Z

Ah, I noticed a possible situation. The last page is 4KiB. What if a function call eventually skip that page and access to the pages behind it? For example, you allocate a huge array on the stack but you don't use it, which happens to cover the whole last page. So the other local variables may be allocated after the last page.

zonyitoo · 2016-01-31T05:41:48Z

This could be added into that comment. I nearly forget about it.

critiqjo · 2016-01-31T05:42:03Z

For example, you allocate a huge array on the stack but you don't use it, which happens to cover the whole last page.

Is it limited to FFI, or can it happen in safe Rust too?

zonyitoo · 2016-01-31T05:47:20Z

Nope. You can never use uninitialized memory in safe Rust. So it only affects unsafe Rust (including FFI).

critiqjo · 2016-01-31T06:06:45Z

You can never use uninitialized memory in safe Rust.

Did you mean: you can never have allocated but uninitialized stack regions in safe Rust?

Yay, got "an unknown error" here ✌️ :

fn print_test() -> usize {
    println!("Hello!");
    return 128;
}

fn main() {
    Scheduler::new()
        .with_workers(1)
        .run(|| {
            let x = print_test();
            if x != 128 {
                let a = [0usize; 16000]; // thought this will be uninitialized
                                         // until program counter reached here
                println!("{}", a[8000]); // guess not!
            }
        }).unwrap();
}

zonyitoo · 2016-01-31T06:13:29Z

The expression [0usize; 16000] is equivalent to memset the allocated memory to 0 in C. So it is initialized and the program will hit the protect page.

This is the unsafe Rust version of your program:

use std::mem;

fn print_test() -> usize {
    println!("Hello!");
    return 128;
}

fn main() {
    Scheduler::new()
        .with_workers(1)
        .run(|| {
            let x = print_test();
            if x != 128 {
                let a: [usize; 16000] = unsafe { mem::uninitialized() }; // Allocate without initialization
                println!("{}", a[8000]);
            }
        }).unwrap();
}

This should work, but it may be killed by the OS somehow.

In my laptop, it reports SIGBUS.

zonyitoo · 2016-01-31T06:26:49Z

I got the emitted LLVM IR:

; Function Attrs: inlinehint uwtable
define internal void @"_ZN4main16_$LT$closure$GT$12closure.7156E"(%closure) unnamed_addr #9 {
entry-block:
  %x = alloca i64
  %_a = alloca [16000 x i64]
  %1 = call i64 @_ZN10print_test20hddcb26cfc9bc004ahaaE(), !dbg !12035
  store i64 %1, i64* %x, align 8, !dbg !12035
  call void @llvm.dbg.declare(metadata i64* %x, metadata !12038, metadata !6391), !dbg !12035
  %2 = load i64, i64* %x, align 8, !dbg !12039
  %3 = icmp ne i64 %2, 128, !dbg !12039
  br i1 %3, label %then-block-61-, label %next-block, !dbg !12039

then-block-61-:                                   ; preds = %entry-block
  call void @_ZN3mem13uninitialized21h12808030131495532484E([16000 x i64]* noalias nocapture sret dereferenceable(128000) %_a), !dbg !12040
  call void @llvm.dbg.declare(metadata [16000 x i64]* %_a, metadata !12043, metadata !6391), !dbg !12044
  br label %next-block

next-block:                                       ; preds = %entry-block, %then-block-61-
  ret void, !dbg !12045
}

It seems that it still going to do something with the allocated %_a. I don't know what it have done, but it actually triggered SIGBUS.

critiqjo · 2016-01-31T06:36:03Z

I got the emitted LLVM IR:
... too many symbols here ...
... actually triggered SIGBUS.

I see... (well, not really 😄)

I will issue a PR with "Limitations" section within 3 days (hopefully)...

critiqjo · 2016-02-04T15:24:39Z

rust-lang/rust#31273 might be of interest.

zonyitoo · 2016-02-04T16:34:03Z

Good idea. But they are going to implement it in the signal handler, right? Then we cannot define it for ourself.

critiqjo closed this as completed Jan 31, 2016

critiqjo mentioned this issue Feb 2, 2016

Readme: Limitations section #33

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segfault due to stack size limit when recursing? #30

Segfault due to stack size limit when recursing? #30

critiqjo commented Jan 30, 2016

lhecker commented Jan 30, 2016

critiqjo commented Jan 30, 2016

critiqjo commented Jan 30, 2016

lhecker commented Jan 30, 2016

critiqjo commented Jan 30, 2016

lhecker commented Jan 30, 2016

zonyitoo commented Jan 31, 2016

critiqjo commented Jan 31, 2016

critiqjo commented Jan 31, 2016

zonyitoo commented Jan 31, 2016

zonyitoo commented Jan 31, 2016

critiqjo commented Jan 31, 2016

zonyitoo commented Jan 31, 2016

zonyitoo commented Jan 31, 2016

critiqjo commented Jan 31, 2016

zonyitoo commented Jan 31, 2016

critiqjo commented Jan 31, 2016

zonyitoo commented Jan 31, 2016

zonyitoo commented Jan 31, 2016

critiqjo commented Jan 31, 2016

critiqjo commented Feb 4, 2016

zonyitoo commented Feb 4, 2016

Segfault due to stack size limit when recursing? #30

Segfault due to stack size limit when recursing? #30

Comments

critiqjo commented Jan 30, 2016

lhecker commented Jan 30, 2016

critiqjo commented Jan 30, 2016

critiqjo commented Jan 30, 2016

lhecker commented Jan 30, 2016

critiqjo commented Jan 30, 2016

lhecker commented Jan 30, 2016

zonyitoo commented Jan 31, 2016

critiqjo commented Jan 31, 2016

critiqjo commented Jan 31, 2016

Limitations

zonyitoo commented Jan 31, 2016

zonyitoo commented Jan 31, 2016

critiqjo commented Jan 31, 2016

zonyitoo commented Jan 31, 2016

zonyitoo commented Jan 31, 2016

critiqjo commented Jan 31, 2016

zonyitoo commented Jan 31, 2016

critiqjo commented Jan 31, 2016

zonyitoo commented Jan 31, 2016

zonyitoo commented Jan 31, 2016

critiqjo commented Jan 31, 2016

critiqjo commented Feb 4, 2016

zonyitoo commented Feb 4, 2016