Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the GC in Rust #1750

Merged
merged 21 commits into from
Sep 3, 2020
Merged

Implement the GC in Rust #1750

merged 21 commits into from
Sep 3, 2020

Conversation

osa1
Copy link
Contributor

@osa1 osa1 commented Jul 23, 2020

This implements the current garbage collector in Rust. No changes were made to
the GC design -- it's just ports the one implemented in code generator to Rust.

The goals are:

  • Evaluate Rust for Motoko's RTS implementation
  • Make the collector easier to read, understand, modify, and extend.

@osa1 osa1 requested review from nomeata and ggreif July 23, 2020 09:05
@osa1 osa1 marked this pull request as draft July 23, 2020 09:05
@dfinity-ci
Copy link

dfinity-ci commented Jul 23, 2020

In terms of gas, 2 tests regressed and the mean change is +6.4%.
In terms of size, 2 tests regressed and the mean change is +3.1%.

Copy link
Collaborator

@nomeata nomeata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work! Impressed how you were able to map the kinda obscure Ocaml code to Rust so nicely.

rts/motoko-rts/src/array.rs Outdated Show resolved Hide resolved
rts/motoko-rts/src/gc.rs Outdated Show resolved Hide resolved
rts/motoko-rts/src/gc.rs Outdated Show resolved Hide resolved
rts/motoko-rts/src/gc.rs Outdated Show resolved Hide resolved
rts/motoko-rts/src/gc.rs Outdated Show resolved Hide resolved
rts/motoko-rts/src/gc.rs Outdated Show resolved Hide resolved
rts/motoko-rts/src/gc.rs Outdated Show resolved Hide resolved
rts/motoko-rts/src/gc.rs Outdated Show resolved Hide resolved
rts/motoko-rts/src/gc.rs Outdated Show resolved Hide resolved
rts/motoko-rts/src/types.rs Outdated Show resolved Hide resolved
@osa1
Copy link
Contributor Author

osa1 commented Aug 20, 2020

Failing tests:

  • shared-object
  • stable-exp
  • life-mut
  • life
  • stable-mutable

@osa1
Copy link
Contributor Author

osa1 commented Aug 24, 2020

One test remaining to fix ...

@osa1
Copy link
Contributor Author

osa1 commented Aug 24, 2020

This is now passing the tests, but the last bug fix was a bit weird, I'll write more about it after a break.

Next: documentation and refactoring.

rts/motoko-rts/src/gc.rs Outdated Show resolved Hide resolved
@osa1 osa1 changed the title [WIP] Implement the GC in Rust Implement the GC in Rust Aug 24, 2020
@osa1 osa1 marked this pull request as ready for review August 24, 2020 13:52
@osa1 osa1 requested review from nomeata and crusso August 24, 2020 13:52
@osa1 osa1 force-pushed the osa1/rust_gc branch 2 times, most recently from 97ef41f to a84f426 Compare August 26, 2020 11:09
@osa1
Copy link
Contributor Author

osa1 commented Aug 27, 2020

The bug mentioned above is an instrumentation bug. We debugged it with @crusso and I'm currently preparing a PR to fix it.

@osa1
Copy link
Contributor Author

osa1 commented Aug 27, 2020

Instrumentation fix: https://github.com/dfinity-lab/dfinity/pull/5129

@osa1 osa1 force-pushed the osa1/rust_gc branch 2 times, most recently from 49671c3 to 6932f21 Compare August 27, 2020 12:16
osa1 added a commit that referenced this pull request Aug 27, 2020
Current master includes a bug fix that unblocks #1750
@osa1 osa1 mentioned this pull request Aug 27, 2020
… more, fix BigInt layout

- I figured out how to implement methods on pointer types, make
  payload_addr a method on pointers

- Fix BigInt layout (sign field was missing)

- Remove magic values in object_size
rts/motoko-rts/src/gc.rs Outdated Show resolved Hide resolved
@osa1
Copy link
Contributor Author

osa1 commented Sep 3, 2020

I just fixed a bug that testing did not catch. See the last commit. Basically I forgot to add a field to BigInt so we used incorrect layout for it in the garbage collector (object_size worked fine though as we had hard-coded 5 for the object size).

I removed magic values in object_size now and I'm using size of structs defined in types.rs for object sizes. That reduces amount of duplicate information (object sizes) from 3 (code gen, object_size function, structs in types.rs) to 2. Code generator and the structs in types.rs still need to be in sync though and we have no way of knowing automatically when they go out of sync.

@nomeata
Copy link
Collaborator

nomeata commented Sep 3, 2020

I just fixed a bug that testing did not catch. See the last commit. Basically I forgot to add a field to BigInt so we used incorrect layout for it in the garbage collector (object_size worked fine though as we had hard-coded 5 for the object size).

Good catch! We need more testing of the GC somehow…

I removed magic values in object_size now and I'm using size of structs defined in types.rs for object sizes.

Great! I think you forgot Object

@osa1
Copy link
Contributor Author

osa1 commented Sep 3, 2020

For some reason both gas usage and size got worse after moving heap pointer to the RTS. This doesn't look right, I should investigate.

@nomeata
Copy link
Collaborator

nomeata commented Sep 3, 2020

For some reason both gas usage and size got worse after moving heap pointer to the RTS. This doesn't look right, I should investigate.

One guess: Wasm global + extra function calls is cheaper than load and store of a static memory location. Especially if the linker doesn’t optimize the base + offset calculation in each access.

~/dfinity/motoko/test $ ./compare-wat.sh -f 1f8b54a -t 52060df perf/*.mo|& less
…
   (func $alloc_bytes (type 5) (param i32) (result i32)
-    (local i32)
+    (local i32 i32 i32)
     i32.const 92672
-    i32.const 32
-    i32.add
     local.tee 1
-    local.get 1
-    i64.load
+    i32.const 40
+    i32.add
+    local.tee 2
+    local.get 2
+    i32.load
+    local.tee 2
     local.get 0
     i32.const 3
     i32.add
     i32.const -4
     i32.and
     local.tee 0
+    i32.add
+    local.tee 3
+    i32.store
+    local.get 1
+    i32.const 32
+    i32.add
+    local.tee 1
+    local.get 1
+    i64.load
+    local.get 0
     i64.extend_i32_u
     i64.add
     i64.store
-    call $get_hp
-    local.tee 1
-    local.get 0
-    i32.add
-    local.tee 0
-    call $set_hp
     block  ;; label = @1

seems to confirm that. There are awful lot of constants being propagated.

@osa1
Copy link
Contributor Author

osa1 commented Sep 3, 2020

Can you share compare-wat.sh? It looks useful.

@nomeata
Copy link
Collaborator

nomeata commented Sep 3, 2020

Sometimes I fear we have packet drop https://dfinity.slack.com/archives/CPL67E7MX/p1599040660028400?thread_ts=1599040527.027900&cid=CPL67E7MX ;-)

It's in test/

@osa1
Copy link
Contributor Author

osa1 commented Sep 3, 2020

One guess: Wasm global + extra function calls is cheaper than load and store of a static memory location. Especially if the linker doesn’t optimize the base + offset calculation in each access.

Sigh... this is unfortunate. I briefly searched for using globals with LLVM. It seems like it's currently only possible with using an assembly file: emscripten-core/emscripten#11035

So if we want to use globals instead of memory locations for globals we'll have to generate the Wasm for those globals ourselves, and then provide getter and setter functions. We can perhaps optimize away those functions in link time (in mo-ld).

@nomeata
Copy link
Collaborator

nomeata commented Sep 3, 2020

One guess: Wasm global + extra function calls is cheaper than load and store of a static memory location. Especially if the linker doesn’t optimize the base + offset calculation in each access.

Sigh... this is unfortunate. I briefly searched for using globals with LLVM. It seems like it's currently only possible with using an assembly file: emscripten-core/emscripten#11035

Yes, it’s unfortunate that rust just doesn’t use globals when it can.

But I think the real problem (in terms of gas counter) is the redunant address offset calculations post-linker. I couldn't trick wasm-opt into optimizing them away for now.

Nevertheless I would use the current code, optimizing for conceptual cleanness (heap pointer maintained by the allocator) over the gas usage. At least we know where the gas is expended.

@osa1
Copy link
Contributor Author

osa1 commented Sep 3, 2020

Yes, it’s unfortunate that rust just doesn’t use globals when it can.

It's a problem with semantics of variables in C, C++, and Rust. If I have a global variable x in any of these languages then I should be able to get address of it. That's not possible if x is represented as a Wasm global.

I think it should be possible to use a Wasm global for a global variable when we never take address of it, and in our case this is the case as we don't export the variable and never take address of it in the library, but maybe for some other reason it's not possible to do this analysis, I'm not sure. I'll ask this on llvm-dev mailing list.

@nomeata
Copy link
Collaborator

nomeata commented Sep 3, 2020

It's a problem with semantics of variables in C, C++, and Rust. If I have a global variable x in any of these languages then I should be able to get address of it.

Even rust? I was hoping it would not be the case there. Was too optimistic.

@osa1
Copy link
Contributor Author

osa1 commented Sep 3, 2020

Even rust? I was hoping it would not be the case there. Was too optimistic.

Indeed. It's unsafe (you need a unsafe { } block or need to be in an unsafe function), but possible.


I think at this point I addressed all comments. There's room for improvement in a few areas which I'll open issues for.

Do we have any tests other than what CI runs that I can try manually to test this more?

@nomeata how does it look now? Any other comments/requests?

@nomeata
Copy link
Collaborator

nomeata commented Sep 3, 2020

Do we have any tests other than what CI runs that I can try manually to test this more?

Unfortunately not. Ideally we do some random checking, maybe @ggreif has ideas when he comes back.

@osa1
Copy link
Contributor Author

osa1 commented Sep 3, 2020

I forgot to address https://github.com/dfinity-lab/motoko/pull/1750/files#r481861309 . I'll update one last time and then put this on merge queue.

(The word "forwarding pointer" is used widely in the literature for
these things)
@osa1 osa1 added the automerge-squash When ready, merge (using squash) label Sep 3, 2020
@mergify mergify bot merged commit 1461f56 into master Sep 3, 2020
@mergify mergify bot deleted the osa1/rust_gc branch September 3, 2020 15:01
@mergify mergify bot removed the automerge-squash When ready, merge (using squash) label Sep 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants