Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Game crashes in Release exports, no error in Debug (GDScript Pool*Array with type hinted from Array) #48090

Closed
WolfgangSenff opened this issue Apr 22, 2021 · 17 comments

Comments

@WolfgangSenff
Copy link
Contributor

I have looked for a few other issues and while some might be the same, I have a repro project whereas those did not. It's not a minimal project, but it's easy enough to cause the bug with this project. It might be the same or very similar to #46533, not sure, as I didn't test it just with deleting nodes explicitly (mine happens on scene change).

Godot version:
3.3.stable, GLES2 for web exporting (but happens regardless of which system exported for)

OS/device including version:
macOS Catalina, 10.15.5, Radeon Pro 560X, also integrated Intel UHD Graphics 630

Issue description:
When switching scenes in my game while in release mode, the game often freezes. This happens regardless of which system I'm exporting for. It seems very little is listed in the Mac OS logs as well - I actually couldn't find anything. However, it looks like possibly the web export has a bit more information (but note the freeze is definitely not specific to the web build):
Screen Shot 2021-04-22 at 9 08 11 AM

Steps to reproduce:

  1. Clone and open the project
  2. Export for any platform, without debug info (this is the most important part - it won't crash if you export with debug info); this even happens exporting from Mac to Mac.
  3. Run the app from the exported package
  4. Play single player
  5. Move to the right with D or right arrow, jumping enemies with spacebar or left mouse click until you get to the door and touch the door with your snoot.

For my testing, I found this always froze on the next screen. Since it's picking scenes at random to send you to next, I don't think it's specific to my code, but it's possible, as I use an inherited root scene for each level.

Minimal reproduction project:
I don't have a minimal reproduction, just the big one. I'm not sure exactly what's causing it, but as it's a crashing bug, I think it's pretty important to include it here anyway:

https://github.com/WolfgangSenff/GOTMJam3

@Zireael07
Copy link
Contributor

'function signature mismatch' is an error specific to web build.

Does it happen if you remove the randomizing part of the portal (to exclude the possibility that one of the scenes is corrupted itself)?

@WolfgangSenff
Copy link
Contributor Author

I'm not sure, but if that's specific to the web build, then maybe it's not actually useful for debugging, doh. It definitely happens on all other platforms as well. However, regarding the randomizing, none of the scenes are corrupted - used to go through them linearly and was still frozen. The randomization was only added toward the end.

@akien-mga akien-mga added the bug label Apr 22, 2021
@akien-mga akien-mga added this to the 3.4 milestone Apr 22, 2021
@akien-mga
Copy link
Member

I can't test the project, the Firebase plugin seems to have missing files:
Screenshot_20210422_160636

@WolfgangSenff
Copy link
Contributor Author

Oh. I must have put that one into my gitignore. Ugh. Let me make a branch that removes it, as it's only used for multiplayer, sorry.

@WolfgangSenff
Copy link
Contributor Author

For now, if you want, you can grab that file from here: https://github.com/GodotNuts/GodotFirebase/blob/main/addons/godot-firebase/firebase/firebase.gd - it'll cause weird bugs not having the data in there, but only if you try multiplayer. Single player should be fine without the config stuff at the top.

@akien-mga
Copy link
Member

Thanks, I can reproduce a segfault on Linux. I'll grab a stacktrace.

@akien-mga
Copy link
Member

The crash also happens with 3.2.3-stable so it's not a recent regression.

Crashes here:

Thread 1 "godot.x11.opt.6" received signal SIGSEGV, Segmentation fault.
Variant::call_ptr (this=this@entry=0x4365510, p_method=..., p_args=p_args@entry=0x7fffffffc738, p_argcount=p_argcount@entry=0, r_ret=0x7fffffffc6d8, r_error=...) at core/variant_call.cpp:1165
1165                    funcdata.call(ret, *this, p_args, p_argcount, r_error);
(gdb) bt
#0  Variant::call_ptr (this=this@entry=0x4365510, p_method=..., p_args=p_args@entry=0x7fffffffc738, p_argcount=p_argcount@entry=0, r_ret=0x7fffffffc6d8, r_error=...) at core/variant_call.cpp:1165
#1  0x00000000007f6355 in GDScriptFunction::call (this=0x3ff6b50, p_instance=p_instance@entry=0x4365bc0, p_args=p_args@entry=0x7fffffffc9f8, p_argcount=p_argcount@entry=1, r_err=..., p_state=<optimized out>)
    at modules/gdscript/gdscript_function.cpp:1086
#2  0x00000000007b80f0 in GDScriptInstance::call_multilevel (this=0x4365bc0, p_method=..., p_args=0x7fffffffc9f8, p_argcount=1) at modules/gdscript/gdscript.cpp:1224
#3  0x0000000000ec1d07 in Node::_notification (this=0x4365690, p_notification=<optimized out>) at scene/main/node.cpp:69
#4  0x0000000001384f98 in Node::_notificationv (p_reversed=false, p_notification=16, this=0x4365690) at ./scene/main/node.h:46
#5  CanvasItem::_notificationv (p_reversed=false, p_notification=16, this=0x4365690) at ./scene/2d/canvas_item.h:166
#6  Node2D::_notificationv (p_reversed=false, p_notification=16, this=0x4365690) at ./scene/2d/node_2d.h:38
#7  CollisionObject2D::_notificationv (p_reversed=false, p_notification=16, this=0x4365690) at ./scene/2d/collision_object_2d.h:39
#8  PhysicsBody2D::_notificationv (p_reversed=false, p_notification=16, this=0x4365690) at scene/2d/physics_body_2d.h:43
#9  KinematicBody2D::_notificationv (this=0x4365690, p_notification=16, p_reversed=<optimized out>) at scene/2d/physics_body_2d.h:290
#10 0x0000000001caeda4 in Object::notification (this=0x4365690, p_notification=16, p_reversed=<optimized out>) at core/object.cpp:929
#11 0x0000000000ee2dda in SceneTree::_notify_group_pause (this=0x2ac1370, p_group=..., p_notification=16) at scene/main/scene_tree.cpp:993
#12 0x0000000000ee6ed1 in SceneTree::iteration (this=0x2ac1370, p_time=<optimized out>) at scene/main/scene_tree.cpp:486
#13 0x00000000005961dd in Main::iteration () at main/main.cpp:2100
#14 0x0000000000578731 in OS_X11::run (this=this@entry=0x7fffffffce00) at platform/x11/os_x11.cpp:3641
#15 0x0000000000566035 in main (argc=3, argv=0x7fffffffd6d8) at platform/x11/godot_x11.cpp:56

CC @RandomShaper (in case you're interested in this kind of release-only issue).

@akien-mga akien-mga changed the title Godot 3.3: Game Freeze in Release Mode (only) Exports Game crashes in Release exports, no error in Debug Apr 22, 2021
@WolfgangSenff
Copy link
Contributor Author

In case anyone needs the branch upon which I've removed the Firebase-related error that Akien reported above, here you are: https://github.com/WolfgangSenff/GOTMJam3/tree/remove-firebase-for-bugfix

@akien-mga akien-mga modified the milestones: 3.4, 3.5 Nov 8, 2021
@akien-mga
Copy link
Member

Tested with 3.4-stable, this is still reproducible.

@RandomShaper
Copy link
Member

RandomShaper commented Nov 27, 2021

I've found the issue.

This is a simplification from Leaf.gd:

var points := [] setget set_points

func set_points(value):
    points = value

func _physics_process(delta: float):
    if ...:
        points.pop_front()

points is being assigned a PoolVector2Array from outside (i.e., set_points() is called with a PoolVector2Array value argument). PoolVector2Array doesn't feature a pop_front() method, so what happens at runtime is somewhat expected.

However, on a debug build, because of points being statically typed as Array, the points = value assignment forces the PoolVector2Array to be converted to an Array. Therefore, no error happens there, because Array has pop_front().

The difference of behavior across debug and release is the bug. I'm not sure if what we want is that the coercion to Array happens on release as well, or that it doesn't happen on either.

This little scripts shows the magical change of type on debug:

var a = PoolVector2Array([ Vector2(1, 2), Vector2(3, 4) ])
var b := []
b = a
pass # Set a breakpoint here: 'a' will be a PoolVector2Array, 'b' will be an Array

P. S.: Removing the : from the := works in both cases to keep the value being a PoolVector2Array.
P. P. S.: @WolfgangSenff, a simple fix for your project while the root issue is fixed would be to do points = Array(value), so you have an Array on release as well.

CC @vnen

@WolfgangSenff
Copy link
Contributor Author

@RandomShaper - I actually didn't see this until just now. You are some sort of superman for finding this. That is absolutely baffling, as well as impressive debugging!

@YuriSizov
Copy link
Contributor

The difference of behavior across debug and release is the bug. I'm not sure if what we want is that the coercion to Array happens on release as well, or that it doesn't happen on either.

I think that the appropriate solution would be to raise an error in debug here. We use strict typing for a reason, and we don't allow coercion a lot of time when it would've been preferred by some users (see int vs float debacles). Implicit behavior here hides a bug in code, crash or not. So yeah, should be an error as early as possible.


And a huge thank you for the investigation @RandomShaper! We just ran into this issue ourselves with a project, and it's a very nasty bug to figure out. This helps immensely.

@akien-mga akien-mga changed the title Game crashes in Release exports, no error in Debug Game crashes in Release exports, no error in Debug (GDScript Pool*Array with type hinted from Array) Dec 29, 2021
@cdemirer
Copy link
Contributor

I think that the appropriate solution would be to raise an error in debug here. We use strict typing for a reason, and we don't allow coercion a lot of time when it would've been preferred by some users (see int vs float debacles). Implicit behavior here hides a bug in code, crash or not. So yeah, should be an error as early as possible.

I'm not sure if this comment is 3.x specific, but at least on master, the existing code suggests that when the destination is statically typed (hard), a conversion should happen on assignment unless the source is also the same static (hard) type. If conversion is not possible, that's when it's an error. (Though there are cases for emitting warnings even when conversion is possible.)

Relevant:

OPCODE(OPCODE_ASSIGN_TYPED_BUILTIN) {
CHECK_SPACE(4);
GET_INSTRUCTION_ARG(dst, 0);
GET_INSTRUCTION_ARG(src, 1);
Variant::Type var_type = (Variant::Type)_code_ptr[ip + 3];
GD_ERR_BREAK(var_type < 0 || var_type >= Variant::VARIANT_MAX);
if (src->get_type() != var_type) {
#ifdef DEBUG_ENABLED
if (Variant::can_convert_strict(src->get_type(), var_type)) {
#endif // DEBUG_ENABLED
Callable::CallError ce;
Variant::construct(var_type, *dst, const_cast<const Variant **>(&src), 1, ce);
} else {
#ifdef DEBUG_ENABLED
err_text = "Trying to assign value of type '" + Variant::get_type_name(src->get_type()) +
"' to a variable of type '" + Variant::get_type_name(var_type) + "'.";
OPCODE_BREAK;
}
} else {
#endif // DEBUG_ENABLED
*dst = *src;
}
ip += 4;

case ARRAY: {
static const Type valid[] = {
PACKED_BYTE_ARRAY,
PACKED_INT32_ARRAY,
PACKED_INT64_ARRAY,
PACKED_FLOAT32_ARRAY,
PACKED_FLOAT64_ARRAY,
PACKED_STRING_ARRAY,
PACKED_COLOR_ARRAY,
PACKED_VECTOR2_ARRAY,
PACKED_VECTOR3_ARRAY,
NIL
};
valid_types = valid;
} break;
// arrays
case PACKED_BYTE_ARRAY: {
static const Type valid[] = {
ARRAY,
NIL
};
valid_types = valid;
} break;

So I believe the appropriate solution would be making the release export consistent with debug. But I don't know if there are design differences in this case for 3.x

@YuriSizov
Copy link
Contributor

I'm not sure if this comment is 3.x specific

The comment is not 3.x specific, but the reported crash is. We haven't tested what happens in master, and GDScript 2 is too buggy for the moment to make a fair assessment anyway. But I don't think that the conversion described by Pedro is expected or desired when you use types. Like I've said, it hides a bug in code, which strict typing is supposed to prevent.

We already act very strictly when using floats and ints together and actively avoid any implicit behavior. We also avoid coercion when adding a variant to a string, even though you could implicitly convert it to a string and concatenate (conversion works perfectly fine with string formatting). It's not very relevant if this behavior has been done on purpose or not, as it introduces an inconsistency with how we handle type coercion. To the point where it leads to a crash because that inconsistency wasn't accounted for in release builds.

That said, making release builds coerce types the same way as debug build does would solve the problem with the crash specifically. I still want to underline that this hides a bug in code from the developer. So at the very least we should reconsider it for Godot 4 while we can break compatibility. In 3.x we can accept coercion in release to avoid crashing, though I'd say it's still undesired and should be fixed properly.

@cdemirer
Copy link
Contributor

cdemirer commented Dec 29, 2021

But if the design intention is to do conversion when assigning to a hard-typed destination (unless the source is the same hard-type), then it's not "hiding a bug in code from the developer". With this design, the semantics of assigning to a hard-typed variable do include type conversion.

What you mention about strings is in fact accounted for within the difference between Variant::can_convert and Variant::can_convert_strict, where the strict version of the two defines all the allowed conversions when assigning to a hard-typed variable (there are also checks elsewhere for Objects etc.). Here's the 3.4 version of these functions for reference:

bool Variant::can_convert_strict(Variant::Type p_type_from, Variant::Type p_type_to) {

We already act very strictly when using floats and ints together and actively avoid any implicit behavior.

With regards to assignments, that doesn't seem to be the case. The only time such assignment is prevented seems to be const initializations (which can be argued not to be an assignment itself).

Conversion when assigning to hard-typed variables is something I'm very fond of about GDScript. If changing that is desired by some, then maybe it should be a proposal. Preventing it would also hurt safe polymorphism / make it very cumbersome to write.

@YuriSizov
Copy link
Contributor

With regards to assignments, that doesn't seem to be the case.

I wasn't talking about assignments there specifically, sorry. Just general cases of possible coercion that the users run into from time to time.

But if the design intention is to do conversion when assigning to a hard-typed destination (unless the source is the same hard-type), then it's not "hiding a bug in code from the developer".

It very much is, as evident by the fact that only after reading that investigation by Pedro did we realize that we were in fact assigning one type to a variable that is supposed to hold another. It was a relic of old code being refactored, skipped past the developer who did the refactoring, and we were none the wiser. Coercion made us think the code was safe, when it wasn't. IMO crash is irrelevant here, so even if it didn't crash I would still consider that a problem.

Conversion when assigning to hard-typed variables is something I'm very fond of about GDScript. If changing that is desired by some, then maybe it should be a proposal. Preventing it would also hurt safe polymorphism / make it very cumbersome to write.

Fair. Then, at the very least, it should be a warning that you can disable or silence per line. We have a few such warnings when ints and floats are, again, involved. When you implicitly cast you do something potentially unsafe.

And I'm not sure I agree with being very cumbersome to write. We have explicit casting, and we have overloaded constructors that would extend your code by a few characters, but would make the intended logic very clear.

@akien-mga
Copy link
Member

akien-mga commented Feb 9, 2022

Fixed by #57851.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

7 participants