Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent, but seemingly deterministic crash of runtime in debug mode (repro attached) #8569

Open
gasnica opened this issue Jul 24, 2024 · 0 comments

Comments

@gasnica
Copy link

gasnica commented Jul 24, 2024

Hi guys,

Thank you for your absolutely aaawweesome work :-)

With intention to help you make improvements, here's a seemingly deterministic repro of runtime crash, when debug mode is enabled. This happens when a complex pattern of allocating fields and writing to them is done. This seems to be independent of the types or shapes of the fields involved.

It got confirmed on multiple machines, on Windows & Ubuntu, on Intel processors.


When debug=True is enabled, and you have many fields that you create and update in a complicated fashion, the runtime is likely to crash.

This crash was confirmed with this repro on multiple machines, on Windows and Ubuntu. It's run on cpu with 1 thread.

I got a minimal super-simplified crashing program. It involves creating multiple field and writing to them in a pattern. The types and sizes/dimensions/shapes of the fields seem to not matter at all, only the sequence in which they are created and written to seems to matter.

At the end to get the crash, we need to access one of the fields' value outside of kernel to determine the size of another new field. And we get a crash on subsequent access.

Please, note, that changing the allocation & writing pattern easily removes/hides the crash. Also sometimes, and often, the crash happens quietly without the stack trace printed.

btw. Is there a reason why I'm seeing win_amd64.pyd files on the crash stack, while I'm on an intel based cpu here?

Thank you,
Adrian


The crashing program is included in minimal_debug_crash.txt (renamed to be able to attach it here)
minimal_debug_crash.txt

The crash stack trace I'm getting is attached in crash.txt
crash.txt


Code inlined for convenience:

import taichi as ti


@ti.kernel
def WriteSingleInt(field: ti.template()):
    field[None] = 1

def CreateField(shape = ()):
    return ti.field(int, shape=shape)


def main():
    ti.init(
        ti.cpu,
        #cpu_max_num_threads=4,
        debug=True,
        # kernel_profiler=True,
        # random_seed=42,
    )

    # hold on to all fields, just in case
    allFields = []

    # do field allocation & assignment in a pattern
    seq = [(13,1), (1,2), (3, 1), (1,5)]
    for loopCount, batchSize in seq:
        for _ in range(loopCount):
            fields = []
            for _ in range(batchSize):
                fields += [CreateField()]
            for field in fields:
                WriteSingleInt(field)

            allFields += fields

    # alloc and write 'size' to a field            
    intFieldA = CreateField()
    WriteSingleInt(intFieldA)

    # alloc another field after that
    intFieldB = CreateField()

    # use 'size' from earlier field to create another new field
    unusedIntFieldC = CreateField(intFieldA[None])

    # crash:
    print("crash here:")
    WriteSingleInt(intFieldB)
    print("did not crash ?!?")


if __name__ == "__main__":
    main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Untriaged
Development

No branches or pull requests

1 participant