module deinitialization paradigms, cross-platform termination detection #9388

Ben-Fields · 2021-03-20T16:54:42Z

~~deinit(). Similar to init(), but called at the end of the module's lifetime. (also similar to free() for structs).~~

~~This can be accomplished using C code currently - C.atext() (int atexit(void (*func)(void))).~~

C.atexit() exists, but with scoping & avoiding globals, the relevant context(s) cannot be accessed. Request more robust cross-platform termination detection, and a potential similar alternative to init(), so that meta/system-level control flow can be properly handled by the main module.

Discussion TL;DR: Current methods not cross-platform (C.atexit(), os.signal()), encourage wasting resources (init()), encourage sub-modules to alter system-level control flow (os.signal(), exit(), panic()). Pattern with closures proposed.

Sub-issue of sub-modules having globals and side-effects. Globals are a necessity for C interop and low-level (effectively discouraged for general use), but side-effects (making globals apparent to the main module, or altering system-level control flow) can and should be avoided, at least not encouraged.

The text was updated successfully, but these errors were encountered:

dumblob · 2021-03-21T08:29:16Z

I'm afraid this is mixing two inherently different things. atexit() registers a callback to be called when a running os process is about to exit.

But this has nothing to do with modules (module is a compile-time concept - anything runtime is not handled by V AFAIK and thus is handled explicitly by the programmer - imagine something like manual DLL loading using winapi). Remember, V modules are statically compiled, so there is no point in time when a module is "not reachable" or "unloaded" etc.

So could you clarify what's the use case and the desired semantics of a potential trigger running a deinit() function?

As a side note, I'd discourage anyone using atexit() as it has pretty constrained semantics - majority of the program is not anymore valid (so you can't just read any data from anywhere, you can't just execute arbitrary functions, you don't know how threading state and scheduling in edge cases looks like, etc.). For those knowing what they're doing and knowing that they won't need such functionality in JavaScript, they can already now use C.atexit().

Ben-Fields · 2021-03-21T13:32:09Z

I gave C.atexit() as a current workaround. I imagine the implementation would be identical to init(). The use cases would be similar as well.

they can already now use C.atexit()

The goal as I understand is to continually eliminate C from V code.

@larpon

JalonSolov · 2021-03-21T14:46:36Z

Yes. There should be "pure" V implementations of everything. If that gets translated to C.atexit by cgen, that's fine, but it should not be the only implementation.

Otherwise you are throwing roadblocks to other backends, such as x64, JS, WASM, etc., etc.

dumblob · 2021-03-21T15:24:13Z

I might have not been clear enough - sorry for that. My comment wasn't about atexit() (note I explicitly wrote that I strongly discourage using it at all). My comment was about the definition of the semantics of deinit() as there is no point in time when to call it! So again:

When exactly should `deinit()` be called and how will safety be guaranteed?

(note, that even Rust is by far not expressive enough to describe the safety of an atexit() equivalent - therefore Rust does not even provide anything close to this)

Ben-Fields · 2021-03-21T15:57:49Z

Safety would be guaranteed in the usual way. For example, you might have the following for explicit manual memory management:

module my_module
[manualfree]
fn init() { /* ... */ }
[manualfree]
fn deinit() { /* ... */ }

Though you could use them, of course, without manual memory management.
deinit() would not be special in any other way besides that it is auto-called.
If I were implemeting init() and deinit(), I would auto-generate calls at the beginning and end of main(), respectively. The exact order would be based on module dependency. Probably naive.

dumblob · 2021-03-21T16:12:05Z

@Ben-Fields I'm afraid I still do not understand anything about the "safety" measure(s) you're proposing.

Let's put it this way - a module's init() is called with "no previous data available" and thus can be made safe, because there is no reference/pointer/ID/opened_descriptor/opened_connection/thread/lock/you_name_it and thus if nothing weird will be contained in init() itself, "nothing" can go wrong. On the contrary having deinit() implies one wants to work with existing reference/pointer/ID/opened_descriptor/opened_connection/thread/lock/you_name_it none of which can be made fully safe.

Ben-Fields · 2021-03-21T16:36:27Z

@dumblob From the docs:

The init function cannot be public - it will be called automatically. This feature is particularly useful for initializing a C library.

You're right, maybe it won't be terribly useful beyond cleaning up C __globals or without unsafe blocks. Sorry if I'm missing your point yet again, but I'm not proposing you should be able to do anything beyond what's already possible in V inside deinit()?

@larpon Could you share your use-case for C.atexit() and/or deinit()?

larpon · 2021-03-21T20:27:30Z

Yes. My specific case is a C library I'm currently wrapping. The C library provides it's lifetime management via 2 methods (like many many other C libs do) it has an init() and a deinit() - and inbetween those two calls you get to use all the other useful calls.

Since V has a module init, that part was easy to adapt - but placing the deinit() is harder since V module life-cycle is currently only the init.

After working with it I'm still unsure how to solve it properly. If the module get included in a big project involving several independent vendors that happen to include the module twice having a module deinit() could make a mess since the module would be shutdown twice? (or maybe it couldn't - I'm not sure how V handles it currently edit: I'm pretty sure module init is only called once nomatter how many times it's included, so this is my best option since the C lib is "one instance only")

Wrapping it in a struct with a new/free cycle also is problematic since the C lib is only build for monolthic use (as far as I can see). So it's not really easy to put into a V "life-cycle" scheme. You could call it a kind of "singleton module"

In the end it'll probably just be shipped with a warning that tells you to only use one instance at every point in time - but currently my best option is the module level init() combined with C.atexit and os.signal() to cover the most cases I think.

I agree with @dumblob here that there's no reliable way to know when to shut things down for certain in all usecases. I my case a module deinit function should fire only once and when execution terminates in some way - be it by Ctrl+C or "normal" exit from main just before it returns 0 to the calling shell.

I'm not sure about process suspension/continuing - in my case it's not using a ton of memory - so I probably won't deinit -> init again in this case

larpon · 2021-03-21T20:53:32Z

Hmm... maybe a module deinit with a way to specify at what points deinit should be called ?

dumblob · 2021-03-21T21:10:38Z

with a way to specify at what points deinit should be called

Sounds like infinite dependency hell to me...

But how about approaching this whole thing differently? Namely find a canonical way to make singleton across processes, then a singleton across os threads and finally a singleton across go routines. If this will be readily available, then calling manually any deinit()s at right points in time shall be piece of cake with only advantages: will be explicit, will be independent from the language and language's runtime (imagine modules or whatever will change in V 1.1 or in V 2.0 or some other "ssubtle" but enormously important underlying change will happen) and through visibility you get at least some quazi-safety.

larpon · 2021-03-21T21:53:43Z

Yeah, maybe 🤷‍♂️
We have shared so maybe a [singleton] attribute on top or something

dumblob · 2021-03-21T21:58:17Z

We have shared so maybe a [singleton] attribute on top or something

Just to clarify for further discussion - I meant to have all 3 (independent from other) singleton kinds available in V. Not just one of them (and thus only one [singleton] attribute).

Ben-Fields · 2021-03-21T23:38:36Z

find a canonical way to make singleton across processes

We have shared so maybe a [singleton] attribute on top or something

If you allow multiple instantiations and freeing of a [singleton] at different points in time (still only one instantiated at a given time), I'm not sure that is something you could practically enforce at compile time (baggage for the user or compiler overhead, or restriction to main process). There are common singleton paradigms, but they are all managed by the user as far as I'm aware. Ensuring only one is ever instantiated may be useful...

I think this diverges from my original post; I was misleading bringing up "lifetime" I suppose. My thinking was that deinit() would not solve all de-initialization issues. Perhaps the name is misleading also, but it best mirrors the already established init().

Just as init() is called once at the beginning of the program, no matter how many times the module is imported, my thinking was deinit() would be called at the end of a program's normal execution (in effect at the end of main()), no matter how many times the module is imported. This is useful simply because modules that are not the main module do not have access to main().

If you want to deinit() on abnormal termination also, just pass it as a callback to os.signal(). We already have that, as has been mentioned. I think this fulfills @larpon 's use-case.

larpon · 2021-03-22T06:24:49Z

Yes singletons are user controlled, agreed. it's a V version of C.atexit() and os.signal() I'm looking for. I think modules have a lifecycle since there's points in time where they are inited and points in time where they clearly does not exist anymore - like after a process exits

dumblob · 2021-03-22T08:17:47Z

If you allow multiple instantiations and freeing of a [singleton] at different points in time (still only one instantiated at a given time), I'm not sure that is something you could practically enforce at compile time (baggage for the user or compiler overhead, or restriction to main process). There are common singleton paradigms, but they are all managed by the user as far as I'm aware. Ensuring only one is ever instantiated may be useful...

Here I'll again point out, that I was talking about different lifetimes - and there are exactly 3 language-level lifetimes in V if I'm not mistaken - so the ever you're mentioning is important, but has at least 3 different mutually independent forms ("ever throughout the os process instance", "ever throughout the thread instance", and "ever throughout the go routine instance").

How this will be enforced is another question which I wasn't discussing 😉.

I think this diverges from my original post; I was misleading bringing up "lifetime" I suppose. My thinking was that deinit() would not solve all de-initialization issues. Perhaps the name is misleading also, but it best mirrors the already established init().

Just as init() is called once at the beginning of the program, no matter how many times the module is imported, my thinking was deinit() would be called at the end of a program's normal execution (in effect at the end of main()), no matter how many times the module is imported. This is useful simply because modules that are not the main module do not have access to main().

I think safety here is enormously important - and saying "useful because modules that are not the main module do not have access to main()" implies you want to perform some extremely brittle actions. So I'm really against anything like deinit() in whatever form. I'd rather introduce e.g. an attribute [has_to_be_called_at_least_once "you forgot to call mydeinitfn()"] for a given function/method from a module which would enforce in compile time calling it. This could then be named arbitrarily (e.g. deinit()) and you can easily call it with defer{} guaranteed at the end of main(). Easy, explicit, safe (as safe as the rest of V). And there can be any number of [has_to_be_called_at_least_once "you forgot to call myanotherfn()"] functions/methods in a module.

If you want to deinit() on abnormal termination also, just pass it as a callback to os.signal(). We already have that, as has been mentioned. I think this fulfills @larpon 's use-case.

This won't work with JS backend nor with bare metal "backend" and I thought we agreed above we actually need a crossplatform solution. For for now it should be enough, that's for sure 😉.

Ben-Fields · 2021-03-31T07:56:08Z

I think modules have a lifecycle

To be clear, I completely agree with @dumblob 's first post that modules are a compile-time concept and do not have a lifetime.

I am backtracking my opinion on the initial "deinit()" part of the proposal seeing the interpreted intent is beyond the scope of the role of modules / what I think should be implemented. Pure-V termination detection, which was my original intent of the posted issue, ineptly titled "deinit" on my part, I do not think should be as difficult to determine the method of implementation. However, I now agree in that it is not something that should be added onto "modules".

With regards to os.signal():

This won't work with JS backend nor with bare metal "backend" ... for now it should be enough, that's for sure

I'm unfamiliar with vweb, but ability to register js events would be required (ex. onbeforeunload). For embedded, you'll have to wrap / rely on C.atexit() and know the limitations of the embedded platform you're targeting. For direct assembly, it depends on the target, so an implementation of the previous system-level methods (os.signal()/C.atexit()) is needed. All these methods of abnormal termination could be bundled in an abort() function, declared simply like main(), and restricted to the main module. You could then use $ifs to distinguish targets, since the methods of termination are contextually different.

Some abort() function is not "one way" if mechanisms are available for the different abnormal termination events, but it "just works". Maybe keep to abort() and give warnings when those particular events are accessed individually. Keep in mind:

System callbacks are currently able to be registered in external modules
Due to the nature of abnormal termination, this is an instance in V where it is unclear how to pass data to the function, based on existing convention. Though less elegant, perhaps callbacks are better, save reason 1.

I think safety here is enormously important

I'm going to slightly disagree on this one in that I more highly value simplicity and clarity than I do enforcement of 'safety' abstractions. I think the former, when the two are at odds, will produce better code (which is why V is an attractive language; though yes, it tries to do both).

For the provided use-case, either define a custom module init/deinit (vaguely similar to beginning/ending a graphics context) with [unsafe], or, as has been suggested, wrap the module in a struct with a defined free(). Both approaches are managed by the module user. The second method was dismissed due to lack of enforcement (ex. a singleton attribute). In this scenario, I think the user should not be guarded from misuse. It is not beyond the scope of the intended minimal use of unsafe blocks (you're directly interfacing with C). unsafe blocks are natural when you use [manualfree] in a function that isn't self-contained and/or you use code that isn't pure V.

Many of the standard modules are wrappers for C libraries. To gain safety, among other things, V gives up some power where raw data management is concerned, and to gain some of that power back in certain situations, it has great C interoperability, where you must (re-)compromise on safety.

As a side note, as I understand, a similar adherence to simplicity and clarity is why "auto (de)referencing is going away".

I think the above compromise is better than ensuring the module is initialized for the entire program lifetime, which may not be desired, or until the world is written in V, and we can cast aside distasteful 'unsafe' code dependencies.

Misc.

Modules are a static container for functionality. If it were solely up to me (which it's not), I would remove init(). V is very explicit. For example, functions are pure by default. The only code I want to be called implicitly is free(), a primary feature of the language.

There may be valid use cases for module-level init/deinit as discussed (once on start/termination), but I imagine they are rare such that it would not be a hindrance to have the module user explicitly call them. In the more common case, such as initializing a C library, the false assumption that the module will be used throughout the program's entire lifetime leads to wasting resources.

TL;DR: Don't overcomplicate. Implement some abort() function. Use unsafe when it fits best. Opinion.

dumblob · 2021-03-31T10:07:03Z

@Ben-Fields I think I understand. But could you sum up how it would look like in practice?

I actually see 2 different mutually orthogonal use cases.

The module creator knows there are some activities needed to be executed at the very beginning (before any meaningful computation begins) and/or at the very end of the os process. This is what init() in a module currently does (and assuming there will be a language-level built-in process-global list of callbacks to call "at exit" - i.e. at the very end of main() - we can easily do atexit.add( fn(){content_of_my_callback...} ) in module's init()). This is probably something others had in mind above.

Note though, this does not solve the primary issue (the motivation of this whole discussion as it seems) - namely it doesn't stop wasting resources at all.
The module creator knows, there are some activities needed to be executed just before a given struct/object/function/method from the module starts to be used and/or right after it stops to be used. Here I think it's enough if mymod.MyStruct.new() or mystruct_instance.init() returns a callable (preferably with no arguments - i.e. the callable would be a closure) which the programmer would be obliged to call whenever she finds fit and wants to free resources. It's a similar principle as with f := open( 'a' ) defer{ f.close() }. V documentation shall then emphasize, that the returned closure callable shall be both reentrant and thread safe to make programmer's life easier.

Note, this is what actually does solve the primary issue - namely it does save the resources!

As a side note to enforce calling the closure callable something like "context manager" block (imagine with in Python) would be needed. But that has IMHO more cons than pros.

If both (1) and (2) get implemented, then nothing stops you to prepare some thread-locking in init() and then using it in MyStruct.new() (imagine you're implementing a thread-local storage module and you have to support both systems which natively support TLS and those which don't in which case you want to simulate the behavior using locks - as it's being done in Python).

Ben-Fields · 2021-03-31T22:06:46Z

@dumblob

The module creator knows, there are some activities needed to be executed just before a given struct/object/function/method from the module starts to be used and/or right after it stops to be used. Here I think it's enough if mymod.MyStruct.new() or mystruct_instance.init() returns a callable (preferably with no arguments - i.e. the callable would be a closure) which the programmer would be obliged to call whenever she finds fit and wants to free resources. It's a similar principle as with f := open( 'a' ) defer{ f.close() }. V documentation shall then emphasize, that the returned closure callable shall be both reentrant and thread safe to make programmer's life easier.

This exactly. There is locality, it's similar to file resource management, and documentation is hugely important. With existing V, it's simple, and is this not already possible?

This is where I think the role of the "module creator" ends, and the "module user" has responsibility to use the module correctly. As such, I am against introducing measures such as to "enforce calling the closure callable" (agree that there are more cons that pros).

In Larpon's situation, there is some global state. I maintain that I would use something similar to the dismissed struct method / what is quoted above, giving power/responsibility to the module user. The global state case is a rare case (globals are discouraged in V). Importing two modules that both use the global state module as a dependency is an even rarer case, so perhaps simply keep an initialization flag to ensure the initialization / de-initialization only happens once.

Also, in situations I can envision, I do not think a module should hook into init() or atexit() (minimize 'hidden execution'). The module user has responsibility to call them / place them in the right hooks.

Separately from the discussion of general initialization and de-initialization, I do think there needs to be an atexit/abort/whatever in pure V:

fn main() {
    on_abort( my_cleanup_callback_1( /* ... */ ) ) // part of builtin, causes warning / prod error outside of 
                                                   // main module, some way to specify context/data
    // ...
}

fn my_cleanup_callback_1( /* ... */ ) {
    // Unexpected or asynchronous program end.
    my_cleanup( /* ... */ )
    output_to_log( /* ... */ )
    ensure_user_data_saved( /* ... */ )
    module1.cleanup( /*... */ )
    module2.end( /* ... */ )
}

Either the validity of the passed data is handled with typical error handling (in a similar vein to struct DAGNode { node: ?Node }, OR
The passed callbacks may only accept one variable. The compiler inserts code after the variable's free() to remove that callback from the callback list.

dumblob · 2021-04-01T08:59:03Z

Very well, then we're on the same boat 😉.

With existing V, it's simple, and is this not already possible?

Unfortunately not - closures are not yet implemented if I'm not mistaken. So these rather important programming patterns did (could) not emerge yet 😢.

Separately from the discussion of general initialization and de-initialization, I do think there needs to be an atexit/abort/whatever in pure V:

This is actually more complex than it seems from your example and quick description. Generally it boils down to error() & panic() dualism - the only way how abort can happen in V is either by assert or by panic(). So panic() is the only culprit and it's still an unsolved issue (but I hope after #9367 gets fully implemented, there might be yet slightly smaller number of panic()s and hopefully this trend goes on until panic() will eventually die and disappear before V 1.0).

In other words without panic() there is no need for on_abort() nor anything similar 😉 (asynchronous programming in V is explicit, so nothing hidden and all state is explicitly managed by the user, so there is no need for on_abort() either).

Ben-Fields · 2021-04-02T00:37:37Z

@dumblob I appreciate the info, shared issues and your role in them.

I see that this issue needs to be revisited when several parts of the language are refined.

leighmcculloch · 2021-05-03T14:59:12Z

deinit(). Similar to init(), but called at the end of the module's lifetime. (also similar to free() for structs).

I'd like to not see deinit() introduced, and for us to reconsider the inclusion of init(). The inclusion of either seem at odds with the goal of having no globals. Both globals and module functions that get automatically executed on module import, etc, cause side effects that aren't easy to understand as a user of a module.

The module could provide a function for deinitialization that the importer calls when they are finished with the module. This is much clearer for the importer, easier for readers to understand, and arguably tiny overhead for the importer. It's one function call. This provides much more flexibility because the importer may want to deinitialize the module before the program is finished.

This is also discussed in #7610.

Ben-Fields · 2021-05-04T13:32:04Z

@leighmcculloch Agree completely. There are just a few cases where "a function for deinitialization" cannot be provided due to V's strict scoping. Initial post did not reflect discussion; updated.

leighmcculloch · 2021-05-05T02:01:02Z

@Ben-Fields Could you share a code example of where a function for deinitialization couldn't be used?

Ben-Fields added the Feature Request This issue is made to request a feature. label Mar 20, 2021

zakuro9715 added the Type: Discussion label Mar 22, 2021

Ben-Fields changed the title ~~module deinit()~~ module deinitialization paradigms, cross-platform termination detection Apr 2, 2021

dumblob mentioned this issue Apr 3, 2021

Return value from main #9558

Closed

vlang locked and limited conversation to collaborators Sep 22, 2021

medvednikov closed this as completed Sep 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

module deinitialization paradigms, cross-platform termination detection #9388

module deinitialization paradigms, cross-platform termination detection #9388

Ben-Fields commented Mar 20, 2021 •

edited

Loading

dumblob commented Mar 21, 2021

Ben-Fields commented Mar 21, 2021

JalonSolov commented Mar 21, 2021

dumblob commented Mar 21, 2021

Ben-Fields commented Mar 21, 2021

dumblob commented Mar 21, 2021

Ben-Fields commented Mar 21, 2021

larpon commented Mar 21, 2021 •

edited

Loading

larpon commented Mar 21, 2021

dumblob commented Mar 21, 2021

larpon commented Mar 21, 2021

dumblob commented Mar 21, 2021

Ben-Fields commented Mar 21, 2021

larpon commented Mar 22, 2021

dumblob commented Mar 22, 2021 •

edited

Loading

Ben-Fields commented Mar 31, 2021

dumblob commented Mar 31, 2021

Ben-Fields commented Mar 31, 2021

dumblob commented Apr 1, 2021 •

edited

Loading

Ben-Fields commented Apr 2, 2021

leighmcculloch commented May 3, 2021 •

edited

Loading

Ben-Fields commented May 4, 2021

leighmcculloch commented May 5, 2021

This issue was moved to a discussion.

This issue was moved to a discussion.

module deinitialization paradigms, cross-platform termination detection #9388

module deinitialization paradigms, cross-platform termination detection #9388

Comments

Ben-Fields commented Mar 20, 2021 • edited Loading

dumblob commented Mar 21, 2021

Ben-Fields commented Mar 21, 2021

JalonSolov commented Mar 21, 2021

dumblob commented Mar 21, 2021

When exactly should deinit() be called and how will safety be guaranteed?

Ben-Fields commented Mar 21, 2021

dumblob commented Mar 21, 2021

Ben-Fields commented Mar 21, 2021

larpon commented Mar 21, 2021 • edited Loading

larpon commented Mar 21, 2021

dumblob commented Mar 21, 2021

larpon commented Mar 21, 2021

dumblob commented Mar 21, 2021

Ben-Fields commented Mar 21, 2021

larpon commented Mar 22, 2021

dumblob commented Mar 22, 2021 • edited Loading

Ben-Fields commented Mar 31, 2021

dumblob commented Mar 31, 2021

Ben-Fields commented Mar 31, 2021

dumblob commented Apr 1, 2021 • edited Loading

Ben-Fields commented Apr 2, 2021

leighmcculloch commented May 3, 2021 • edited Loading

Ben-Fields commented May 4, 2021

leighmcculloch commented May 5, 2021

This issue was moved to a discussion.

Ben-Fields commented Mar 20, 2021 •

edited

Loading

When exactly should `deinit()` be called and how will safety be guaranteed?

larpon commented Mar 21, 2021 •

edited

Loading

dumblob commented Mar 22, 2021 •

edited

Loading

dumblob commented Apr 1, 2021 •

edited

Loading

leighmcculloch commented May 3, 2021 •

edited

Loading