Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module deinitialization paradigms, cross-platform termination detection #9388

Closed
Ben-Fields opened this issue Mar 20, 2021 · 23 comments
Closed
Labels
Feature Request This issue is made to request a feature.

Comments

@Ben-Fields
Copy link
Contributor

Ben-Fields commented Mar 20, 2021

deinit(). Similar to init(), but called at the end of the module's lifetime. (also similar to free() for structs).

This can be accomplished using C code currently - C.atext() (int atexit(void (*func)(void))).

C.atexit() exists, but with scoping & avoiding globals, the relevant context(s) cannot be accessed. Request more robust cross-platform termination detection, and a potential similar alternative to init(), so that meta/system-level control flow can be properly handled by the main module.

Discussion TL;DR: Current methods not cross-platform (C.atexit(), os.signal()), encourage wasting resources (init()), encourage sub-modules to alter system-level control flow (os.signal(), exit(), panic()). Pattern with closures proposed.

Sub-issue of sub-modules having globals and side-effects. Globals are a necessity for C interop and low-level (effectively discouraged for general use), but side-effects (making globals apparent to the main module, or altering system-level control flow) can and should be avoided, at least not encouraged.

@Ben-Fields Ben-Fields added the Feature Request This issue is made to request a feature. label Mar 20, 2021
@dumblob
Copy link
Contributor

dumblob commented Mar 21, 2021

I'm afraid this is mixing two inherently different things. atexit() registers a callback to be called when a running os process is about to exit.

But this has nothing to do with modules (module is a compile-time concept - anything runtime is not handled by V AFAIK and thus is handled explicitly by the programmer - imagine something like manual DLL loading using winapi). Remember, V modules are statically compiled, so there is no point in time when a module is "not reachable" or "unloaded" etc.

So could you clarify what's the use case and the desired semantics of a potential trigger running a deinit() function?


As a side note, I'd discourage anyone using atexit() as it has pretty constrained semantics - majority of the program is not anymore valid (so you can't just read any data from anywhere, you can't just execute arbitrary functions, you don't know how threading state and scheduling in edge cases looks like, etc.). For those knowing what they're doing and knowing that they won't need such functionality in JavaScript, they can already now use C.atexit().

@Ben-Fields
Copy link
Contributor Author

I gave C.atexit() as a current workaround. I imagine the implementation would be identical to init(). The use cases would be similar as well.

they can already now use C.atexit()

The goal as I understand is to continually eliminate C from V code.

@larpon

@JalonSolov
Copy link
Contributor

Yes. There should be "pure" V implementations of everything. If that gets translated to C.atexit by cgen, that's fine, but it should not be the only implementation.

Otherwise you are throwing roadblocks to other backends, such as x64, JS, WASM, etc., etc.

@dumblob
Copy link
Contributor

dumblob commented Mar 21, 2021

I might have not been clear enough - sorry for that. My comment wasn't about atexit() (note I explicitly wrote that I strongly discourage using it at all). My comment was about the definition of the semantics of deinit() as there is no point in time when to call it! So again:

When exactly should deinit() be called and how will safety be guaranteed?

(note, that even Rust is by far not expressive enough to describe the safety of an atexit() equivalent - therefore Rust does not even provide anything close to this)

@Ben-Fields
Copy link
Contributor Author

Safety would be guaranteed in the usual way. For example, you might have the following for explicit manual memory management:

module my_module
[manualfree]
fn init() { /* ... */ }
[manualfree]
fn deinit() { /* ... */ }

Though you could use them, of course, without manual memory management.
deinit() would not be special in any other way besides that it is auto-called.
If I were implemeting init() and deinit(), I would auto-generate calls at the beginning and end of main(), respectively. The exact order would be based on module dependency. Probably naive.

@dumblob
Copy link
Contributor

dumblob commented Mar 21, 2021

@Ben-Fields I'm afraid I still do not understand anything about the "safety" measure(s) you're proposing.

Let's put it this way - a module's init() is called with "no previous data available" and thus can be made safe, because there is no reference/pointer/ID/opened_descriptor/opened_connection/thread/lock/you_name_it and thus if nothing weird will be contained in init() itself, "nothing" can go wrong. On the contrary having deinit() implies one wants to work with existing reference/pointer/ID/opened_descriptor/opened_connection/thread/lock/you_name_it none of which can be made fully safe.

@Ben-Fields
Copy link
Contributor Author

@dumblob From the docs:

The init function cannot be public - it will be called automatically. This feature is particularly useful for initializing a C library.

You're right, maybe it won't be terribly useful beyond cleaning up C __globals or without unsafe blocks. Sorry if I'm missing your point yet again, but I'm not proposing you should be able to do anything beyond what's already possible in V inside deinit()?

@larpon Could you share your use-case for C.atexit() and/or deinit()?

@larpon
Copy link
Contributor

larpon commented Mar 21, 2021

Yes. My specific case is a C library I'm currently wrapping. The C library provides it's lifetime management via 2 methods (like many many other C libs do) it has an init() and a deinit() - and inbetween those two calls you get to use all the other useful calls.

Since V has a module init, that part was easy to adapt - but placing the deinit() is harder since V module life-cycle is currently only the init.

After working with it I'm still unsure how to solve it properly. If the module get included in a big project involving several independent vendors that happen to include the module twice having a module deinit() could make a mess since the module would be shutdown twice? (or maybe it couldn't - I'm not sure how V handles it currently edit: I'm pretty sure module init is only called once nomatter how many times it's included, so this is my best option since the C lib is "one instance only")

Wrapping it in a struct with a new/free cycle also is problematic since the C lib is only build for monolthic use (as far as I can see). So it's not really easy to put into a V "life-cycle" scheme. You could call it a kind of "singleton module"

In the end it'll probably just be shipped with a warning that tells you to only use one instance at every point in time - but currently my best option is the module level init() combined with C.atexit and os.signal() to cover the most cases I think.

I agree with @dumblob here that there's no reliable way to know when to shut things down for certain in all usecases. I my case a module deinit function should fire only once and when execution terminates in some way - be it by Ctrl+C or "normal" exit from main just before it returns 0 to the calling shell.

I'm not sure about process suspension/continuing - in my case it's not using a ton of memory - so I probably won't deinit -> init again in this case

@larpon
Copy link
Contributor

larpon commented Mar 21, 2021

Hmm... maybe a module deinit with a way to specify at what points deinit should be called ?

@dumblob
Copy link
Contributor

dumblob commented Mar 21, 2021

with a way to specify at what points deinit should be called

Sounds like infinite dependency hell to me...

But how about approaching this whole thing differently? Namely find a canonical way to make singleton across processes, then a singleton across os threads and finally a singleton across go routines. If this will be readily available, then calling manually any deinit()s at right points in time shall be piece of cake with only advantages: will be explicit, will be independent from the language and language's runtime (imagine modules or whatever will change in V 1.1 or in V 2.0 or some other "ssubtle" but enormously important underlying change will happen) and through visibility you get at least some quazi-safety.

@larpon
Copy link
Contributor

larpon commented Mar 21, 2021

Yeah, maybe 🤷‍♂️
We have shared so maybe a [singleton] attribute on top or something

@dumblob
Copy link
Contributor

dumblob commented Mar 21, 2021

We have shared so maybe a [singleton] attribute on top or something

Just to clarify for further discussion - I meant to have all 3 (independent from other) singleton kinds available in V. Not just one of them (and thus only one [singleton] attribute).

@Ben-Fields
Copy link
Contributor Author

find a canonical way to make singleton across processes

We have shared so maybe a [singleton] attribute on top or something

If you allow multiple instantiations and freeing of a [singleton] at different points in time (still only one instantiated at a given time), I'm not sure that is something you could practically enforce at compile time (baggage for the user or compiler overhead, or restriction to main process). There are common singleton paradigms, but they are all managed by the user as far as I'm aware. Ensuring only one is ever instantiated may be useful...

I think this diverges from my original post; I was misleading bringing up "lifetime" I suppose. My thinking was that deinit() would not solve all de-initialization issues. Perhaps the name is misleading also, but it best mirrors the already established init().

Just as init() is called once at the beginning of the program, no matter how many times the module is imported, my thinking was deinit() would be called at the end of a program's normal execution (in effect at the end of main()), no matter how many times the module is imported. This is useful simply because modules that are not the main module do not have access to main().

If you want to deinit() on abnormal termination also, just pass it as a callback to os.signal(). We already have that, as has been mentioned. I think this fulfills @larpon 's use-case.

@larpon
Copy link
Contributor

larpon commented Mar 22, 2021

Yes singletons are user controlled, agreed. it's a V version of C.atexit() and os.signal() I'm looking for. I think modules have a lifecycle since there's points in time where they are inited and points in time where they clearly does not exist anymore - like after a process exits

@dumblob
Copy link
Contributor

dumblob commented Mar 22, 2021

If you allow multiple instantiations and freeing of a [singleton] at different points in time (still only one instantiated at a given time), I'm not sure that is something you could practically enforce at compile time (baggage for the user or compiler overhead, or restriction to main process). There are common singleton paradigms, but they are all managed by the user as far as I'm aware. Ensuring only one is ever instantiated may be useful...

Here I'll again point out, that I was talking about different lifetimes - and there are exactly 3 language-level lifetimes in V if I'm not mistaken - so the ever you're mentioning is important, but has at least 3 different mutually independent forms ("ever throughout the os process instance", "ever throughout the thread instance", and "ever throughout the go routine instance").

How this will be enforced is another question which I wasn't discussing 😉.

I think this diverges from my original post; I was misleading bringing up "lifetime" I suppose. My thinking was that deinit() would not solve all de-initialization issues. Perhaps the name is misleading also, but it best mirrors the already established init().

Just as init() is called once at the beginning of the program, no matter how many times the module is imported, my thinking was deinit() would be called at the end of a program's normal execution (in effect at the end of main()), no matter how many times the module is imported. This is useful simply because modules that are not the main module do not have access to main().

I think safety here is enormously important - and saying "useful because modules that are not the main module do not have access to main()" implies you want to perform some extremely brittle actions. So I'm really against anything like deinit() in whatever form. I'd rather introduce e.g. an attribute [has_to_be_called_at_least_once "you forgot to call mydeinitfn()"] for a given function/method from a module which would enforce in compile time calling it. This could then be named arbitrarily (e.g. deinit()) and you can easily call it with defer{} guaranteed at the end of main(). Easy, explicit, safe (as safe as the rest of V). And there can be any number of [has_to_be_called_at_least_once "you forgot to call myanotherfn()"] functions/methods in a module.

If you want to deinit() on abnormal termination also, just pass it as a callback to os.signal(). We already have that, as has been mentioned. I think this fulfills @larpon 's use-case.

This won't work with JS backend nor with bare metal "backend" and I thought we agreed above we actually need a crossplatform solution. For for now it should be enough, that's for sure 😉.

@Ben-Fields
Copy link
Contributor Author

I think modules have a lifecycle

To be clear, I completely agree with @dumblob 's first post that modules are a compile-time concept and do not have a lifetime.

I am backtracking my opinion on the initial "deinit()" part of the proposal seeing the interpreted intent is beyond the scope of the role of modules / what I think should be implemented. Pure-V termination detection, which was my original intent of the posted issue, ineptly titled "deinit" on my part, I do not think should be as difficult to determine the method of implementation. However, I now agree in that it is not something that should be added onto "modules".

With regards to os.signal():

This won't work with JS backend nor with bare metal "backend" ... for now it should be enough, that's for sure

I'm unfamiliar with vweb, but ability to register js events would be required (ex. onbeforeunload). For embedded, you'll have to wrap / rely on C.atexit() and know the limitations of the embedded platform you're targeting. For direct assembly, it depends on the target, so an implementation of the previous system-level methods (os.signal()/C.atexit()) is needed. All these methods of abnormal termination could be bundled in an abort() function, declared simply like main(), and restricted to the main module. You could then use $ifs to distinguish targets, since the methods of termination are contextually different.

Some abort() function is not "one way" if mechanisms are available for the different abnormal termination events, but it "just works". Maybe keep to abort() and give warnings when those particular events are accessed individually. Keep in mind:

  1. System callbacks are currently able to be registered in external modules
  2. Due to the nature of abnormal termination, this is an instance in V where it is unclear how to pass data to the function, based on existing convention. Though less elegant, perhaps callbacks are better, save reason 1.

I think safety here is enormously important

I'm going to slightly disagree on this one in that I more highly value simplicity and clarity than I do enforcement of 'safety' abstractions. I think the former, when the two are at odds, will produce better code (which is why V is an attractive language; though yes, it tries to do both).

For the provided use-case, either define a custom module init/deinit (vaguely similar to beginning/ending a graphics context) with [unsafe], or, as has been suggested, wrap the module in a struct with a defined free(). Both approaches are managed by the module user. The second method was dismissed due to lack of enforcement (ex. a singleton attribute). In this scenario, I think the user should not be guarded from misuse. It is not beyond the scope of the intended minimal use of unsafe blocks (you're directly interfacing with C). unsafe blocks are natural when you use [manualfree] in a function that isn't self-contained and/or you use code that isn't pure V.

Many of the standard modules are wrappers for C libraries. To gain safety, among other things, V gives up some power where raw data management is concerned, and to gain some of that power back in certain situations, it has great C interoperability, where you must (re-)compromise on safety.

As a side note, as I understand, a similar adherence to simplicity and clarity is why "auto (de)referencing is going away".

I think the above compromise is better than ensuring the module is initialized for the entire program lifetime, which may not be desired, or until the world is written in V, and we can cast aside distasteful 'unsafe' code dependencies.

Misc.

Modules are a static container for functionality. If it were solely up to me (which it's not), I would remove init(). V is very explicit. For example, functions are pure by default. The only code I want to be called implicitly is free(), a primary feature of the language.

  • There may be valid use cases for module-level init/deinit as discussed (once on start/termination), but I imagine they are rare such that it would not be a hindrance to have the module user explicitly call them. In the more common case, such as initializing a C library, the false assumption that the module will be used throughout the program's entire lifetime leads to wasting resources.

TL;DR: Don't overcomplicate. Implement some abort() function. Use unsafe when it fits best. Opinion.

@dumblob
Copy link
Contributor

dumblob commented Mar 31, 2021

@Ben-Fields I think I understand. But could you sum up how it would look like in practice?

I actually see 2 different mutually orthogonal use cases.

  1. The module creator knows there are some activities needed to be executed at the very beginning (before any meaningful computation begins) and/or at the very end of the os process. This is what init() in a module currently does (and assuming there will be a language-level built-in process-global list of callbacks to call "at exit" - i.e. at the very end of main() - we can easily do atexit.add( fn(){content_of_my_callback...} ) in module's init()). This is probably something others had in mind above.

    Note though, this does not solve the primary issue (the motivation of this whole discussion as it seems) - namely it doesn't stop wasting resources at all.

  2. The module creator knows, there are some activities needed to be executed just before a given struct/object/function/method from the module starts to be used and/or right after it stops to be used. Here I think it's enough if mymod.MyStruct.new() or mystruct_instance.init() returns a callable (preferably with no arguments - i.e. the callable would be a closure) which the programmer would be obliged to call whenever she finds fit and wants to free resources. It's a similar principle as with f := open( 'a' ) defer{ f.close() }. V documentation shall then emphasize, that the returned closure callable shall be both reentrant and thread safe to make programmer's life easier.

    Note, this is what actually does solve the primary issue - namely it does save the resources!

    As a side note to enforce calling the closure callable something like "context manager" block (imagine with in Python) would be needed. But that has IMHO more cons than pros.

If both (1) and (2) get implemented, then nothing stops you to prepare some thread-locking in init() and then using it in MyStruct.new() (imagine you're implementing a thread-local storage module and you have to support both systems which natively support TLS and those which don't in which case you want to simulate the behavior using locks - as it's being done in Python).

@Ben-Fields
Copy link
Contributor Author

@dumblob

The module creator knows, there are some activities needed to be executed just before a given struct/object/function/method from the module starts to be used and/or right after it stops to be used. Here I think it's enough if mymod.MyStruct.new() or mystruct_instance.init() returns a callable (preferably with no arguments - i.e. the callable would be a closure) which the programmer would be obliged to call whenever she finds fit and wants to free resources. It's a similar principle as with f := open( 'a' ) defer{ f.close() }. V documentation shall then emphasize, that the returned closure callable shall be both reentrant and thread safe to make programmer's life easier.

This exactly. There is locality, it's similar to file resource management, and documentation is hugely important. With existing V, it's simple, and is this not already possible?

This is where I think the role of the "module creator" ends, and the "module user" has responsibility to use the module correctly. As such, I am against introducing measures such as to "enforce calling the closure callable" (agree that there are more cons that pros).

In Larpon's situation, there is some global state. I maintain that I would use something similar to the dismissed struct method / what is quoted above, giving power/responsibility to the module user. The global state case is a rare case (globals are discouraged in V). Importing two modules that both use the global state module as a dependency is an even rarer case, so perhaps simply keep an initialization flag to ensure the initialization / de-initialization only happens once.

Also, in situations I can envision, I do not think a module should hook into init() or atexit() (minimize 'hidden execution'). The module user has responsibility to call them / place them in the right hooks.


Separately from the discussion of general initialization and de-initialization, I do think there needs to be an atexit/abort/whatever in pure V:

fn main() {
    on_abort( my_cleanup_callback_1( /* ... */ ) ) // part of builtin, causes warning / prod error outside of 
                                                   // main module, some way to specify context/data
    // ...
}

fn my_cleanup_callback_1( /* ... */ ) {
    // Unexpected or asynchronous program end.
    my_cleanup( /* ... */ )
    output_to_log( /* ... */ )
    ensure_user_data_saved( /* ... */ )
    module1.cleanup( /*... */ )
    module2.end( /* ... */ )
}
  1. Either the validity of the passed data is handled with typical error handling (in a similar vein to struct DAGNode { node: ?Node }, OR
  2. The passed callbacks may only accept one variable. The compiler inserts code after the variable's free() to remove that callback from the callback list.

@dumblob
Copy link
Contributor

dumblob commented Apr 1, 2021

Very well, then we're on the same boat 😉.

With existing V, it's simple, and is this not already possible?

Unfortunately not - closures are not yet implemented if I'm not mistaken. So these rather important programming patterns did (could) not emerge yet 😢.

Separately from the discussion of general initialization and de-initialization, I do think there needs to be an atexit/abort/whatever in pure V:

This is actually more complex than it seems from your example and quick description. Generally it boils down to error() & panic() dualism - the only way how abort can happen in V is either by assert or by panic(). So panic() is the only culprit and it's still an unsolved issue (but I hope after #9367 gets fully implemented, there might be yet slightly smaller number of panic()s and hopefully this trend goes on until panic() will eventually die and disappear before V 1.0).

In other words without panic() there is no need for on_abort() nor anything similar 😉 (asynchronous programming in V is explicit, so nothing hidden and all state is explicitly managed by the user, so there is no need for on_abort() either).

@Ben-Fields
Copy link
Contributor Author

@dumblob I appreciate the info, shared issues and your role in them.

I see that this issue needs to be revisited when several parts of the language are refined.

@Ben-Fields Ben-Fields changed the title module deinit() module deinitialization paradigms, cross-platform termination detection Apr 2, 2021
@leighmcculloch
Copy link
Contributor

leighmcculloch commented May 3, 2021

deinit(). Similar to init(), but called at the end of the module's lifetime. (also similar to free() for structs).

I'd like to not see deinit() introduced, and for us to reconsider the inclusion of init(). The inclusion of either seem at odds with the goal of having no globals. Both globals and module functions that get automatically executed on module import, etc, cause side effects that aren't easy to understand as a user of a module.

The module could provide a function for deinitialization that the importer calls when they are finished with the module. This is much clearer for the importer, easier for readers to understand, and arguably tiny overhead for the importer. It's one function call. This provides much more flexibility because the importer may want to deinitialize the module before the program is finished.

This is also discussed in #7610.

@Ben-Fields
Copy link
Contributor Author

@leighmcculloch Agree completely. There are just a few cases where "a function for deinitialization" cannot be provided due to V's strict scoping. Initial post did not reflect discussion; updated.

@leighmcculloch
Copy link
Contributor

@Ben-Fields Could you share a code example of where a function for deinitialization couldn't be used?

@vlang vlang locked and limited conversation to collaborators Sep 22, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
Feature Request This issue is made to request a feature.
Projects
None yet
Development

No branches or pull requests

7 participants