Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Option<Entity> to leverage niche optimization #3029

Conversation

notverymoe
Copy link
Contributor

@notverymoe notverymoe commented Oct 26, 2021

Objective

Solution

  • Entity::generation was changed to NonZeroU32, over Entity::id because:
    • It reduces the amount of API breakage, being effectively an implementation detail
    • It's, probably, less error-prone than constantly offsetting Entity::id
  • Entities::meta changed to Option<NonZeroU32> to match
    • None indicates a slot that has used all generation values and can no longer be reused.
  • Entity::from_bits was modified to return a Option<Entity>
  • Tests updated (mostly just Entity::generation init with NonZeroU32::new)
  • Added tests for Entity::from_bits when the bits are invalid; Option<Entity>niche optimization; Entities::meta generation increment; Entities::meta generation overflow behaviour;

Edit: PS first PR here, be harsh but gentle 😅

Edit2: Updated to reflect latest commit

@github-actions github-actions bot added the S-Needs-Triage This issue needs to be labelled label Oct 26, 2021
Entity { id, generation: 0 }
Entity {
id,
generation: unsafe{ NonZeroU32::new_unchecked(1) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe

Suggested change
generation: unsafe{ NonZeroU32::new_unchecked(1) }
generation: NonZeroU32::new(1).unwrap(),

as this is pretty much guaranteed to be optimized away anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely should be optimized away like that, but I'm not sure about with debug/non-optimized builds - though I don't imagine it being a bottleneck.

The other option is to introduce a constant in the file, ie. GENERATION_INITIAL, which we almost have anyway via the EntityMeta::EMPTY constant since unsafe is required in that context anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checked, compiler seems happy to optimize that even at opt-level=1 but won't at 0:
https://godbolt.org/z/E8EY5Mj8s

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other option is to introduce a constant in the file, ie. GENERATION_INITIAL, which we almost have anyway via the EntityMeta::EMPTY constant since unsafe is required in that context anyway.

A constant should work. Something like const GENERATION_ONE: NonZeroU32 = if let Some(gen) = NonZeroU32::new(1) { gen } else { panic!() }; should work in safe code and optimize even with opt-level=0. It will require rustc 1.57 for the panic!() though, but replacing panic!() with [][1] works even with the current rustc 1.56.

meta.generation.get()
.checked_add(1)
.unwrap_or(1))
};
Copy link
Contributor

@bjorn3 bjorn3 Oct 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the safe version optimize to the same asm as this code here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, I'll confirm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't appear to optimize to the same code unfortunately, though I might've done it wrong:
https://godbolt.org/z/4r6soaTbj

Copy link
Contributor Author

@notverymoe notverymoe Oct 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible solution:

std::num::NonZeroU32::new(match num.get() {
        u32::MAX => 1,
        v        => v+1
    }).unwrap();
  • [Opt-0] Removes overhead of calls to checked_add and unwrap_or, still has the overhead of NonZeroU32::new.
  • [Opt-1] Shorter asm than unsafe option
  • [Opt-2][Opt-3] Identical asm between it and the unsafe version
  • Can mark the slot dead without effecting the asm much, as per: Allow Option<Entity> to leverage niche optimization #3029 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the dead slot, we don't actually need to unwrap where. But we do need to unwrap when we allocate from the freelist. I'm not sure if that's particularly performant - but it's safe!

Copy link
Contributor

@bjorn3 bjorn3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good approach, but I would prefer less usage of unsafe.

@NiklasEi NiklasEi added A-ECS Entities, components, systems, and events C-Enhancement A new feature C-Performance A change motivated by improving speed, memory usage or compile times S-Needs-Review and removed S-Needs-Triage This issue needs to be labelled labels Oct 26, 2021
Copy link
Member

@DJMcNab DJMcNab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally a reasonable implementation, but we should not use any of this unsafe.
It might actually be viable to make bevy_ecs::entity #![forbid_unsafe_code].
That would require understanding why flush is unsafe, which is unclear to me, so probably isn't a task for this PR.


meta.generation = unsafe{ NonZeroU32::new_unchecked(
meta.generation.get()
.checked_add(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if this addition fails, we should instead choose not to add the entity to the pending Vec.
This effectively marks the slot as dead, but avoids re-using generations.
This should probably give a warning though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only issue I can see is that contains, and any similar checks, will see the u32::MAX generation and match it with the last EntityID - which is still a problem with wrapping anyway, just a more likely problem. We can avoid this by losing 1 generation at the top too though, so that's fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yeah, well spotted. I guess EntityMeta could contain an Option<NonZeroU32> with None for this case?

Copy link
Contributor Author

@notverymoe notverymoe Oct 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be the dead generation way, avoids unwraps on reallocating free'd but not dead entities:

        meta.generation = NonZeroU32::new(meta.generation.get().saturating_add(1)).unwrap();

        if meta.generation != GENERATION_DEAD {
            self.pending.push(entity.id);

            let new_free_cursor = self.pending.len() as i64;
            *self.free_cursor.get_mut() = new_free_cursor;
            self.len -= 1;
        }

        Some(mem::replace(&mut meta.location, EntityMeta::EMPTY.location))

Optimized, this is the same cost as just a saturating_add, so it correctly removes the NonZeroU32::new/unwrap.

I'll give the Option option a go now

Copy link
Contributor Author

@notverymoe notverymoe Oct 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, dinner! Option method was a little more complex, but it's the more correct solution for sure. Thought there were a few issues revealed by using it, but, it turns out that in all those cases we were accessing meta via indices from pending - so they're all safe to unwrap in this case. But, in the previous case most of them had slipped past my scan through.

Only issue I can foresee is that I don't think the optimizer can handle this case, since it won't know that pending can only contain indexes to options with a value, so all those additional unwraps probably won't be optimized out. I don't think that switching to a raw u32 would "fix" that here either, and I'm not even really sure this would be a bottleneck anyway. Beats a panic or bug though, that's for sure.

@@ -76,17 +80,17 @@ impl Entity {
///
/// No particular structure is guaranteed for the returned bits.
pub fn to_bits(self) -> u64 {
u64::from(self.generation) << 32 | u64::from(self.id)
u64::from(self.generation.get()) << 32 | u64::from(self.id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
u64::from(self.generation.get()) << 32 | u64::from(self.id)
u64::from(self.generation()) << 32 | u64::from(self.id)

Copy link
Member

@DJMcNab DJMcNab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got a few things I'd like to see addressed, but otherwise I'm happy with this

@@ -255,7 +266,7 @@ impl Entities {
// Allocate from the freelist.
let id = self.pending[(n - 1) as usize];
Entity {
generation: self.meta[id as usize].generation,
generation: self.meta[id as usize].generation.unwrap(), // Safe, meta from pending list, so generation is valid
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of the term safe here is peculiar - an unwrap failing/triggering is never UB.

I'd probably change this to use expect

@@ -352,14 +366,14 @@ impl Entities {
let current_meta = &mut self.meta[entity.id as usize];
if current_meta.location.archetype_id == ArchetypeId::INVALID {
AllocAtWithoutReplacement::DidNotExist
} else if current_meta.generation == entity.generation {
} else if current_meta.generation == Some(entity.generation) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


let loc = mem::replace(&mut meta.location, EntityMeta::EMPTY.location);
meta.generation = NonZeroU32::new(meta.generation.unwrap().get().wrapping_add(1));

This comment was marked as resolved.

@@ -397,11 +418,11 @@ impl Entities {
}

/// Returns true if the [`Entities`] contains [`entity`](Entity).
// This will return false for entities which have been freed, even if
// not reallocated since the generation is incremented in `free`
/// This will return false for entities which have been freed, even if
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose to make these not doc comments, since this is explaining to the reader of the code 'how this does what the doc says'.

That is, this is describing an implementation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad!

Comment on lines 439 to 440
if meta.generation.is_none()
|| meta.generation.unwrap() != entity.generation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if meta.generation.is_none()
|| meta.generation.unwrap() != entity.generation
if meta.generation != Some(entity.generation)

or

Suggested change
if meta.generation.is_none()
|| meta.generation.unwrap() != entity.generation
if meta.generation? != entity.generation

would both also work

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, must have been a little tired when I went through this, did it "correct" already elsewhere

@@ -435,14 +457,17 @@ impl Entities {
pub fn resolve_from_id(&self, id: u32) -> Option<Entity> {
let idu = id as usize;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really a fan of the idu name here, however I don't have a good suggestion for anything different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All that comes to mind is:

  • idx / index
  • slot_id / slot_idx / slot_index

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the last push, this is the only comment I don't think I've addressed

@@ -518,13 +545,13 @@ impl Entities {

#[derive(Copy, Clone, Debug)]
pub struct EntityMeta {
pub generation: u32,
pub generation: Option<NonZeroU32>,

This comment was marked as resolved.

NotVeryMoe added 2 commits October 30, 2021 16:42
Change generation.unwrap to generation.expect, add documentation for None generation's in entity meta, add documentation for wrapping_add of generations, revert doc comment change, fix generation comparison
@@ -542,17 +578,67 @@ pub struct EntityLocation {
pub index: usize,
}

// Constant for the initial generation value, removes the need to unwrap in opt-0 code but doesn't
// require unsafe. When rust 1.57 drops, we can exchange [][1] for panic!().
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1.57 has landed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, I'll get that in after work

@maniwani
Copy link
Contributor

maniwani commented Dec 16, 2021

  • Entities::meta changed to Option<NonZeroU32> to match
    - None indicates a slot that has used all generation values and can no longer be reused.

Is permanently retiring an id actually behavior that people want? It seems more like it's a consequence of wrapping the generation and not wanting a branch in its increment logic.

  • Entity::generation was changed to NonZeroU32, over Entity::id because:
    - It reduces the amount of API breakage, being effectively an implementation detail
    - It's, probably, less error-prone than constantly offsetting Entity::id

I wouldn't expect people to spawn 4 billion entities 4 billion times each (they might tho), but aside from being less code to change, this part feels arbitrary. I think this PR should argue why retirement is a good idea for everyone, since it would essentially put a hardcoded time-limit on a World, which is strange when things like change ticks wraparound.

(Edit: Also, in the future, it's plausible that Bevy would reduce the number of generation bits to add something else.)

(Edit 2: The current implementation panics if a generation maxes out, so I guess this PR is a net improvement and good as-is, but why not let it wraparound with a warning that stale cached Entity values may have become valid again?)

@notverymoe
Copy link
Contributor Author

notverymoe commented Dec 19, 2021

Oops, just realized that I didn't post my reply, just need to rewrite it - one sec. Sorry if this is a little rambly/overly-specific, bit late here and I like running silly numbers (hopefully I didn't fumble the math at least)

Is permanently retiring an id actually behavior that people want? It seems more like it's a consequence of wrapping the generation and not wanting a branch in its increment logic.

The reason for not wrapping generation IDs is to prevent the entire class of errors caused by aliasing, which range from subtle to significant bugs and are difficult to test for, design around and reproduce. By retiring we trade that class of errors for a single well-defined error case that both already exists (spawn more entities than assignable IDs) and is significantly less likely to occur (both than aliasing and running out of entity IDs alone). This has the additional advantage of potentially allowing us to expose a handler to the developer, if that is something we're interested in (ie. save critical data, shutdown gracefully, perform some operation to free entity IDs).

I wouldn't expect people to spawn 4 billion entities 4 billion times each (they might tho), but aside from being less code to change, this part feels arbitrary. I think this PR should argue why retirement is a good idea for everyone, since it would essentially put a hardcoded time-limit on a World, which is strange when things like change ticks wraparound.

If there is actual demand for destroying and re-creating that many entities over the lifetime of the program, I'd suggest that the first step would be to re-evaluate the approach being taken. Other than standard suggestions like merging entities into groups (ie. a particle system instead of particle entities), I would suggest implementing a pool system. This pool system will better-fit the user's advanced use case as they can then attach (or not attach) any amount of metadata to identify aliasing (ie. a u128 generation), and force the developer to identify which systems care about aliasing. One other solution is sub-apps/sub-worlds, and recreating them if you exhaust IDs, which is a little more transparent and only requires the user to define and facilitate cross-world communication.

I also believe that the chance of encountering this error is so low that any such designs that could, would already need to be designing around limits such as the entity ID address space. Assuming a 4GHz processor incrementing the generation of each entity in-turn, once per tick and performing no other logic/work, it would take 140 years to do. To bring that down to something like a decade, you would need to permanently occupy all but 300k entities (ie. permanently take-up 4 billion, or 93.75% of the address space), at which point you would be much more likely to run out of entity IDs than generations and likely have thus design around it (and it would be a significant burden on system resources, something like 31.8GiB just in Entity ID/Gens alone) - meaning that it wouldn't be unreasonable to also require one to design around a generation limit at that extreme.

Also, in the future, it's plausible that Bevy would reduce the number of generation bits to add something else.

Reducing the number of generation bits could definitely be of concern. Reducing it by even 8 bits in this engineered scenario would only leave us with a little over a half year of uptime. The engineered case is obviously not very useful for doing any work however, and if you're spending 95% of your ticks on actual work, that half year stretches out to just over a decade).

Though as you point out the current implementation panics immediately as soon as generations roll over (though only in debug), and that would occur in just barely over a second (1.07). So on the whole this should be an improvement. Maybe it's something that should be re-visted later when reducing the number of generation bits moves forward? Or is it something worth talking about now? Since this PR effectively allows developers the freedom to chew up generations at a higher rate, and switching back later to be more strict again would break that code and switching to allow wrapping would also break such code

@maniwani
Copy link
Contributor

maniwani commented Dec 19, 2021

The reason for not wrapping generation IDs is to prevent the entire class of errors caused by aliasing, which range from subtle to significant bugs and are difficult to test for, design around and reproduce.

I also believe that the chance of encountering this error is so low that any such designs that could, would already need to be designing around limits such as the entity ID address space.

You acknowledge that developers who could even encounter this would likely be designing around the limits of the address space already, so it seems a bit overkill to enforce it. If we allow the generation to wraparound, we could just emit a warning when the it resets to 1, e.g. "Dangling references to entity ID {id} may have become valid again."

[A] pool system will better-fit the user's advanced use case as they can then attach (or not attach) any amount of metadata to identify aliasing (ie. a u128 generation), and force the developer to identify which systems care about aliasing.

One other solution is sub-apps/sub-worlds, and recreating them if you exhaust IDs, which is a little more transparent and only requires the user to define and facilitate cross-world communication.

The solutions suggested seem more troublesome than just warning the user about possible aliasing, since they expect the user to have a clear appreciation of aliasing anyway or lean on future work to overcome the inconvenience.

Sub-worlds—or any extension to bevy_ecs that would let users reserve a range of entities—are what I'm mostly concerned about, since they're more likely to be impacted by retirement. Recreating an exhausted World does not sound like a cheap operation, nor does Bevy have a method (or idiom) for doing that.

I think retirement is better than the current debug panic / silent release behavior. (Edit: so yeah, I think we can revisit this after this PR) It's just, given the rarity of apps that approach the limits of a 64-bit ID and our shared assumption that their developers would most likely be careful to avoid dangling entity references anyway, ID retirement seems more like a weirdly targeted obstacle than a general user safeguard.

Other ECS additions we're expecting may also reduce the likelihood of encountering entity aliasing bugs. For example, automatic cleanup of dangling references following despawn is one of the highlights of relations whenever they're brought up in discussion.

@notverymoe
Copy link
Contributor Author

notverymoe commented Dec 20, 2021

You acknowledge that developers who could even encounter this would likely be designing around the limits of the address space already, so it seems a bit overkill to enforce it. If we allow the generation to wraparound, we could just emit a warning when the it resets to 1, e.g. "Dangling references to entity ID {id} may have become valid again."

No, the error I'm talking about in those quotes is exhausting the address space without generation rollover. A bad alias from rollovers is either at least as unlikely as world retirement or orders of magnitude more likely to occur - particularly if we were to reduce the number of bits. Enough that I believe it should be avoided by default. Designing for and debugging aliasing would be a non-insignificant burden on developers since it's such a hard issue to reproduce, and I don't feel like the warning helps in this case because it could have a bad alias on generation 33 and that rollover warning 4+ hours ago in the logs.

The solutions suggested seem more troublesome than just warning the user about possible aliasing, since they expect the user to have a clear appreciation of aliasing anyway or lean on future work to overcome the inconvenience.

I expect them to have a clear appreciation of aliasing in that case, because to encounter that case they would already have had to have designed around other limitations of the entity address space. I do think some kind of graceful shutdown in the case of a panic could be nice though - but that's another design talk for a different part of the engine really.

Sub-worlds—or any extension to bevy_ecs that would let users reserve a range of entities—are what I'm mostly concerned about, since they're more likely to be impacted by retirement.

I definitely agree regarding reserved ranges, that's definitely a huge concern.

I'm just not convinced that making generations rollover by default make sense when you're already manipulating the address space. In that case we could make it opt-in since they're already making active choices regarding the address space, for example we could a mechanism flag a range or subworld as having rollover, and still provide aliasing safety by default.

Recreating an exhausted World does not sound like a cheap operation, nor does Bevy have a method (or idiom) for doing that.

Recreating an exhausted world is definitely an expensive operation. However to need to do something like that you would, with the current amount of bits, need to be doing some kind of multi-decade data science number crunching and providing efficient patterns of such fringe use cases of a game engine seems out of scope to me.

Other ECS additions we're expecting may also reduce the likelihood of encountering entity aliasing bugs. For example, automatic cleanup of dangling references following despawn is one of the highlights of relations whenever they're brought up in discussion.

That's definitely a step towards removing aliasing issues. I do have concerns about that regarding efficient access patterns and storage overhead, primarily in the case of grid or tree structures. But I expect after relations drop there will be plenty of efficient implementations of such structures.

I think retirement is better than the current debug panic / silent release behaviour.

Yeah, that definitely makes me uncomfortable, even though it makes sense it still gives me anxiety just thinking about it. I do wonder if going with panic in both debug/release for now might be the better option, since its more restrictive it might be an easier transition when it later opens it up in other specific ways.

our shared assumption that their developers would most likely be careful to avoid dangling entity references anyway, ID retirement seems more like a weirdly targeted obstacle than a general user safeguard.

The main reason I have concerns about it is because the likelihood of it occurring ranges all the way from as unlikely as retiring a world, to almost certain to occur. Conceivably it can be encountered by developers and players, without any strange access patterns, creating a very difficult situation to debug. Where-as the access patterns required to exhaust a world via retirement is a very fringe use-case, and would require the user to already be making informed design decisions around the entity address space.

However, I can't actually quantify the risk of the average user running into aliasing issues. With 32-bit generations it's either as or more significant than exhausting the address space but probably insignificant for most use cases, but dropping to even 24-bits significantly increases the risk of it occurring and is approaching being almost certain at something extreme like 16-bits. In those cases we would be trading a single entity id for effectively an additional 32/24/16-bits worth of unaliased generations, which is why I generally prefer to make the trade.

That all said, it's just the trade I like to make in my hobby ECSs. I trust that the people working on bevy are much more versed in the design space than I am, as well as on the future direction of Bevy's ECS. So, I'm not against rolling rollover it into this PR if its the general consensus - or we could be more conservative and enforce panic at both debug and runtime - or as you say continue with the PR as-is.

PS. Hope I'm not coming off as rude, definitely none of this is meant with any kind of bad tone.

Copy link
Member

@DJMcNab DJMcNab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I for one am in favour of the retirement solution

If someone actually runs into issues related to this, then we can add a setting which enables rollover, and point to it in the warning.

I'd be extremely surprised if someone ever does though, so I don't think we need it in this PR.

const GENERATION_ONE: NonZeroU32 = if let Some(gen) = NonZeroU32::new(1) {
gen
} else {
panic!("Failed to unwrap GENERATION_ONE constant")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
panic!("Failed to unwrap GENERATION_ONE constant")
unreachable!("1≠0")

Copy link
Contributor Author

@notverymoe notverymoe Dec 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately unreachable! with a message isn't const even though panic! is 0.o Without a message however it's fine, should I move it to that? It does just compile down to panic! with a preset message in that case though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a raw unreachable would be fine, otherwise some panic message like unreachable: 1≠0 would also make sense.

Copy link
Member

@alice-i-cecile alice-i-cecile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also prefer re-use, but it sounds like that's out of scope for this PR. Code and docs both look good to me, and I strongly approve of the general idea.

@notverymoe
Copy link
Contributor Author

Just updated this, the merge was a bit ugly to try and resolve so i just recreated the PR line by line.

The hierarchy system required some reworking since it made use of placeholders, I'm not sure my solution is ideal.

@notverymoe
Copy link
Contributor Author

I'm not sure what to do about the failed github check, it seems to have failed to clone the repo

@alice-i-cecile
Copy link
Member

Yep I'm rerunning CI for you :) In the future, just ping me or another maintainer and we can do that.

Copy link
Member

@james7132 james7132 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see some fresh benchmarks, because this adds branches in numerous key entity-related functions.

The removal of PLACEHOLDER and the introduction of Parent::try_get is also generally bad UX IMO.

/// }
/// }
/// ```
pub const PLACEHOLDER: Self = Self::from_raw(u32::MAX);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to retain this with a u32::MAX generation and index.

generation: (bits >> 32) as u32,
index: bits as u32,
pub const fn from_bits(bits: u64) -> Option<Self> {
if let Some(generation) = NonZeroU32::new((bits >> 32) as u32) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could use Option::map.

@james7132 james7132 added S-Needs-Benchmarking This set of changes needs performance benchmarking to double-check that they help and removed S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it labels Mar 13, 2023
@alice-i-cecile alice-i-cecile added this to the 0.11 milestone Mar 13, 2023
@notverymoe
Copy link
Contributor Author

I'd like to see some fresh benchmarks, because this adds branches in numerous key entity-related functions.

The removal of PLACEHOLDER and the introduction of Parent::try_get is also generally bad UX IMO.

Definitely agree on both those last points, I wasn't entirely certain what the preferred solution there was. I hope to get some fixes for this up by the weekend

@james7132
Copy link
Member

I want to also propose an alternative implementation here that might avoid the branches in Entities: use NonZeroU32 for the index not the generation. We can resolve this by padding the Entities internal Vec with an unused EntityMeta, which would require fewer branches and unwraps, and would not incur any cost when indexing into Entities. We lose our on entity zero, but we literally have 2^32 - 2 others, which is probably a worthwhile tradeoff.

@notverymoe
Copy link
Contributor Author

notverymoe commented Mar 15, 2023

For the purposes of the issue being address, niche-optimization, absolutely and I'd be keen to go forward with it since it'll likely be more performant than previous tests for the generation indicate - I hadn't thought about just blacklisting an entity slot to avoid offsetting, very smart. However, we do lose out on having consistent behavior for generation overflows - currently in development exhausting a generation will cause a panic, whilst in production it'll silently overflow.

@notverymoe
Copy link
Contributor Author

I mean, generation overflow behavior might be better addressed with a proposal since subworlds complicate this quite a bit.

@notverymoe
Copy link
Contributor Author

notverymoe commented Jun 15, 2023

Hey there, I haven't forgot about this, I just kept remembering when I was busy and forgot again by the time I was free 😅 Haven't got those fixes up for the branch, but I wrote up the reserved-id branch a little while ago, and missed posting about it:
https://github.com/nvm-contrib/bevy/tree/niche-optimization-for-entity-via-id

I tried running the benches at the time, but I got similar inconsistent results like mentioned earlier in this issue. If I get a chance I'll give the benches another go soon.

@alice-i-cecile
Copy link
Member

Bumping to 0.12 milestone: feel free to remake this PR if you think that will be easier than fixing merge conflicts!

@alice-i-cecile alice-i-cecile modified the milestones: 0.11, 0.12 Jun 19, 2023
Comment on lines 716 to +717
/// The current generation of the [`Entity`].
pub generation: u32,
pub generation: Option<NonZeroU32>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the increase in code clarity here -- this makes it harder to forget to handle the case where an entity is out of room for more generations.

Copy link
Contributor Author

@notverymoe notverymoe Jul 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly this became a big selling point to be with the generation approach, since currently debug will panic and release will just silently overflow and to rectify that discrepancy we're (probably) going to need a slowdown in that area regardless... or just assume nobody will ever exhaust generations or hold a stale reference (which honestly might not be unreasonable)

@alice-i-cecile alice-i-cecile added the S-Adopt-Me The original PR author has no intent to complete this work. Pick me up! label Sep 12, 2023
@alice-i-cecile alice-i-cecile removed this from the 0.12 milestone Sep 12, 2023
@notverymoe
Copy link
Contributor Author

notverymoe commented Sep 22, 2023

Starting a conversation on discord, because honestly I need to know what direction I should be pushing this.

Link here: https://discord.com/channels/691052431525675048/1154573759752183808/1154573764240093224

@notverymoe
Copy link
Contributor Author

notverymoe commented Sep 22, 2023

First pass attempt v2 at Entity::index nicheing. Still some quirks and questions:
https://github.com/nvm-contrib/bevy/pull/1/files

If we decide in the discord discussion that index is the go-forward, I'll close this PR and open one for the index version on here

I'll have an update for the Generation version of this change sometime this weekend. Some notes/changes I'll be looking to add into here, after having talked with Joy/Joy2 on discord:

  • Generation -> NonZeroU32
  • We can make flags and relationships work with it
  • We can mitigate performance regressions for iteration/get by removing Option from the entity meta, at the cost of a slightly more expensive free operation.
  • Wrap generation for now, and table for discussion, since it'll be less problematic to swap from wrapping to not-wrapping than the other way around.
  • There are plenty of generations to go around
  • Already planned work will address theoretical subworld id sync concerns

Give the scope of the changes, and the current conflicts. It might be better to open a new PR. Unsure.

@notverymoe
Copy link
Contributor Author

This PR has been superseded:
#9907

@notverymoe notverymoe closed this Sep 23, 2023
github-merge-queue bot pushed a commit that referenced this pull request Jan 8, 2024
…on (#9907)

# Objective

- Implements change described in
#3022
- Goal is to allow Entity to benefit from niche optimization, especially
in the case of Option<Entity> to reduce memory overhead with structures
with empty slots

## Discussion
- First PR attempt: #3029
- Discord:
https://discord.com/channels/691052431525675048/1154573759752183808/1154573764240093224

## Solution

- Change `Entity::generation` from u32 to NonZeroU32 to allow for niche
optimization.
- The reason for changing generation rather than index is so that the
costs are only encountered on Entity free, instead of on Entity alloc
- There was some concern with generations being used, due to there being
some desire to introduce flags. This was more to do with the original
retirement approach, however, in reality even if generations were
reduced to 24-bits, we would still have 16 million generations available
before wrapping and current ideas indicate that we would be using closer
to 4-bits for flags.
- Additionally, another concern was the representation of relationships
where NonZeroU32 prevents us using the full address space, talking with
Joy it seems unlikely to be an issue. The majority of the time these
entity references will be low-index entries (ie. `ChildOf`, `Owes`),
these will be able to be fast lookups, and the remainder of the range
can use slower lookups to map to the address space.
- It has the additional benefit of being less visible to most users,
since generation is only ever really set through `from_bits` type
methods.
- `EntityMeta` was changed to match
- On free, generation now explicitly wraps:
- Originally, generation would panic in debug mode and wrap in release
mode due to using regular ops.
- The first attempt at this PR changed the behavior to "retire" slots
and remove them from use when generations overflowed. This change was
controversial, and likely needs a proper RFC/discussion.
- Wrapping matches current release behaviour, and should therefore be
less controversial.
- Wrapping also more easily migrates to the retirement approach, as
users likely to exhaust the exorbitant supply of generations will code
defensively against aliasing and that defensive code is less likely to
break than code assuming that generations don't wrap.
- We use some unsafe code here when wrapping generations, to avoid
branch on NonZeroU32 construction. It's guaranteed safe due to how we
perform wrapping and it results in significantly smaller ASM code.
    - https://godbolt.org/z/6b6hj8PrM 

## Migration

- Previous `bevy_scene` serializations have a high likelihood of being
broken, as they contain 0th generation entities.

## Current Issues
 
- `Entities::reserve_generations` and `EntityMapper` wrap now, even in
debug - although they technically did in release mode already so this
probably isn't a huge issue. It just depends if we need to change
anything here?

---------

Co-authored-by: Natalie Baker <natalie.baker@advancednavigation.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Enhancement A new feature C-Performance A change motivated by improving speed, memory usage or compile times S-Adopt-Me The original PR author has no intent to complete this work. Pick me up! S-Needs-Benchmarking This set of changes needs performance benchmarking to double-check that they help X-Controversial There is active debate or serious implications around merging this PR
Projects
Status: Responded
Development

Successfully merging this pull request may close these issues.

9 participants