Skip to content

Reduce size of Typed::type_info()#24171

Open
greeble-dev wants to merge 7 commits intobevyengine:mainfrom
greeble-dev:minimize-type-info
Open

Reduce size of Typed::type_info()#24171
greeble-dev wants to merge 7 commits intobevyengine:mainfrom
greeble-dev:minimize-type-info

Conversation

@greeble-dev
Copy link
Copy Markdown
Contributor

@greeble-dev greeble-dev commented May 7, 2026

Objective

Reduce binary sizes by changing the generated code for Typed::type_info(). Also makes some very minor improvements to runtime memory and compile times.

Background

Say I have this reflected struct:

#[derive(Reflect)]
struct Example {
    number: u32,
    length: f32,
}

The generated code for Example::type_info is:

impl bevy_reflect::Typed for Example {
    fn type_info() -> &'static bevy_reflect::TypeInfo {
        static CELL: bevy_reflect::utility::NonGenericTypeInfoCell =
            bevy_reflect::utility::NonGenericTypeInfoCell::new();
        CELL.get_or_set(|| {
            bevy_reflect::TypeInfo::Struct(bevy_reflect::structs::StructInfo::new::<Self>(
                &[
                    bevy_reflect::NamedField::new::<u32>("number"),
                    bevy_reflect::NamedField::new::<f32>("length"),
                ],
            ))
        })
    }
}

So on the first call it creates a bunch of data structures that represent the type (StructInfo, NamedField etc), and later calls just return a reference to that data.

The assembly code for type_info can be surprisingly large - Example::type_info is 2KB (x86, release). This might not sound like much, but it adds up when there's over a thousand reflected types, and some of them are much more complex - the assembly for bevy_ui::Node::type_info is 26.5KB. In total there's several megabytes depending on build settings (it's hard to be completely accurate due to inlining and the way some things are hidden behind closures).

Solution

This PR reduces the size of type_info. In a release build, Example:type_info goes from 2KB to 126 bytes, and bevy_ui::Node::type_info from 26.5KB to 10.1KB. The binary goes from 79.03MB to 74.38MB (-4.64MB, -5.9%).

There's a few different tweaks:

1. Avoid allocations

Several structs contain an Arc<CustomAttributes>. But most types doesn't have any custom attributes. Switching to Option<Arc<CustomAttributes>> avoids allocating an Arc for the empty case.

 pub fn new(name: &'static str) -> Self {
     Self {
         name,
-        custom_attributes: Arc::new(CustomAttributes::default()),
+        custom_attributes: None,
         #[cfg(feature = "reflect_documentation")]
         docs: None,
     }

Similarly, Generics is a Box<[GenericInfo]> that's usually empty.

-pub struct Generics(Box<[GenericInfo]>);
+pub struct Generics(Option<Box<[GenericInfo]>>);

These two changes should also give a small runtime performance and memory win.

2. Reduce generic code

StructInfo::new and others look roughly like this:

pub fn new<T: Reflect + TypePath>(fields: &[NamedField]) -> Self {
    ...build internal boxes and HashMaps...
    
    Self {
        ty: Type::of::<T>(),
        fields: fields.to_vec().into_boxed_slice(),
        ...
    }
}

Being generic means that the whole function gets duplicated for each unique T, and often gets inlined too. But T is only used once by Type::of::<T>().

This PR splits the function into a small generic stub that calls a larger non-generic with inline(never).

pub fn new<T: Reflect + TypePath>(fields: &[NamedField]) -> Self {
    Self::from_erased(Type::of::<T>(), fields)
}
    
#[inline(never)]
pub fn from_erased<T: Reflect + TypePath>(fields: &[NamedField]) -> Self {
    ...
}

3. Reduce inlining

I've also made a few changes to inlining on type_info:

 impl bevy_reflect::Typed for Example {
-    #[inline] 
+    #[inline(never)] 
     fn type_info() -> &'static bevy_reflect::TypeInfo {
         static CELL: bevy_reflect::utility::NonGenericTypeInfoCell =
             bevy_reflect::utility::NonGenericTypeInfoCell::new();
-        CELL.get_or_set(|| {
+        CELL.get_or_set(#[inline(never)] || {
             ...

The first change might be contentious as it makes get_represented_type_info +20% slower in micro-benchmarks - although 20% slower only means one cycle slower. I'd guess that's worth it to save ~363KB - it might even be a performance increase in real code due to memory latency. But it's debatable - I'd be fine to remove it.

The second change prevents inlining on the closure that can only be called once - this gives a ~330KB saving with no performance cost. Although I'm not quite sure why it's a saving. I suspect that non-LTO release builds are ending up with two copies - one regular closure and one inlined copy.

Putting it all together

Here's the effect of each change on a release build of 3d_scene (Win10, x86).

Original 80,927KB
Change Arc<CustomAttributes> to option 78,185KB (-2,742KB)
Change Generics to option 77,959KB (-226KB)
Change TypeInfo variants to reduce generics 76,866KB (-1,093KB)
Never inline type_info inner closure 76,536KB (-330KB)
Never inline type_info itself 76,173KB (-363KB)

Compile times maybe went down a second or two (2m21s -> 2m19s), although I'm not sure how much to trust that.

And the before/after on a few different builds:

Build Before After Difference
x86 release 79.0MB 74.3MB -4.7MB, -5.9%
x86 release optimized 45.3MB 43.9MB -1.4MB, -3%
WASM release 90.3MB 88.7MB -1.6MB, -1.7%)
WASM release optimized 61.5MB 60.6MB -0.9MB, -1.5%

The optimized build has codegen-units = 1, lto = "fat", panic = "abort".

Can More Be Done?

Yes, I can imagine a bunch more ways to make type_info smaller. This PR only does the easy stuff. But all the options I could think of are more complex, and there will be diminishing returns.

The dream would be to make TypeInfo entirely const data. But that's a major overhaul and might mean requiring it to leak memory - that's fine for static reflection, but I don't know if it's also used where leaking would be problematic.

A less drastic option would involve changing type_info to not use StructInfo/etc directly - instead it would build up a bunch of smaller POD types that don't need much copying and never need to free memory. Then finally it would pass this simpler type to a non-inlined function that actually creates the real TypeInfo. Would that be worth it? It adds a lot of code, and there's a risk that the savings might be just a few hundred KB.

Testing

cargo test -p bevy_reflect
cargo test -p bevy_reflect_derive
cargo test -p bevy_asset
cargo run --example custom_attributes

# Benchmark type_info
cargo bench -p benches --bench reflect -- concrete_struct_type_info

…s>` to `Option<Arc<CustomAttributes>>`. This avoids needing to allocate them for the common case of being empty.
…icInfo]>>`. This avoids needing to allocate it for the common case of being empty.
…eneric parts of their `new`. This avoid monomorphizing calls to the allocator.
@greeble-dev greeble-dev added C-Performance A change motivated by improving speed, memory usage or compile times A-Reflection Runtime information about types S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels May 7, 2026
@github-project-automation github-project-automation Bot moved this to Needs SME Triage in Reflection May 7, 2026
@alice-i-cecile alice-i-cecile added D-Modest A "normal" level of difficulty; suitable for simple features or challenging fixes D-Macros Code that generates Rust code labels May 7, 2026
/// Creates an empty set of generics.
pub fn new() -> Self {
Self(Box::new([]))
Self(None)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that a box doesn't allocate for zero sized allocations. https://godbolt.org/z/oG8vYj14M

Copy link
Copy Markdown
Contributor Author

@greeble-dev greeble-dev May 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I didn't know that. But making it an option does still make a difference to binary sizes. Problem is that I haven't got a firm explanation for why.

In x86 release the difference is -137KB, but that doesn't seem to come from type_info functions - there's a ton of small differences across the codebase. WASM release is similar (-57KB).

I tried codegen-units = 1 to make it more consistent. Now the binaries have a much smaller gap (-3KB), but it's clearly in type_info related things (and one Debug::fmt for some reason).

 Delta Bytes │ Item
─────────────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
        +111 ┊ <&T as core::fmt::Debug>::fmt::h63ae76a0c6ee540b
         -51 ┊ bevy_animation::_::<impl bevy_reflect::type_info::Typed for bevy_animation::AnimationEventFn>::type_info::{{closure}}::h12b731417151ee00
         -51 ┊ bevy_asset::path::_::<impl bevy_reflect::type_info::Typed for bevy_asset::path::AssetPath>::type_info::{{closure}}::hd80b9ced7d70fa35
         -51 ┊ bevy_asset::render_asset::_::<impl bevy_reflect::type_info::Typed for bevy_asset::render_asset::RenderAssetUsages>::type_info::{{closure}}::hcb8bc4662600b515
         -51 ┊ bevy_camera::camera::_::<impl bevy_reflect::type_info::Typed for bevy_camera::camera::CameraMainTextureUsages>::type_info::{{closure}}::hb3a19a65e0b77a74
         -51 ┊ bevy_camera::camera::_::<impl bevy_reflect::type_info::Typed for bevy_camera::camera::Exposure>::type_info::{{closure}}::h3b5aff6c36ea2d1c
         -51 ┊ bevy_ecs::entity::_::<impl bevy_reflect::type_info::Typed for bevy_ecs::entity::Entity>::type_info::{{closure}}::h8ba36f50a2ead030
         -51 ┊ bevy_ecs::entity::_::<impl bevy_reflect::type_info::Typed for bevy_ecs::entity::EntityGeneration>::type_info::{{closure}}::he8a1fb98c4a32142
         -51 ┊ bevy_ecs::entity::_::<impl bevy_reflect::type_info::Typed for bevy_ecs::entity::EntityIndex>::type_info::{{closure}}::h2e90f066044f2ab0
         -51 ┊ bevy_image::image::_::<impl bevy_reflect::type_info::Typed for bevy_image::image::Image>::type_info::{{closure}}::h890c0d08362bae5d
         -51 ┊ bevy_render::camera::_::<impl bevy_reflect::type_info::Typed for bevy_render::camera::CameraRenderGraph>::type_info::{{closure}}::hbce80d5fd843e51c
         -51 ┊ bevy_render::storage::_::<impl bevy_reflect::type_info::Typed for bevy_render::storage::ShaderBuffer>::type_info::{{closure}}::h237ed12a74b20ba7
         -37 ┊ bevy_reflect::utility::GenericTypeCell<T>::get_or_insert_by_type_id::h0efa23d8c559cc3b
         -37 ┊ bevy_reflect::utility::GenericTypeCell<T>::get_or_insert_by_type_id::h33796cf1654ce1db
         -37 ┊ bevy_reflect::utility::GenericTypeCell<T>::get_or_insert_by_type_id::h41cae767814fb25c
         -37 ┊ bevy_reflect::utility::GenericTypeCell<T>::get_or_insert_by_type_id::hadd00d4cdabd4ae8
         -37 ┊ bevy_reflect::utility::GenericTypeCell<T>::get_or_insert_by_type_id::hc7bf014d11ab9b1a
         -37 ┊ bevy_reflect::utility::GenericTypeCell<T>::get_or_insert_by_type_id::hfb9e3e46da5b9169
         -30 ┊ bevy_reflect::utility::GenericTypeCell<T>::get_or_insert_by_type_id::h1f3bcfa6fdd52b19
         -30 ┊ bevy_reflect::utility::GenericTypeCell<T>::get_or_insert_by_type_id::h2e6cfdb8f5594620
       -1906 ┊ ... and 116 more.
       -2638 ┊ Σ [136 Total Rows]

So my low confidence guess is:

  • In release, something is getting duplicated or failing to optimize across code-gen units, so the problem is magnified.
  • In release with codegen-units = 1, the compiler is able to optimize out a small number of drop calls (not allocations) because it works out the option is always None in some cases.

I'm not sure whether to revert the change to Generics or leave it in. It does make a small difference to release mode, but also complicates the code.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave it in, in some form. It helps communicate intent. Not every type has generics. Plus it's a bit less work for the optimizer. Maybe the simpler code is making the inliner a bit more eager.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Reflection Runtime information about types C-Performance A change motivated by improving speed, memory usage or compile times D-Macros Code that generates Rust code D-Modest A "normal" level of difficulty; suitable for simple features or challenging fixes S-Needs-Review Needs reviewer attention (from anyone!) to move forward

Projects

Status: Needs SME Triage

Development

Successfully merging this pull request may close these issues.

4 participants