Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

ECS Architecture

gba::ecs uses a static, flat-storage architecture tuned for ARM7TDMI constraints. The design goal is straightforward: make the common operations for a small fixed-capacity game world cheap enough that you can reason about them without a profiler open all day.

File layout and public interface

include/gba/ecs              -> public facade
  +- registry<Capacity, Components...>
  +- group<Components...>
  +- entity_id (handle with generation)
  +- pad<N> (padding utility)

include/gba/bits/ecs/        -> internal implementation
  +- entity.hpp
  +- group.hpp
  +- group_metadata.hpp
  +- registry.hpp

Why this ECS is static

Many desktop ECS libraries optimise for:

  • unlimited entity counts
  • runtime component registration
  • dynamic archetype churn
  • scheduler/tooling integration

gba::ecs optimises for something entirely different:

  • a known maximum entity count (fits in 8 bits; max 255 entities)
  • a small compile-time component set (max 31 components)
  • simple arrays that can live inline inside one registry object
  • predictable loops for handheld game logic

That is why the registry type specifies everything at compile time:

using world_type = gba::ecs::registry<128, position, velocity, health>;

The type itself answers the architectural questions: maximum 128 live entities, exactly three component pools.

Registry storage model

Every registry owns its storage inline – no heap allocation, no indirection.

registry<Capacity, Components...>
|
+- hot metadata (cached in Thumb mode)
|  +- m_component_count[N]      (1 byte/component)
|  +- m_free_top                (1 byte)
|  +- m_next_slot               (1 byte)
|  +- m_alive                   (1 byte)
|  +- m_dense_prefix            (1 byte)
|
+- per-slot tracking
|  +- m_mask[Capacity]          (4 bytes/slot)
|  +- m_gen[Capacity]           (1 byte/slot)
|  +- m_free_stack[Capacity]    (1 byte/slot)
|  +- m_alive_list[Capacity]    (1 byte/slot)
|  +- m_alive_index[Capacity]   (1 byte/slot)
|
+- component pools
   +- std::array<C1, Capacity>  (Capacity x sizeof(C1))
   +- std::array<C2, Capacity>  (Capacity x sizeof(C2))
   +- ...

No heap allocation, sparse sets, or type-erased component maps are involved.

Memory consumption breakdown

For gba::ecs::registry<128, position, velocity, health>:

ItemSizeNotes
Metadata overhead
Hot scalars (5 bytes)5 B
Per-slot tracking (7 × 128)896 Bm_mask + m_gen + stacks + indices
Per-component count (3)3 B
Metadata subtotal904 B(26% of total)
Component pools
position (8 × 128)1024 B
velocity (8 × 128)1024 B
health (4 × 128)512 B
Data subtotal2560 B(74% of total)
Total3464 B(~3.4 KB)

General formula

For a registry with Capacity slots and N components:

Metadata = Capacity × 7 + N + 5

Component data = Capacity × Σ(sizeof(Component))

Total = Metadata + Component data

Scaling characteristics

Metadata grows linearly per slot (7 bytes) but is independent of component count. Adding more components only adds to the pool size, not metadata.

ConfigMetadataDataTotal% Overhead
64 entities, 3 components453 B1280 B1733 B26%
128 entities, 3 components904 B2560 B3464 B26%
256 entities, 3 components1803 B5120 B6923 B26%
128 entities, 6 components904 B4608 B5512 B16%

Larger registries and more components both reduce metadata percentage, making large game worlds more efficient.

Component groups and logical organisation

Component groups provide compile-time organisation without runtime overhead.

// Define conceptual groups
using physics = gba::ecs::group<position, velocity, acceleration>;
using rendering = gba::ecs::group<sprite_id, palette_bank, x_offset>;

// Use groups in registry declaration
gba::ecs::registry<128, physics, rendering, health> world;

// Internally flattened to:
// gba::ecs::registry<128, position, velocity, acceleration,
//                        sprite_id, palette_bank, x_offset, health>

Groups are completely erased at compile time. They exist for code organisation and readability, not runtime behaviour.

Why groups matter

  • Logical namespace: Physics components stay together in the code
  • No runtime cost: Groups are pure templates; zero overhead
  • No ambiguity: The registry type fully specifies what exists
  • Iterating unchanged: Use view<position, velocity>() regardless of groups
// All of these work the same way:
world.view<position, velocity>().each([](position& p, velocity& v) {
	p.x += v.vx;
});

Entity identity

entity_id is a 16-bit handle:

BitsMeaning
low 8 bitsslot index
high 8 bitsgeneration counter
15            8 7             0
+---------------+---------------+
| generation | slot |
+---------------+---------------+

Consequences:

  • maximum slots per registry: 255
  • 0xFFFF is reserved for gba::ecs::null
  • stale handles become invalid after destroy() increments generation

This is a very good match for GBA games, where worlds are usually dozens or low hundreds of entities, not tens of thousands.

Presence tracking with one mask per slot

Each slot has one std::uint32_t mask:

  • bit 31 = entity alive flag
  • bits 0-30 = component presence bits

That supports cheap queries:

  • all_of<Cs...>() -> bitwise AND against a compile-time mask
  • any_of<Cs...>() -> bitwise AND against a compile-time mask
  • view filtering -> compare the slot mask with a required mask

Logical per-entity layout vs physical storage

One of the easiest ways to misunderstand the registry is to imagine each entity as one packed struct. That is not what happens.

Logical view

For example, with these components:

ComponentSize
position8 bytes
velocity8 bytes
health4 bytes
sprite_id1 byte

The logical entity data is 21 bytes of component payload.

Physical view

The registry stores them as separate arrays:

position pool: [p0][p1][p2][p3] ...
velocity pool: [v0][v1][v2][v3] ...
health pool:   [h0][h1][h2][h3] ...
sprite pool:   [s0][s1][s2][s3] ...

That is why a view<position, velocity>() can iterate directly over only the pools it needs.

Metadata arrays and what they buy you

FieldRole
m_component_count[]Count of alive entities owning each component
m_free_topSize of the free-slot stack
m_next_slotNext never-before-used slot
m_aliveCurrent alive entity count
m_mask[]Alive + component presence bits
m_gen[]Per-slot generation counters
m_free_stack[]Recycled slot stack
m_alive_list[]Dense list of alive slots
m_alive_index[]Reverse map for O(1) removal from m_alive_list

This is the backbone of the ECS. The component pools are simple; the metadata is what makes creation, destruction, and iteration cheap.

View dispatch strategy

view<Cs...>() does not use one always-generic loop. It picks among three runtime paths:

PathConditionCost profile
Dense + all-matchevery alive entity has every requested component, and alive slots are still dense from 0..N-1no alive-list lookup, no mask check
All-match with gapsevery alive entity has every requested component, but slots are no longer densealive-list lookup, no mask check
Mixedsome alive entities are missing requested componentsalive-list lookup plus per-slot mask check

This matters because many gameplay worlds spend most of their time in one of the first two cases.

Iteration styles

APIBest for
range-for over view<Cs...>()ergonomic gameplay code
.each(fn)explicit callback style, constexpr-friendly code
.each_arm(fn)measured hot loops where ARM-mode + IWRAM placement matters

Example:

world.view<position, velocity>().each([](position& pos, const velocity& vel) {
    pos.x += vel.vx;
    pos.y += vel.vy;
});

Power-of-two component sizes

Every component type must have a power-of-two sizeof(T).

SizeAllowed?
1yes
2yes
4yes
8yes
3, 5, 6, 7, …no

If a type is almost right, pad it:

struct sprite_id {
    std::uint8_t id;
    gba::ecs::pad<3> _;
};

This rule exists to support cheap shift-based addressing in the component pools.

What the architecture intentionally omits

To stay small and predictable, gba::ecs deliberately does not include:

  • runtime component registration
  • dynamic archetype storage
  • event buses or schedulers
  • system graphs or task runners
  • serialisation or reflection

The expectation is that you compose those policies at a higher layer if your project needs them.

See Internal Implementation for the field ordering, alive-list mechanics, and the fast-path details that fall out of this architecture.