Dynamic Uniform Buffer¶

This example shows how to use dynamic uniform buffer (UBO) bindings to render multiple objects without creating separate descriptor sets for each. By storing multiple transforms in one contiguous buffer and using dynamic offsets when binding, we can efficiently render many objects with minimal descriptor management overhead. This is essential for rendering scenes with hundreds or thousands of objects.

The example uses the KDGpuExample helper API for simplified setup.

Overview¶

What this example demonstrates:

Dynamic uniform buffer offsets for per-object data
Alignment requirements (minUniformBufferOffsetAlignment)
Single buffer with multiple transforms
Efficient descriptor reuse across draw calls
Instanced rendering alternative

Performance benefit:

One descriptor set binding for N objects (instead of N bindings)
Reduced descriptor set switching overhead
Better cache locality with contiguous buffer
Minimal CPU-side per-object work

Vulkan Requirements¶

Vulkan Version: 1.0+
Extensions: None (dynamic UBOs are core functionality)
Device Limits: minUniformBufferOffsetAlignment (typically 16-256 bytes)

Key Concepts¶

Dynamic Uniform Buffers:

Vulkan provides two types of uniform buffer bindings:

Static UBO: Offset fixed at descriptor set creation time
Dynamic UBO: Offset specified at bind time (vkCmdBindDescriptorSets)

Dynamic UBOs allow binding different regions of the same buffer without creating multiple descriptor sets. This is perfect for per-object data like transform matrices.

Alignment Requirements:

Dynamic UBO offsets must be aligned to minUniformBufferOffsetAlignment, a device-specific value (typically 16 or 256 bytes). Even if your data is smaller (e.g., 64-byte mat4), you must stride by the alignment requirement:

// Wrong: may crash or produce incorrect results
offset = objectIndex * sizeof(mat4);  // 64 bytes

// Correct: aligned to device requirement
offset = objectIndex * alignedSize;    // e.g., 256 bytes

Spec: https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VkPhysicalDeviceLimits.html

Use Cases:

Per-object transforms in a scene
Per-material parameters
Per-light data arrays
Skeletal animation bone matrices
Any homogeneous per-item data

Implementation¶

Computing Aligned Stride:

        // Retrieve minimum buffer offset alignment
        const size_t minDynamicUBOOffsetAlignment = m_device.adapter()->properties().limits.minUniformBufferOffsetAlignment;
        m_dynamicUBOByteStride = std::max(minDynamicUBOOffsetAlignment, sizeof(glm::mat4));

        const BufferOptions bufferOptions = {
            .size = entityCount * m_dynamicUBOByteStride,
            .usage = BufferUsageFlagBits::UniformBufferBit,
            .memoryUsage = MemoryUsage::CpuToGpu // So we can map it to CPU address space
        };
        m_transformDynamicUBOBuffer = m_device.createBuffer(bufferOptions);

Filename: dynamic_ubo/dynamic_ubo_triangles.cpp

Key points:

Query minUniformBufferOffsetAlignment from device limits
Round up data size to alignment boundary
Allocate buffer with: objectCount * alignedStride
Wasted bytes are unavoidable but small

Example: With 256-byte alignment and 64-byte mat4:

Stride = 256 bytes
Waste = 192 bytes per object (75% wasted)
For 1000 objects: 250 KB total, 187 KB wasted
Still more efficient than 1000 descriptor sets!

Bind Group Layout Configuration:

    // Create bind group layout consisting of a single binding holding a UBO
    const BindGroupLayoutOptions bindGroupLayoutOptions = {
        .bindings = {
                {
                        .binding = 0,
                        .resourceType = ResourceBindingType::DynamicUniformBuffer,
                        .shaderStages = ShaderStageFlags(ShaderStageFlagBits::VertexBit),
                },
        },
    };
    const BindGroupLayout bindGroupLayout = m_device.createBindGroupLayout(bindGroupLayoutOptions);

Filename: dynamic_ubo/dynamic_ubo_triangles.cpp

Use ResourceBindingType::DynamicUniformBuffer instead of UniformBuffer. This tells Vulkan you'll provide offsets at draw time.

Bind Group with Dynamic Binding:

    const BindGroupOptions bindGroupOptions = {
        .layout = bindGroupLayout,
        .resources = {
                {
                        .binding = 0,
                        // We are dealing with a Dynamic UBO expected to hold a set of transform matrices.
                        // The size we specify for the binding is the size of a single entry in the buffer
                        .resource = DynamicUniformBufferBinding{
                                .buffer = m_transformDynamicUBOBuffer,
                                .size = uint32_t(m_dynamicUBOByteStride),
                        },
                },
        },
    };

Filename: dynamic_ubo/dynamic_ubo_triangles.cpp

The size field specifies size of ONE entry (the aligned stride), not the entire buffer. Vulkan uses this with the dynamic offset to compute actual buffer region.

Per-Frame Update:

    // Each frame we want to rotate the triangle a little
    static float angle = 0.0f;
    angle += 0.1f;
    if (angle > 360.0f)
        angle -= 360.0f;

    std::vector<uint8_t> rawTransformData(entityCount * m_dynamicUBOByteStride, 0U);

    // Update EntityCount matrices into the single buffer we have
    for (size_t i = 0; i < entityCount; ++i) {
        auto transform = glm::mat4(1.0f);
        transform = glm::translate(transform, glm::vec3(-0.7f + (i * 0.5f), 0.0f, 0.0f));
        transform = glm::scale(transform, glm::vec3(0.2f));
        transform = glm::rotate(transform, glm::radians(angle + (45.0f * i)), glm::vec3(0.0f, 0.0f, 1.0f));

        std::memcpy(rawTransformData.data() + (i * m_dynamicUBOByteStride), &transform, sizeof(glm::mat4));
    }

    auto *bufferData = m_transformDynamicUBOBuffer.map();
    std::memcpy(bufferData, rawTransformData.data(), rawTransformData.size());
    m_transformDynamicUBOBuffer.unmap();

Filename: dynamic_ubo/dynamic_ubo_triangles.cpp

Map the entire buffer once, update all transforms, unmap. Each transform is written at i * alignedStride offset.

Rendering with Dynamic Offsets:

    for (size_t i = 0; i < entityCount; ++i) {
        // Bind Group and provide offset into the Dynamic UBO that holds all the transform matrices
        const uint32_t dynamicUBOOffset = i * m_dynamicUBOByteStride;
        opaquePass.setBindGroup(0, m_transformBindGroup, m_pipelineLayout, std::array{ dynamicUBOOffset });
        const DrawIndexedCommand drawCmd = { .indexCount = 3 };
        opaquePass.drawIndexed(drawCmd);
    }

Filename: dynamic_ubo/dynamic_ubo_triangles.cpp

The critical call is setBindGroup with the dynamic offset:

opaquePass.setBindGroup(0, m_transformBindGroup, {}, std::array{ i * m_dynamicUBOByteStride });

This selects which object's transform the shader sees, without changing descriptor sets.

Performance Notes¶

When to Use Dynamic UBOs:

Good: 10-10,000 objects with small per-object data (transforms, colors)
Good: Data updated frequently (every frame)
Bad: Huge per-object data (>256 bytes) - use storage buffers instead
Bad: Very sparse updates - static descriptor sets may be better

Memory Overhead:

Alignment waste can be significant for small data
For 64-byte data with 256-byte alignment: 75% wasted
For 256-byte data with 256-byte alignment: 0% wasted
Pack multiple data items to reduce waste

CPU Performance:

Much faster than updating/binding N descriptor sets
Single vkCmdBindDescriptorSets with offset vs N calls
Reduces driver validation overhead

GPU Performance:

Contiguous buffer improves cache locality
Uniform buffer access is very fast
No performance difference from static UBOs

Alternatives:

Push Constants: For very small data (<128 bytes), fastest but limited size
Storage Buffers: For large/variable-size data, no alignment waste but slower access
Instanced Rendering: For identical geometry with per-instance data
Indirect Drawing: For GPU-driven rendering with buffers

Dynamic Uniform Buffer¶

Overview¶

Vulkan Requirements¶

Key Concepts¶

Implementation¶

Performance Notes¶

See Also¶

Further Reading¶