Descriptor Indexing with Bindless Rendering¶

This example shows how to use descriptor indexing (also called "bindless rendering") to access arrays of resources using non-uniform indices in shaders. Traditional Vulkan requires binding specific descriptors before each draw call; descriptor indexing allows binding large arrays once and indexing into them dynamically in shaders. This dramatically reduces CPU overhead and enables efficient material systems, texture atlases, and data-driven rendering.

The example uses the KDGpuExample helper API for simplified setup.

Overview¶

What this example demonstrates:

Enabling VK_EXT_descriptor_indexing extension and required features
Creating descriptor sets with large arrays of uniform buffers
Using nonuniformEXT qualifier in shaders for dynamic indexing
Drawing multiple objects with different materials/transforms
Variable-length descriptor arrays sized at runtime

Use cases:

Material systems (hundreds/thousands of materials in one array)
Texture streaming and mega-textures
GPU-driven rendering (indirect draws selecting resources)
Bindless vertex/index buffers
Efficient multi-material rendering

Vulkan Requirements¶

Vulkan Version: 1.2+ (descriptor indexing promoted to core)
Extensions: VK_EXT_descriptor_indexing (core in 1.2)
Features:
- shaderUniformBufferArrayNonUniformIndexing
- runtimeDescriptorArray
- descriptorBindingVariableDescriptorCount
- descriptorBindingPartiallyBound (optional but recommended)
Shader: SPIR-V 1.3+ or GLSL 450+ with GL_EXT_nonuniform_qualifier

Key Concepts¶

Traditional Descriptor Binding:

// Draw 3 objects with different materials:
for (int i = 0; i < 3; i++) {
    bindDescriptorSet(materialDescriptorSets[i]);  // CPU overhead!
    draw(object[i]);
}

Every material requires a descriptor set bind, which has CPU cost.

Descriptor Indexing / Bindless:

// Bind array of ALL materials once:
bindDescriptorSet(allMaterialsArray);  // One bind for all!

// Draw all objects with different material indices:
for (int i = 0; i < 3; i++) {
    // Index is computed in shader or passed via push constant
    draw(object[i]);
}

Shader:

layout(set = 0, binding = 0) uniform Material {
    mat4 transform;
} materials[16];  // Array of materials/transforms

void main() {
    // Index computed from frame time/angle:
    uint index = computeIndex();
    mat4 transform = materials[nonuniformEXT(index)].transform;
}

Benefits:

Massively reduced bind calls
GPU-driven resource selection
Simplified render loop

Spec: https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VK_EXT_descriptor_indexing.html

NonUniform Indexing:

The nonuniformEXT qualifier tells the compiler that the index can vary between shader invocations (non-uniform control flow). Without this:

Index must be compile-time constant, or
Index must be uniform across all invocations in a subgroup

With nonuniformEXT:

Each triangle/pixel can use different index
Enables material-per-object, texture-per-pixel selection
May have small performance cost on some hardware

Variable-Length Arrays:

Traditional Vulkan requires compile-time array sizes. Descriptor indexing allows:

layout(set = 0, binding = 0) uniform Transforms {
    mat4 matrix;
} transforms[];  // Variable length!

The size is determined at runtime by descriptorCount in VkDescriptorSetLayoutBinding.

Implementation¶

Allocating Descriptor Array Buffers:

    // Create a set of TransformsCount UBOs, each holding a distinct rotation matrix
    {
        const BufferOptions bufferOptions = {
            .size = sizeof(glm::mat4),
            .usage = BufferUsageFlagBits::UniformBufferBit,
            .memoryUsage = MemoryUsage::CpuToGpu // So we can map it to CPU address space
        };

        m_transformBuffers.reserve(TransformsCount);

        const float angleStep = 360.0f / float(TransformsCount);

        for (size_t i = 0; i < TransformsCount; ++i) {
            const glm::mat4 mat = glm::rotate(glm::mat4(1.0f), glm::radians(i * angleStep), glm::vec3(0.0f, 0.0f, 1.0f));

            Buffer buf = m_device.createBuffer(bufferOptions);
            auto bufferData = buf.map();
            std::memcpy(bufferData, &mat, sizeof(glm::mat4));
            buf.unmap();

            m_transformBuffers.emplace_back(std::move(buf));
        }
    }