Skip to content

Offscreen Rendering

This example demonstrates how to use vulkan to render to an image file instead of presenting to the screen. We have no need for KDGui::GuiApplication or the KDGpu::KDGpuExample helper API, since we don't have an event loop.

When starting the program, the first thing we do is fill a large vector with vertex information, which represent points we want to plot. These are generated by a function that we won't cover here. Vertices look like this:

1
2
3
4
    struct Vertex {
        glm::vec2 pos;
        glm::vec4 color;
    };

Filename: offscreen_rendering/offscreen.h

We then construct the "Offscreen" object and call its initializeScene method.

createRenderTargets

The first part of initialization is the constructor, which creates a vulkan API instance, finds a physical adapter, and creates a device and a queue from it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
Offscreen::Offscreen()
    : m_api(std::make_unique<VulkanGraphicsApi>())
{
    m_instance = m_api->createInstance(InstanceOptions{
            .applicationName = "offscreen_rendering",
            .applicationVersion = KDGPU_MAKE_API_VERSION(0, 1, 0, 0) });

    auto adapter = m_instance.selectAdapter(AdapterDeviceType::Default);
    const auto adapterProperties = adapter->properties();
    SPDLOG_INFO("Using adapter: {}", adapterProperties.deviceName);

    // Create a device and grab the first queue
    m_device = adapter->createDevice(DeviceOptions{ .requestedFeatures = adapter->features() });
    m_queue = m_device.queues()[0];

    createRenderTargets();
}

Filename: offscreen_rendering/offscreen.cpp

The constructor also calls createRenderTargets, which is a long function that configures and creates all the textures we need, as well as an array of KDGpu::TextureMemoryBarrierOptions, the purpose of which will be shown later. First, let's look at the color texture initialization.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
    // Create a color texture to use as our color render target
    const TextureOptions msaaColorTextureOptions = {
        .type = TextureType::TextureType2D,
        .format = m_colorFormat,
        .extent = { m_width, m_height, 1 },
        .mipLevels = 1,
        .samples = m_samples,
        .usage = TextureUsageFlagBits::ColorAttachmentBit,
        .memoryUsage = MemoryUsage::GpuOnly
    };
    m_msaaColorTexture = m_device.createTexture(msaaColorTextureOptions);
    m_msaaColorTextureView = m_msaaColorTexture.createView();

    // Create a color texture to use as our resolve render target
    const TextureOptions colorTextureOptions = {
        .type = TextureType::TextureType2D,
        .format = m_colorFormat,
        .extent = { m_width, m_height, 1 },
        .mipLevels = 1,
        .samples = SampleCountFlagBits::Samples1Bit,
        .usage = TextureUsageFlagBits::ColorAttachmentBit | TextureUsageFlagBits::TransferSrcBit,
        .memoryUsage = MemoryUsage::GpuOnly
    };
    m_colorTexture = m_device.createTexture(colorTextureOptions);
    m_colorTextureView = m_colorTexture.createView();

Filename: offscreen_rendering/offscreen.cpp

We need to create two textures since we are using multisampling. The first texture can be upsampled and then rendered into the second to create the MSAA effect. To use this we simply configure the main pass to use both views:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
    m_renderPassOptions = {
        .colorAttachments = {
            {
                .view = m_msaaColorTextureView,
                .resolveView = m_colorTextureView,
                .clearValue = { 0.3f, 0.3f, 0.3f, 1.0f }
            }
        },
        .depthStencilAttachment = {
            .view = m_depthTextureView,
        },
        .samples = m_samples
    };

Filename: offscreen_rendering/offscreen.cpp

For more information on multi-sampling with KDGpu, check out Hello Triangle MSAA.

Next, we initialize a texture which exists in CPU address space, which we will use as a proxy to render to from vulkan and then copy onto disk in an image format.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
    // Create a color texture that is host visible and in linear layout. We will copy into this.
    const TextureOptions cpuColorTextureOptions = {
        .type = TextureType::TextureType2D,
        .format = m_colorFormat,
        .extent = { m_width, m_height, 1 },
        .mipLevels = 1,
        .samples = SampleCountFlagBits::Samples1Bit,
        .tiling = TextureTiling::Linear, // Linear so we can manipulate it on the host
        .usage = TextureUsageFlagBits::TransferDstBit,
        .memoryUsage = MemoryUsage::CpuOnly
    };
    m_cpuColorTexture = m_device.createTexture(cpuColorTextureOptions);

Filename: offscreen_rendering/offscreen.cpp

The tiling field determines the layout of the texels in memory. Optimal is the default value, and in that case the texture will be laid out in a more optimized way for whatever hardware the program is running on. Linear means row-major order, which is best if the texture needs to be CPU-addressable.

After creating textures, we need to create an array of memory barrier options. First, lets look at how these options are used later, during rendering:

1
2
3
4
5
    commandRecorder.textureMemoryBarrier(m_barriers[uint8_t(TextureBarriers::CopySrcPre)]);
    commandRecorder.textureMemoryBarrier(m_barriers[uint8_t(TextureBarriers::CopyDstPre)]);
    commandRecorder.copyTextureToTexture(m_copyOptions);
    commandRecorder.textureMemoryBarrier(m_barriers[uint8_t(TextureBarriers::CopyDstPost)]);
    commandRecorder.textureMemoryBarrier(m_barriers[uint8_t(TextureBarriers::CopySrcPost)]);

Filename: offscreen_rendering/offscreen.cpp

Memory barriers are commands which will act to ensure that memory in the CPU cache is flushed and visible to other cores before continuing processing. They also can make changes to the format of memory, which is part of the options we will configure. Before we can perform the copyTextureToTexture operation, we need to ensure that the texture memory is up-to-date, visible, and in the correct format. Afterwards, we need memory barriers to reset the GPU texture to its original format and to set up the CPU texture's format so it can be mapped to CPU address space. So, lets set the different memory barrier options:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
    // Insert a texture memory barrier to ensure the rendering to the color render target
    // is completed and to transition it into a layout suitable for copying from
    m_barriers[uint8_t(TextureBarriers::CopySrcPre)] = {
        .srcStages = PipelineStageFlagBit::TransferBit,
        .srcMask = AccessFlagBit::MemoryReadBit,
        .dstStages = PipelineStageFlagBit::TransferBit,
        .dstMask = AccessFlagBit::TransferReadBit,
        .oldLayout = TextureLayout::ColorAttachmentOptimal,
        .newLayout = TextureLayout::TransferSrcOptimal,
        .texture = m_colorTexture,
        .range = { .aspectMask = TextureAspectFlagBits::ColorBit }
    };

    // Insert another texture memory barrier to transition the destination cpu visible
    // texture into a suitable layout for copying into
    m_barriers[uint8_t(TextureBarriers::CopyDstPre)] = {
        .srcStages = PipelineStageFlagBit::TransferBit,
        .srcMask = AccessFlagBit::None,
        .dstStages = PipelineStageFlagBit::TransferBit,
        .dstMask = AccessFlagBit::TransferWriteBit,
        .oldLayout = TextureLayout::Undefined,
        .newLayout = TextureLayout::TransferDstOptimal,
        .texture = m_cpuColorTexture,
        .range = { .aspectMask = TextureAspectFlagBits::ColorBit }
    };

    // Transition the destination texture to general layout so that we can map it to the cpu
    // address space later.
    m_barriers[uint8_t(TextureBarriers::CopyDstPost)] = {
        .srcStages = PipelineStageFlagBit::TransferBit,
        .srcMask = AccessFlagBit::TransferWriteBit,
        .dstStages = PipelineStageFlagBit::TransferBit,
        .dstMask = AccessFlagBit::MemoryReadBit,
        .oldLayout = TextureLayout::TransferDstOptimal,
        .newLayout = TextureLayout::General,
        .texture = m_cpuColorTexture,
        .range = { .aspectMask = TextureAspectFlagBits::ColorBit }
    };

    // Transition the color target back to the color attachment optimal layout, ready
    // to render again later.
    m_barriers[uint8_t(TextureBarriers::CopySrcPost)] = {
        .srcStages = PipelineStageFlagBit::TransferBit,
        .srcMask = AccessFlagBit::TransferReadBit,
        .dstStages = PipelineStageFlagBit::TransferBit,
        .dstMask = AccessFlagBit::MemoryReadBit,
        .oldLayout = TextureLayout::TransferSrcOptimal,
        .newLayout = TextureLayout::ColorAttachmentOptimal,
        .texture = m_colorTexture,
        .range = { .aspectMask = TextureAspectFlagBits::ColorBit }
    };

Filename: offscreen_rendering/offscreen.cpp

The last step is to create the copy options for the copyTextureToTexture call shown earlier:

1
2
3
4
5
6
7
8
9
    m_copyOptions = {
        .srcTexture = m_colorTexture,
        .srcLayout = TextureLayout::TransferSrcOptimal,
        .dstTexture = m_cpuColorTexture,
        .dstLayout = TextureLayout::TransferDstOptimal,
        .regions = {{
            .extent = { .width = m_width, .height = m_height, .depth = 1 }
        }}
    };

Filename: offscreen_rendering/offscreen.cpp

initializeScene

The first thing to do on scene initialization is to load the image that we use represent points on the graph. The majority of this is identical to the texture loading seen in the Textured Quad example, including the same loadImage helper function. One difference is that we set the scaling filters for upscaling and downscaling.

1
        m_pointSampler = m_device.createSampler(SamplerOptions{ .magFilter = FilterMode::Linear, .minFilter = FilterMode::Linear });

Filename: offscreen_rendering/offscreen.cpp

Also, we keep track of the buffer upload information in a member variable, to free later. This is a housekeeping task which normally would handled by KDGpuExample::ExampleEngineLayer::uploadBufferData.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
Offscreen::Offscreen()
    : m_api(std::make_unique<VulkanGraphicsApi>())
{
    m_instance = m_api->createInstance(InstanceOptions{
            .applicationName = "offscreen_rendering",
            .applicationVersion = KDGPU_MAKE_API_VERSION(0, 1, 0, 0) });

    auto adapter = m_instance.selectAdapter(AdapterDeviceType::Default);
    const auto adapterProperties = adapter->properties();
    SPDLOG_INFO("Using adapter: {}", adapterProperties.deviceName);

    // Create a device and grab the first queue
    m_device = adapter->createDevice(DeviceOptions{ .requestedFeatures = adapter->features() });
    m_queue = m_device.queues()[0];

    createRenderTargets();
}

Filename: offscreen_rendering/offscreen.cpp

Next we load the shaders.

TODO: document gl_PointSize

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
out gl_PerVertex
{
    vec4 gl_Position;
    float gl_PointSize;
};

layout(set = 1, binding = 0) uniform Transform
{
    mat4 proj;
}
transform;

void main()
{
    color = vertexCol;
    gl_PointSize = 16.0;
    gl_Position = transform.proj * vertexPos;
}

Filename: offscreen_rendering/doc/shadersnippet.vert

Next, we create a buffer to hold the transformation matrix, called m_proj, and copy an orthographic projection into it:

1
2
3
4
5
6
7
8
void Offscreen::setProjection(float left, float right, float bottom, float top)
{
    // NB: We flip bottom and top since Vulkan (and KDGpu) invert the y vs OpenGL
    m_proj = glm::ortho(left, right, top, bottom);
    auto bufferData = m_projBuffer.map();
    std::memcpy(bufferData, &m_proj, sizeof(glm::mat4));
    m_projBuffer.unmap();
}

Filename: offscreen_rendering/offscreen.cpp

We create the necessary bind group and bind group layouts, and then finally create the pipeline options:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
    const GraphicsPipelineOptions pipelineOptions = {
        .shaderStages = {
            { .shaderModule = vertexShader, .stage = ShaderStageFlagBits::VertexBit },
            { .shaderModule = fragmentShader, .stage = ShaderStageFlagBits::FragmentBit }
        },
        .layout = m_pipelineLayout,
        .vertex = {
            .buffers = {
                { .binding = 0, .stride = sizeof(Offscreen::Vertex) }
            },
            .attributes = {
                { .location = 0, .binding = 0, .format = Format::R32G32_SFLOAT }, // Position
                { .location = 1, .binding = 0, .format = Format::R32G32B32A32_SFLOAT, .offset = sizeof(glm::vec2) } // Color
            }
        },
        .renderTargets = {{
            .format = m_colorFormat,
            .blending = {
                .blendingEnabled = true,
                .color = {
                    .srcFactor = BlendFactor::SrcAlpha,
                    .dstFactor = BlendFactor::OneMinusSrcAlpha
                },
                .alpha = {
                    .srcFactor = BlendFactor::SrcAlpha,
                    .dstFactor = BlendFactor::OneMinusSrcAlpha
                }
            }
        }},
        .depthStencil = {
            .format = m_depthFormat,
            .depthTestEnabled = false,
            .depthWritesEnabled = false,
            .depthCompareOperation = CompareOperation::Always
        },
        .primitive = {
            .topology = PrimitiveTopology::PointList
        },
        .multisample = {
            .samples = m_samples
        }
    };

Filename: offscreen_rendering/offscreen.cpp

Notice:

  • the use of multisampling
  • CompareOperation::Always
  • The pointlist topology, so that the vertices get interpreted as points and not triangles
  • That blending is enabled, with settings for both color and alpha
  • The binding for the vertex buffer that uses the size of the Offscreen::Vertex shown earlier.

Data Upload

Having completed initialization, we need to pass the large vector of vertex data we generated earlier into our Offscreen object. We pass it in with setData:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
void Offscreen::setData(const std::vector<Offscreen::Vertex> &data)
{
    m_pointCount = data.size();

    const DeviceSize dataByteSize = data.size() * sizeof(Offscreen::Vertex);
    BufferOptions bufferOptions = {
        .size = dataByteSize,
        .usage = BufferUsageFlagBits::VertexBufferBit | BufferUsageFlagBits::TransferDstBit,
        .memoryUsage = MemoryUsage::GpuOnly
    };
    m_dataBuffer = m_device.createBuffer(bufferOptions);
    const BufferUploadOptions uploadOptions = {
        .destinationBuffer = m_dataBuffer,
        .dstStages = PipelineStageFlagBit::VertexAttributeInputBit,
        .dstMask = AccessFlagBit::VertexAttributeReadBit,
        .data = data.data(),
        .byteSize = dataByteSize
    };

    // Initiate the data upload. We note the upload details so that we can
    // test to see when it is safe to destroy the staging buffer. We will check
    // at the end of each render function.
    m_stagingBuffers.emplace_back(m_queue.uploadBufferData(uploadOptions));
}

Filename: offscreen_rendering/offscreen.cpp

At the end of this function we also keep track of the buffer for release later.


Updated on 2023-12-22 at 00:05:36 +0000