Hello Triangle Native¶

This example is a demonstration of the control over vulkan that's possible without the KDGpuExample helper API. To see what parts of this example are abstracted/simplified by KDGpuExample, check out the Hello Triangle example.

This example demonstrates the patterns that KDGpuExample normally handles for you, so this is more complex than the Hello Triangle example. It is expected that you read that example first in order to understand this example in full.

Initialization¶

Filename: hello_triangle_native/main.cpp

Unlike in the Hello Triangle example, we have to instantiate a vulkan API object, and then use that to create a vulkan instance. We must also create a KDGui Window with dimensions, a title, and a quit handler. The handler which will store a reference to our app and quit it when it's called. KDGui will call this handler when the windowing system requests the window to close.

Next, we're going to need a vulkan surface. Requesting that requires different approaches on different operating systems:

#if defined(KDGUI_PLATFORM_WIN32)
    auto win32Window = dynamic_cast<Win32PlatformWindow *>(window.platformWindow());
    if (win32Window != nullptr) {
        surfaceOptions.hWnd = win32Window->handle();
    }
#endif

#if defined(KDGUI_PLATFORM_XCB)
    auto xcbWindow = dynamic_cast<LinuxXcbPlatformWindow *>(window.platformWindow());
    if (xcbWindow != nullptr) {
        surfaceOptions.connection = xcbWindow->connection();
        surfaceOptions.window = xcbWindow->handle();
    }
#endif

#if defined(KDGUI_PLATFORM_WAYLAND)
    auto waylandWindow = dynamic_cast<LinuxWaylandPlatformWindow *>(window.platformWindow());
    if (waylandWindow != nullptr) {
        surfaceOptions.display = waylandWindow->display();
        surfaceOptions.surface = waylandWindow->surface();
    }
#endif

#if defined(KDGUI_PLATFORM_COCOA)
    surfaceOptions.layer = createMetalLayer(&window);
#endif

    Surface surface = instance.createSurface(surfaceOptions);

Filename: hello_triangle_native/main.cpp

These are implementation details, but it's good to break things down to understand where the application has dependency on different operating system functions. And look: even the OS-specific operations are simple helper functions and types.

Next, lets look at hardware specific code. Unlike the OS specific code, this is going to have to be determined at runtime. First, we look for an adapter of a certain type. Available types can be found at KDGpu::AdapterDeviceType.

    // Enumerate the adapters (physical devices) and select one to use. Here we look for
    // a discrete GPU. In a real app, we could fallback to an integrated one.
    Adapter *selectedAdapter = instance.selectAdapter(AdapterDeviceType::Default);

Filename: hello_triangle_native/main.cpp

We can then query that adapter for features (KDGpu::AdapterFeatures) and properties (KDGpu::AdapterProperties). Features are things like multiview, precise occlusion queries, and other GPU-specific features. Properties are things like the device's ID and the latest vulkan API version that it supports.

    // We can easily query the adapter for various features, properties and limits.
    SPDLOG_LOGGER_DEBUG(appLogger, "maxBoundDescriptorSets = {}", selectedAdapter->properties().limits.maxBoundDescriptorSets);
    SPDLOG_LOGGER_DEBUG(appLogger, "multiDrawIndirect = {}", selectedAdapter->features().multiDrawIndirect);

Filename: hello_triangle_native/main.cpp

Next, we can query the available families of command queues. Command queues are interfaces on which you can submit lists of GPU commands, and the types of GPU commands you can submit to a queue depends on the family you're using. For example, submitting a draw call won't work if the only queue you have access to is a compute queue. In order to ensure the hardware supports the queue family you want, bitwise OR | all the KDGpu::QueueFlagBits that correspond to the family like so:

    auto queueTypes = selectedAdapter->queueTypes();
    const bool hasGraphicsAndCompute = queueTypes[0].supportsFeature(QueueFlags(QueueFlagBits::GraphicsBit) | QueueFlags(QueueFlagBits::ComputeBit));
    SPDLOG_LOGGER_DEBUG(appLogger, "Queue family 0 graphics and compute support: {}", hasGraphicsAndCompute);

Filename: hello_triangle_native/main.cpp

And the last of the hardware queries: we can query the swapchain's properties. Explore KDGpu::AdapterSwapchainProperties to see these in greater detail. The most important check here is whether or not the swapchain supports presentation for the given surface and queue type. Under the hood, this calls vkGetPhysicalDeviceSurfacePresentModesKHR, which enumerates "presentation modes." These are basically things like v-sync on, v-sync off, or hybrid. If the swapchain does not have any of these presentation modes, then it does not support presentation and your graphical app will fail or be unable to show anything on-screen.

    // We are now able to query the adapter for swapchain properties and presentation support with the window surface
    const auto swapchainProperties = selectedAdapter->swapchainProperties(surface);
    const bool supportsPresentation = selectedAdapter->supportsPresentation(surface, 0); // Query about the 1st queue type
    SPDLOG_LOGGER_DEBUG(appLogger, "Queue family 0 supports presentation: {}", supportsPresentation);

    if (!supportsPresentation || !hasGraphicsAndCompute) {
        SPDLOG_LOGGER_CRITICAL(appLogger, "Selected adapter queue family 0 does not meet requirements. Aborting.");
        return -1;
    }

Filename: hello_triangle_native/main.cpp

Next, we must create the swapchain. For the most part, this is just propagating values that we queried earlier into the options structs for the swapchain.

    // Now we can create a device from the selected adapter that we can then use to interact with the GPU.
    Device device = selectedAdapter->createDevice();
    Queue queue = device.queues()[0];
    SPDLOG_LOGGER_INFO(appLogger, "Created device with {} queues", device.queues().size());

    Swapchain swapchain;
    std::vector<TextureView> swapchainViews;
    Texture depthTexture;
    TextureView depthTextureView;

    Format swapchainFormat;
    Format depthTextureFormat;

    auto createSwapchain = [&] {
        const AdapterSwapchainProperties swapchainProperties = device.adapter()->swapchainProperties(surface);
        const SurfaceCapabilities &surfaceCapabilities = swapchainProperties.capabilities;

        // Create a swapchain of images that we will render to.
        const Extent2D swapchainExtent = {
            .width = std::clamp(window.width(), surfaceCapabilities.minImageExtent.width,
                                surfaceCapabilities.maxImageExtent.width),
            .height = std::clamp(window.height(), surfaceCapabilities.minImageExtent.height,
                                 surfaceCapabilities.maxImageExtent.height),
        };

        const SwapchainOptions swapchainOptions = {
            .surface = surface.handle(),
            .minImageCount = getSuitableImageCount(swapchainProperties.capabilities),
            .imageExtent = { .width = swapchainExtent.width, .height = swapchainExtent.height },
            .oldSwapchain = swapchain,
        };

        swapchain = device.createSwapchain(swapchainOptions);
        SPDLOG_LOGGER_INFO(appLogger, "Created swapchain with {} images", swapchain.textures().size());
        const auto &swapchainTextures = swapchain.textures();
        const auto swapchainTextureCount = swapchainTextures.size();

        swapchainViews.clear();
        swapchainViews.reserve(swapchainTextureCount);
        for (uint32_t i = 0; i < swapchainTextureCount; ++i) {
            auto view = swapchainTextures[i].createView({ .format = swapchainOptions.format });
            swapchainViews.push_back(std::move(view));
        }

        // Create a depth texture to use for rendering
        const TextureOptions depthTextureOptions = {
            .type = TextureType::TextureType2D,
            .format = Format::D24_UNORM_S8_UINT,
            .extent = { swapchainExtent.width, swapchainExtent.height, 1 },
            .mipLevels = 1,
            .usage = TextureUsageFlagBits::DepthStencilAttachmentBit,
            .memoryUsage = MemoryUsage::GpuOnly
        };
        depthTexture = device.createTexture(depthTextureOptions);
        depthTextureView = depthTexture.createView();
        SPDLOG_LOGGER_INFO(appLogger, "Created depth texture");

        swapchainFormat = swapchainOptions.format;
        depthTextureFormat = depthTextureOptions.format;
    };

    createSwapchain();

Filename: hello_triangle_native/main.cpp

However, note the depth texture portion. This provides a level of control which was not available when using KDGpuExample.

        // Create a depth texture to use for rendering
        const TextureOptions depthTextureOptions = {
            .type = TextureType::TextureType2D,
            .format = Format::D24_UNORM_S8_UINT,
            .extent = { swapchainExtent.width, swapchainExtent.height, 1 },
            .mipLevels = 1,
            .usage = TextureUsageFlagBits::DepthStencilAttachmentBit,
            .memoryUsage = MemoryUsage::GpuOnly
        };

Filename: hello_triangle_native/main.cpp

The last difference between this example and Hello Triangle is that we must initialize some GPU synchronization primitives. These will be used to ensure that the steps of the render loop execute in the correct order. See the synchronization section for more details.

    const GpuSemaphore imageAvailableSemaphore = device.createGpuSemaphore();
    const GpuSemaphore renderCompleteSemaphore = device.createGpuSemaphore();
    Fence frameInFlightFence = device.createFence(FenceOptions{ .createSignalled = true });

Filename: hello_triangle_native/main.cpp

Per-Frame Render Logic¶

Our render loop begins like so:

    while (window.visible()) {

Filename: hello_triangle_native/main.cpp

The window's visible status will change based on updates from the OS/window manager.

Our first task on each frame is to acquire the swapchain image to render to. Doing this requires that the swapchain size and format respects the current state of the presentation target. So, we use getNextImageIndex and make sure it's up to date:

        uint32_t currentImageIndex = 0;
        AcquireImageResult result = swapchain.getNextImageIndex(currentImageIndex, imageAvailableSemaphore);
        if (result == AcquireImageResult::OutOfDate) {
            // This can happen when swapchain was resized
            // We need to recreate the swapchain and retry
            createSwapchain();
            continue;
        }

Filename: hello_triangle_native/main.cpp

The reason for the imageAvailableSemaphore is explained in the synchronization section.

After this point, the render loop is pretty much identical to the render loop in Hello Triangle. However, the queue submission involves synchronization primitives which are fully exposed to us, so we can use this code as an example that shows what's actually going on in the KDGpuExample API.

Synchronization¶

In KDGpuExample examples, there is often a section at the end of each render which looks something like this:

    const SubmitOptions submitOptions = {
        .commandBuffers = { m_commandBuffer },
        .waitSemaphores = { m_presentCompleteSemaphores[m_inFlightIndex] },
        .signalSemaphores = { m_renderCompleteSemaphores[m_inFlightIndex] }
    };
    m_queue.submit(submitOptions);

Filename: hello_triangle/hello_triangle.cpp

What are the wait and signal semaphore arguments? When submitting queues of commands to the GPU, it is sometimes necessary for a set of commands to be executed before or after another set of commands. In Vulkan, command queues are not implicitly completed in the order they were submitted. Instead, it is necessary to explicitly define dependency between two sets of GPU commands. This is achieved with semaphores. We initialized some KDGpu::GpuSemaphore primitives earlier, and we will continue to use them each frame. Their destructors will clean up the Vulkan resources at the end of the program. Let's look at the steps of the pipeline which are declared as dependent using semaphores:

        AcquireImageResult result = swapchain.getNextImageIndex(currentImageIndex, imageAvailableSemaphore);

Filename: hello_triangle_native/main.cpp

The acquisition of the swapchain image has a related semaphore (imageAvailableSemaphore). KDGpu will pass this to the GPU so that, when it finds that the swapchain is available, it will mark this semaphore as signalled.

When submitting our draw commands, we put the imageAvailableSemaphore in the waitSemaphores argument, which KDGpu will attach to the queue submission so that the GPU will not try to draw anything until the swapchain image is available. Then, just as we attached the imageAvailableSemaphore to the swapchain query, we also attach the renderCompleteSemaphore to be signalled when the draw commands are completed. (We will come back to the signal fence).

        queue.submit(SubmitOptions{
                .commandBuffers = { commands },
                .waitSemaphores = { imageAvailableSemaphore },
                .signalSemaphores = { renderCompleteSemaphore },
                .signalFence = frameInFlightFence,
        });

Filename: hello_triangle_native/main.cpp

Then we present the rendered scene to the surface. KDGpu has neatly coupled this into the KDGpu::Queue class, so you can just call present on the same queue you submitted. We wait for the renderCompleteSemaphore so that don't draw an unfinished render to the screen.

        queue.present(PresentOptions{
                .waitSemaphores = { renderCompleteSemaphore },
                .swapchainInfos = { { .swapchain = swapchain, .imageIndex = currentImageIndex } },
        });

Filename: hello_triangle_native/main.cpp

The last thing that happens is we wait for the render (but not the presentation) to complete before allowing the loop to continue to the next iteration. Doing this requires a KDGpu::Fence. A fence is a synchronization primitive, just like a semaphore, but it exists in CPU address space, so it is meant to be used to block CPU processes. We created one earlier:

    Fence frameInFlightFence = device.createFence(FenceOptions{ .createSignalled = true });

Filename: hello_triangle_native/main.cpp

We start each loop by resetting the fence to its unsignalled state:

        frameInFlightFence.reset();

Filename: hello_triangle_native/main.cpp

And when submitting the render commands we set it as the signalFence so that it will be signalled when rendering has completed:

                .signalFence = frameInFlightFence,

Filename: hello_triangle_native/main.cpp

Finally, after submitting our commands and requesting presentation, we call wait on the fence to block the main thread until the GPU has completed rendering.

        frameInFlightFence.wait();

Filename: hello_triangle_native/main.cpp

Updated on 2024-08-28 at 00:05:10 +0000