Skip to content

Hello Triangle MSAA

This example introduces multi-sample anti-aliasing, a hardware-accelerated technique for reducing jagged edges (aliasing) on geometry. MSAA samples each pixel multiple times and resolves the samples into a final color, dramatically improving visual quality with modest performance cost. Compare this to Hello Triangle to see the difference.

The example uses the KDGpuExample helper API for simplified setup.

hello_triangle_msaa.png

Overview

What this example demonstrates:

  • Creating multi-sampled textures for rendering
  • Configuring graphics pipelines for MSAA
  • Resolving multi-sampled images to regular textures
  • Dynamically switching between MSAA sample counts (1x, 2x, 4x, 8x)

Visual impact:

  • Without MSAA (1x): Jagged, stair-stepped edges on triangle
  • With MSAA (4x/8x): Smooth, anti-aliased edges

Vulkan Requirements

  • Vulkan Version: 1.0+
  • Extensions: None (MSAA is core functionality)
  • Device Features: Sample counts supported by device (query via VkPhysicalDeviceProperties)

All Vulkan devices support at least 1x (no MSAA) and typically 2x/4x. Many support 8x or higher.

Key Concepts

What is Anti-Aliasing?

When rendering geometry to a pixel grid, edges that don't align with pixel boundaries appear jagged or "aliased" - the stair-step effect. Anti-aliasing techniques smooth these edges by sampling multiple points within each pixel and averaging the results.

Multi-Sample Anti-Aliasing (MSAA):

MSAA is a hardware-accelerated anti-aliasing method:

  1. Render to multi-sampled texture: Instead of 1 color per pixel, store N samples (2x, 4x, 8x, etc.)
  2. Fragment shader runs once per pixel: Shading cost doesn't increase (unlike supersampling)
  3. Coverage mask: Hardware determines which samples each triangle covers
  4. Resolve operation: Average the N samples to produce final pixel color

MSAA only anti-aliases geometry edges - textures and shader-computed patterns aren't affected. For those, you need supersampling (SSAA) or other techniques.

Sample Counts (VkSampleCountFlagBits):

  • 1x (Samples1Bit): No MSAA - standard aliased rendering
  • 2x (Samples2Bit): 2 samples per pixel - minimal smoothing, low cost
  • 4x (Samples4Bit): 4 samples per pixel - good quality/performance balance (most common)
  • 8x (Samples8Bit): 8 samples per pixel - high quality, higher cost
  • 16x/32x/64x: Device-dependent, diminishing returns

For more on MSAA: https://www.khronos.org/opengl/wiki/Multisampling

Implementation

Creating the Multi-Sampled Texture:

Multi-sampled textures are created similarly to regular textures, but with a samples parameter set to the desired sample count (2, 4, 8, etc.) instead of 1. Key differences:

  • samples: Set to desired sample count instead of 1
  • usage: TextureUsageFlagBits::ColorAttachmentBit - used as render target
  • Memory cost: N samples means N× memory (4x MSAA = 4× texture memory)

The multi-sampled texture stores intermediate rendering with multiple samples per pixel.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
void HelloTriangleMSAA::createRenderTarget()
{

    const TextureOptions options = {
        .type = TextureType::TextureType2D,
        .format = m_swapchainFormat,
        .extent = { .width = m_swapchainExtent.width, .height = m_swapchainExtent.height, .depth = 1 },
        .mipLevels = 1,
        .samples = m_samples.get(),
        .usage = TextureUsageFlagBits::ColorAttachmentBit,
        .memoryUsage = MemoryUsage::GpuOnly,
        .initialLayout = TextureLayout::Undefined
    };
    m_msaaTexture = m_device.createTexture(options);
    m_msaaTextureView = m_msaaTexture.createView();
}

Filename: hello_triangle_msaa/hello_triangle_msaa.cpp

Configuring the Render Pass for MSAA:

The render pass configuration is set up with appropriate attachments and sample counts. Critical MSAA setup:

  • view: Multi-sampled texture where rendering happens
  • resolveView: Regular texture (swapchain image) where resolved result goes
  • samples: Must match the multi-sampled texture's sample count

The resolve operation happens automatically at the end of the render pass, averaging the multi-sampled data into the final image.

Pipeline Multisample Configuration:

The pipeline must be configured for the same sample count as the render target. Vulkan does not allow mismatches between pipeline and render pass sample counts.

1
2
3
        .multisample = {
            .samples = samples
        }

Filename: hello_triangle_msaa/hello_triangle_msaa.cpp

Resolve Operation:

The resolve happens implicitly when the render pass ends. Vulkan averages the N samples at each pixel location and writes the result to the resolveView. This is extremely efficient - done by dedicated hardware on the GPU.

From the spec: https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VkResolveModeFlagBits.html

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
    auto opaquePass = commandRecorder.beginRenderPass(KDGpu::RenderPassCommandRecorderOptions{
            .colorAttachments = {
                    {
                            .view = isMsaaEnabled() ? m_msaaTextureView : m_swapchainViews.at(m_currentSwapchainImageIndex),
                            .resolveView = isMsaaEnabled() ? m_swapchainViews.at(m_currentSwapchainImageIndex) : Handle<TextureView_t>{},
                            .clearValue = { 0.3f, 0.3f, 0.3f, 1.0f },
                            .finalLayout = TextureLayout::PresentSrc,
                    },
            },
            .depthStencilAttachment = {
                    .view = m_depthTextureView,
            },
            // configure for multisampling
            .samples = m_samples.get(),
    });

Filename: hello_triangle_msaa/hello_triangle_msaa.cpp

Dynamic Sample Count:

This example creates pipelines for all supported sample counts and switches between them via UI. When changing sample counts:

  1. Recreate the multi-sampled texture with new sample count
  2. Switch to pipeline configured for that sample count
  3. Update render pass configuration

The helper API populates m_supportedSampleCounts by querying device limits.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
void HelloTriangleMSAA::setMsaaSampleCount(SampleCountFlagBits samples)
{
    if (samples == m_samples.get())
        return;

    // get new pipeline
    for (size_t i = 0; i < m_supportedSampleCounts.size(); ++i) {
        if (m_supportedSampleCounts[i] == samples) {
            m_currentPipelineIndex = i;
            break;
        }
    }

    // the ExampleEngineLayer will recreate the depth view when we do this
    m_samples = samples;

    // we must also refresh the view(s) we handle, and reattach them
    createRenderTarget();
}

Filename: hello_triangle_msaa/hello_triangle_msaa.cpp

Performance Notes

Memory Cost:

  • MSAA increases color/depth buffer memory by the sample count multiplier
  • 4x MSAA = 4× memory, 8x MSAA = 8× memory
  • Resolved output (swapchain) is still single-sampled

Bandwidth Cost:

  • Rendering writes N samples per pixel (higher bandwidth)
  • Resolve operation adds bandwidth cost (read multi-sampled, write single-sampled)
  • On tile-based GPUs (mobile), resolve cost is lower due to on-chip storage

Shading Cost:

  • MSAA: Fragment shader runs once per pixel (good)
  • Supersampling (SSAA): Fragment shader runs once per sample (N× cost)
  • MSAA is much more performant than SSAA

Recommended Settings:

  • Desktop: 4x MSAA is standard, 8x for high-end
  • Mobile: 2x or 4x MSAA (bandwidth limited)
  • VR: Often disabled or 2x (need high frame rates)

Tile-Based Deferred Rendering (TBDR): Mobile GPUs (Arm Mali, Qualcomm Adreno, Apple) use TBDR architecture. MSAA is very efficient on TBDR because multi-sampled data stays on-chip (tile memory) and only the resolved result goes to main memory.

See Also

Further Reading


Updated on 2026-03-31 at 00:02:07 +0000