Render to Texture (Dynamic Rendering with Local Read)¶

Overview¶
Traditional Vulkan requires creating VkRenderPass and VkFramebuffer objects to define multi-pass rendering. These API objects are verbose to create and inflexible—changing attachment formats or load/store operations requires rebuilding the entire render pass. Vulkan 1.3 introduced Dynamic Rendering, which eliminates these objects by specifying attachments directly in the command buffer.
However, dynamic rendering initially lacked a critical feature: the ability to read from attachments within the same render pass (like subpass input attachments). The VK_KHR_dynamic_rendering_local_read extension (and VK_EXT_shader_tile_image for mobile) fills this gap. You can now render to an attachment in one draw call and immediately sample it in a subsequent draw call—all within a single dynamic render pass.
This example demonstrates a deferred shading pipeline:
- Scene pass: Render scene geometry to an output color attachment
- Post Processing pass: Sample the input color attachment (output of the scene pass), add some post processing effect and render to the final color attachment
Both passes occur within one dynamic rendering scope, avoiding the overhead of separate render passes. On tile-based GPUs (mobile, Apple Silicon), attachments remain in on-chip tile memory between passes, preserving the performance benefits of traditional subpasses.
Vulkan Requirements¶
- Vulkan Version: 1.3 (for dynamic rendering core support)
- Required Extensions:
VK_KHR_dynamic_rendering_local_read(enables reading attachments during dynamic rendering)- Alternatively:
VK_EXT_shader_tile_image(mobile-focused tile image extension)
- Required Features:
dynamicRendering: Enables render pass-free renderingdynamicRenderingLocalRead: Allows reading from local attachments within dynamic render pass
- Dynamic Rendering: Must begin render pass with RenderPassCommandRecorderWithDynamicRenderingOptions using dynamic rendering options
Key Concepts¶
Dynamic Rendering:
Instead of pre-creating VkRenderPass and VkFramebuffer, you begin rendering by directly specifying attachments:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | |
Filename: render_to_texture_subpass_dynamic_rendering/render_to_texture_subpass_dynamic_rendering.cpp
This eliminates hundreds of lines of boilerplate and makes render pass configuration data-driven.
Local Read:
Traditional subpass dependencies allow reading from attachments using input_attachment (GLSL).
Local read extends this to dynamic rendering. Mark an attachment with TextureLayout::DynamicLocalRead layout and bind it as an input attachment for subsequent draws within the same render pass:
1 2 3 4 5 6 7 8 9 10 11 12 13 | |
Filename: render_to_texture_subpass_dynamic_rendering/render_to_texture_subpass_dynamic_rendering.cpp
On-Tile Memory (TBDR GPUs):
Tile-based deferred rendering GPUs (ARM Mali, Qualcomm Adreno, Apple GPUs) render to on-chip "tile memory" before writing to main memory. Local reads keep data on-chip between passes, avoiding expensive memory bandwidth:
- Traditional subpasses: Keep data on-chip automatically
- Dynamic rendering + local read: Achieves the same optimization with simpler API
For mobile VR and high-resolution rendering, this bandwidth saving is critical.
Shader Integration:
The scene pass fragment shader writes its output to layout(location = 0), which the pipeline's dynamicOutputLocations maps to color attachment 0:
1 2 3 4 5 6 7 8 | |
Filename: assets/shaders/examples/render_to_texture_subpass/rotating_triangle.frag
The post-process fragment shader declares its input using the standard subpassInput type, bound to input_attachment_index = 0.
This index corresponds to the remapping set up by dynamicInputLocations in the post-process pipeline — color attachment 0 is exposed as input attachment 0:
1 2 | |
Filename: assets/shaders/examples/render_to_texture_subpass/desaturate.frag
The input is read with subpassLoad, and the result is written to layout(location = 0) which the pipeline maps to color attachment 1 (the swapchain image):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
Filename: assets/shaders/examples/render_to_texture_subpass/desaturate.frag
Implementation Details¶
Dynamic Rendering Configuration:
- Specify all attachments when beginning the render pass.
- Specify all attachments when creating the pipelines and whether they are used and if so, to which shader input or output index they are to be bound
- BindGroups can use
InputAttachmentBindingbindings with layoutDynamicLocalReadto enable sampling from input attachments. TheDynamicLocalReadlayout tells the driver this attachment will be read later in the same pass.
Graphics Pipeline Setup:
Pipelines must declare their attachment formats in dynamicRendering options (since no VkRenderPass provides this info). This replaces the renderPass field used in traditional pipelines.
Attachment Location and Input Remapping¶
Dynamic rendering with local read introduces a two-level remapping system that controls how color attachments are wired to shader inputs and outputs. Both levels must agree for the draw to be valid.
Level 1 — Pipeline declaration:
Each pipeline statically declares which attachments it reads from and writes to, baked into the pipeline object at creation time. This maps to VkRenderingAttachmentLocationInfoKHR and VkRenderingInputAttachmentIndexInfoKHR in the pNext chain.
dynamicInputLocations.inputColorAttachments[i]: for each color attachment indexi, declares whether it is exposed as an input attachment and, if so, which input attachment index (remappedIndex) the shader sees it under.dynamicOutputLocations.outputAttachments[i]: for each color attachment indexi, declares whether the pipeline writes to it and, if so, which fragment shader output location (remappedIndex) drives it.
Setting .enabled = false marks an attachment as unused for that pipeline — the pipeline will neither read from nor write to it, even though the attachment exists in the dynamic render pass.
Main scene pipeline declaration — no input attachments, frag output 0 writes only to color attachment 0, color attachment 1 unused:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | |
Filename: render_to_texture_subpass_dynamic_rendering/render_to_texture_subpass_dynamic_rendering.cpp
Post-process pipeline declaration — color attachment 0 is exposed as input attachment 0, frag output 0 writes only to color attachment 1, color attachment 0 output is unused:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | |
Filename: render_to_texture_subpass_dynamic_rendering/render_to_texture_subpass_dynamic_rendering.cpp
Level 2 — Per-draw command buffer state (vkCmdSet*):
When switching pipelines within the same render pass, the render pass instance must also be updated to match the new pipeline's declared mapping. This is done with setInputAttachmentMapping() and setOutputAttachmentMapping(), which call vkCmdSetRenderingInputAttachmentIndicesKHR and vkCmdSetRenderingAttachmentLocationsKHR respectively.
The array passed to each call mirrors the pipeline's declaration:
setOutputAttachmentMapping(locations):locations[i]is the fragment shader output location that writes to color attachmenti. Passstd::nulloptto mark attachmentias unused for output.setInputAttachmentMapping(inputIndices, ...):inputIndices[i]is the input attachment index that the shader reads color attachmentifrom. Passstd::nulloptto mark attachmentias unused for input.
Multi-Pass Execution:
Pass 1 — set state for main scene pipeline, render geometry into color attachment 0, leave attachment 1 untouched:
1 2 3 4 5 6 7 8 9 10 | |
Filename: render_to_texture_subpass_dynamic_rendering/render_to_texture_subpass_dynamic_rendering.cpp
Pass 2 — update state for post-process pipeline: color attachment 0 is now read as input attachment 0, frag output 0 now routes to color attachment 1:
1 2 3 4 5 6 7 8 9 10 | |
Filename: render_to_texture_subpass_dynamic_rendering/render_to_texture_subpass_dynamic_rendering.cpp
No nextSubpass() call needed—just update the mapping state and issue draws in sequence.
Memory Barrier (if needed):
If the implementation requires explicit barriers between local reads, insert a pipeline barrier:
1 2 3 4 5 6 7 8 9 10 | |
Filename: render_to_texture_subpass_dynamic_rendering/render_to_texture_subpass_dynamic_rendering.cpp
Many drivers optimize this away for tile-based architectures.
Performance Notes¶
Tile Memory Efficiency: On mobile GPUs, local reads keep intermediate data on-chip:
- Without local read: G-Buffer written to RAM (bandwidth cost), then read back (bandwidth cost)
- With local read: G-Buffer stays in tile memory (~95% bandwidth saving)
For a 2048x2048 G-Buffer (3 textures × 16 bytes = 96 MB), this saves ~190 MB of bandwidth per frame.
Desktop GPUs: Less dramatic but still beneficial:
- Improved cache locality
- Reduced command buffer overhead
- Driver-side optimization opportunities
API Simplicity: Dynamic rendering reduces validation layers overhead and driver complexity, improving frame pacing and CPU performance.
Best Practices:
- Use local read for all multi-pass effects: deferred shading, SSAO, post-processing chains
- Profile bandwidth with GPU tools (Snapdragon Profiler, ARM Streamline, Xcode Instruments)
- Consider
VK_EXT_shader_tile_imagefor explicit tile memory control on supported hardware - Minimize attachment format changes mid-pass
See also¶
Render to Texture with Subpasses for traditional subpass-based implementation
Hello Triangle MSAA with Dynamic Rendering for basic dynamic rendering
Render to Texture for separate render pass approach
Further Reading¶
- Vulkan Dynamic Rendering Guide
- VK_KHR_dynamic_rendering_local_read Spec
- Tile-Based Deferred Rendering
- Shader Tile Image Extension
Updated on 2026-03-31 at 00:02:07 +0000