Descriptor Abstractions
Now that we are going to be growing the engine abstractions to support textures and considerably increase the complexity, we are going to need better abstractions for the descriptor sets.
In chapter 2, we already created 2 objects, the Descriptor_Allocator
and
Descriptor_Layout_Builder
. With the descriptor Allocator we have a basic way of abstracting a
single vk..DescriptorPool
to allocate descriptors, and the LayoutBuilder abstracts creating
Descriptor Set Layouts.
DescriptorAllocator 2
We are going to create a new version of the Descriptor_Allocator
,
Descriptor_Allocator_Growable
. The one we created before will just crash when the pool runs
out of space. This is fine for some cases where we know the amount of descriptors ahead of
time, but it wont work when we need to load meshes from arbitrary files and cant know ahead of
time how many descriptors we will need. This new structure will perform almost exactly the
same, except instead of handling a single pool, it handles a bunch of them. Whenever a pool
fails to allocate, we create a new one. When this allocator gets cleared, it clears all of its
pools. This way we can use 1 descriptor allocator and it will just grow as we need to.
We will use fixed array with default maximums based on typical Vulkan implementations, These values can be too much or not enough, but we can adjust them as needed.
This is the implementation we will have in the descriptors.odin
.
// Core
import sa "core:container/small_array"
MAX_POOLS :: #config(MAX_POOLS, 32)
Ratios :: sa.Small_Array(MAX_POOL_SIZES, Pool_Size_Ratio)
Full_Pools :: sa.Small_Array(MAX_POOLS, vk.DescriptorPool)
Ready_Pools :: sa.Small_Array(MAX_POOLS, vk.DescriptorPool)
Descriptor_Allocator_Growable :: struct {
device: vk.Device,
ratios: Ratios,
full_pools: Full_Pools,
ready_pools: Ready_Pools,
sets_per_pool: u32,
}
descriptor_growable_init :: proc(
self: ^Descriptor_Allocator_Growable,
device: vk.Device,
initial_sets: u32,
pool_ratios: []Pool_Size_Ratio,
allocator := context.allocator,
) -> (
ok: bool,
) {
return
}
descriptor_growable_init :: proc(
self: ^Descriptor_Allocator_Growable,
device: vk.Device,
max_sets: u32,
pool_ratios: []Pool_Size_Ratio,
) -> (
ok: bool,
) {
return
}
descriptor_growable_clear_pools :: proc(self: ^Descriptor_Allocator_Growable) -> (ok: bool) {
}
descriptor_growable_destroy_pools :: proc(self: Descriptor_Allocator_Growable) {
}
descriptor_growable_allocate :: proc(
self: ^Descriptor_Allocator_Growable,
layout: ^vk.DescriptorSetLayout,
pNext: rawptr = nil,
) -> (
ds: vk.DescriptorSet,
ok: bool,
) {
return
}
descriptor_growable_get_pool :: proc(
self: ^Descriptor_Allocator_Growable,
) -> (
pool: vk.DescriptorPool,
ok: bool,
) {
return
}
descriptor_growable_create_pool :: proc(
self: ^Descriptor_Allocator_Growable,
set_count: u32,
) -> (
pool: vk.DescriptorPool,
ok: bool,
) {
return
}
This is very similar as in the other descriptor allocator. What has changed is that
now we need to store the array of pool size ratios (for when we reallocate the pools), how many
sets we allocate per pool, and 2 arrays. full_pools
contains the pools we know we cant
allocate from anymore, and ready_pools
contains the pools that can still be used, or the
freshly created ones.
The allocation logic will first grab a pool from ready_pools
, and try to allocate from it. If
it succeeds, it will add the pool back into the ready_pools
array. If it fails, it will put
the pool on the full_pools
array, and try to get another pool to retry. The get_pool
procedure will pick up a pool from ready_pools
, or create a new one.
Lets write the get_pool and create_pool procedures.
descriptor_growable_get_pool :: proc(
self: ^Descriptor_Allocator_Growable,
) -> (
pool: vk.DescriptorPool,
ok: bool,
) {
if sa.len(self.ready_pools) > 0 {
// Pop from ready pools
pool = sa.pop_back(&self.ready_pools)
} else {
// Need to create a new pool
pool = descriptor_growable_create_pool(self, self.sets_per_pool) or_return
// Grow pool size by 50% each time, with an upper limit
self.sets_per_pool = u32(f32(self.sets_per_pool) * 1.5)
// 4096 is the maximum number of descriptor sets per pool supported by most Vulkan
// implementations. Using 4092 instead of 4096 for alignment/padding safety
if self.sets_per_pool > 4092 {
self.sets_per_pool = 4092
}
}
return pool, true
}
descriptor_growable_create_pool :: proc(
self: ^Descriptor_Allocator_Growable,
set_count: u32,
) -> (
pool: vk.DescriptorPool,
ok: bool,
) {
pool_sizes: [MAX_POOL_SIZES]vk.DescriptorPoolSize
for i in 0 ..< sa.len(self.ratios) {
ratio := sa.get_ptr(&self.ratios, i)
pool_sizes[i] = vk.DescriptorPoolSize {
type = ratio.type,
descriptorCount = u32(ratio.ratio) * set_count,
}
}
pool_info := vk.DescriptorPoolCreateInfo {
sType = .DESCRIPTOR_POOL_CREATE_INFO,
maxSets = set_count,
poolSizeCount = u32(sa.len(self.ratios)),
pPoolSizes = &pool_sizes[0],
}
vk_check(vk.CreateDescriptorPool(self.device, &pool_info, nil, &pool)) or_return
return pool, true
}
On get_pool
, when we create a new pool, we increase the sets_per_pool
, to mimic something
like an array resize. Still, we will limit the max amount of sets per pool to 4092
to avoid it
growing too much. This max limit can be modified if you find it works better in your use cases.
An important detail on this procedure is that we are removing the pool from the ready_pools
array when grabbing it. This is so then we can add it back into that array or the other one
once a descriptor is allocated.
On the create_pool
procedure, its the same we had in the other descriptor allocator.
Lets create the other procedures we need, init()
, clear_pools()
, and destroy_pools()
.
descriptor_growable_init :: proc(
self: ^Descriptor_Allocator_Growable,
device: vk.Device,
max_sets: u32,
pool_ratios: []Pool_Size_Ratio,
) -> (
ok: bool,
) {
self.device = device
sa.clear(&self.ratios)
sa.clear(&self.full_pools)
sa.clear(&self.ready_pools)
for r in pool_ratios {
sa.push_back(&self.ratios, r)
}
new_pool := descriptor_growable_create_pool(self, max_sets) or_return
// Grow it next allocation
self.sets_per_pool = u32(f32(max_sets) * 1.5)
sa.push_back(&self.ready_pools, new_pool)
return true
}
descriptor_growable_clear_pools :: proc(self: ^Descriptor_Allocator_Growable) -> (ok: bool) {
// Reset ready pools
for &pool in sa.slice(&self.ready_pools) {
vk_check(vk.ResetDescriptorPool(self.device, pool, {})) or_return
}
// Reset full pools and move them to ready
for &pool in sa.slice(&self.full_pools) {
vk_check(vk.ResetDescriptorPool(self.device, pool, {})) or_return
sa.push_back(&self.ready_pools, pool)
}
sa.clear(&self.full_pools)
return true
}
descriptor_growable_destroy_pools :: proc(self: Descriptor_Allocator_Growable) {
// Destroy ready pools
for i in 0 ..< sa.len(self.ready_pools) {
pool := sa.get(self.ready_pools, i)
vk.DestroyDescriptorPool(self.device, pool, nil)
}
// Destroy full pools
for i in 0 ..< sa.len(self.full_pools) {
pool := sa.get(self.full_pools, i)
vk.DestroyDescriptorPool(self.device, pool, nil)
}
}
The init procedure just allocates the first descriptor pool, and adds it to the ready_pools
array.
Clearing the pools means going through all pools, and coping from full_pools
array into the
ready_pools
array.
Destroying loops over both lists and destroys everything to clear the entire allocator.
Last is the new allocate
procedure.
descriptor_growable_allocate :: proc(
self: ^Descriptor_Allocator_Growable,
layout: ^vk.DescriptorSetLayout,
pNext: rawptr = nil,
) -> (
ds: vk.DescriptorSet,
ok: bool,
) {
// Get or create a pool to allocate from
pool_to_use := descriptor_growable_get_pool(self) or_return
alloc_info := vk.DescriptorSetAllocateInfo {
pNext = pNext,
sType = .DESCRIPTOR_SET_ALLOCATE_INFO,
descriptorPool = pool_to_use,
descriptorSetCount = 1,
pSetLayouts = layout,
}
result := vk.AllocateDescriptorSets(self.device, &alloc_info, &ds)
// Allocation failed. Try again
if result == .ERROR_OUT_OF_POOL_MEMORY || result == .ERROR_FRAGMENTED_POOL {
sa.push_back(&self.full_pools, pool_to_use)
pool_to_use = descriptor_growable_get_pool(self) or_return
alloc_info.descriptorPool = pool_to_use
vk_check(vk.AllocateDescriptorSets(self.device, &alloc_info, &ds)) or_return
}
sa.push_back(&self.ready_pools, pool_to_use)
return ds, true
}
We first grab a pool, then allocate from it, and if the allocation failed, we add it into the
full_pools
array (as we know this pool is filled) and then try again. If the second time
fails too stuff is completely broken so it just error out and returns. Once we have allocated
with a pool, we add it back into the ready_pools
array.
Descriptor Writer
When we needed to create a descriptor set for our compute shader, we did the vulkan
vk.UpdateDescriptorSets()
the manual way, but this is really annoying to deal with. So we are
going to abstract that too. In our writer, we are going to have a write_image
and
write_buffer
procedures to bind the data. Lets look at the struct declaration, also on the
descriptors.odin
file.
MAX_IMAGE_INFOS :: #config(MAX_IMAGE_INFOS, 64)
MAX_BUFFER_INFOS :: #config(MAX_BUFFER_INFOS, 64)
MAX_WRITES :: #config(MAX_WRITES, 128)
Image_Infos :: sa.Small_Array(MAX_IMAGE_INFOS, vk.DescriptorImageInfo)
Buffer_Infos :: sa.Small_Array(MAX_BUFFER_INFOS, vk.DescriptorBufferInfo)
Writes :: sa.Small_Array(MAX_WRITES, vk.WriteDescriptorSet)
Descriptor_Writer :: struct {
device: vk.Device,
image_infos: Image_Infos,
buffer_infos: Buffer_Infos,
writes: Writes,
}
descriptor_writer_init :: proc(self: ^Descriptor_Writer, device: vk.Device) {
self.device = device
}
descriptor_writer_write_image :: proc(
self: ^Descriptor_Writer,
binding: int,
image: vk.ImageView,
sampler: vk.Sampler,
layout: vk.ImageLayout,
type: vk.DescriptorType,
loc := #caller_location,
) -> bool {
return
}
descriptor_writer_write_buffer :: proc(
self: ^Descriptor_Writer,
binding: int,
buffer: vk.Buffer,
size: vk.DeviceSize,
offset: vk.DeviceSize,
type: vk.DescriptorType,
loc := #caller_location,
) -> bool {
return
}
descriptor_writer_clear :: proc(self: ^Descriptor_Writer) {
}
descriptor_writer_update_set :: proc(
self: ^Descriptor_Writer,
set: vk.DescriptorSet,
loc := #caller_location,
) {
}
Lets look at the definition of vk.WriteDescriptorSet
.
WriteDescriptorSet :: struct {
sType: StructureType,
pNext: rawptr,
dstSet: DescriptorSet,
dstBinding: u32,
dstArrayElement: u32,
descriptorCount: u32,
descriptorType: DescriptorType,
pImageInfo: ^DescriptorImageInfo,
pBufferInfo: ^DescriptorBufferInfo,
pTexelBufferView: ^BufferView,
}
We have target set, target binding element, and the actual buffer or image is done by pointer.
We need to keep the information on the vk.DescriptorBufferInfo
and others in a way that the
pointers are stable, or a way to fix up those pointers when making the final
WriteDescriptorSet
array.
Lets look at what the write_buffer
procedure does.
descriptor_writer_write_buffer :: proc(
self: ^Descriptor_Writer,
binding: int,
buffer: vk.Buffer,
size: vk.DeviceSize,
offset: vk.DeviceSize,
type: vk.DescriptorType,
loc := #caller_location,
) -> bool {
assert(sa.space(self.buffer_infos) != 0, "No space left in buffer_infos array", loc)
assert(sa.space(self.writes) != 0, "No space left in writes array", loc)
// Add buffer info
sa.push_back(
&self.buffer_infos,
vk.DescriptorBufferInfo{buffer = buffer, offset = offset, range = size},
)
info_ptr := sa.get_ptr(&self.buffer_infos, sa.len(self.buffer_infos) - 1)
// Add write
sa.push_back(
&self.writes,
vk.WriteDescriptorSet {
sType = .WRITE_DESCRIPTOR_SET,
dstBinding = u32(binding),
dstSet = 0, // Left empty for now until we need to write it
descriptorCount = 1,
descriptorType = type,
pBufferInfo = info_ptr,
},
)
return true
}
We have to fill a vk.DescriptorBufferInfo
first, with the buffer itself, and then an offset and
range (size) for it.
Then, we have to setup the write itself. Its only 1 descriptor, at the given binding slot, with
the correct type, and a pointer to the vk.DescriptorBufferInfo
.
The descriptor types that are allowed for a buffer are these.
// vk.DescriptorType
.UNIFORM_BUFFER
.STORAGE_BUFFER
.UNIFORM_BUFFER_DYNAMIC
.STORAGE_BUFFER_DYNAMIC
We already explained those types of buffers in the last chapter. When we want to bind one or
the other type into a shader, we set the correct type here. Remember that it needs to match the
usage when allocating the vk.Buffer
.
For images, this is the other procedure.
descriptor_writer_write_image :: proc(
self: ^Descriptor_Writer,
binding: int,
image: vk.ImageView,
sampler: vk.Sampler,
layout: vk.ImageLayout,
type: vk.DescriptorType,
loc := #caller_location,
) -> bool {
assert(sa.space(self.image_infos) != 0, "No space left in image_infos array", loc)
assert(sa.space(self.writes) != 0, "No space left in writes array", loc)
// Add image info
sa.push_back(
&self.image_infos,
vk.DescriptorImageInfo{sampler = sampler, imageView = image, imageLayout = layout},
)
info_ptr := sa.get_ptr(&self.image_infos, sa.len(self.image_infos) - 1)
// Add write
sa.push_back(
&self.writes,
vk.WriteDescriptorSet {
sType = .WRITE_DESCRIPTOR_SET,
dstBinding = u32(binding),
dstSet = 0, // Left empty for now until we need to write it
descriptorCount = 1,
descriptorType = type,
pImageInfo = info_ptr,
},
)
return true
}
Very similar to the buffer one, but we have a different Info type, using a
vk.DescriptorImageInfo
instead. For that one, we need to give it a sampler, a image view, and
what layout the image uses. The layout is going to be almost always either
SHADER_READ_ONLY_OPTIMAL
, the best layout to use for accessing textures in the shaders, or
GENERAL
when we are using them from compute shaders and writing them.
The 3 parameters in the ImageInfo can be optional, depending on the specific vk.DescriptorType
.
.SAMPLER
is JUST the sampler, so it does not needImageView
or layout to be set..SAMPLED_IMAGE
doesnt need the sampler set because its going to be accessed with different samplers within the shader, this descriptor type is just a pointer to the image..COMBINED_IMAGE_SAMPLER
needs everything set, as it holds the information for both the sampler, and the image it samples. This is a useful type because it means we only need 1 descriptor binding to access the texture..STORAGE_IMAGE
was used back in chapter 2, it does not need sampler, and its used to allow compute shaders to directly access pixel data.
In both the write_image
and write_buffer
procedures, we are being overly generic. This is done
for simplicity, but if you want, you can add new ones like write_sampler()
where it has
SAMPLER
and sets imageview and layout to zero, and other similar abstractions.
With these done, we can perform the write itself.
descriptor_writer_clear :: proc(self: ^Descriptor_Writer) {
sa.clear(&self.image_infos)
sa.clear(&self.buffer_infos)
sa.clear(&self.writes)
}
descriptor_writer_update_set :: proc(
self: ^Descriptor_Writer,
set: vk.DescriptorSet,
loc := #caller_location,
) {
assert(self.device != nil, "Invalid 'Device'", loc)
for &write in sa.slice(&self.writes) {
write.dstSet = set
}
if sa.len(self.writes) > 0 {
vk.UpdateDescriptorSets(
self.device,
u32(sa.len(self.writes)),
raw_data(sa.slice(&self.writes)),
0,
nil,
)
}
}
The clear()
procedure resets everything. The update_set
procedure takes a descriptor set,
connects that set to the array of writes, and then calls vk.UpdateDescriptorSets
to write the
descriptor set to its new bindings.
Lets look at how this abstraction can be used to replace code we had before in the
engine_init_descriptors
procedure.
Before:
img_info := vk.DescriptorImageInfo {
imageLayout = .GENERAL,
imageView = self.draw_image.image_view,
}
draw_image_write := vk.WriteDescriptorSet {
sType = .WRITE_DESCRIPTOR_SET,
dstBinding = 0,
dstSet = self.draw_image_descriptors,
descriptorCount = 1,
descriptorType = .STORAGE_IMAGE,
pImageInfo = &img_info,
}
vk.UpdateDescriptorSets(self.vk_device, 1, &draw_image_write, 0, nil)
After:
writer: Descriptor_Writer
descriptor_writer_init(&writer, self.vk_device)
descriptor_writer_write_image(
&writer,
binding = 0,
image = self.draw_image.image_view,
sampler = 0,
layout = .GENERAL,
type = .STORAGE_IMAGE,
)
descriptor_writer_update_set(&writer, self.draw_image_descriptors)
This abstraction will prove much more useful when we have more complex descriptor sets, specially in combination with the allocator and the layout builder.
Dynamic Descriptor Allocation
Lets start using the abstraction by using it to create a global scene data descriptor every frame. This is the descriptor set that all of our draws will use. It will contain the camera matrices so that we can do 3d rendering.
To allocate descriptor sets at runtime, we will hold one descriptor allocator in our
Frame_Data
structure. This way it will work like with the deletion queue, where we flush the
resources and delete things as we begin the rendering of that frame. Resetting the whole
descriptor pool at once is a lot faster than trying to keep track of individual descriptor set
resource lifetimes.
We add it into Frame_Data
struct.
Frame_Data :: struct {
deletion_queue: ^Deletion_Queue,
// Other fields above ---
frame_descriptors: Descriptor_Allocator_Growable,
}
Now, lets initialize it when we initialize the swapchain and create these structs. Add this at
the end of engine_init_descriptors()
.
for &frame in self.frames {
frame_sizes: Ratios
sa.push_back(&frame_sizes, Pool_Size_Ratio{.STORAGE_IMAGE, 3})
sa.push_back(&frame_sizes, Pool_Size_Ratio{.STORAGE_BUFFER, 3})
sa.push_back(&frame_sizes, Pool_Size_Ratio{.UNIFORM_BUFFER, 3})
sa.push_back(&frame_sizes, Pool_Size_Ratio{.COMBINED_IMAGE_SAMPLER, 4})
descriptor_growable_init(
&frame.frame_descriptors,
self.vk_device,
1000,
sa.slice(&frame_sizes),
)
deletion_queue_push(&self.main_deletion_queue, &frame.frame_descriptors)
}
We also need to update the deletion queue:
Resource :: union {
// Higher-level custom resources
Descriptor_Allocator_Growable,
}
deletion_queue_flush :: proc(queue: ^Deletion_Queue) {
#reverse for &resource in queue.resources {
switch &res in resource {
// Higher-level custom resources
case Descriptor_Allocator_Growable:
descriptor_growable_destroy_pools(res)
}
}
}
And now, we can clear these every frame when we flush the frame deletion queue. This goes at
the start of engine_draw()
.
// Wait until the gpu has finished rendering the last frame. Timeout of 1 second
vk_check(vk.WaitForFences(self.vk_device, 1, &frame.render_fence, true, 1e9)) or_return
deletion_queue_flush(&frame.deletion_queue)
descriptor_growable_clear_pools(&frame.frame_descriptors) // < new
Now that we can allocate descriptor sets dynamically, we will be allocating the buffer that holds scene data and create its descriptor set.
Add a new structure that we will use for the uniform buffer of scene data. We will hold view and projection matrix separated, and then premultiplied view-projection matrix. We also add some vec4s for a very basic lighting model that we will be building next.
// Core
import la "core:math/linalg"
GPU_Scene_Data :: struct {
view: la.Matrix4x4f32,
proj: la.Matrix4x4f32,
viewproj: la.Matrix4x4f32,
ambient_color: la.Vector4f32,
sunlight_direction: la.Vector4f32, // w for sun power
sunlight_color: la.Vector4f32,
}
Add a new descriptor Layout on the Engine
structure.
Engine :: struct {
// Scene
scene_data: GPU_Scene_Data,
gpu_scene_data_descriptor_layout: vk.DescriptorSetLayout,
}
Create the descriptor set layout as part of engine_init_descriptors
. It will be a descriptor
set with a single uniform buffer binding. We use uniform buffer here instead of SSBO
because this is a small buffer. We arent using it through buffer device adress because we have
a single descriptor set for all objects so there isnt any overhead of managing it.
{
builder: Descriptor_Layout_Builder
descriptor_layout_builder_init(&builder, self.vk_device)
descriptor_layout_builder_add_binding(&builder, 0, .UNIFORM_BUFFER)
self.gpu_scene_data_descriptor_layout = descriptor_layout_builder_build(
&builder,
self.vk_device,
{.VERTEX, .FRAGMENT},
) or_return
}
deletion_queue_push(&self.main_deletion_queue, self.gpu_scene_data_descriptor_layout)
Now, we will create this descriptor set every frame, inside the engine_draw_geometry()
procedure. We will also dynamically allocate the uniform buffer itself as a way to showcase how
you could do temporal per-frame data that is dynamically created. It would be better to hold
the buffers cached in our Frame_Data
structure, but we will be doing it this way to show how.
There are cases with dynamic draws and passes where you might want to do it this way.
engine_draw_geometry :: proc(self: ^Engine, cmd: vk.CommandBuffer) -> (ok: bool) {
// ...
vk.CmdDrawIndexed(
cmd,
self.test_meshes[2].surfaces[0].count,
1,
self.test_meshes[2].surfaces[0].start_index,
0,
0,
)
// Other code above ---
// Allocate a new uniform buffer for the scene data
gpu_scene_data_buffer := create_buffer(
self,
size_of(GPU_Scene_Data),
{.UNIFORM_BUFFER},
.Cpu_To_Gpu,
) or_return
// Add it to the deletion queue of this frame so it gets deleted once its been used
frame := engine_get_current_frame(self)
deletion_queue_push(&frame.deletion_queue, &gpu_scene_data_buffer)
// Write the buffer
scene_uniform_data := cast(^GPU_Scene_Data)gpu_scene_data_buffer.info.mapped_data
scene_uniform_data^ = self.scene_data
// Create a descriptor set that binds that buffer and update it
global_descriptor := descriptor_growable_allocate(
&frame.frame_descriptors,
&self.gpu_scene_data_descriptor_layout,
) or_return
writer: Descriptor_Writer
descriptor_writer_init(&writer, self.vk_device)
descriptor_writer_write_buffer(
&writer,
binding = 0,
buffer = gpu_scene_data_buffer.buffer,
size = size_of(GPU_Scene_Data),
offset = 0,
type = .UNIFORM_BUFFER,
)
descriptor_writer_update_set(&writer, global_descriptor)
vk.CmdEndRendering(cmd)
return true
}
First we allocate the uniform buffer using the Cpu_To_Gpu
memory usage so that its a memory
type that the cpu can write and gpu can read. This might be done on CPU RAM, but because its a
small amount of data, the gpu is going to have no problem loading it into its caches. We can
skip the logic with the staging buffer upload to dedicated gpu memory for cases like this.
Then we add it into the destruction queue of the current frame. This will destroy the buffer after the next frame is rendered, so it gives enough time for the GPU to be done accessing it. All of the resources we dynamically created for a single frame must go here for deletion.
To allocate the descriptor set we allocate it from the frame_descriptors
. That pool gets
destroyed every frame, so same as with the deletion queue, it will be deleted automatically
when the gpu is done with it 2 frames later.
Then we write the new buffer into the descriptor set. Now we have the global_descriptor
ready
to be used for drawing. We aren't using the scene-data buffer right now, but it will be
necessary later.
Before we continue with drawing, lets set up textures.