Updated 01.15.2024
12.2.2023
Updated! See the bottom of the article for the final results of this.
I recently came across an issue in Vulkan that I never ran into with OpenGL.
In Kohi, I have descriptors setup by change frequency. In other words, how often descriptors are updated during the course of a single frame.
For example:
Note that the list indices above (namely 0 and 1) also indicate the index of a descriptor set.
Within these descriptor sets, there are two bindings:
The problem with this comes in the form of requiring different sampler types in the GLSL shader code within the array. Consider the sampler types for my recently-written PBR shader:
Ignore the number of maps (yes, I know some of these could/should be combined, I know, it's coming). Note that indices 0-4 are sampler2D while index 5 is a samplerCube. As far as the Vulkan application goes, these are all the same, just combined image samplers. There is no way to differentiate them in application code per se.
In the GLSL shader code, these are all represented in a singular array, in a singular set/binding. This means types can't be mixed:
layout(set = 1, binding = 1) uniform sampler2D samplers[6];
After looking around a bit, the suggested solution (which I originally found here) to this was similar to what one might do in OpenGL - alias the descriptors but use different types. This is done by simply re-declaring the array as a different type, which _should_ work since samplers are opaque types:
layout(set = 1, binding = 1) uniform sampler2D samplers[6]; layout(set = 1, binding = 1) uniform samplerCube cube_samplers[6];
The problem with this configuration is that Vulkan doesn't really like this by default, and throws validation errors that look something like this when referenced:
Validation Error: [ VUID-vkCmdDrawIndexed-viewType-07752 ] Object 0: handle = 0xbbf6ab0000000169, type = VK_OBJECT_TYPE_DESCRIPTOR_SET; Object 1: handle = 0xc25f26000000009c, name = default_cube_view, type = VK_OBJECT_TYPE_IMAGE_VIEW; | MessageID = 0xce261924 | vkCmdDrawIndexed: Descriptor set VkDescriptorSet 0xbbf6ab0000000169[] in binding #1 index 5 ImageView type is VK_IMAGE_VIEW_TYPE_CUBE but the OpTypeImage has (Dim = 2D) and (Arrrayed = 0). The Vulkan spec states: If a VkImageView is accessed as a result of this command, then the image view's viewType must match the Dim operand of the OpTypeImage as described in Instruction/Sampler/Image View Validation (https://vulkan.lunarg.com/doc/view/1.3.250.1/windows/1.3-extensions/vkspec.html#VUID-vkCmdDrawIndexed-viewType-07752)
To solve this, the following changes were needed:
When creating the device, you'll want to make sure partial binding is enabled. This is what actually made the aliasing work for me.
VkPhysicalDeviceDescriptorIndexingFeatures descriptor_indexing_features = {VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DESCRIPTOR_INDEXING_FEATURES_EXT}; // Partial binding is required for descriptor aliasing. descriptor_indexing_features.descriptorBindingPartiallyBound = VK_TRUE;
Of course you'll need to feed this into the pNext chain of your VkDeviceCreateInfo structure. In my case I didn't actually need to load the extension since I'm using 1.3 on Windows. Note this, we'll come back to it.
When creating descriptor set layouts, you will need to do something like this:
VkDescriptorSetLayoutCreateInfo layout_info; ... // Partial binding is required for descriptor aliasing (i.e using different types on the same set/binding) VkDescriptorBindingFlags binding_flags = VK_DESCRIPTOR_BINDING_PARTIALLY_BOUND_BIT_EXT; VkDescriptorSetLayoutBindingFlagsCreateInfoEXT extended_info = {VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_BINDING_FLAGS_CREATE_INFO_EXT}; extended_info.bindingCount = 2; // In this example, the first binding is a UBO and the second is a combined image sampler array, so it only needs to be set on the second binding in my case. VkDescriptorBindingFlagsEXT descriptor_binding_flags[2] = { 0, binding_flags }; extended_info.pBindingFlags = descriptor_binding_flags; layout_info.pNext = &extended_info;
In my case I only set the flags on the second binding since that contains the sampler array I wish to alias.
The problem I have now is that this doesn't currently (at the time of writing, Vulkan SDK 1.3.268) seem to work on macOS. MoltenVK doesn't seem to like this at all.
[ERROR]: VK_ERROR_INITIALIZATION_FAILED: Shader library compile failed (Error code 3): program_source:137:195: error: cannot reserve 'texture' resource locations at index 0 fragment main0_out main0(main0_in in [[stage_in]], constant instance_uniform_object& instance_ubo [[buffer(2)]], array, 6> samplers [[texture(0)]], array , 6> cube_samplers [[texture(0)]], array samplersSmplr [[sampler(0)]], array cube_samplersSmplr [[sampler(0)]]) ^ program_source:137:291: error: cannot reserve 'sampler' resource locations at index 0 fragment main0_out main0(main0_in in [[stage_in]], constant instance_uniform_object& instance_ubo [[buffer(2)]], array , 6> samplers [[texture(0)]], array , 6> cube_samplers [[texture(0)]], array samplersSmplr [[sampler(0)]], array cube_samplersSmplr [[sampler(0)]]) ^ . [ERROR]: VK_ERROR_INVALID_SHADER_NV: Fragment shader function could not be compiled into pipeline. See previous logged error. [ERROR]: Validation Error: [ VUID-vkSetDebugUtilsObjectNameEXT-pNameInfo-02588 ] | MessageID = 0x30f70d65 | vkSetDebugUtilsObjectNameEXT() pNameInfo->objectHandle cannot be VK_NULL_HANDLE. The Vulkan spec states: pNameInfo->objectHandle must not be VK_NULL_HANDLE (https://vulkan.lunarg.com/doc/view/1.3.261.1/mac/1.3-extensions/vkspec.html#VUID-vkSetDebugUtilsObjectNameEXT-pNameInfo-02588) [ERROR]: Validation Error: [ VUID-VkDebugUtilsObjectNameInfoEXT-objectType-02590 ] Object 0: handle = 0x294006f78, type = VK_OBJECT_TYPE_INSTANCE; | MessageID = 0x9b4c6071 | vkSetDebugUtilsObjectNameEXT(): Invalid VkPipeline Object 0x0. The Vulkan spec states: If objectType is not VK_OBJECT_TYPE_UNKNOWN, objectHandle must be VK_NULL_HANDLE or a valid Vulkan handle of the type associated with objectType as defined in the VkObjectType and Vulkan Handle Relationship table (https://vulkan.lunarg.com/doc/view/1.3.261.1/mac/1.3-extensions/vkspec.html#VUID-VkDebugUtilsObjectNameInfoEXT-objectType-02590) [ERROR]: vkCreateGraphicsPipelines failed with VK_ERROR_INVALID_SHADER_NV One or more shaders failed to compile or link. More details are reported back to the application via VK_EXT_debug_report if enabled..
There's a lot to unpack here. From what I am gathering, it's attempting to split the aliases twice (likely due to the attempted aliasing). This is what I assume causes that "program_source" error to appear twice. It then logs what it's trying to do, where it splits apart the textures and samplers. Afterward, it fails to reserve resources for index 0 of the first in each set of 4 arrays (index 0 of the texture array, and index 0 of the sampler array), likely for the same reason that actual Vulkan failed above.
After looking into this a bit, I found a recommendation to set the environment variable MVK_CONFIG_USE_METAL_ARGUMENT_BUFFERS to a value of 1, which essentially tells MoltenVK to tell Metal to use something called "argument buffers", which is used to "gather multiple resources into a single shader argument". (See "Improving CPU Performance by Using Argument Buffers", developer.apple.com, here)
I'm not all too familiar with Metal (yet), but basically this gives us what we need (or should). The easiest way to test this was to modify my VSCode debugger/launch.json config:
{ "version": "0.2.0", "configurations": [ { "name": "Launch TestBed", ... "osx": { "environment": [ {"name": "MVK_CONFIG_USE_METAL_ARGUMENT_BUFFERS", "value": "1"} ] }, } ] }
Well, this got me a bit further, but still no dice:
[mvk-error] SPIR-V to MSL conversion error: Argument buffer resource base type could not be determined. When padding argument buffer elements, all descriptor set resources must be supplied with a base type by the app. [mvk-error] VK_ERROR_INVALID_SHADER_NV: Fragment shader function could not be compiled into pipeline. See previous logged error.
Searching that got me to this page, which eventually brought me to this page, which ultimately lists this as a regression from the previously released version, 1.3.261.1.
/sigh.
Well, guess what? It still didn't work. Same result.
Here's what I did next:
Through a bit more searching, I found this issue for SPIRV-Cross listing the exact issue I am having. Here, someone in the comments also mentions this as working in 261.1, but not in 268.1. Woo!
After this, I dug around a little bit more and found thispage.
It lists the upcoming release notes for MoltenVK 1.2.7 (Vulkan SDK ships with 1.2.6, the broken version). In there, an important note stuck out to me:
Update to latest SPIRV-Cross:, MSL: Fix regression error in argument buffer runtime arrays.This might be exactly what I am looking for, but the release is TBD as of today (12.2.2023).
The release cadence seems to be every 2-3 months, with the last release being on 10/17. Hopefully this means there will be a release soon for this. This means that, until the next MoltenVK release, and ultimate Vulkan SDK release, the PBR branch of Kohi can't be merged into main and/or released without breaking macOS as a platform.
Now, I don't want to sound unappreciative of all the hard work being done by the MoltenVK and SPIRV-Cross teams. It's amazing work they do and it made porting Kohi to macOS somewhat trivial. However, this does highlight the eventual need to support Metal natively as a renderer backend, and it's an example of why it really needs to be done.
A few things to note - I did post most of my non-macOS findings over on the Reddit post I originally found in case someone goes searching there. I may also post the macOS bits from here over there once this is sorted. I will also update this page when I have solved this issue, one way or another.
UPDATE: I eventually decided against this due to all the trouble it was causing - it just wasn't worth it. I later rewrote my uniform system to handle array types instead and allow dynamic configuration of sampler types which all have their own bindings to eliminate this as an issue.
For example, the configuration of the PBR shader's sampler now looks like this:
# NOTE: samplers are bound in the order they are configured. # albedo,normal,combined (metallic,roughness,ao) uniform=sampler2D[3],1,material_textures # Shadow map uniform=sampler2DArray,1,shadow_textures # IBL uniform=samplerCube,1,ibl_cube_texture
In this example, material_textures is now an array of 3 Sampler2Ds (Note that metallic, roughness and AO maps have also been combined into a single map since this article was originally written, which is why this is 3 samplers and not 5), shadow_textures is now an arrayed (layered) texture that uses a single sampler, and ibl_cube_texture is on its own as a SamplerCube. The three of these, since they are configured separately, each have their own binding. The resulting GLSL shader code looks like this:
// Material textures: albedo, normal, combined (metallic, roughness, ao) layout(set = 1, binding = 1) uniform sampler2D material_textures[3]; // Shadow maps layout(set = 1, binding = 2) uniform sampler2DArray shadow_texture; // Environment map is at the last index. layout(set = 1, binding = 3) uniform samplerCube irradiance_texture;
Note the bindings. This is not only easier to configure and setup, but is also far less error-prone and confusing.
In the end, this was an interesting experiment, but not one that I'll be using in production code.