So here's the story - I wrote the first three Vulkan tutorials under the assumption that the validation layers are active. The purpose of these layers is to do as much validation on the Vulkan API calls as posssible and output warning and error messages whenever something is wrong. Since they add some overhead to the execution of the driver with all the extra checks that they do you are supposed to activate them only during development and then ship the final product to customers with validation disabled to get max performance. This was one of the design goals of Vulkan. But as I said, I was actually running with validation disabled and missed quite a lot of stuff. Even though the program seemed to run as expected there were many things done wrong so I decided to do a tutorial about these mistakes and get back to the original subject (which was vertex buffers) in the next tutorial.

The main problem was that I didn't synchronize access to the swap chain images correctly. If we look at the RenderScene() function we see that we start by acquiring an image from the swap chain. This will be the target image for the current frame. This function returns immediately (it does not hang until an image is actually available) with an index to the next free image. The thing is that since the presentation engine may not have finished reading from this image (during some previous frame) it basically tells us: "Hey! here's a free image for you at index XYZ but make sure you wait on this semaphore before you actually use it". We can get an image without a semaphore (which we did in the previous tutorials by passing in NULL as the semaphore) but it is risky. If we had more to render we could have encountered some glitches. So that's the first semaphore that we need to use. After getting the image the next thing we do is submit a command buffer for rendering into the render queue. This is where we need to use the semaphore so that the command buffer will only be processed in the background after the presentation engine have finished reading from the target image.

The last thing we do in the RenderScene() function is queue a present command. Again, we need to make sure that presentation will only take place after the command buffer has finished processing. This is where the second semaphore comes into play. We make the present command wait on a semaphore that will only be signalled after the draw command has finished processing. When that semaphore is signalled we know it is safe to present the image. So the draw command must wait on a semaphore that will tell it when it can start rendering and it must signal another semaphore that will trigger the presentation of the finished frame. This is quite a complex chain of dependencies that must be constructed carefully to make sure everything runs smoothly without glitches.

Source walkthru

    VkSemaphoreCreateInfo createInfo = {};
    VkSemaphore semaphore;
    VkResult res = vkCreateSemaphore(m_device, &createInfo, NULL, &semaphore);    
    CHECK_VULKAN_ERROR("vkCreateSemaphore error %d\n", res);
    return semaphore;

I've added the member function above to the OgldevVulkanCore class. It is a utility function that wraps the vkCreateSemaphore() API and returns a VkSemaphore handle.

class OgldevVulkanApp
    void CreateSemaphores();
    VkSemaphore m_renderCompleteSem;
    VkSemaphore m_presentCompleteSem;

void OgldevVulkanApp::CreateSemaphores() 
    m_presentCompleteSem = m_core.CreateSemaphore();
    m_renderCompleteSem = m_core.CreateSemaphore();

I've added a couple of semaphores to the private section of the OgldevVulkanApp class - one to signal that rendering has been completed and the other to signal that the presentation engine has finished presenting the frame. There is also a small function that creates the two semaphores (OgldevVulkanApp::CreateSemaphores) using the utility function above. Don't forget to call this function in OgldevVulkanApp::Init() (it will not be specified here again).

void OgldevVulkanApp::RenderScene()
    uint ImageIndex = 0;
    VkResult res = vkAcquireNextImageKHR(m_core.GetDevice(), m_swapChainKHR, UINT64_MAX, m_presentCompleteSem, NULL, &ImageIndex);
    CHECK_VULKAN_ERROR("vkAcquireNextImageKHR error %d\n" , res);
    VkSubmitInfo submitInfo = {};
    submitInfo.sType                = VK_STRUCTURE_TYPE_SUBMIT_INFO;
    submitInfo.commandBufferCount   = 1;
    submitInfo.pCommandBuffers      = &m_cmdBufs[ImageIndex];
    submitInfo.pWaitSemaphores      = &m_presentCompleteSem;  
    submitInfo.waitSemaphoreCount   = 1;
    submitInfo.pWaitDstStageMask    = &waitFlags;
    submitInfo.pSignalSemaphores    = &m_renderCompleteSem;
    submitInfo.signalSemaphoreCount = 1;
    res = vkQueueSubmit(m_queue, 1, &submitInfo, NULL);    
    CHECK_VULKAN_ERROR("vkQueueSubmit error %d\n", res);
    VkPresentInfoKHR presentInfo = {};
    presentInfo.sType              = VK_STRUCTURE_TYPE_PRESENT_INFO_KHR;
    presentInfo.swapchainCount     = 1;
    presentInfo.pSwapchains        = &m_swapChainKHR;
    presentInfo.pImageIndices      = &ImageIndex;
    presentInfo.pWaitSemaphores    = &m_renderCompleteSem;
    presentInfo.waitSemaphoreCount = 1;
    res = vkQueuePresentKHR(m_queue, &presentInfo);    
    CHECK_VULKAN_ERROR("vkQueuePresentKHR error %d\n" , res);

With the two semaphores ready let's see how to use them. The above is the updated RenderScene() function with the new stuff in bold face. We start by providing the present-complete semaphore to vkAcquireNextImageKHR() which acquires a free image from the presentation engine. Previously we passed NULL here but now we are asking the presentation engine to signal that semaphore once it has finished presenting the image so it can be rendered again.

Next we prepare the submit info structure for the command buffer. We provide the address of the present-complete semaphore in the pWaitSemaphores member. This make the command wait until this semaphore is signalled. pWaitSemaphores actually expects an array of semaphores (giving it the capability to wait on multiple semaphores) so we must also specify the number of semaphores in the waitSemaphoresCount member (just one in our case). We also specify the address of the render-complete semaphore in the pSignalSemaphores member (and don't forget the corresponding member signalSemaphoreCount). This makes the driver signal that semaphore once the command buffer has finished processing.

The last piece of interest in the submit info is the pWaitDstStageMask member. The idea behind wait stage masks is to limit the places where a wait occurs so that unnecessary stalls or cache flushes will be avoided. In our case it is best to wait on the present-complete semaphore only when the pipeline reaches the point where the final color values are written out. Therefore, we specify VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT in the pWaitDstStageMask member. This is also an array and because it must be the same size as the pWaitSemaphores array we don't need to specify its size again.

The final call in this function is to the vkQueuePresentKHR() function that takes a VkPresentInfoKHR struct as a parameter. In order to make presentation wait until rendering is complete we must specify the render-complete semaphore in the pWaitSemaphore member (again, remember to set the number of semaphores in the waitSemaphoreCount member).

We've finished covering all the synchronization related parts of the tutorial. Now let's review the set of changes that were detected by the validation layers. We will start with the function where most of the changes took place - CreatePipeline(). Here is the complete source with some comments in-lined:

void OgldevVulkanApp::CreatePipeline()
    VkPipelineShaderStageCreateInfo shaderStageCreateInfo[2] = {};
    shaderStageCreateInfo[0].stage = VK_SHADER_STAGE_VERTEX_BIT;
    shaderStageCreateInfo[0].module = m_vsModule;
    shaderStageCreateInfo[0].pName = "main";
    shaderStageCreateInfo[1].stage = VK_SHADER_STAGE_FRAGMENT_BIT;
    shaderStageCreateInfo[1].module = m_fsModule;
    shaderStageCreateInfo[1].pName = "main";   
    VkPipelineVertexInputStateCreateInfo vertexInputInfo = {};
    VkPipelineInputAssemblyStateCreateInfo pipelineIACreateInfo = {};
    pipelineIACreateInfo.topology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST;
    VkViewport vp = {};
    vp.x = 0.0f;
    vp.y = 0.0f;
    vp.width  = (float)WINDOW_WIDTH;
    vp.height = (float)WINDOW_HEIGHT;
    vp.minDepth = 0.0f;
    vp.maxDepth = 1.0f;
    VkRect2D scissor;
    scissor.offset.x = 0;
    scissor.offset.y = 0;
    scissor.extent.width = WINDOW_WIDTH;
    scissor.extent.height = WINDOW_HEIGHT;
    VkPipelineViewportStateCreateInfo vpCreateInfo = {};
    vpCreateInfo.viewportCount = 1;
    vpCreateInfo.pViewports = &vp;
    vpCreateInfo.scissorCount = 1;
    vpCreateInfo.pScissors = &scissor;

Here you can see that I've added a scissor with the same size as the viewport. The scissor is simply yet another test that determines whether a fragment should be dropped (when it is outside of the scissor rectangle). This was detected by a message from the validation layer that said that since the multi viewport feature is disabled the scissor must be specified. In addition, the spec says in section 25.2 that "If the pipeline state object is created without VK_DYNAMIC_STATE_SCISSOR enabled then the scissor rectangles are set by the VkPipelineViewportStateCreateInfo state of the pipeline state object. Otherwise, to dynamically set the scissor rectangles call: vkCmdSetScissor()". In a simple program such as this it makes sense not to use a dynamic scissor and simply specify it when creating the pipeline object. You will later see that I dropped the usage of vkCmdSetScissor(), per the comment from the spec.

    VkPipelineRasterizationStateCreateInfo rastCreateInfo = {};
    rastCreateInfo.polygonMode = VK_POLYGON_MODE_FILL;
    rastCreateInfo.cullMode = VK_CULL_MODE_BACK_BIT;
    rastCreateInfo.frontFace = VK_FRONT_FACE_COUNTER_CLOCKWISE;
    rastCreateInfo.lineWidth = 1.0f;
    VkPipelineMultisampleStateCreateInfo pipelineMSCreateInfo = {};
    pipelineMSCreateInfo.rasterizationSamples = VK_SAMPLE_COUNT_1_BIT;

I've added the rasterizationSamples here because the spec (section 24) says that we have to put a valid value. This value is the number of samples per pixel during rasterization and is not too important at this point so I just use the minimal value.

    VkPipelineColorBlendAttachmentState blendAttachState = {};
    blendAttachState.colorWriteMask = 0xf;
    VkPipelineColorBlendStateCreateInfo blendCreateInfo = {};
    blendCreateInfo.logicOp = VK_LOGIC_OP_COPY;
    blendCreateInfo.attachmentCount = 1;
    blendCreateInfo.pAttachments = &blendAttachState;
    VkPipelineLayoutCreateInfo layoutInfo = {};
    VkResult res = vkCreatePipelineLayout(m_core.GetDevice(), &layoutInfo, NULL, &m_pipelineLayout);
    CHECK_VULKAN_ERROR("vkCreatePipelineLayout error %d\n", res);    

Various resources such as buffers, image views and samplers can be accessed in a shader stage. A descriptor is a data structure which describes these resources and connects them to the pipeline object. Right now our shader is very simple and doesn't need any external data (e.g. transformation matrices) but we are still required to provide some minimal descriptor meta data. In the above highlighted block of code we create that.

The object that was missing in the previous tutorial and caused some validation warnings is called Pipeline Layout. This object points to the descriptor sets and the descriptor sets are groups of descriptors. Since we are not going to reference any real resources in this tutorial the only thing that we need to create is a pipeline layout object. This object is mandatory for the creation of the pipeline. The pipeline layout usually points to descriptor set layout but here it is enough to create an empty VkPipelineLayoutCreateInfo struct and only set its type. Next we call vkCreatePipelineLayout() in order to create the pipeline layout.

    VkGraphicsPipelineCreateInfo pipelineInfo = {};
    pipelineInfo.stageCount = ARRAY_SIZE_IN_ELEMENTS(shaderStageCreateInfo);
    pipelineInfo.pStages = &shaderStageCreateInfo[0];
    pipelineInfo.pVertexInputState = &vertexInputInfo;
    pipelineInfo.pInputAssemblyState = &pipelineIACreateInfo;
    pipelineInfo.pViewportState = &vpCreateInfo;
    pipelineInfo.pRasterizationState = &rastCreateInfo;
    pipelineInfo.pMultisampleState = &pipelineMSCreateInfo;
    pipelineInfo.pColorBlendState = &blendCreateInfo;
    pipelineInfo.layout = m_pipelineLayout;
    pipelineInfo.renderPass = m_renderPass;
    pipelineInfo.basePipelineIndex = -1;
    res = vkCreateGraphicsPipelines(m_core.GetDevice(), VK_NULL_HANDLE, 1, &pipelineInfo, NULL, &m_pipeline);
    CHECK_VULKAN_ERROR("vkCreateGraphicsPipelines error %d\n", res);

The only thing that remains in order to complete a valid creation of the graphics pipeline is to set the layout member in the pipeline create info struct to point to the pipeline layout that we just created.

void OgldevVulkanApp::RecordCommandBuffers()
    VkCommandBufferBeginInfo beginInfo = {};
    VkClearColorValue clearColor = { 164.0f/256.0f, 30.0f/256.0f, 34.0f/256.0f, 0.0f };
    VkClearValue clearValue = {};
    clearValue.color = clearColor;
    VkImageSubresourceRange imageRange = {};
    imageRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
    imageRange.levelCount = 1;
    imageRange.layerCount = 1;
    VkRenderPassBeginInfo renderPassInfo = {};
    renderPassInfo.renderPass = m_renderPass;
    renderPassInfo.renderArea.offset.x = 0;
    renderPassInfo.renderArea.offset.y = 0;
    renderPassInfo.renderArea.extent.width = WINDOW_WIDTH;
    renderPassInfo.renderArea.extent.height = WINDOW_HEIGHT;
    renderPassInfo.clearValueCount = 1;
    renderPassInfo.pClearValues = &clearValue;

    VkViewport viewport = { 0 };
    viewport.height = (float)WINDOW_HEIGHT;
    viewport.width = (float)WINDOW_WIDTH;
    viewport.minDepth = (float)0.0f;
    viewport.maxDepth = (float)1.0f;
    VkRect2D scissor = { 0 };
    scissor.extent.width = WINDOW_WIDTH;
    scissor.extent.height = WINDOW_HEIGHT;
    scissor.offset.x = 0;
    scissor.offset.y = 0;

    for (uint i = 0 ; i < m_cmdBufs.size() ; i++) {
        VkResult res = vkBeginCommandBuffer(m_cmdBufs[i], &beginInfo);
        CHECK_VULKAN_ERROR("vkBeginCommandBuffer error %d\n", res);
        renderPassInfo.framebuffer = m_fbs[i];

        vkCmdBeginRenderPass(m_cmdBufs[i], &renderPassInfo, VK_SUBPASS_CONTENTS_INLINE);

        vkCmdBindPipeline(m_cmdBufs[i], VK_PIPELINE_BIND_POINT_GRAPHICS, m_pipeline);
        vkCmdSetViewport(m_cmdBufs[i], 0, 1, &viewport);

        vkCmdSetScissor(m_cmdBufs[i], 0, 1, &scissor);

        vkCmdDraw(m_cmdBufs[i], 3, 1, 0, 0);
        res = vkEndCommandBuffer(m_cmdBufs[i]);
        CHECK_VULKAN_ERROR("vkEndCommandBuffer error %d\n", res);
    printf("Command buffers recorded\n"); 

Above we have the complete RecordCommandBuffers() function with a major change - I dropped all references to the viewports and scissors here. The reason is that I added the scissor definition to pipeline viewport state (VkPipelineViewportStateCreateInfo) earlier and since we didn't specify that we plan to use a dynamic scissor what we have in the pipeline object is enough and we don't need to specify the viewport and scissor again when we record the command buffers. This makes the command buffer simpler, which is always a good thing.

comments powered by Disqus