常量缓冲区(使用根描述符表)
作者:互联网
在本教程中,我们将看到如何使用包含常量缓冲区视图的描述符表将数据发送到着色器。
根签名我将以根签名的说明开始本教程。
根签名本质上是管道中着色器的函数签名(也称为参数列表)。 根签名的这一部分称为根参数。 函数签名描述了函数期望的数据。void somefunction(int arg1, int arg2); // (int arg1, int arg2) is the function signature, or parameter list
根签名还包含称为根参数的根参数的参数或数据。
根参数
编写着色器时,会将着色器作为函数编写。 我们包括一个函数签名,它描述了着色器从上一阶段期望的数据。 让我们看一个简单的顶点着色器的例子:
float4 main(float3 pos : POSITION) : SV_POSITION { return float4(pos,1.0); }
在上面的函数中,您会看到顶点着色器期望将float3作为函数的输入。这是它期望从上一个管道阶段(即输入装配(IA)阶段)获得的输入。
由于我们已经有着色器的参数列表,因此什么是“根参数”?
将流水线本身视为一个函数,由运行在CPU上的应用程序调用。管道本身具有一个参数列表,该参数列表由“根签名”描述。这些参数描述了我们想要从应用程序中获取的数据。它们可以采用常量缓冲区视图(CBV),着色器资源视图(SRV)或无序访问视图(UAV)的形式。
刚开始时可能会有些混乱,但是资源视图是资源描述符的另一个(旧的)词。例如,CBV,SRV和UAV只是保留了DirectX先前迭代中的名称。将它们称为“常量缓冲区描述符”或“着色器资源描述符”将具有相同的含义。
让我们将管道可视化为功能的一秒钟:(当然,这实际上不是管道的实际情况,只是为了直观了解根签名适合的位置)// this is our gpu memory where resource heaps are actually at ResourceHeap resourceHeaps[]; // this is the descriptor heap DescriptorHeap descriptorHeaps[]; // this is our register list register b[]; // constant buffer register list register t[]; // shader resource register list register u[]; // uav register list // our root signature is the parameter list to the pipeline RenderTargetList RunPipeline(RootSignature rootSignature) { // loop through each descriptor table for(int i = 0; i < rootSignature.DescriptorTables.length; i++) { int startRegister = rootSignature.DescriptorTables[i].Range.BaseShaderRegister; for(int k = 0; k < rootSignature.DescriptorTables[i].Range.length; k++) { // if its a constant buffer descriptor table use b registers if(rootSignature.DescriptorTables[i].Range[k].RangeType == D3D12_DESCRIPTOR_RANGE_TYPE_CBV) { // there are two indirections for descriptor tables b[startRegister + k] = GetResourcePointer(GetDescriptorFromTable(rootSignature.DescriptorTables[i].Range[k].descriptorIndex)); } // use t registers for srv's else if(rootSignature.DescriptorTables[i].Range[k].RangeType == D3D12_DESCRIPTOR_RANGE_TYPE_SRV) { // there are two indirections for descriptor tables t[startRegister + k] = GetResourcePointer(GetDescriptorFromTable(rootSignature.DescriptorTables[i].Range[k].descriptorIndex)); } // ... then uav's and samplers } } // loop through each root descriptor for(int i = 0; i < rootSignature.RootDescriptors.length; i++) { // set registers for root descriptors. There is only one redirection here } // loop through each root constant for(int i = 0; i < rootSignature.RootConstants.length; i++) { // set registers to root constants. root constants have no indirections, making them the fastest // to access, but the number of them are limited by the root signature parameter limit. } VertexInput vi = null; if(rootSignature.D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT) { // If we specify to use the input assembler in the root signature, the IA will run and assembler // all the geometry we have bound to it, then pass the vertices to the vertex shader // it is possible to not use the input assembler at all, but instead draw a certain number of vertices // and use their index to differentiate them, then create more goemetry in the geometry shader. vi = RunInputAssembler(); } // here we run the bound vertex shader VertexOutput vo = RunVertexShader(vi); // ... run other stages } // heres an example of a vertex shader now VertexOutput RunVertexShader(VertexInput vi) { // this constant buffer is bound to register b0. We must // make sure that the bound root signature has a parameter that // sets the b0 register cbuffer ConstantBuffer : register(b0) { float4 positionOffset; }; // here is our vertex shader function. We use positionOffset, which is defined in a constant buffer. // This constant buffer is updated by the root signature. We must make sure that the root signature contains // a parameter for register b0, since that is what the constant buffer is bound to. float4 main(float3 pos : POSITION) : SV_POSITION { output.pos = float4(input.pos, 1.0f); output.color = input.color; return float4(pos.x + positionOffset.x, pos.y + positionOffset.y + pos.z + positionOffset.z, 1.0); } }
以上可能是多余的代码,无法可视化所有这方面的根签名位置,但是希望这将为视觉学习者提供一个更好的主意。因此,从上面可以看到,根签名是管道的参数列表。任何使用的寄存器都必须由根签名设置,这意味着根签名必须为所有要设置的寄存器都具有一个参数。
根参数有三种类型:描述符表,根描述符和根常量。
根签名的大小有限制。根参数最多只能添加64个DWORD。
如果使用输入装配,则必须使用的总内存Root参数为63个DWORD。
某些硬件的根签名可用空间少于64个DWORD。根签名末尾的任何根参数溢出根签名中的可用内存,都会添加1个间接寻址。
您可以将间接访问视为指向内存的指针。例如,根常量具有0个间接寻址,这意味着可以在着色器中立即访问数据。根描述符具有1个间接寻址,这意味着着色器必须跟随指针以获取数据的实际位置。最后,描述符表具有2个间接寻址,这意味着它跟随描述符表指针指向描述符堆,然后跟随描述符堆内部的描述符到达实际资源数据。因此,任何不适合根描述符可用内存的参数都将添加1个间接寻址,从而导致根常量为1个间接寻址,根描述符为2个间接寻址和描述符表为3个间接寻址。
创建根参数时,您想拒绝任何着色器阶段访问不需要该参数的参数。这使GPU可以优化对数据的访问根常量
根常量是直接存储在根签名中的32位值(1个DWORD)。 它们在根签名中占用1个DWORD的空间。 这些变量应该是最常访问的变量,因为它们比描述符所指向的常量缓冲区的访问速度更快。 ProjectionView 矩阵可能是根常量的很好的候选者,因为通常可见场景中的每个顶点都可以访问它。
根描述符
根描述符是内联描述符。 它们在根签名中占用2个DWORD的空间。 这些应该是经常访问的资源的描述符,因为根描述符中用于存储参数的空间有限。
根描述符采用1个间接方式来获取资源数据。 这种间接性来自描述符,它是指向资源数据的指针。 从着色器访问根描述符时,必须查找描述符指向的资源。描述符表
最后,我们有描述符表。 这些每个花费1个DWORD,并且是一个描述符范围。 该范围指定描述符堆中描述符的开始和数目。 使用描述符表,您可以使用任意数量的描述符。 缺点是额外的间接访问。
描述符表采用2种间接方式来获取资源数据。 描述符表指向描述符堆内部的描述符,该描述符堆指向实际资源数据。
在本教程中,我们将使用描述符表,因为它们将是最常用的参数类型,因为大多数场景所包含的着色器所需的纹理和数据多于64个DWORDs根参数可添加的总和。帧缓冲
我们将为常量缓冲区的每个帧需要一个描述符堆和资源堆。 这样一来,当一帧使用常量缓冲区时,另一帧可以对其进行更新。 我们不想更新当前正在使用的常量缓冲区,因此我们将为每个帧创建一个常量缓冲区资源堆和描述符堆。
教程代码在本教程中,我们将每帧更改四边形的颜色。 为此,我们在每个帧的常量缓冲区中更新一个称为colorMultiplier的变量。 然后,我们将每个顶点的颜色乘以该值。
我们将创建一个描述符堆来存储常量缓冲区视图(CBV),并创建一个描述符表,该表是该描述符堆的范围(该范围只有一个,因为我们只有一个常量缓冲区)。
好吧,我认为我们已经为代码做好了准备。常缓冲结构
当我们使用map在GPU上更新常量缓冲区数据时,我们需要确保正在更新内存的正确部分。 为了使这更容易,我们可以创建一个恒定的缓冲区结构。 我们将在CPU上创建此结构的实例。 用所需的常量缓冲区中的数据更新实例后,我们基本上将数据从该实例复制到GPU上映射的常量缓冲区数据中。 我们并不需要一个结构,但是它使更新GPU上的常量缓冲区变得容易得多。
我们的常量缓冲区现在仅包含colorMultiplier变量。 这是4个浮点值x,y,z和w的向量。 这些是颜色的红色,绿色,蓝色和Alpha通道。 在顶点着色器中,我们将为每个顶点传递的颜色与此颜色乘数相乘,以得到新的颜色。// this is the structure of our constant buffer. struct ConstantBuffer { XMFLOAT4 colorMultiplier; };
新全局变量
第一个变量是描述符堆。 这是一个描述符堆,用于存储我们的常量缓冲区(我们只有一个,但是在更大的应用程序中会有更多)。
第二个变量是资源,它将是存储我们的常量缓冲区数据的GPU上的实际内存。 这称为资源堆,我们将其创建为上传堆。
第三个是我们的ConstantBuffer结构的实例。
最后,我们有一个从资源堆的map方法获得的内存地址。 我们可以使用该地址将常量缓冲区数据复制到该地址。 在执行使用该内存的命令列表之前,我们必须完成将数据复制到上传堆的操作。ID3D12DescriptorHeap* mainDescriptorHeap[frameBufferCount]; // this heap will store the descripor to our constant buffer ID3D12Resource* constantBufferUploadHeap[frameBufferCount]; // this is the memory on the gpu where our constant buffer will be placed. ConstantBuffer cbColorMultiplierData; // this is the constant buffer data we will send to the gpu // (which will be placed in the resource we created above) UINT8* cbColorMultiplierGPUAddress[frameBufferCount]; // this is a pointer to the memory location we get when we map our constant buffer
将描述符表参数添加到根签名
我们为PSO创建的根签名必须与PSO的着色器兼容。 我们的顶点着色器使用绑定到寄存器b0的常量缓冲区,这意味着我们必须创建一个带有绑定到寄存器b0的参数的根签名。
我们将创建一个描述符表,该表将描述常量缓冲区描述符堆中的一系列描述符。
首先创建描述符范围。 我们通过填写D3D12_DESCRIPTOR_RANGE结构来实现。typedef struct D3D12_DESCRIPTOR_RANGE { D3D12_DESCRIPTOR_RANGE_TYPE RangeType; UINT NumDescriptors; UINT BaseShaderRegister; UINT RegisterSpace; UINT OffsetInDescriptorsFromTableStart; } D3D12_DESCRIPTOR_RANGE;
- RangeType-这是D3D12_DESCRIPTOR_RANGE_TYPE枚举。 这说明这是srv,uav,cbv还是采样器的范围
- NumDescriptors-这是范围内的描述符数。 在本教程中,我们只有一个常量缓冲区,因此范围中只有一个描述符。
- BaseShaderRegister-这是此范围绑定到的第一个寄存器。 每个描述符应映射到一个寄存器。 我们将RangeType指定为D3D12_DESCRIPTOR_RANGE_TYPE_CBV,这意味着它是b寄存器。 我们只有一个描述符,所以我们说这个范围始于0寄存器,即寄存器b0
- RegisterSpace-这是寄存器空间。 我们将其设置为0。
- OffsetInDescriptorsFromTableStart-这是描述符从该范围开始的根签名中的描述符开始的偏移量。 我们可以在此处指定D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND,以表示我们只是将此描述符表附加到根参数的末尾。
// create a descriptor range (descriptor table) and fill it out // this is a range of descriptors inside a descriptor heap D3D12_DESCRIPTOR_RANGE descriptorTableRanges[1]; // only one range right now descriptorTableRanges[0].RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_CBV; // this is a range of constant buffer views (descriptors) descriptorTableRanges[0].NumDescriptors = 1; // we only have one constant buffer, so the range is only 1 descriptorTableRanges[0].BaseShaderRegister = 0; // start index of the shader registers in the range descriptorTableRanges[0].RegisterSpace = 0; // space 0. can usually be zero descriptorTableRanges[0].OffsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND; // this appends the range to the end of the root signature descriptor tables
现在,我们通过填充D3D12_ROOT_DESCRIPTOR_TABLE结构来创建描述符表。
typedef struct D3D12_ROOT_DESCRIPTOR_TABLE { UINT NumDescriptorRanges; const D3D12_DESCRIPTOR_RANGE *pDescriptorRanges; } D3D12_ROOT_DESCRIPTOR_TABLE;
- NumDescriptorRanges-此描述符表将包含的范围数。 我们可以在这里将srv,cbv和uav范围的组合放在一起。 采样器不能与其他采样器结合使用。
- pDescriptorRanges-指向范围数组的指针。
尽管范围只能包含相同类型的资源描述符(即cbv,uav,srv),但是描述符表可以包含一系列不同类型的范围。
// create a descriptor table D3D12_ROOT_DESCRIPTOR_TABLE descriptorTable; descriptorTable.NumDescriptorRanges = _countof(descriptorTableRanges); // we only have one range descriptorTable.pDescriptorRanges = &descriptorTableRanges[0]; // the pointer to the beginning of our ranges array
现在,我们创建一个根参数。 我们通过填写D3D12_ROOT_PARAMETER结构来完成此操作。
typedef struct D3D12_ROOT_PARAMETER { D3D12_ROOT_PARAMETER_TYPE ParameterType; union { D3D12_ROOT_DESCRIPTOR_TABLE DescriptorTable; D3D12_ROOT_CONSTANTS Constants; D3D12_ROOT_DESCRIPTOR Descriptor; }; D3D12_SHADER_VISIBILITY ShaderVisibility; } D3D12_ROOT_PARAMETER;
- ParameterType-这是参数的类型,由枚举D3D12_ROOT_PARAMETER_TYPE定义。我们正在创建描述符表根参数,因此我们使用D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE枚举
- DescriptorTable-仅当将D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE指定为ParameterType时,才填写此表。这是D3D12_ROOT_DESCRIPTOR_TABLE结构。
- Constants-仅当将D3D12_ROOT_PARAMETER_TYPE_32BIT_CONSTANTS指定为ParameterType时,才填写此内容。这是D3D12_ROOT_CONSTANTS结构
- Descriptor-仅当其他D3D12_ROOT_PARAMETER_TYPE中的任何一个指定为ParameterType时,才填写此内容。这是D3D12_ROOT_DESCRIPTOR
- ShaderVisibility-这是D3D12_SHADER_VISIBILITY枚举。此参数描述哪些着色器可以访问此参数。您可以指定D3D12_SHADER_VISIBILITY_ALL以允许所有着色器访问,或者为每个有权访问的着色器一起“或”(|)。仅授予使用它的着色器的访问权限,这将允许GPU优化参数。目前,只有我们的顶点着色器可以访问此常量缓冲区,因此我们指定D3D12_SHADER_VISIBILITY_VERTEX.
// create a root parameter and fill it out D3D12_ROOT_PARAMETER rootParameters[1]; // only one parameter right now rootParameters[0].ParameterType = D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE; // this is a descriptor table rootParameters[0].DescriptorTable = descriptorTable; // this is our descriptor table for this root parameter rootParameters[0].ShaderVisibility = D3D12_SHADER_VISIBILITY_VERTEX; // our pixel shader will be the only shader accessing this parameter for now
最后,我们填写我们的根签名结构。 我们必须提供一个指向D3D12_ROOT_PARAMETER数组的指针。
请注意,我还应该添加标志以拒绝着色器访问不需要它的根签名。 基本上,目前只有顶点着色器需要根签名。CD3DX12_ROOT_SIGNATURE_DESC rootSignatureDesc; rootSignatureDesc.Init(_countof(rootParameters), // we have 1 root parameter rootParameters, // a pointer to the beginning of our root parameters array 0, nullptr, D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT | // we can deny shader stages here for better performance D3D12_ROOT_SIGNATURE_FLAG_DENY_HULL_SHADER_ROOT_ACCESS | D3D12_ROOT_SIGNATURE_FLAG_DENY_DOMAIN_SHADER_ROOT_ACCESS | D3D12_ROOT_SIGNATURE_FLAG_DENY_GEOMETRY_SHADER_ROOT_ACCESS | D3D12_ROOT_SIGNATURE_FLAG_DENY_PIXEL_SHADER_ROOT_ACCESS);
新顶点列表
我们已从上一教程中删除了第二个四边形
// a quad Vertex vList[] = { // first quad (closer to camera, blue) { -0.5f, 0.5f, 0.5f, 1.0f, 0.0f, 0.0f, 1.0f }, { 0.5f, -0.5f, 0.5f, 1.0f, 0.0f, 1.0f, 1.0f }, { -0.5f, -0.5f, 0.5f, 0.0f, 0.0f, 1.0f, 1.0f }, { 0.5f, 0.5f, 0.5f, 0.0f, 1.0f, 0.0f, 1.0f } };
常量缓冲区描述符堆
我们必须创建一个描述符堆来存储常量缓冲区描述符。 实际上,我们可以为所有cbv,uav和srv描述符创建一个描述符堆,因此在以后的教程中,我们将srv的描述符添加到该描述符堆。
我们首先填写D3D12_DESCRIPTOR_HEAP_DESC。 实际上,我们在上一教程中已经讨论了这种结构,因此在此我将不对其进行详细介绍。
您会注意到我们已经将此描述符堆的类型设置为D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV,以便我们使用它来存储常量缓冲区描述符。for (int i = 0; i < frameBufferCount; ++i) { D3D12_DESCRIPTOR_HEAP_DESC heapDesc = {}; heapDesc.NumDescriptors = 1; heapDesc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE; heapDesc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV; hr = device->CreateDescriptorHeap(&heapDesc, IID_PPV_ARGS(&mainDescriptorHeap[i])); if (FAILED(hr)) { Running = false; } }
创建常量缓冲区资源堆
我们将创建一个上传堆来保存我们的常量缓冲区。 由于我们将频繁地更新此常量缓冲区(每帧至少更新一次),因此没有理由创建默认堆以将上传堆复制到该堆。 常量缓冲区无论如何都会在每帧上传到GPU,因此我们只将其保存在上传堆中。
我们创建一个大小为64KB的缓冲区。 这与对齐要求有关。 资源堆必须是64KB的倍数。 因此,即使我们的常量缓冲区只有16个字节(4个浮点数的数组),我们也必须分配至少64KB。 如果我们的恒定缓冲区为65KB,则需要分配128KB。
单纹理和缓冲区资源必须对齐64KB。 多采样纹理资源必须对齐4MB。
着色器将读取此资源堆,因此我们将起始状态设置为D3D12_RESOURCE_STATE_GENERIC_READ。// create the constant buffer resource heap // We will update the constant buffer one or more times per frame, so we will use only an upload heap // unlike previously we used an upload heap to upload the vertex and index data, and then copied over // to a default heap. If you plan to use a resource for more than a couple frames, it is usually more // efficient to copy to a default heap where it stays on the gpu. In this case, our constant buffer // will be modified and uploaded at least once per frame, so we only use an upload heap // create a resource heap, descriptor heap, and pointer to cbv for each frame for (int i = 0; i < frameBufferCount; ++i) { hr = device->CreateCommittedResource( &CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_UPLOAD), // this heap will be used to upload the constant buffer data D3D12_HEAP_FLAG_NONE, // no flags &CD3DX12_RESOURCE_DESC::Buffer(1024 * 64), // size of the resource heap. Must be a multiple of 64KB for single-textures and constant buffers D3D12_RESOURCE_STATE_GENERIC_READ, // will be data that is read from so we keep it in the generic read state nullptr, // we do not have use an optimized clear value for constant buffers IID_PPV_ARGS(&constantBufferUploadHeap)); constantBufferUploadHeap->SetName(L"Constant Buffer Upload Resource Heap");
创建常量缓冲区视图
我们将创建一个常量缓冲区视图,该视图描述了常量缓冲区并包含一个指向常量缓冲区数据所在的内存的指针。 为此,我们填写了D3D12_CONSTANT_BUFFER_VIEW_DESC结构。
我们可以通过调用常量缓冲区资源堆的GetGPUVirtualAddress方法来获取指向GPU内存的指针。
常量缓冲区必须为256字节对齐,这与资源堆的对齐要求不同。 从资源堆的开始,常量缓冲区读取必须对齐256字节。 对于SizeInBytes字段,我们获得常量缓冲区的大小,并添加一些字节以使其对齐256个字节。D3D12_CONSTANT_BUFFER_VIEW_DESC cbvDesc = {}; cbvDesc.BufferLocation = constantBufferUploadHeap->GetGPUVirtualAddress(); cbvDesc.SizeInBytes = (sizeof(ConstantBuffer) + 255) & ~255; // CB size is required to be 256-byte aligned. device->CreateConstantBufferView(&cbvDesc, mainDescriptorHeap->GetCPUDescriptorHandleForHeapStart());
清除常量缓冲区数据
在这里,我们只是想将常量缓冲区数据中的所有内存清零,以开始使用。
ZeroMemory(&cbColorMultiplierData, sizeof(cbColorMultiplierData));
映射常量缓冲区
在初始化期间,我们需要对常量缓冲区进行的最后一件事就是对其进行映射。
首先,我们创建一个范围。此范围是CPU可以访问的恒定缓冲区内的内存区域。我们可以将begin设置为等于或大于end,这意味着CPU将不会从常量缓冲区中读取数据。
接下来,我们映射我们的常量缓冲区资源。当命令列表使用它时,我们将获得指向GPU将访问并上传堆的一块内存的指针。我们必须确保在执行将使用此资源的命令列表时,已完成在CPU上修改此映射地址的操作。
只要您需要,就可以保留资源映射的时间。我们只需要确保释放资源后就不访问映射区域即可。
我们将保持恒定的缓冲区资源映射到整个应用程序中。
一旦我们映射了资源,就可以使用memcpy将数据复制到其中。每次将值更新为从Map获取的地址时,我们都会对整个ConstantBuffer实例进行mcpy处理,以便在执行命令列表时,它将把该数据块上传到寄存器b0中,并且我们的顶点着色器将可以访问新数据。CD3DX12_RANGE readRange(0, 0); // We do not intend to read from this resource on the CPU. (End is less than or equal to begin) hr = constantBufferUploadHeap->Map(0, &readRange, reinterpret_cast<void**>(&cbColorMultiplierGPUAddress)); memcpy(cbColorMultiplierGPUAddress, &cbColorMultiplierData, sizeof(cbColorMultiplierData)); }
Update函数
在这里,我们有我们的更新函数。 我们终于有了东西! 这是我们更新游戏逻辑的地方,在本教程中,该逻辑涉及更新颜色倍增器并将我们的ConstantBuffer实例数据复制到映射的常量缓冲区资源中
void Update() { // update app logic, such as moving the camera or figuring out what objects are in view static float rIncrement = 0.00002f; static float gIncrement = 0.00006f; static float bIncrement = 0.00009f; cbColorMultiplierData.colorMultiplier.x += rIncrement; cbColorMultiplierData.colorMultiplier.y += gIncrement; cbColorMultiplierData.colorMultiplier.z += bIncrement; if (cbColorMultiplierData.colorMultiplier.x >= 1.0 || cbColorMultiplierData.colorMultiplier.x <= 0.0) { cbColorMultiplierData.colorMultiplier.x = cbColorMultiplierData.colorMultiplier.x >= 1.0 ? 1.0 : 0.0; rIncrement = -rIncrement; } if (cbColorMultiplierData.colorMultiplier.y >= 1.0 || cbColorMultiplierData.colorMultiplier.y <= 0.0) { cbColorMultiplierData.colorMultiplier.y = cbColorMultiplierData.colorMultiplier.y >= 1.0 ? 1.0 : 0.0; gIncrement = -gIncrement; } if (cbColorMultiplierData.colorMultiplier.z >= 1.0 || cbColorMultiplierData.colorMultiplier.z <= 0.0) { cbColorMultiplierData.colorMultiplier.z = cbColorMultiplierData.colorMultiplier.z >= 1.0 ? 1.0 : 0.0; bIncrement = -bIncrement; } // copy our ConstantBuffer instance to the mapped constant buffer resource memcpy(cbColorMultiplierGPUAddress[frameIndex], &cbColorMultiplierData, sizeof(cbColorMultiplierData)); }
设置描述符堆和根描述符表
首先,我们创建一个描述符堆数组。 然后,将管道描述符堆设置为mainDescriptorHeap。
设置描述符堆后,需要将根参数0(描述符表)的值设置为mainDescriptorHeap的位置。// set constant buffer descriptor heap ID3D12DescriptorHeap* descriptorHeaps[] = { mainDescriptorHeap[frameIndex] }; commandList->SetDescriptorHeaps(_countof(descriptorHeaps), descriptorHeaps); // set the root descriptor table 0 to the constant buffer descriptor heap commandList->SetGraphicsRootDescriptorTable(0, mainDescriptorHeap[frameIndex]->GetGPUDescriptorHandleForHeapStart());
清理
最后,我们释放资源
for (int i = 0; i < frameBufferCount; ++i) { SAFE_RELEASE(mainDescriptorHeap[i]); SAFE_RELEASE(constantBufferUploadHeap[i]); };
新顶点着色器
我们在顶点着色器中添加了一个常量缓冲区。
您将看到我们已将此顶点缓冲区绑定到寄存器b0。 我们必须确保我们设置的根签名与我们在PSO中设置的着色器兼容。
我们将使用ConstantBuffer中的变量colorMultiplier通过将其与原始顶点颜色相乘来获得新的顶点颜色。完整源代码struct VS_INPUT { float3 pos : POSITION; float4 color: COLOR; }; struct VS_OUTPUT { float4 pos: SV_POSITION; float4 color: COLOR; }; cbuffer ConstantBuffer : register(b0) { float4 colorMultiplier; }; VS_OUTPUT main(VS_INPUT input) { VS_OUTPUT output; output.pos = float4(input.pos, 1.0f); output.color = input.color * colorMultiplier; return output; }
VertexShader.hlsl
struct VS_INPUT { float3 pos : POSITION; float4 color: COLOR; }; struct VS_OUTPUT { float4 pos: SV_POSITION; float4 color: COLOR; }; cbuffer ConstantBuffer : register(b0) { float4 colorMultiplier; }; VS_OUTPUT main(VS_INPUT input) { VS_OUTPUT output; output.pos = float4(input.pos, 1.0f); output.color = input.color * colorMultiplier; return output; }
PixelShader.hlsl
struct VS_OUTPUT { float4 pos: SV_POSITION; float4 color: COLOR; }; float4 main(VS_OUTPUT input) : SV_TARGET { // return interpolated color return input.color; }
stdafx.h
#pragma once #ifndef WIN32_LEAN_AND_MEAN #define WIN32_LEAN_AND_MEAN // Exclude rarely-used stuff from Windows headers. #endif #include <windows.h> #include <d3d12.h> #include <dxgi1_4.h> #include <D3Dcompiler.h> #include <DirectXMath.h> #include "d3dx12.h" #include <string> // this will only call release if an object exists (prevents exceptions calling release on non existant objects) #define SAFE_RELEASE(p) { if ( (p) ) { (p)->Release(); (p) = 0; } } using namespace DirectX; // we will be using the directxmath library // Handle to the window HWND hwnd = NULL; // name of the window (not the title) LPCTSTR WindowName = L"BzTutsApp"; // title of the window LPCTSTR WindowTitle = L"Bz Window"; // width and height of the window int Width = 800; int Height = 600; // is window full screen? bool FullScreen = false; // we will exit the program when this becomes false bool Running = true; // create a window bool InitializeWindow(HINSTANCE hInstance, int ShowWnd, bool fullscreen); // main application loop void mainloop(); // callback function for windows messages LRESULT CALLBACK WndProc(HWND hWnd, UINT msg, WPARAM wParam, LPARAM lParam); // direct3d stuff const int frameBufferCount = 3; // number of buffers we want, 2 for double buffering, 3 for tripple buffering ID3D12Device* device; // direct3d device IDXGISwapChain3* swapChain; // swapchain used to switch between render targets ID3D12CommandQueue* commandQueue; // container for command lists ID3D12DescriptorHeap* rtvDescriptorHeap; // a descriptor heap to hold resources like the render targets ID3D12Resource* renderTargets[frameBufferCount]; // number of render targets equal to buffer count ID3D12CommandAllocator* commandAllocator[frameBufferCount]; // we want enough allocators for each buffer * number of threads (we only have one thread) ID3D12GraphicsCommandList* commandList; // a command list we can record commands into, then execute them to render the frame ID3D12Fence* fence[frameBufferCount]; // an object that is locked while our command list is being executed by the gpu. We need as many //as we have allocators (more if we want to know when the gpu is finished with an asset) HANDLE fenceEvent; // a handle to an event when our fence is unlocked by the gpu UINT64 fenceValue[frameBufferCount]; // this value is incremented each frame. each fence will have its own value int frameIndex; // current rtv we are on int rtvDescriptorSize; // size of the rtv descriptor on the device (all front and back buffers will be the same size) // function declarations bool InitD3D(); // initializes direct3d 12 void Update(); // update the game logic void UpdatePipeline(); // update the direct3d pipeline (update command lists) void Render(); // execute the command list void Cleanup(); // release com ojects and clean up memory void WaitForPreviousFrame(); // wait until gpu is finished with command list ID3D12PipelineState* pipelineStateObject; // pso containing a pipeline state ID3D12RootSignature* rootSignature; // root signature defines data shaders will access D3D12_VIEWPORT viewport; // area that output from rasterizer will be stretched to. D3D12_RECT scissorRect; // the area to draw in. pixels outside that area will not be drawn onto ID3D12Resource* vertexBuffer; // a default buffer in GPU memory that we will load vertex data for our triangle into ID3D12Resource* indexBuffer; // a default buffer in GPU memory that we will load index data for our triangle into D3D12_VERTEX_BUFFER_VIEW vertexBufferView; // a structure containing a pointer to the vertex data in gpu memory // the total size of the buffer, and the size of each element (vertex) D3D12_INDEX_BUFFER_VIEW indexBufferView; // a structure holding information about the index buffer ID3D12Resource* depthStencilBuffer; // This is the memory for our depth buffer. it will also be used for a stencil buffer in a later tutorial ID3D12DescriptorHeap* dsDescriptorHeap; // This is a heap for our depth/stencil buffer descriptor // this is the structure of our constant buffer. struct ConstantBuffer { XMFLOAT4 colorMultiplier; }; ID3D12DescriptorHeap* mainDescriptorHeap[frameBufferCount]; // this heap will store the descripor to our constant buffer ID3D12Resource* constantBufferUploadHeap[frameBufferCount]; // this is the memory on the gpu where our constant buffer will be placed. ConstantBuffer cbColorMultiplierData; // this is the constant buffer data we will send to the gpu // (which will be placed in the resource we created above) UINT8* cbColorMultiplierGPUAddress[frameBufferCount]; // this is a pointer to the memory location we get when we map our constant buffer
main.cpp
#include "stdafx.h" struct Vertex { Vertex(float x, float y, float z, float r, float g, float b, float a) : pos(x, y, z), color(r, g, b, z) {} XMFLOAT3 pos; XMFLOAT4 color; }; int WINAPI WinMain(HINSTANCE hInstance, //Main windows function HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nShowCmd) { // create the window if (!InitializeWindow(hInstance, nShowCmd, FullScreen)) { MessageBox(0, L"Window Initialization - Failed", L"Error", MB_OK); return 1; } // initialize direct3d if (!InitD3D()) { MessageBox(0, L"Failed to initialize direct3d 12", L"Error", MB_OK); Cleanup(); return 1; } // start the main loop mainloop(); // we want to wait for the gpu to finish executing the command list before we start releasing everything WaitForPreviousFrame(); // close the fence event CloseHandle(fenceEvent); // clean up everything Cleanup(); return 0; } // create and show the window bool InitializeWindow(HINSTANCE hInstance, int ShowWnd, bool fullscreen) { if (fullscreen) { HMONITOR hmon = MonitorFromWindow(hwnd, MONITOR_DEFAULTTONEAREST); MONITORINFO mi = { sizeof(mi) }; GetMonitorInfo(hmon, &mi); Width = mi.rcMonitor.right - mi.rcMonitor.left; Height = mi.rcMonitor.bottom - mi.rcMonitor.top; } WNDCLASSEX wc; wc.cbSize = sizeof(WNDCLASSEX); wc.style = CS_HREDRAW | CS_VREDRAW; wc.lpfnWndProc = WndProc; wc.cbClsExtra = NULL; wc.cbWndExtra = NULL; wc.hInstance = hInstance; wc.hIcon = LoadIcon(NULL, IDI_APPLICATION); wc.hCursor = LoadCursor(NULL, IDC_ARROW); wc.hbrBackground = (HBRUSH)(COLOR_WINDOW + 2); wc.lpszMenuName = NULL; wc.lpszClassName = WindowName; wc.hIconSm = LoadIcon(NULL, IDI_APPLICATION); if (!RegisterClassEx(&wc)) { MessageBox(NULL, L"Error registering class", L"Error", MB_OK | MB_ICONERROR); return false; } hwnd = CreateWindowEx(NULL, WindowName, WindowTitle, WS_OVERLAPPEDWINDOW, CW_USEDEFAULT, CW_USEDEFAULT, Width, Height, NULL, NULL, hInstance, NULL); if (!hwnd) { MessageBox(NULL, L"Error creating window", L"Error", MB_OK | MB_ICONERROR); return false; } if (fullscreen) { SetWindowLong(hwnd, GWL_STYLE, 0); } ShowWindow(hwnd, ShowWnd); UpdateWindow(hwnd); return true; } void mainloop() { MSG msg; ZeroMemory(&msg, sizeof(MSG)); while (Running) { if (PeekMessage(&msg, NULL, 0, 0, PM_REMOVE)) { if (msg.message == WM_QUIT) break; TranslateMessage(&msg); DispatchMessage(&msg); } else { // run game code Update(); // update the game logic Render(); // execute the command queue (rendering the scene is the result of the gpu executing the command lists) } } } LRESULT CALLBACK WndProc(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam) { switch (msg) { case WM_KEYDOWN: if (wParam == VK_ESCAPE) { if (MessageBox(0, L"Are you sure you want to exit?", L"Really?", MB_YESNO | MB_ICONQUESTION) == IDYES) { Running = false; DestroyWindow(hwnd); } } return 0; case WM_DESTROY: // x button on top right corner of window was pressed Running = false; PostQuitMessage(0); return 0; } return DefWindowProc(hwnd, msg, wParam, lParam); } bool InitD3D() { HRESULT hr; // -- Create the Device -- // IDXGIFactory4* dxgiFactory; hr = CreateDXGIFactory1(IID_PPV_ARGS(&dxgiFactory)); if (FAILED(hr)) { return false; } IDXGIAdapter1* adapter; // adapters are the graphics card (this includes the embedded graphics on the motherboard) int adapterIndex = 0; // we'll start looking for directx 12 compatible graphics devices starting at index 0 bool adapterFound = false; // set this to true when a good one was found // find first hardware gpu that supports d3d 12 while (dxgiFactory->EnumAdapters1(adapterIndex, &adapter) != DXGI_ERROR_NOT_FOUND) { DXGI_ADAPTER_DESC1 desc; adapter->GetDesc1(&desc); if (desc.Flags & DXGI_ADAPTER_FLAG_SOFTWARE) { // we dont want a software device continue; } // we want a device that is compatible with direct3d 12 (feature level 11 or higher) hr = D3D12CreateDevice(adapter, D3D_FEATURE_LEVEL_11_0, _uuidof(ID3D12Device), nullptr); if (SUCCEEDED(hr)) { adapterFound = true; break; } adapterIndex++; } if (!adapterFound) { return false; } // Create the device hr = D3D12CreateDevice( adapter, D3D_FEATURE_LEVEL_11_0, IID_PPV_ARGS(&device) ); if (FAILED(hr)) { return false; } // -- Create a direct command queue -- // D3D12_COMMAND_QUEUE_DESC cqDesc = {}; cqDesc.Flags = D3D12_COMMAND_QUEUE_FLAG_NONE; cqDesc.Type = D3D12_COMMAND_LIST_TYPE_DIRECT; // direct means the gpu can directly execute this command queue hr = device->CreateCommandQueue(&cqDesc, IID_PPV_ARGS(&commandQueue)); // create the command queue if (FAILED(hr)) { return false; } // -- Create the Swap Chain (double/tripple buffering) -- // DXGI_MODE_DESC backBufferDesc = {}; // this is to describe our display mode backBufferDesc.Width = Width; // buffer width backBufferDesc.Height = Height; // buffer height backBufferDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM; // format of the buffer (rgba 32 bits, 8 bits for each chanel) // describe our multi-sampling. We are not multi-sampling, so we set the count to 1 (we need at least one sample of course) DXGI_SAMPLE_DESC sampleDesc = {}; sampleDesc.Count = 1; // multisample count (no multisampling, so we just put 1, since we still need 1 sample) // Describe and create the swap chain. DXGI_SWAP_CHAIN_DESC swapChainDesc = {}; swapChainDesc.BufferCount = frameBufferCount; // number of buffers we have swapChainDesc.BufferDesc = backBufferDesc; // our back buffer description swapChainDesc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT; // this says the pipeline will render to this swap chain swapChainDesc.SwapEffect = DXGI_SWAP_EFFECT_FLIP_DISCARD; // dxgi will discard the buffer (data) after we call present swapChainDesc.OutputWindow = hwnd; // handle to our window swapChainDesc.SampleDesc = sampleDesc; // our multi-sampling description swapChainDesc.Windowed = !FullScreen; // set to true, then if in fullscreen must call SetFullScreenState with true for full screen to get uncapped fps IDXGISwapChain* tempSwapChain; dxgiFactory->CreateSwapChain( commandQueue, // the queue will be flushed once the swap chain is created &swapChainDesc, // give it the swap chain description we created above &tempSwapChain // store the created swap chain in a temp IDXGISwapChain interface ); swapChain = static_cast<IDXGISwapChain3*>(tempSwapChain); frameIndex = swapChain->GetCurrentBackBufferIndex(); // -- Create the Back Buffers (render target views) Descriptor Heap -- // // describe an rtv descriptor heap and create D3D12_DESCRIPTOR_HEAP_DESC rtvHeapDesc = {}; rtvHeapDesc.NumDescriptors = frameBufferCount; // number of descriptors for this heap. rtvHeapDesc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_RTV; // this heap is a render target view heap // This heap will not be directly referenced by the shaders (not shader visible), as this will store the output from the pipeline // otherwise we would set the heap's flag to D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE rtvHeapDesc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE; hr = device->CreateDescriptorHeap(&rtvHeapDesc, IID_PPV_ARGS(&rtvDescriptorHeap)); if (FAILED(hr)) { return false; } // get the size of a descriptor in this heap (this is a rtv heap, so only rtv descriptors should be stored in it. // descriptor sizes may vary from device to device, which is why there is no set size and we must ask the // device to give us the size. we will use this size to increment a descriptor handle offset rtvDescriptorSize = device->GetDescriptorHandleIncrementSize(D3D12_DESCRIPTOR_HEAP_TYPE_RTV); // get a handle to the first descriptor in the descriptor heap. a handle is basically a pointer, // but we cannot literally use it like a c++ pointer. CD3DX12_CPU_DESCRIPTOR_HANDLE rtvHandle(rtvDescriptorHeap->GetCPUDescriptorHandleForHeapStart()); // Create a RTV for each buffer (double buffering is two buffers, tripple buffering is 3). for (int i = 0; i < frameBufferCount; i++) { // first we get the n'th buffer in the swap chain and store it in the n'th // position of our ID3D12Resource array hr = swapChain->GetBuffer(i, IID_PPV_ARGS(&renderTargets[i])); if (FAILED(hr)) { return false; } // the we "create" a render target view which binds the swap chain buffer (ID3D12Resource[n]) to the rtv handle device->CreateRenderTargetView(renderTargets[i], nullptr, rtvHandle); // we increment the rtv handle by the rtv descriptor size we got above rtvHandle.Offset(1, rtvDescriptorSize); } // -- Create the Command Allocators -- // for (int i = 0; i < frameBufferCount; i++) { hr = device->CreateCommandAllocator(D3D12_COMMAND_LIST_TYPE_DIRECT, IID_PPV_ARGS(&commandAllocator[i])); if (FAILED(hr)) { return false; } } // -- Create a Command List -- // // create the command list with the first allocator hr = device->CreateCommandList(0, D3D12_COMMAND_LIST_TYPE_DIRECT, commandAllocator[frameIndex], NULL, IID_PPV_ARGS(&commandList)); if (FAILED(hr)) { return false; } // -- Create a Fence & Fence Event -- // // create the fences for (int i = 0; i < frameBufferCount; i++) { hr = device->CreateFence(0, D3D12_FENCE_FLAG_NONE, IID_PPV_ARGS(&fence[i])); if (FAILED(hr)) { return false; } fenceValue[i] = 0; // set the initial fence value to 0 } // create a handle to a fence event fenceEvent = CreateEvent(nullptr, FALSE, FALSE, nullptr); if (fenceEvent == nullptr) { return false; } // create root signature // create a descriptor range (descriptor table) and fill it out // this is a range of descriptors inside a descriptor heap D3D12_DESCRIPTOR_RANGE descriptorTableRanges[1]; // only one range right now descriptorTableRanges[0].RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_CBV; // this is a range of constant buffer views (descriptors) descriptorTableRanges[0].NumDescriptors = 1; // we only have one constant buffer, so the range is only 1 descriptorTableRanges[0].BaseShaderRegister = 0; // start index of the shader registers in the range descriptorTableRanges[0].RegisterSpace = 0; // space 0. can usually be zero descriptorTableRanges[0].OffsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND; // this appends the range to the end of the root signature descriptor tables // create a descriptor table D3D12_ROOT_DESCRIPTOR_TABLE descriptorTable; descriptorTable.NumDescriptorRanges = _countof(descriptorTableRanges); // we only have one range descriptorTable.pDescriptorRanges = &descriptorTableRanges[0]; // the pointer to the beginning of our ranges array // create a root parameter and fill it out D3D12_ROOT_PARAMETER rootParameters[1]; // only one parameter right now rootParameters[0].ParameterType = D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE; // this is a descriptor table rootParameters[0].DescriptorTable = descriptorTable; // this is our descriptor table for this root parameter rootParameters[0].ShaderVisibility = D3D12_SHADER_VISIBILITY_VERTEX; // our pixel shader will be the only shader accessing this parameter for now CD3DX12_ROOT_SIGNATURE_DESC rootSignatureDesc; rootSignatureDesc.Init(_countof(rootParameters), // we have 1 root parameter rootParameters, // a pointer to the beginning of our root parameters array 0, nullptr, D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT | // we can deny shader stages here for better performance D3D12_ROOT_SIGNATURE_FLAG_DENY_HULL_SHADER_ROOT_ACCESS | D3D12_ROOT_SIGNATURE_FLAG_DENY_DOMAIN_SHADER_ROOT_ACCESS | D3D12_ROOT_SIGNATURE_FLAG_DENY_GEOMETRY_SHADER_ROOT_ACCESS | D3D12_ROOT_SIGNATURE_FLAG_DENY_PIXEL_SHADER_ROOT_ACCESS); ID3DBlob* signature; hr = D3D12SerializeRootSignature(&rootSignatureDesc, D3D_ROOT_SIGNATURE_VERSION_1, &signature, nullptr); if (FAILED(hr)) { return false; } hr = device->CreateRootSignature(0, signature->GetBufferPointer(), signature->GetBufferSize(), IID_PPV_ARGS(&rootSignature)); if (FAILED(hr)) { return false; } // create vertex and pixel shaders // when debugging, we can compile the shader files at runtime. // but for release versions, we can compile the hlsl shaders // with fxc.exe to create .cso files, which contain the shader // bytecode. We can load the .cso files at runtime to get the // shader bytecode, which of course is faster than compiling // them at runtime // compile vertex shader ID3DBlob* vertexShader; // d3d blob for holding vertex shader bytecode ID3DBlob* errorBuff; // a buffer holding the error data if any hr = D3DCompileFromFile(L"VertexShader.hlsl", nullptr, nullptr, "main", "vs_5_0", D3DCOMPILE_DEBUG | D3DCOMPILE_SKIP_OPTIMIZATION, 0, &vertexShader, &errorBuff); if (FAILED(hr)) { OutputDebugStringA((char*)errorBuff->GetBufferPointer()); return false; } // fill out a shader bytecode structure, which is basically just a pointer // to the shader bytecode and the size of the shader bytecode D3D12_SHADER_BYTECODE vertexShaderBytecode = {}; vertexShaderBytecode.BytecodeLength = vertexShader->GetBufferSize(); vertexShaderBytecode.pShaderBytecode = vertexShader->GetBufferPointer(); // compile pixel shader ID3DBlob* pixelShader; hr = D3DCompileFromFile(L"PixelShader.hlsl", nullptr, nullptr, "main", "ps_5_0", D3DCOMPILE_DEBUG | D3DCOMPILE_SKIP_OPTIMIZATION, 0, &pixelShader, &errorBuff); if (FAILED(hr)) { OutputDebugStringA((char*)errorBuff->GetBufferPointer()); return false; } // fill out shader bytecode structure for pixel shader D3D12_SHADER_BYTECODE pixelShaderBytecode = {}; pixelShaderBytecode.BytecodeLength = pixelShader->GetBufferSize(); pixelShaderBytecode.pShaderBytecode = pixelShader->GetBufferPointer(); // create input layout // The input layout is used by the Input Assembler so that it knows // how to read the vertex data bound to it. D3D12_INPUT_ELEMENT_DESC inputLayout[] = { { "POSITION", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 0, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 }, { "COLOR", 0, DXGI_FORMAT_R32G32B32A32_FLOAT, 0, 12, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 } }; // fill out an input layout description structure D3D12_INPUT_LAYOUT_DESC inputLayoutDesc = {}; // we can get the number of elements in an array by "sizeof(array) / sizeof(arrayElementType)" inputLayoutDesc.NumElements = sizeof(inputLayout) / sizeof(D3D12_INPUT_ELEMENT_DESC); inputLayoutDesc.pInputElementDescs = inputLayout; // create a pipeline state object (PSO) // In a real application, you will have many pso's. for each different shader // or different combinations of shaders, different blend states or different rasterizer states, // different topology types (point, line, triangle, patch), or a different number // of render targets you will need a pso // VS is the only required shader for a pso. You might be wondering when a case would be where // you only set the VS. It's possible that you have a pso that only outputs data with the stream // output, and not on a render target, which means you would not need anything after the stream // output. D3D12_GRAPHICS_PIPELINE_STATE_DESC psoDesc = {}; // a structure to define a pso psoDesc.InputLayout = inputLayoutDesc; // the structure describing our input layout psoDesc.pRootSignature = rootSignature; // the root signature that describes the input data this pso needs psoDesc.VS = vertexShaderBytecode; // structure describing where to find the vertex shader bytecode and how large it is psoDesc.PS = pixelShaderBytecode; // same as VS but for pixel shader psoDesc.PrimitiveTopologyType = D3D12_PRIMITIVE_TOPOLOGY_TYPE_TRIANGLE; // type of topology we are drawing psoDesc.RTVFormats[0] = DXGI_FORMAT_R8G8B8A8_UNORM; // format of the render target psoDesc.SampleDesc = sampleDesc; // must be the same sample description as the swapchain and depth/stencil buffer psoDesc.SampleMask = 0xffffffff; // sample mask has to do with multi-sampling. 0xffffffff means point sampling is done psoDesc.RasterizerState = CD3DX12_RASTERIZER_DESC(D3D12_DEFAULT); // a default rasterizer state. psoDesc.BlendState = CD3DX12_BLEND_DESC(D3D12_DEFAULT); // a default blent state. psoDesc.NumRenderTargets = 1; // we are only binding one render target psoDesc.DepthStencilState = CD3DX12_DEPTH_STENCIL_DESC(D3D12_DEFAULT); // a default depth stencil state // create the pso hr = device->CreateGraphicsPipelineState(&psoDesc, IID_PPV_ARGS(&pipelineStateObject)); if (FAILED(hr)) { return false; } // Create vertex buffer // a quad Vertex vList[] = { // first quad (closer to camera, blue) { -0.5f, 0.5f, 0.5f, 1.0f, 0.0f, 0.0f, 1.0f }, { 0.5f, -0.5f, 0.5f, 1.0f, 0.0f, 1.0f, 1.0f }, { -0.5f, -0.5f, 0.5f, 0.0f, 0.0f, 1.0f, 1.0f }, { 0.5f, 0.5f, 0.5f, 0.0f, 1.0f, 0.0f, 1.0f } }; int vBufferSize = sizeof(vList); // create default heap // default heap is memory on the GPU. Only the GPU has access to this memory // To get data into this heap, we will have to upload the data using // an upload heap device->CreateCommittedResource( &CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_DEFAULT), // a default heap D3D12_HEAP_FLAG_NONE, // no flags &CD3DX12_RESOURCE_DESC::Buffer(vBufferSize), // resource description for a buffer D3D12_RESOURCE_STATE_COPY_DEST, // we will start this heap in the copy destination state since we will copy data // from the upload heap to this heap nullptr, // optimized clear value must be null for this type of resource. used for render targets and depth/stencil buffers IID_PPV_ARGS(&vertexBuffer)); // we can give resource heaps a name so when we debug with the graphics debugger we know what resource we are looking at vertexBuffer->SetName(L"Vertex Buffer Resource Heap"); // create upload heap // upload heaps are used to upload data to the GPU. CPU can write to it, GPU can read from it // We will upload the vertex buffer using this heap to the default heap ID3D12Resource* vBufferUploadHeap; device->CreateCommittedResource( &CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_UPLOAD), // upload heap D3D12_HEAP_FLAG_NONE, // no flags &CD3DX12_RESOURCE_DESC::Buffer(vBufferSize), // resource description for a buffer D3D12_RESOURCE_STATE_GENERIC_READ, // GPU will read from this buffer and copy its contents to the default heap nullptr, IID_PPV_ARGS(&vBufferUploadHeap)); vBufferUploadHeap->SetName(L"Vertex Buffer Upload Resource Heap"); // store vertex buffer in upload heap D3D12_SUBRESOURCE_DATA vertexData = {}; vertexData.pData = reinterpret_cast<BYTE*>(vList); // pointer to our vertex array vertexData.RowPitch = vBufferSize; // size of all our triangle vertex data vertexData.SlicePitch = vBufferSize; // also the size of our triangle vertex data // we are now creating a command with the command list to copy the data from // the upload heap to the default heap UpdateSubresources(commandList, vertexBuffer, vBufferUploadHeap, 0, 0, 1, &vertexData); // transition the vertex buffer data from copy destination state to vertex buffer state commandList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(vertexBuffer, D3D12_RESOURCE_STATE_COPY_DEST, D3D12_RESOURCE_STATE_VERTEX_AND_CONSTANT_BUFFER)); // Create index buffer // a quad (2 triangles) DWORD iList[] = { // first quad (blue) 0, 1, 2, // first triangle 0, 3, 1, // second triangle }; int iBufferSize = sizeof(iList); // create default heap to hold index buffer device->CreateCommittedResource( &CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_DEFAULT), // a default heap D3D12_HEAP_FLAG_NONE, // no flags &CD3DX12_RESOURCE_DESC::Buffer(iBufferSize), // resource description for a buffer D3D12_RESOURCE_STATE_COPY_DEST, // start in the copy destination state nullptr, // optimized clear value must be null for this type of resource IID_PPV_ARGS(&indexBuffer)); // we can give resource heaps a name so when we debug with the graphics debugger we know what resource we are looking at vertexBuffer->SetName(L"Index Buffer Resource Heap"); // create upload heap to upload index buffer ID3D12Resource* iBufferUploadHeap; device->CreateCommittedResource( &CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_UPLOAD), // upload heap D3D12_HEAP_FLAG_NONE, // no flags &CD3DX12_RESOURCE_DESC::Buffer(vBufferSize), // resource description for a buffer D3D12_RESOURCE_STATE_GENERIC_READ, // GPU will read from this buffer and copy its contents to the default heap nullptr, IID_PPV_ARGS(&iBufferUploadHeap)); vBufferUploadHeap->SetName(L"Index Buffer Upload Resource Heap"); // store vertex buffer in upload heap D3D12_SUBRESOURCE_DATA indexData = {}; indexData.pData = reinterpret_cast<BYTE*>(iList); // pointer to our index array indexData.RowPitch = iBufferSize; // size of all our index buffer indexData.SlicePitch = iBufferSize; // also the size of our index buffer // we are now creating a command with the command list to copy the data from // the upload heap to the default heap UpdateSubresources(commandList, indexBuffer, iBufferUploadHeap, 0, 0, 1, &indexData); // transition the vertex buffer data from copy destination state to vertex buffer state commandList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(indexBuffer, D3D12_RESOURCE_STATE_COPY_DEST, D3D12_RESOURCE_STATE_VERTEX_AND_CONSTANT_BUFFER)); // Create the depth/stencil buffer // create a depth stencil descriptor heap so we can get a pointer to the depth stencil buffer D3D12_DESCRIPTOR_HEAP_DESC dsvHeapDesc = {}; dsvHeapDesc.NumDescriptors = 1; dsvHeapDesc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_DSV; dsvHeapDesc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE; hr = device->CreateDescriptorHeap(&dsvHeapDesc, IID_PPV_ARGS(&dsDescriptorHeap)); if (FAILED(hr)) { Running = false; } D3D12_DEPTH_STENCIL_VIEW_DESC depthStencilDesc = {}; depthStencilDesc.Format = DXGI_FORMAT_D32_FLOAT; depthStencilDesc.ViewDimension = D3D12_DSV_DIMENSION_TEXTURE2D; depthStencilDesc.Flags = D3D12_DSV_FLAG_NONE; D3D12_CLEAR_VALUE depthOptimizedClearValue = {}; depthOptimizedClearValue.Format = DXGI_FORMAT_D32_FLOAT; depthOptimizedClearValue.DepthStencil.Depth = 1.0f; depthOptimizedClearValue.DepthStencil.Stencil = 0; device->CreateCommittedResource( &CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_DEFAULT), D3D12_HEAP_FLAG_NONE, &CD3DX12_RESOURCE_DESC::Tex2D(DXGI_FORMAT_D32_FLOAT, Width, Height, 1, 0, 1, 0, D3D12_RESOURCE_FLAG_ALLOW_DEPTH_STENCIL), D3D12_RESOURCE_STATE_DEPTH_WRITE, &depthOptimizedClearValue, IID_PPV_ARGS(&depthStencilBuffer) ); dsDescriptorHeap->SetName(L"Depth/Stencil Resource Heap"); device->CreateDepthStencilView(depthStencilBuffer, &depthStencilDesc, dsDescriptorHeap->GetCPUDescriptorHandleForHeapStart()); // Create a constant buffer descriptor heap for each frame // this is the descriptor heap that will store our constant buffer descriptor for (int i = 0; i < frameBufferCount; ++i) { D3D12_DESCRIPTOR_HEAP_DESC heapDesc = {}; heapDesc.NumDescriptors = 1; heapDesc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE; heapDesc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV; hr = device->CreateDescriptorHeap(&heapDesc, IID_PPV_ARGS(&mainDescriptorHeap[i])); if (FAILED(hr)) { Running = false; } } // create the constant buffer resource heap // We will update the constant buffer one or more times per frame, so we will use only an upload heap // unlike previously we used an upload heap to upload the vertex and index data, and then copied over // to a default heap. If you plan to use a resource for more than a couple frames, it is usually more // efficient to copy to a default heap where it stays on the gpu. In this case, our constant buffer // will be modified and uploaded at least once per frame, so we only use an upload heap // create a resource heap, descriptor heap, and pointer to cbv for each frame for (int i = 0; i < frameBufferCount; ++i) { hr = device->CreateCommittedResource( &CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_UPLOAD), // this heap will be used to upload the constant buffer data D3D12_HEAP_FLAG_NONE, // no flags &CD3DX12_RESOURCE_DESC::Buffer(1024 * 64), // size of the resource heap. Must be a multiple of 64KB for single-textures and constant buffers D3D12_RESOURCE_STATE_GENERIC_READ, // will be data that is read from so we keep it in the generic read state nullptr, // we do not have use an optimized clear value for constant buffers IID_PPV_ARGS(&constantBufferUploadHeap[i])); constantBufferUploadHeap[i]->SetName(L"Constant Buffer Upload Resource Heap"); D3D12_CONSTANT_BUFFER_VIEW_DESC cbvDesc = {}; cbvDesc.BufferLocation = constantBufferUploadHeap[i]->GetGPUVirtualAddress(); cbvDesc.SizeInBytes = (sizeof(ConstantBuffer) + 255) & ~255; // CB size is required to be 256-byte aligned. device->CreateConstantBufferView(&cbvDesc, mainDescriptorHeap[i]->GetCPUDescriptorHandleForHeapStart()); ZeroMemory(&cbColorMultiplierData, sizeof(cbColorMultiplierData)); CD3DX12_RANGE readRange(0, 0); // We do not intend to read from this resource on the CPU. (End is less than or equal to begin) hr = constantBufferUploadHeap[i]->Map(0, &readRange, reinterpret_cast<void**>(&cbColorMultiplierGPUAddress[i])); memcpy(cbColorMultiplierGPUAddress[i], &cbColorMultiplierData, sizeof(cbColorMultiplierData)); } // Now we execute the command list to upload the initial assets (triangle data) commandList->Close(); ID3D12CommandList* ppCommandLists[] = { commandList }; commandQueue->ExecuteCommandLists(_countof(ppCommandLists), ppCommandLists); // increment the fence value now, otherwise the buffer might not be uploaded by the time we start drawing fenceValue[frameIndex]++; hr = commandQueue->Signal(fence[frameIndex], fenceValue[frameIndex]); if (FAILED(hr)) { Running = false; } // create a vertex buffer view for the triangle. We get the GPU memory address to the vertex pointer using the GetGPUVirtualAddress() method vertexBufferView.BufferLocation = vertexBuffer->GetGPUVirtualAddress(); vertexBufferView.StrideInBytes = sizeof(Vertex); vertexBufferView.SizeInBytes = vBufferSize; // create a vertex buffer view for the triangle. We get the GPU memory address to the vertex pointer using the GetGPUVirtualAddress() method indexBufferView.BufferLocation = indexBuffer->GetGPUVirtualAddress(); indexBufferView.Format = DXGI_FORMAT_R32_UINT; // 32-bit unsigned integer (this is what a dword is, double word, a word is 2 bytes) indexBufferView.SizeInBytes = iBufferSize; // Fill out the Viewport viewport.TopLeftX = 0; viewport.TopLeftY = 0; viewport.Width = Width; viewport.Height = Height; viewport.MinDepth = 0.0f; viewport.MaxDepth = 1.0f; // Fill out a scissor rect scissorRect.left = 0; scissorRect.top = 0; scissorRect.right = Width; scissorRect.bottom = Height; return true; } void Update() { // update app logic, such as moving the camera or figuring out what objects are in view static float rIncrement = 0.00002f; static float gIncrement = 0.00006f; static float bIncrement = 0.00009f; cbColorMultiplierData.colorMultiplier.x += rIncrement; cbColorMultiplierData.colorMultiplier.y += gIncrement; cbColorMultiplierData.colorMultiplier.z += bIncrement; if (cbColorMultiplierData.colorMultiplier.x >= 1.0 || cbColorMultiplierData.colorMultiplier.x <= 0.0) { cbColorMultiplierData.colorMultiplier.x = cbColorMultiplierData.colorMultiplier.x >= 1.0 ? 1.0 : 0.0; rIncrement = -rIncrement; } if (cbColorMultiplierData.colorMultiplier.y >= 1.0 || cbColorMultiplierData.colorMultiplier.y <= 0.0) { cbColorMultiplierData.colorMultiplier.y = cbColorMultiplierData.colorMultiplier.y >= 1.0 ? 1.0 : 0.0; gIncrement = -gIncrement; } if (cbColorMultiplierData.colorMultiplier.z >= 1.0 || cbColorMultiplierData.colorMultiplier.z <= 0.0) { cbColorMultiplierData.colorMultiplier.z = cbColorMultiplierData.colorMultiplier.z >= 1.0 ? 1.0 : 0.0; bIncrement = -bIncrement; } // copy our ConstantBuffer instance to the mapped constant buffer resource memcpy(cbColorMultiplierGPUAddress[frameIndex], &cbColorMultiplierData, sizeof(cbColorMultiplierData)); } void UpdatePipeline() { HRESULT hr; // We have to wait for the gpu to finish with the command allocator before we reset it WaitForPreviousFrame(); // we can only reset an allocator once the gpu is done with it // resetting an allocator frees the memory that the command list was stored in hr = commandAllocator[frameIndex]->Reset(); if (FAILED(hr)) { Running = false; } // reset the command list. by resetting the command list we are putting it into // a recording state so we can start recording commands into the command allocator. // the command allocator that we reference here may have multiple command lists // associated with it, but only one can be recording at any time. Make sure // that any other command lists associated to this command allocator are in // the closed state (not recording). // Here you will pass an initial pipeline state object as the second parameter, // but in this tutorial we are only clearing the rtv, and do not actually need // anything but an initial default pipeline, which is what we get by setting // the second parameter to NULL hr = commandList->Reset(commandAllocator[frameIndex], pipelineStateObject); if (FAILED(hr)) { Running = false; } // here we start recording commands into the commandList (which all the commands will be stored in the commandAllocator) // transition the "frameIndex" render target from the present state to the render target state so the command list draws to it starting from here commandList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(renderTargets[frameIndex], D3D12_RESOURCE_STATE_PRESENT, D3D12_RESOURCE_STATE_RENDER_TARGET)); // here we again get the handle to our current render target view so we can set it as the render target in the output merger stage of the pipeline CD3DX12_CPU_DESCRIPTOR_HANDLE rtvHandle(rtvDescriptorHeap->GetCPUDescriptorHandleForHeapStart(), frameIndex, rtvDescriptorSize); // get a handle to the depth/stencil buffer CD3DX12_CPU_DESCRIPTOR_HANDLE dsvHandle(dsDescriptorHeap->GetCPUDescriptorHandleForHeapStart()); // set the render target for the output merger stage (the output of the pipeline) commandList->OMSetRenderTargets(1, &rtvHandle, FALSE, &dsvHandle); // Clear the render target by using the ClearRenderTargetView command const float clearColor[] = { 0.0f, 0.2f, 0.4f, 1.0f }; commandList->ClearRenderTargetView(rtvHandle, clearColor, 0, nullptr); // clear the depth/stencil buffer commandList->ClearDepthStencilView(dsDescriptorHeap->GetCPUDescriptorHandleForHeapStart(), D3D12_CLEAR_FLAG_DEPTH, 1.0f, 0, 0, nullptr); // set root signature commandList->SetGraphicsRootSignature(rootSignature); // set the root signature // set constant buffer descriptor heap ID3D12DescriptorHeap* descriptorHeaps[] = { mainDescriptorHeap[frameIndex] }; commandList->SetDescriptorHeaps(_countof(descriptorHeaps), descriptorHeaps); // set the root descriptor table 0 to the constant buffer descriptor heap commandList->SetGraphicsRootDescriptorTable(0, mainDescriptorHeap[frameIndex]->GetGPUDescriptorHandleForHeapStart()); // draw triangle commandList->RSSetViewports(1, &viewport); // set the viewports commandList->RSSetScissorRects(1, &scissorRect); // set the scissor rects commandList->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST); // set the primitive topology commandList->IASetVertexBuffers(0, 1, &vertexBufferView); // set the vertex buffer (using the vertex buffer view) commandList->IASetIndexBuffer(&indexBufferView); commandList->DrawIndexedInstanced(6, 1, 0, 0, 0); // draw first quad // transition the "frameIndex" render target from the render target state to the present state. If the debug layer is enabled, you will receive a // warning if present is called on the render target when it's not in the present state commandList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(renderTargets[frameIndex], D3D12_RESOURCE_STATE_RENDER_TARGET, D3D12_RESOURCE_STATE_PRESENT)); hr = commandList->Close(); if (FAILED(hr)) { Running = false; } } void Render() { HRESULT hr; UpdatePipeline(); // update the pipeline by sending commands to the commandqueue // create an array of command lists (only one command list here) ID3D12CommandList* ppCommandLists[] = { commandList }; // execute the array of command lists commandQueue->ExecuteCommandLists(_countof(ppCommandLists), ppCommandLists); // this command goes in at the end of our command queue. we will know when our command queue // has finished because the fence value will be set to "fenceValue" from the GPU since the command // queue is being executed on the GPU hr = commandQueue->Signal(fence[frameIndex], fenceValue[frameIndex]); if (FAILED(hr)) { Running = false; } // present the current backbuffer hr = swapChain->Present(0, 0); if (FAILED(hr)) { Running = false; } } void Cleanup() { // wait for the gpu to finish all frames for (int i = 0; i < frameBufferCount; ++i) { frameIndex = i; WaitForPreviousFrame(); } // get swapchain out of full screen before exiting BOOL fs = false; if (swapChain->GetFullscreenState(&fs, NULL)) swapChain->SetFullscreenState(false, NULL); SAFE_RELEASE(device); SAFE_RELEASE(swapChain); SAFE_RELEASE(commandQueue); SAFE_RELEASE(rtvDescriptorHeap); SAFE_RELEASE(commandList); for (int i = 0; i < frameBufferCount; ++i) { SAFE_RELEASE(renderTargets[i]); SAFE_RELEASE(commandAllocator[i]); SAFE_RELEASE(fence[i]); SAFE_RELEASE(mainDescriptorHeap[i]); SAFE_RELEASE(constantBufferUploadHeap[i]); }; SAFE_RELEASE(pipelineStateObject); SAFE_RELEASE(rootSignature); SAFE_RELEASE(vertexBuffer); SAFE_RELEASE(indexBuffer); SAFE_RELEASE(depthStencilBuffer); SAFE_RELEASE(dsDescriptorHeap); } void WaitForPreviousFrame() { HRESULT hr; // swap the current rtv buffer index so we draw on the correct buffer frameIndex = swapChain->GetCurrentBackBufferIndex(); // if the current fence value is still less than "fenceValue", then we know the GPU has not finished executing // the command queue since it has not reached the "commandQueue->Signal(fence, fenceValue)" command if (fence[frameIndex]->GetCompletedValue() < fenceValue[frameIndex]) { // we have the fence create an event which is signaled once the fence's current value is "fenceValue" hr = fence[frameIndex]->SetEventOnCompletion(fenceValue[frameIndex], fenceEvent); if (FAILED(hr)) { Running = false; } // We will wait until the fence has triggered the event that it's current value has reached "fenceValue". once it's value // has reached "fenceValue", we know the command queue has finished executing WaitForSingleObject(fenceEvent, INFINITE); } // increment fenceValue for next frame fenceValue[frameIndex]++; }
参考链接:
- https://docs.microsoft.com/en-us/windows/win32/direct3d12/directx-12-programming-guide
- http://www.d3dcoder.net/
- https://www.braynzarsoft.net/viewtutorial/q16390-04-directx-12-braynzar-soft-tutorials
- https://developer.nvidia.com/dx12-dos-and-donts
- https://www.3dgep.com/learning-directx-12-1/
- https://gpuopen.com/learn/lets-learn-directx12/
- https://alain.xyz/blog/raw-directx12
- https://www.rastertek.com/tutdx12.html
- https://digitalerr0r.net/2015/08/19/quickstart-directx-12-programming/
- https://walbourn.github.io/getting-started-with-direct3d-12/
- https://docs.aws.amazon.com/lumberyard/latest/userguide/graphics-rendering-directx.html
- http://diligentgraphics.com/diligent-engine/samples/
- https://www.programmersought.com/article/2904113865/
- https://www.tutorialspoint.com/directx/directx_first_hlsl.htm
- http://rbwhitaker.wikidot.com/hlsl-tutorials
- https://digitalerr0r.net/2015/08/19/quickstart-directx-12-programming/
- https://www.ronja-tutorials.com/post/002-hlsl/
标签:常量,buffer,will,hr,描述符,heap,缓冲区,D3D12 来源: https://blog.51cto.com/u_15273495/2914380