.. _shaders:

*******
着色器
*******

Dagor 着色语言是纯 HLSL 着色器的预处理器/编译器。在 DSHL 中，我们可以为 HLSL 着色器绑定资源，配置固定的着色器阶段（剔除、Z 测试......）等。
纯 HLSL 代码需要包含在 ``hlsl{...}`` 块中。

==============================
定义和编译着色器
==============================

让我们来看一个简单的 DSHL 着色器示例：

.. code-block:: c

  shader simple_shader
  {
    // 这是顶点着色器对顶点缓冲区的预期描述
    channel float3 pos=pos; // 位置
    channel float3 vcol=vcol; // 顶点颜色

    hlsl {
      struct VsInput
      {
        float3 pos: POSITION0;
        float3 color: COLOR0;
      };

      struct VsOutput
      {
        float4 pos : SV_POSITION;
        float3 color : COLOR0;
      };

      VsOutput test_vertex(VsInput input)
      {
        VsOutput ret;
        ret.pos = float4(input.pos, 1.0);
        ret.color = input.color;

        return ret;
      }

      float4 test_pixel(VsOutput input) : SV_Target0
      {
        return float4(input.color.rgb, 1.0);
      }
    }
    compile("target_vs", "test_vertex");
    compile("target_ps", "test_pixel");
  }

这里， ``shader (name)`` 定义了着色器编译成纯 HLSL 后的实际名称。

通道 ``pos`` 和 ``vcol`` 描述了顶点着色器希望接收的顶点缓冲区数据。
DSHL 预着色器会根据这些 ``channel`` 变量为 C++ 代码创建适当的布局。请参阅 :ref:`channels` 获取更多信息。

在 ``hlsl`` 块中定义着色器后，需要通过 ``compile(“target_(stage)”, “entry_function”)``指定其入口点，其中 ``entry_function`` 应为 ``hlsl`` 块中相应着色器函数的名称，而 ``stage`` 则定义以下着色器阶段之一：

- ``target_vs`` (vertex shader)
- ``target_hs`` (hull shader)
- ``target_ds`` (domain shader)
- ``target_gs`` (geometry shader)
- ``target_ps`` (pixel shader)
- ``target_cs`` (compute shader)
- ``target_ms`` (mesh shader)
- ``target_as`` (amplification shader)
- ``target_vs_for_gs`` (如果在 PS4/PS5 上同时使用顶点着色器和几何着色器，则顶点着色器的编译方式必须不同)
- ``target_vs_for_tess`` (如果在 PS4/PS5 上同时使用顶点着色器和细分着色器，则必须以不同方式编译顶点着色器)
- ``target_vs_half`` (半类型顶点着色器)
- ``target_ps_half`` (半类型像素着色器)

您还可以通过在括号中指定着色器阶段来指定 ``hlsl`` 块中的代码将转到哪个特定的着色器阶段，例如 ``hlsl(stage) {...}``
可用的着色器有 ``ps``, ``vs``, ``cs``, ``ds``, ``hs``, ``gs``, ``ms``, ``as``。如果省略指定，``hlsl{...}``块中的代码将被发送到所有这些着色器。

.. _preshader:

=========
Preshader
=========

除了声明着色器代码本身外，DSHL 还允许您声明预着色器（pre-shader），它是一种脚本，允许您轻松地将数据从 C++ 管道传输到着色器。

这种管道最常见的用例是纹理和缓冲区的各种绑定：您可以通过名为 “着色器变量 ”的全局 ``string`` 到 ``DSHL data type``映射，将变量绑定到着色器，而不是像以前那样 “选择插槽，将纹理设置到插槽中，记住不要弄乱并使用相同的插槽两次”。

该映射与您在 .dshl 文件 :ref:`global-variables` 中定义的全局 DSHL 变量一一对应，并且是 RW 的。

因此，举例来说，您既可以从 C++ 中读取着色器中定义的 ``int`` ，也可以将纹理设置为着色器中定义的全局 ``texture`` 变量。
在 C++ 端，只需使用 ``set_texture`` 填入该映射，而在着色器端，则要求预着色器系统抓取某个着色器变量并将其设置为 HLSL 变量。语法如下

.. code-block:: c

  (shader_stage) {
    hlsl_variable_name @type_suffix = variable|expression [hlsl {/*hlsl text*/}]
  }

然后，我们的着色器编译器会将这些代码编译成一连串简单的解释命令，这些命令存储在着色器转储区，并在运行着色器之前执行。

可接受的着色器阶段：

- ``cs`` -- Compute Shader
- ``ps`` -- Pixel Shader
- ``vs`` -- Vertex Shader
- ``ms`` -- Mesh Shader

可接受的类型：

+-----------------+--------------------------------------------+
| 后缀             | 类型                                       |
+=================+============================================+
| @f1             | float                                      |
+-----------------+--------------------------------------------+
| @f2             | float2                                     |
+-----------------+--------------------------------------------+
| @f3             | float3                                     |
+-----------------+--------------------------------------------+
| @f4             | float4                                     |
+-----------------+--------------------------------------------+
| @f44            | float4x4                                   |
+-----------------+--------------------------------------------+
| @i1             | int                                        |
+-----------------+--------------------------------------------+
| @i2             | int2                                       |
+-----------------+--------------------------------------------+
| @i3             | int3                                       |
+-----------------+--------------------------------------------+
| @i4             | int4                                       |
+-----------------+--------------------------------------------+
| @u1             | uint                                       |
+-----------------+--------------------------------------------+
| @u2             | uint2                                      |
+-----------------+--------------------------------------------+
| @u3             | uint3                                      |
+-----------------+--------------------------------------------+
| @u4             | uint4                                      |
+-----------------+--------------------------------------------+
| @tex            | Texture                                    |
+-----------------+--------------------------------------------+
| @tex2d          | Texture2D                                  |
+-----------------+--------------------------------------------+
| @tex3d          | Texture3D                                  |
+-----------------+--------------------------------------------+
| @texArray       | Texture2DArray                             |
+-----------------+--------------------------------------------+
| @texCube        | TextureCube                                |
+-----------------+--------------------------------------------+
| @texCubeArray   | TextureCubeArray                           |
+-----------------+--------------------------------------------+
| @uav            | Unordered Access View flag                 |
+-----------------+--------------------------------------------+
| @smp            | Texture with SamplerState                  |
+-----------------+--------------------------------------------+
| @smp2d          | Texture2D with SamplerState                |
+-----------------+--------------------------------------------+
| @smp3d          | Texture3D with SamplerState                |
+-----------------+--------------------------------------------+
| @smpArray       | Texture2DArray with SamplerState           |
+-----------------+--------------------------------------------+
| @smpCube        | TextureCube with SamplerState              |
+-----------------+--------------------------------------------+
| @smpCubeArray   | TextureCubeArray with SamplerState         |
+-----------------+--------------------------------------------+
| @shd            | Texture2D with SamplerComparisonState      |
+-----------------+--------------------------------------------+
| @shdArray       | Texture2DArray with SamplerComparisonState |
+-----------------+--------------------------------------------+
| @buf            | Buffer/StructuredBuffer                    |
+-----------------+--------------------------------------------+
| @cbuf           | ConstantBuffer                             |
+-----------------+--------------------------------------------+
| @static         | Material Texture2D with SamplerState       |
+-----------------+--------------------------------------------+
| @staticCube     | Material TextureCube with SamplerState     |
+-----------------+--------------------------------------------+
| @staticTexArray | Material Texture2DArray with SamplerState  |
+-----------------+--------------------------------------------+
| @tlas           | Top-level acceleration structure (RT)      |
+-----------------+--------------------------------------------+

.. note::
  在 ``(vs)`` 阶段声明的所有变量在 ``hlsl(<gs, hs, ds>){...}`` 块中也是可见的。
  所有在 ``(ms)`` 阶段声明的变量在 ``hlsl(as){...}`` 块中也是可见的。

--------
示例
--------

让我们创建 ``float4x4`` matrix:

.. code-block:: c

  (ps) { globtm_psf@f44 = { globtm_psf_0, globtm_psf_1, globtm_psf_2, globtm_psf_3 }; }

在这里，HLSL 变量 ``globtm_psf`` 将由预着色器用 ``globtm_psf_0..3`` 的值初始化，这些值都是 ``float4`` 类型，存储在全局着色器变量映射中。
C++ 代码有责任调用

.. code-block:: c

  set_color4(get_shader_variable_id("get_globtm_psf_X"), Point4(...));

为 ``X=0..3`` 填入适当的值。是的， ``color4``这个名字非常不幸。

对于 ``(vs)`` 块，有一个内置的 ``globtm`` *shader 变量*可用。你可以直接用它来声明 HLSL ``globtm`` ：

.. code-block:: c

  (ps) { globtm@f44 = globtm; }

您还可以对数组进行操作

.. code-block:: c

  (ps) { my_arr@type[] = {element1, element2, ..., elementN}; }

---------------------
纹理和缓存
---------------------

默认的 ``float4`` HLSL 纹理是通过 ``@tex2d, @tex3d, @texArray, @texCube, @texCubeArray`` 后缀定义的。
例如，以下代码

.. code-block:: c

  (ps) {
    hlsl_texture@tex2d = some_texture;
    hlsl_texarray@texArray = some_texarray;
  }

将被编译为

.. code-block:: c

  Texture2D hlsl_texture: register(t16);
  Texture2D hlsl_texarray: register(t17);
  // 编译器会自动选择寄存器

后缀 ``@smp2d, @smp3d, @smpArray, @smpCube, @smpCubeArray`` 确保 ``SamplerState`` 对象与纹理/贴图一起定义、
分配给相同的寄存器编号。

对于 ``@shd, @shdArray`` 后缀，除了 ``SamplerState`` 以外，还会定义一个 ``SamplerComparisonState`` 对象。
(代表阴影，因为这些纹理通常用于阴影）。

例如，以下代码

.. code-block:: c

  (ps) {
    hlsl_texture@smp2d = some_texture;
    hlsl_texarray@smpArray = some_texarray;
    hlsl_shdtexture@shd = some_shdtexture;
  }

将被编译为

.. code-block:: c

  SamplerState hlsl_texture_samplerstate: register(s0);
  SamplerState hlsl_texarray_samplerstate: register(s1);
  SamplerState hlsl_shdtexture_samplerstate: register(s2);

  SamplerComparisonState hlsl_shdtexture_cmpSampler:register(s2);

  Texture2D hlsl_texture: register(t0);
  Texture2DArray hlsl_texarray: register(t1);
  Texture2D hlsl_shdtexture: register(t2);

请注意，您可以在 ``hlsl{...}`` 块中使用着色器编译器生成的 ``<texture_name>_samplerstate``或 ``<texture_name>_cmpSampler``。
(例如，示例中的 ``hlsl_shdtexture_cmpSampler``）。

后缀 ``@tex`` 和 ``@smp`` 定义了特定类型的纹理，必须紧跟 ``hlsl{...}`` 块（定义了纹理类型）。
(定义纹理类型）。

.. code-block:: c

  (ps) {
    // 无采样器的纹理
    uint_texture@tex = uint_texture hlsl { Texture2D<uint> uint_texture@tex; }
    float_texarray@tex = float_texarray hlsl { Texture2DArray<float> float_texarray@tex; }

    // 带有采样器的纹理
    uint_texture@smp = uint_texture hlsl { Texture2D<uint> uint_texture@smp; }
    float_texarray@smp = float_texarray hlsl { Texture2DArray<float> float_texarray@smp; }
  }

-----------------
材质纹理
-----------------

绑定到材质（漫反射、法线等）的纹理称为*材质纹理*。
在预着色器中，必须使用 ``@static, @staticCube, @staticTexArray`` 后缀来区别对待这些材质纹理与全局或动态纹理。

.. code-block:: c

  shader example_shader
  {
    texture diffuse_tex = material.texture.diffuse;
    texture normal_tex = material.texture[1];
    texture cube_tex = material.texture[2];
    texture some_texarray = material.texture[3];

    (ps) {
      diffuse_tex@static = diffuse_tex;
      normal_tex@static = normal_tex;
      cube_tex@staticCube = cube_tex;
      some_texarray@staticTexArray = some_texarray;
    }
  }

如果是针对 DX12 进行编译，材质纹理会自动作为无绑定纹理使用；Vulkan 和 PlayStation 也支持无绑定（使用特殊的 ``-enableBindless:on`` 编译器标志）。

在 HLSL 块中，材质纹理应通过其获取器 ``get_<texture_name>()`` 而不是其名称来引用：

.. code-block:: c

  hlsl(ps) {
    float4 albedo = tex2DBindless(get_diffuse_tex(), input.diffuseTexCoord.uv);
  }

.. note::
  即使禁用了无绑定纹理功能，上述语法仍然适用。

如果使用了无绑定纹理，“MaterialProperties”（材料属性）常量缓冲区将被填充为 “uint2”。
索引（第一部分索引纹理，第二部分索引采样器）。

然后，这些索引将用于从 ``static_textures[]`` 和 ``static_samplers[]`` 数组中检索相应的纹理和采样器。

这就是 ``get_<texture_name>()`` 的基本功能。

-------
缓存
-------

``Buffer`` 和 ``ConstantBuffer``声明后必须有``hlsl{...}``块。例如

.. code-block:: c

  (ps) {
    some@buf = my_buffer hlsl {
      #include <myStruct.h>
      StructuredBuffer<MyStruct> some@buf;
    }
  }

  (ps) {
    my_buf@cbuf = my_const_buffer hlsl {
      #include <myStruct.h>
      cbuffer my_buf@cbuf
      {
        MyStruct data;
      };
    }
  }

-------------------
硬编码寄存器
-------------------

您可以将任何资源绑定到硬编码寄存器，而所有自动资源都不会与之重叠。
此外， ``always_referenced`` 关键字不是必需的，整数变量将保存在转储中，CPU 端可以读取。

.. code-block:: c

  int reg_no = 3;

  shader sh {
    (ps) {
      foo_vec@f4 : register(reg_no);
      foo_tex@smp2d : register(reg_no);
      foo_buf@buf : register(reg_no) hlsl { StructuredBuffer<uint> foo_buf@buf; };
      foo_uav@uav : register(reg_no) hlsl { RWStructuredBuffer<uint> foo_uav@uav; };
    }
  }

寄存器编号必须作为全局变量 ``int`` 声明。

.. note::
  使用这种方法声明资源时，不会生成 stcode。

---------------------
无序访问视图
---------------------

无序访问视图后缀 ``@uav`` 为着色器编译器提供了一个提示，即资源应绑定到相应的 ``u`` 寄存器。
请注意，此类声明后必须跟上 ``hlsl{...}`` 块，以定义 UAV 资源的实际类型。

.. code-block:: c

  buffer some_buffer;
  texture some_texture;

  shader some_shader {
    (cs) {
      hlsl_buffer@uav = some_buffer hlsl {
        RWStructuredBuffer<uint> hlsl_buffer@uav;
      }
      hlsl_texture@uav = some_texture hlsl {
        RWTexture2D<float4> hlsl_texture@uav;
      }
    }
    // ...
  }

--------------------------------
顶层加速结构
--------------------------------

为了进行光线跟踪，也可以像这样声明一个 TLAS（顶级加速结构）：

.. code-block:: c

  tlas some_tlas;

  shader some_shader {
    (cs) {
      hlsl_tlas@tlas = some_tlas;
    }
    // ...
  }

在 HLSL 术语中， ``hlsl_tlas``的类型为 ``RaytracingAccelerationStructure``。

.. _shader-blocks:

=============
着色器块
=============

着色器块是预着色器理念的延伸，它定义了多个着色器所共有的变量/常量，这些着色器 ``support`` 这些变量/常量。
其目的是优化常量/纹理切换。
例如

.. code-block:: c

  float4 world_view_pos;

  block(global_const|frame|scene|object) name_of_block
  {
    (ps) { world_view_pos@f3 = world_view_pos; }
    (vs) { world_view_pos@f3 = world_view_pos; }
  }

请注意， ``block``和 ``shader``一样，都定义了一个预着色器脚本。
这就是块之所以有用的主要原因：
它们允许你提取多个着色器通用的预着色器的一部分，并在设置块时执行一次，而不是每次执行着色器时都执行。
在此示例中， ``world_view_pos`` 将在支持此代码块的每个着色器的像素和顶点着色器中可见。

-------------------
着色块图层
-------------------

括号 ``block(...)``中的指定符称为层。它表示块内数值的变化频率。
可用的层有

- ``global_const``（用于全局常量，很少发生变化）
- ``frame``（用于每帧变化一次的着色器变量）
- ``scene``（用于着色器变量，当渲染模式发生变化时，这些变量应在一帧内发生变化）
- ``object``（用于每个对象都应更改的着色器变量）

.. warning::
  每个对象块是邪恶的，应尽量避免使用。
  它们意味着一种按对象绘制调用的模式，而这种模式在历史上已被证明与性能背道而驰。

 ``frame``层中提到的渲染模式由用户定义，并可针对每个着色器具体设置。
例如， ``rendinst_opaque_inc.dshl``着色器中有 ``4``个场景块，在渲染单帧的过程中会不断切换：

- ``rendinst_scene``用于颜色传递
- 用于深度传递的 ``rendinst_depth_scene``
- 用于生成草图的 ``rendinst_grassify_scene``
- 用于体素化通证的 ``rendinst_voxelize_scene``

------------------------------
在着色器中使用着色器块
------------------------------

在着色器中使用此类块的语法如下

.. code-block:: c

  shader shader_name
  {
    supports some_block;
    supports some_other_block;

    hlsl(ps) {
      // 假设 world_view_pos 是在这些代码块中定义的
      float3 multiplied_world_pos = 2.0 * world_view_pos;
      ...
    }
  }

由于支持多个程序块，您只能使用这些程序块交叉点上的变量。